sysclk-lwla: Simplify and optimize word extraction.
It turns out that all LWLA protocol responses consist either
of 32-bit units or of 32-bit units combined into 64-bit units.
Thus it makes sense to double the basic unit size for reading
from 16 bit to 32 bit.
We cannot do the same for command messages though, as those
actually do use 16-bit quantities in some places, and 32-bit
arguments are not always aligned to 32-bit boundaries.
(acquisition_state.xfer_buf_in): Change unit type to uint32_t,
and update related macros and code accordingly.
(LWLA_TO_UINT32): New macro to replace LWLA_READ32, operating
directly on 32-bit values instead of pointers to 16-bit units.
Make use of a compiler-recognized idiom for bitwise rotation
to efficiently swap the 16-bit halves of a 32-bit word.
(LWLA_TO_UINT16): New macro to replace LWLA_READ16.
(LWLA_READ64): Remove unused macro.
(LWLA_WORD_[0123]): Slightly simplify 16-bit word extraction.