Difference between revisions of "Sysclk LWLA1034/Protocol"
(Document extraction of firmware from Windows installer) |
(Document capture control bits) |
||
(28 intermediate revisions by the same user not shown) | |||
Line 20: | Line 20: | ||
# on application exit. | # on application exit. | ||
The firmware transfer is split into packets with 15 byte payload each and thus takes quite a bit of time. | The size of the bitstream varies due to compression, but is in the order of 50kB to 80kB. The firmware transfer is split by the vendor software into packets with 15 byte payload each and thus takes quite a bit of time inside VirtualBox. Testing with a libusb-based tool for issuing USB bulk transfers has shown that this does not appear to be necessary: Transferring even the entire firmware blob in a single bulk transfer appears to work without error. | ||
=== Firmware Extraction === | === Firmware Extraction === | ||
Line 54: | Line 53: | ||
Control commands are sent via bulk transfer to USB end point 2, with the response (if any) coming in from end point 6. | Control commands are sent via bulk transfer to USB end point 2, with the response (if any) coming in from end point 6. | ||
Command messages sent to the device | Command messages are sent to the device as a sequence of 16-bit words with little endian byte order. The first word in a message identifies the command type. Different command types have different message lengths. Some command types include a length field and allow for messages of variable length, others are of fixed size. | ||
There are read commands which trigger an immediate response from the device, and write commands without a response. | There are read commands which trigger an immediate response from the device, and write commands without a response. | ||
Line 60: | Line 59: | ||
=== Command 0001: Read Register === | === Command 0001: Read Register === | ||
This command | This command reads a 32-bit wide control register. | ||
==== Command ==== | ==== Command ==== | ||
Line 76: | Line 75: | ||
==== Response ==== | ==== Response ==== | ||
The response has a fixed length of 4 bytes. It is the content of a 32-bit register in mixed endian (2-1-4-3) byte order | The response has a fixed length of 2 words (4 bytes). It is the content of a 32-bit register in mixed endian (2-1-4-3) byte order. | ||
=== Command 0002: Write Register === | === Command 0002: Write Register === | ||
This command | This command writes a 32-bit value to a control register. | ||
==== Command ==== | ==== Command ==== | ||
Line 117: | Line 95: | ||
|} | |} | ||
The | The value is encoded in mixed endian (2-1-4-3) byte order. | ||
=== Command 0005: Copy Protection Check === | |||
This command writes 16 32-bit words which are then verified by the device for copy protection purposes. The vendor software issues this after a transfer of captured data to the host has finished, but only for the first three captures. | |||
The values are apparently obtained by scrambling and/or hashing the captured data. The device functions correctly even without ever sending this command. For this reason, the LWLA1034 firmware distributed with sigrok implements the copy protection in the FPGA instead. | |||
==== Command ==== | ==== Command ==== | ||
Line 131: | Line 111: | ||
!Data 1 | !Data 1 | ||
!Data 2 | !Data 2 | ||
! | !... | ||
!Data 16 | |||
!Data | |||
|- | |- | ||
| 0005 | | 0005 | ||
| | | dddd-dddd | ||
| dddd-dddd | |||
| | | ... | ||
| | | dddd-dddd | ||
| | |||
|} | |} | ||
The byte order of each 32-bit word is the usual 2-1-4-3 mixed endianess. It appears that the three high bytes of each word are always zero. | |||
=== Command 0006: Read Memory at Address === | === Command 0006: Read Memory at Address === | ||
This command | This command reads a chunk of data from the device memory (SRAM). It allows for random access using a 32-bit start address, and for variable length via a 32-bit length field. The software uses this command to read captured data from the device's buffer. | ||
==== Command ==== | ==== Command ==== | ||
Line 167: | Line 140: | ||
|} | |} | ||
Both the address and the length are | Both the address and the length are encoded in mixed endian (2-1-4-3) byte order. | ||
==== Response ==== | ==== Response ==== | ||
The memory is | The memory is 36 bit wide, and thus the size of the response in bits is 36 times the value in the length field. The original vendor software reads chunks of 120 words @ 36 bit at a time, which works out to an integer multiple of 32 (i.e. 4320 bits = 135 32-bit words or 540 bytes). The final six reads are done in chunks of 8 words @ 36 bit, which works out to nine 32-bit words or 36 bytes. The overall amount of memory being read when fetching captured samples from a full buffer is just below the RAM size of 256k×36 bit. | ||
Note that the software always starts reading at address 4 rather than 0. Presumably the firmware uses the first four 36-bit words for internal bookkeeping or some other purpose. Exception to the rule: For some unknown reason, the memory is also being read on start-up right after loading the firmware into the FPGA. In this case, altogether 128000 36-bit words are being read beginning at address 0. | Note that the software always starts reading at address 4 rather than 0. Presumably, the firmware uses the first four 36-bit words for internal bookkeeping or some other purpose. Exception to the rule: For some unknown reason (perhaps testing), the memory is also being read on start-up right after loading the firmware into the FPGA. In this case, altogether 128000 36-bit words are being read beginning at address 0. | ||
Note that reading more than 1024 bytes at a time seems to be unreliable. Due to the constraints outlined in the following section, the maximum read length should therefore be restricted to 224 device words, which works out to 1008 bytes. | |||
The software issues this command once during start-up, and once for each capture operation as part of the setup sequence. It also issued when clicking the Stop button to cancel a capture in progress | ===== 36-to-32 Bit Mapping ===== | ||
The data returned by the read command consists of 32-bit words in 2-1-4-3 mixed endian byte order. Eight consecutive 36-bit words from the SRAM are mapped at a time to nine consecutive 32-bit words in the received stream. The first of these slices aligns to the beginning of the read-out stream, apparently irrespective of the absolute start address of the read operation. | |||
The first eight 32-bit words in a slice contain the lower 32 bit of the eight encoded 36-bit words. The ninth 32-bit word contains the four remaining high bits of all eight 36-bit words combined. The high nibbles are shifted into the ninth word from right to left, resulting in a 1-2-3-4-5-6-7-8 order of nibbles. (This is after conversion from mixed endian byte order!) | |||
As it is necessary to access the ninth 32-bit word in each slice even to fully extract the first 36-bit word, it follows that memory reads should always request a multiple of the slice length, i.e. eight 36-bit words. The length of the response will thus always be a multiple of nine 32-bit words (36 bytes). However, it does not appear to be necessary to restrict read operations to only the two lengths 120 and 8 used by the vendor software. | |||
===== Compression Scheme ===== | |||
The compression scheme is a form of run-length encoding. Very short run lengths of only one or two cycles are handled as special cases, which helps to keep the worst-case overhead of compression pretty low. In particular, the scheme ensures that it is never necessary to write more than one 36-bit word per sampling cycle to the SRAM buffer. | |||
Each 36-bit word in the stream is either a data word or a repeat half-count word. The first word in the stream is a data word, with the following layout: | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | |||
!Bit | |||
!35 | |||
!34 | |||
!33 | |||
!32 | |||
!31 | |||
!... | |||
!2 | |||
!1 | |||
!0 | |||
|- | |||
!Meaning | |||
| Repeat count follows | |||
| LSB of repeat count | |||
| CH34 | |||
| CH33 | |||
| CH32 | |||
| ... | |||
| CH3 | |||
| CH2 | |||
| CH1 | |||
|} | |||
If bit 35 is set, then the next 36-bit word in the stream encodes the number of cycles the previous data word is repeated, divided by two. The actual number of repeat cycles is twice that number plus the LSB of repeat count, i.e. bit 34 from the data word. If bit 35 is not set the next word is again a data word and the repeat half-count is assumed as zero. However, the LSB of repeat count bit still applies; i.e. if it is set then the repeat count would be 1. | |||
So, to recap, repeat counts 0 to 1 are encoded as part of the channel data word, larger repeat counts use up a full extra 36-bit word. The combined repeat count is then 37 bit wide. At 125 MHz, this scheme would allow for encoding run lengths of more than 18 minutes. However, note that the vendor software does not actually make use of the full range: Apparently, it stops as soon as the count rolls over into bit 36, thereby cutting the maximum possible run length in half (i.e. about 9 minutes at 125 MHz). However, this detail does not make any difference for the decompression algorithm. | |||
The next word following a repeat half-count is again a data word. Note that it is possible for a sample/run-length pair to be split across a slice boundary, or even across successive read chunks. The decoder therefore needs to keep track of RLE state across slices as well as read operations. | |||
=== Command 0007: Write Long Registers === | |||
This command writes several long registers in bulk. The long registers are 64 bit wide and use their own address space separate from the 32-bit control registers. | |||
The vendor software issues this command once during start-up, and once for each capture operation as part of the setup sequence. It is also issued when clicking the Stop button to cancel a capture in progress. | |||
==== Command ==== | ==== Command ==== | ||
Variable length of 3 words (6 bytes) plus length × 4 words (8 bytes). | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | {| border="0" style="font-size: smaller;" class="sigroktable" | ||
!ID | !ID | ||
! | !Address | ||
! | !Length | ||
!Data | !Data | ||
!... | !... | ||
|- | |- | ||
| 0007 | | 0007 | ||
| | | aaaa | ||
| | | nnnn | ||
| dddd-dddd-dddd-dddd | | dddd-dddd-dddd-dddd | ||
| ... | | ... | ||
|} | |} | ||
The two argument words | The two argument words are the start address and the length of the slice to write, in little endian byte order. Both the address and the length refer to quantities of 64 bit (4 words or 8 bytes). Thus, if length is 10 the payload should consist of 40 words or 80 bytes. | ||
The vendor software always writes a slice of length 10 beginning at address 0, thus completely resetting both the capture configuration as well as the capture status. See the table of long registers for a description of the configuration and status fields. | |||
This command | === Command 0008: Read Long Registers === | ||
This command reads several long registers in bulk. The long registers are 64 bit wide and use their own address space separate from the 32-bit control registers. | |||
The software uses this command mainly to read the block of capture status registers. During idle periods, the original vendor software polls the channel state about 34 times per second for its live port status display. During a capture operation, it is necessary to poll the status in order to find out whether the capture buffer has been filled completely and samples can be retrieved. | |||
==== Command ==== | ==== Command ==== | ||
Line 209: | Line 234: | ||
{| border="0" style="font-size: smaller;" class="sigroktable" | {| border="0" style="font-size: smaller;" class="sigroktable" | ||
!ID | !ID | ||
!Length | !Address | ||
!Length | |||
|- | |- | ||
| 0008 | | 0008 | ||
| | | aaaa | ||
| nnnn | |||
|} | |} | ||
The two argument words | The two argument words are the start address and the length of the slice to read, in little endian byte order. Both the address and the length refer to quantities of 64 bit (4 words or 8 bytes). Thus, if length is 10 the reply will consist of 40 words or 80 bytes. | ||
Although the vendor software always reads the full 10 fields of configuration and status information, it does not seem to be an actual requirement. Restricting reads to fields 5 to 9 in the sigrok driver works fine so far without any problems. | |||
==== Response ==== | ==== Response ==== | ||
Each returned 64-bit word is in very much mixed up (6-5-8-7-2-1-4-3) byte order. See the table of long registers for the meaning of the values at each address. | |||
== Control Registers == | |||
The device exposes a number of 32-bit wide registers accessed via commands 1 and 2. Some of the register addresses appearing in the protocol also occur in the LWLA1016 protocol, although it seems that their purpose may not be the same. Other registers appear to be specific to the LWLA1034. | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | |||
!Address | |||
!Name | |||
!Description | |||
|- | |||
| 1074 | |||
| MEM_CTRL | |||
| Control register for capture memory access. | |||
|- | |||
| 1078 | |||
| MEM_FILL | |||
| Compressed size of captured data (number of 36-bit words). | |||
|- | |||
| 107C | |||
| MEM_ADDR? | |||
| Not clear if this is an address or another control register, or even if it is needed at all. | |||
|- | |||
| 1090 | |||
| TEST? | |||
| Writing 1 to this register apparently enables some sort of test mode. | |||
|- | |||
| 1094 | |||
| DIV_BYPASS | |||
| This is set to 0 when using the internal clock with sampling rates of 100 MHz and below. For 125 MHz internal clock or the external clock modes, it is set to 1. | |||
|- | |||
| 10B0 | |||
| LONG_STROBE | |||
| Long register read/write strobe. | |||
|- | |||
| 10B4 | |||
| LONG_ADDR | |||
| Long register address. | |||
|- | |||
| 10B8 | |||
| LONG_LOW | |||
| Long register low word. | |||
|- | |||
| 10BC | |||
| LONG_HIGH | |||
| Long register high word. | |||
|- | |||
| 10C0 | |||
| FREQ_CH1 | |||
|rowspan="4"| | |||
These registers apparently count the number of rising (or falling?) clock edges on channels 1 to 4. The vendor software polls these counters every second to display a live frequency count for the first four channels. | |||
It is unclear what time base is being used for those counters: The large error (sometimes by more than 20%, especially for CH1) hints at I/O latency, which would imply that the software resets the counters. However, manual testing of single register reads has shown that the values do not seem to scale with the time between reads. This would mean the device is using an internal time base after all. However, that makes the large error a bit hard to explain, especially since the error is different for each channel despite being driven by the same signal source. | |||
|- | |||
| 10C4 | |||
| FREQ_CH2 | |||
|- | |||
| 10C8 | |||
| FREQ_CH3 | |||
|- | |||
| 10CC | |||
| FREQ_CH4 | |||
|} | |||
=== Long Registers === | |||
These are separate 64-bit wide registers mainly used for capture control and status reporting. Single values can be read or written indirectly via special control registers. Commands 7 and 8 can be used to read or write multiple 64-bit values in a single command. | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | |||
!Index | |||
!Purpose | |||
!Description | |||
|- | |||
| 0 | |||
| Channel enable mask | |||
| Bit mask of enabled channels. Bit 0 (LSB) corresponds to channel 1. | |||
|- | |||
| 1 | |||
| Clock divider count | |||
| Max value of the counter which divides the internal clock to yield the sampling clock, for sampling rates of 100 MS/s or less. The value is calculated as 1 / (samplefreq * 10ns) - 1. | |||
|- | |||
| 2 | |||
| Trigger level mask | |||
| Each bit corresponds to a channel, selecting whether to trigger on low/falling (0) or high/rising (1). | |||
|- | |||
| 3 | |||
| Trigger edge mask | |||
| Each bit corresponds to a channel, selecting whether to trigger on level (0) or edge (1). | |||
|- | |||
| 4 | |||
| Trigger enable mask | |||
| Each bit corresponds to a channel, selecting whether the trigger is enabled (1) or disabled (0). | |||
|- | |||
| 5 | |||
| Capture memory fill level | |||
| On write, the value limits the (compressed) size of the captured data. On read, the current memory fill level is returned. | |||
|- | |||
| 6 | |||
| Not used? | |||
| Apparently unused. | |||
|- | |||
| 7 | |||
| Running capture duration | |||
| Time passed since the first sample. For samplerates up to 100 MS/s, the value is a duration in milliseconds. At 125 MS/s, the value needs to be adjusted by 100/125 = 4/5. | |||
|- | |||
| 8 | |||
| Channel input state | |||
| Each bit corresponds to a channel, showing whether the input signal is currently low (0) or high (1). | |||
|- | |||
| 9 | |||
| Capture status flags | |||
| A bit vector of status flags, as described in the next section. Only the lowest 6 bits appear to be valid, the remaining ones may contain garbage. | |||
|- | |||
| 10 | |||
| Capture control | |||
| This appears to be a control register for starting and stopping data capture. | |||
|- | |||
| 100 | |||
| Test ID | |||
| When read, the fixed value 0x1234567887654321 is returned. This is used for a sanity check during initialization. | |||
|} | |||
The original vendor software uses the bulk read/write commands exclusively with the long registers 0 to 9. The long registers 10 and 100 are only ever accessed indirectly via the control registers 0x10B0 to 0x10BC. | |||
=== Capture Status Flags === | |||
The status flag bits are used as follows: | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | |||
!Bit 0 | |||
!Bit 1 | |||
!Bit 2 | |||
!Bit 3 | |||
!Bit 4 | |||
!Bit 5 | |||
|- | |||
| ??? | |||
| Capturing | |||
| ??? | |||
| ??? | |||
| Triggered | |||
| Memory available | |||
|} | |||
=== Capture Control Bits === | |||
The capture control bits of long register 10 are assigned as follows: | |||
{| border="0" style="font-size: smaller;" class="sigroktable" | |||
!Bit 0 | |||
!Bit 1 | |||
!Bit 2 | |||
!Bit 3 | |||
!Bit 4 | |||
!Bit 5 | |||
!Bit 6 | |||
|- | |||
| trg_en | |||
| | |||
| do_clr_timebase | |||
| | |||
| flush_fifo | |||
| clr_fifo32_ful | |||
| clr_cntr0 | |||
|} | |||
== Task Recipes == | |||
This section lists the commands issued by the software to perform a particular task. | |||
=== Long Register Read === | |||
This sequence reads from a 64-bit wide internal memory, probably the same as that accessed by commands 7 and 8. | |||
# Write index to address register 0x10B4 | |||
# Read dummy value from strobe register 0x10B0 | |||
# Read high word from register 0x10BC | |||
# Read low word from register 0x10B8 | |||
Steps 3 and 4 appear to be interchangeable. | |||
=== Long Register Write === | |||
This sequence writes to a 64-bit wide internal memory, probably the same as that accessed by commands 7 and 8. | |||
# Write index to address register 0x10B4 | |||
# Write low word to register 0x10B8 | |||
# Write high word to register 0x10BC | |||
# Write 0 (dummy value) to strobe register 0x10B0 | |||
Steps 2 and 3 appear to be interchangeable. | |||
=== Initialization === | |||
# Acquire control of USB device and select configuration 1 | |||
# Send FPGA bitstream (default: internal clock) to EP 4 via bulk transfer | |||
# Device test sequence: | |||
## Read long register 100; ignore result | |||
## Read long register 100; result should be 0x1234567887654321 | |||
# Capture setup/state test (vendor software does this, not mandatory): | |||
## Write sequence 0..9 to capture setup fields (via command 7) | |||
## Read back capture state (via command 8): 0..4 should be read back as is, 5..9 are trashed anyway | |||
# Memory test (vendor software does this, not mandatory): | |||
## Write 2 to register 0x1074 | |||
## Write 0 to register 0x107C | |||
## Read memory beginning at address 0 in chunks of 120 36-bit words, up to but not including 0x013FB0 | |||
## Read memory beginning at address 0x013FB0 in chunks of 8 36-bit words, up to but not including 0x014000 | |||
It does not seem to be necessary to use exactly the same read chunk length as the original vendor software. See the description of command 6 for constraints. | |||
=== Poll channel state === | |||
The vendor software continuously polls the channel state even when idle. However, it is not mandatory to do so. | |||
# Poll frequency of signal at CH1 to CH4: | |||
## Read register 0x10C0: value is frequency of CH1 signal | |||
## Read register 0x10C4: value is frequency of CH2 signal | |||
## Read register 0x10C8: value is frequency of CH3 signal | |||
## Read register 0x10CC: value is frequency of CH4 signal | |||
# Poll signal level of all channels: | |||
## Read capture state (via command 8): the signal level of all channels is recorded in field 8 | |||
=== Clocking Mode Switch === | |||
# Transfer one of three FPGA bitstreams to EP 4: | |||
## Configuration for internal clock | |||
## Configuration for external clock, rising edge | |||
## Configuration for external clock, falling edge | |||
=== Signal Capture === | |||
# Write 2 to register 0x1074 | |||
# Write 1 to register 0x1074 | |||
# Write 0x74 to long register 10 | |||
# Write divider bypass flag to register 0x1094 (see description of register) | |||
# Write capture setup (command 7, address 0, length 10: see description of command) | |||
# Write 1 to long register 10 | |||
# Wait for capture to finish: | |||
## Poll capture state (command 8, address 0, length 10: see description of command) | |||
## Report progress information (got trigger, cycles elapsed, memory fill percentage) to user | |||
## Capture has finished once the memory available flag is reset | |||
=== Cancel Signal Capture === | |||
# Write 0 to long register 10 | |||
# Write 0 to register 0x1094 (divider bypass flag) | |||
After that, the memory available flag in the capture state should have been cleared. Continue in the same manner as for a regularly finished capture. | |||
=== Read Captured Data === | |||
# Read register 0x1078: value is the number of 36-bit words in the capture buffer | |||
# Write 1 to register 0x1094 (divider bypass flag) | |||
# Write 2 to register 0x1074 | |||
# Write 4 to register 0x107C | |||
# Read capture buffer: | |||
## Read memory beginning at address 4 in chunks of 120 36-bit words, up to but not including 0x03FFC4 | |||
## Read memory beginning at address 0x03FFC4 in chunks of 8 36-bit words, up to but not including 0x03FFF4 | |||
# Write 0 to register 0x1094 (divider bypass flag) | |||
# Issue command 5 with scrambled data to be verified by the device for copy protection | |||
It does not seem to be necessary to use exactly the same read chunk length as the original vendor software. See the description of command 6 for constraints. | |||
The original software does the copy protection check only for the first three captures. It is in fact possible to not send command 5 at all, which is exactly what the sigrok driver does. | |||
=== Shutdown === | |||
1. Transfer FPGA configuration for device shutdown to EP 4 |
Latest revision as of 02:00, 31 October 2015
FPGA Configuration
The FPGA bitstream is loaded via bulk transfer to USB end point 4. Each firmware transfer starts with a 4-byte header to announce the transfer size. The payload appears to be a Raw Binary File (.rbf) with compression enabled.
Length | Payload... |
---|---|
nnnn-nnnn | dd... |
Unlike the control commands, the firmware transfer is apparently byte-based. The length is a byte count encoded in big endian (1-2-3-4) byte order, and includes the size of the length field (4 bytes) itself.
Application Behavior
The vendor software transfers a new bitstream to the FPGA
- on application start,
- when switching clocking mode between internal, external/rising or external/falling,
- on application exit.
The size of the bitstream varies due to compression, but is in the order of 50kB to 80kB. The firmware transfer is split by the vendor software into packets with 15 byte payload each and thus takes quite a bit of time inside VirtualBox. Testing with a libusb-based tool for issuing USB bulk transfers has shown that this does not appear to be necessary: Transferring even the entire firmware blob in a single bulk transfer appears to work without error.
Firmware Extraction
The firmware blobs can be extracted directly from the Windows installer executable located on the CD-ROM that ships with the device. The file lwla1034_EN_setup.exe on the CD-ROM from 2012-07-12 has the firmware blobs located at the following offsets:
Offset | Length | Mode |
---|---|---|
34110338 | 78398 | Internal clock |
34266237 | 78247 | External clock (rising edge) |
34344484 | 79145 | External clock (falling edge) |
34578631 | 48525 | Shutdown |
Both offsets and lengths are in bytes. The extracted blobs already include the header with the 32-bit length field.
Control Commands
Control commands are sent via bulk transfer to USB end point 2, with the response (if any) coming in from end point 6.
Command messages are sent to the device as a sequence of 16-bit words with little endian byte order. The first word in a message identifies the command type. Different command types have different message lengths. Some command types include a length field and allow for messages of variable length, others are of fixed size.
There are read commands which trigger an immediate response from the device, and write commands without a response.
Command 0001: Read Register
This command reads a 32-bit wide control register.
Command
Fixed length of 2 words (4 bytes).
ID | Address |
---|---|
0001 | aaaa |
Response
The response has a fixed length of 2 words (4 bytes). It is the content of a 32-bit register in mixed endian (2-1-4-3) byte order.
Command 0002: Write Register
This command writes a 32-bit value to a control register.
Command
Fixed length of 4 words (8 bytes).
ID | Address | Data |
---|---|---|
0002 | aaaa | dddd-dddd |
The value is encoded in mixed endian (2-1-4-3) byte order.
Command 0005: Copy Protection Check
This command writes 16 32-bit words which are then verified by the device for copy protection purposes. The vendor software issues this after a transfer of captured data to the host has finished, but only for the first three captures.
The values are apparently obtained by scrambling and/or hashing the captured data. The device functions correctly even without ever sending this command. For this reason, the LWLA1034 firmware distributed with sigrok implements the copy protection in the FPGA instead.
Command
Fixed length of 33 words (66 bytes).
ID | Data 1 | Data 2 | ... | Data 16 |
---|---|---|---|---|
0005 | dddd-dddd | dddd-dddd | ... | dddd-dddd |
The byte order of each 32-bit word is the usual 2-1-4-3 mixed endianess. It appears that the three high bytes of each word are always zero.
Command 0006: Read Memory at Address
This command reads a chunk of data from the device memory (SRAM). It allows for random access using a 32-bit start address, and for variable length via a 32-bit length field. The software uses this command to read captured data from the device's buffer.
Command
Fixed length of 5 words (10 bytes).
ID | Address | Length |
---|---|---|
0006 | aaaa-aaaa | nnnn-nnnn |
Both the address and the length are encoded in mixed endian (2-1-4-3) byte order.
Response
The memory is 36 bit wide, and thus the size of the response in bits is 36 times the value in the length field. The original vendor software reads chunks of 120 words @ 36 bit at a time, which works out to an integer multiple of 32 (i.e. 4320 bits = 135 32-bit words or 540 bytes). The final six reads are done in chunks of 8 words @ 36 bit, which works out to nine 32-bit words or 36 bytes. The overall amount of memory being read when fetching captured samples from a full buffer is just below the RAM size of 256k×36 bit.
Note that the software always starts reading at address 4 rather than 0. Presumably, the firmware uses the first four 36-bit words for internal bookkeeping or some other purpose. Exception to the rule: For some unknown reason (perhaps testing), the memory is also being read on start-up right after loading the firmware into the FPGA. In this case, altogether 128000 36-bit words are being read beginning at address 0.
Note that reading more than 1024 bytes at a time seems to be unreliable. Due to the constraints outlined in the following section, the maximum read length should therefore be restricted to 224 device words, which works out to 1008 bytes.
36-to-32 Bit Mapping
The data returned by the read command consists of 32-bit words in 2-1-4-3 mixed endian byte order. Eight consecutive 36-bit words from the SRAM are mapped at a time to nine consecutive 32-bit words in the received stream. The first of these slices aligns to the beginning of the read-out stream, apparently irrespective of the absolute start address of the read operation.
The first eight 32-bit words in a slice contain the lower 32 bit of the eight encoded 36-bit words. The ninth 32-bit word contains the four remaining high bits of all eight 36-bit words combined. The high nibbles are shifted into the ninth word from right to left, resulting in a 1-2-3-4-5-6-7-8 order of nibbles. (This is after conversion from mixed endian byte order!)
As it is necessary to access the ninth 32-bit word in each slice even to fully extract the first 36-bit word, it follows that memory reads should always request a multiple of the slice length, i.e. eight 36-bit words. The length of the response will thus always be a multiple of nine 32-bit words (36 bytes). However, it does not appear to be necessary to restrict read operations to only the two lengths 120 and 8 used by the vendor software.
Compression Scheme
The compression scheme is a form of run-length encoding. Very short run lengths of only one or two cycles are handled as special cases, which helps to keep the worst-case overhead of compression pretty low. In particular, the scheme ensures that it is never necessary to write more than one 36-bit word per sampling cycle to the SRAM buffer.
Each 36-bit word in the stream is either a data word or a repeat half-count word. The first word in the stream is a data word, with the following layout:
Bit | 35 | 34 | 33 | 32 | 31 | ... | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|---|
Meaning | Repeat count follows | LSB of repeat count | CH34 | CH33 | CH32 | ... | CH3 | CH2 | CH1 |
If bit 35 is set, then the next 36-bit word in the stream encodes the number of cycles the previous data word is repeated, divided by two. The actual number of repeat cycles is twice that number plus the LSB of repeat count, i.e. bit 34 from the data word. If bit 35 is not set the next word is again a data word and the repeat half-count is assumed as zero. However, the LSB of repeat count bit still applies; i.e. if it is set then the repeat count would be 1.
So, to recap, repeat counts 0 to 1 are encoded as part of the channel data word, larger repeat counts use up a full extra 36-bit word. The combined repeat count is then 37 bit wide. At 125 MHz, this scheme would allow for encoding run lengths of more than 18 minutes. However, note that the vendor software does not actually make use of the full range: Apparently, it stops as soon as the count rolls over into bit 36, thereby cutting the maximum possible run length in half (i.e. about 9 minutes at 125 MHz). However, this detail does not make any difference for the decompression algorithm.
The next word following a repeat half-count is again a data word. Note that it is possible for a sample/run-length pair to be split across a slice boundary, or even across successive read chunks. The decoder therefore needs to keep track of RLE state across slices as well as read operations.
Command 0007: Write Long Registers
This command writes several long registers in bulk. The long registers are 64 bit wide and use their own address space separate from the 32-bit control registers.
The vendor software issues this command once during start-up, and once for each capture operation as part of the setup sequence. It is also issued when clicking the Stop button to cancel a capture in progress.
Command
Variable length of 3 words (6 bytes) plus length × 4 words (8 bytes).
ID | Address | Length | Data | ... |
---|---|---|---|---|
0007 | aaaa | nnnn | dddd-dddd-dddd-dddd | ... |
The two argument words are the start address and the length of the slice to write, in little endian byte order. Both the address and the length refer to quantities of 64 bit (4 words or 8 bytes). Thus, if length is 10 the payload should consist of 40 words or 80 bytes.
The vendor software always writes a slice of length 10 beginning at address 0, thus completely resetting both the capture configuration as well as the capture status. See the table of long registers for a description of the configuration and status fields.
Command 0008: Read Long Registers
This command reads several long registers in bulk. The long registers are 64 bit wide and use their own address space separate from the 32-bit control registers.
The software uses this command mainly to read the block of capture status registers. During idle periods, the original vendor software polls the channel state about 34 times per second for its live port status display. During a capture operation, it is necessary to poll the status in order to find out whether the capture buffer has been filled completely and samples can be retrieved.
Command
Fixed length of 3 words (6 bytes).
ID | Address | Length |
---|---|---|
0008 | aaaa | nnnn |
The two argument words are the start address and the length of the slice to read, in little endian byte order. Both the address and the length refer to quantities of 64 bit (4 words or 8 bytes). Thus, if length is 10 the reply will consist of 40 words or 80 bytes.
Although the vendor software always reads the full 10 fields of configuration and status information, it does not seem to be an actual requirement. Restricting reads to fields 5 to 9 in the sigrok driver works fine so far without any problems.
Response
Each returned 64-bit word is in very much mixed up (6-5-8-7-2-1-4-3) byte order. See the table of long registers for the meaning of the values at each address.
Control Registers
The device exposes a number of 32-bit wide registers accessed via commands 1 and 2. Some of the register addresses appearing in the protocol also occur in the LWLA1016 protocol, although it seems that their purpose may not be the same. Other registers appear to be specific to the LWLA1034.
Address | Name | Description |
---|---|---|
1074 | MEM_CTRL | Control register for capture memory access. |
1078 | MEM_FILL | Compressed size of captured data (number of 36-bit words). |
107C | MEM_ADDR? | Not clear if this is an address or another control register, or even if it is needed at all. |
1090 | TEST? | Writing 1 to this register apparently enables some sort of test mode. |
1094 | DIV_BYPASS | This is set to 0 when using the internal clock with sampling rates of 100 MHz and below. For 125 MHz internal clock or the external clock modes, it is set to 1. |
10B0 | LONG_STROBE | Long register read/write strobe. |
10B4 | LONG_ADDR | Long register address. |
10B8 | LONG_LOW | Long register low word. |
10BC | LONG_HIGH | Long register high word. |
10C0 | FREQ_CH1 |
These registers apparently count the number of rising (or falling?) clock edges on channels 1 to 4. The vendor software polls these counters every second to display a live frequency count for the first four channels. It is unclear what time base is being used for those counters: The large error (sometimes by more than 20%, especially for CH1) hints at I/O latency, which would imply that the software resets the counters. However, manual testing of single register reads has shown that the values do not seem to scale with the time between reads. This would mean the device is using an internal time base after all. However, that makes the large error a bit hard to explain, especially since the error is different for each channel despite being driven by the same signal source. |
10C4 | FREQ_CH2 | |
10C8 | FREQ_CH3 | |
10CC | FREQ_CH4 |
Long Registers
These are separate 64-bit wide registers mainly used for capture control and status reporting. Single values can be read or written indirectly via special control registers. Commands 7 and 8 can be used to read or write multiple 64-bit values in a single command.
Index | Purpose | Description |
---|---|---|
0 | Channel enable mask | Bit mask of enabled channels. Bit 0 (LSB) corresponds to channel 1. |
1 | Clock divider count | Max value of the counter which divides the internal clock to yield the sampling clock, for sampling rates of 100 MS/s or less. The value is calculated as 1 / (samplefreq * 10ns) - 1. |
2 | Trigger level mask | Each bit corresponds to a channel, selecting whether to trigger on low/falling (0) or high/rising (1). |
3 | Trigger edge mask | Each bit corresponds to a channel, selecting whether to trigger on level (0) or edge (1). |
4 | Trigger enable mask | Each bit corresponds to a channel, selecting whether the trigger is enabled (1) or disabled (0). |
5 | Capture memory fill level | On write, the value limits the (compressed) size of the captured data. On read, the current memory fill level is returned. |
6 | Not used? | Apparently unused. |
7 | Running capture duration | Time passed since the first sample. For samplerates up to 100 MS/s, the value is a duration in milliseconds. At 125 MS/s, the value needs to be adjusted by 100/125 = 4/5. |
8 | Channel input state | Each bit corresponds to a channel, showing whether the input signal is currently low (0) or high (1). |
9 | Capture status flags | A bit vector of status flags, as described in the next section. Only the lowest 6 bits appear to be valid, the remaining ones may contain garbage. |
10 | Capture control | This appears to be a control register for starting and stopping data capture. |
100 | Test ID | When read, the fixed value 0x1234567887654321 is returned. This is used for a sanity check during initialization. |
The original vendor software uses the bulk read/write commands exclusively with the long registers 0 to 9. The long registers 10 and 100 are only ever accessed indirectly via the control registers 0x10B0 to 0x10BC.
Capture Status Flags
The status flag bits are used as follows:
Bit 0 | Bit 1 | Bit 2 | Bit 3 | Bit 4 | Bit 5 |
---|---|---|---|---|---|
??? | Capturing | ??? | ??? | Triggered | Memory available |
Capture Control Bits
The capture control bits of long register 10 are assigned as follows:
Bit 0 | Bit 1 | Bit 2 | Bit 3 | Bit 4 | Bit 5 | Bit 6 |
---|---|---|---|---|---|---|
trg_en | do_clr_timebase | flush_fifo | clr_fifo32_ful | clr_cntr0 |
Task Recipes
This section lists the commands issued by the software to perform a particular task.
Long Register Read
This sequence reads from a 64-bit wide internal memory, probably the same as that accessed by commands 7 and 8.
- Write index to address register 0x10B4
- Read dummy value from strobe register 0x10B0
- Read high word from register 0x10BC
- Read low word from register 0x10B8
Steps 3 and 4 appear to be interchangeable.
Long Register Write
This sequence writes to a 64-bit wide internal memory, probably the same as that accessed by commands 7 and 8.
- Write index to address register 0x10B4
- Write low word to register 0x10B8
- Write high word to register 0x10BC
- Write 0 (dummy value) to strobe register 0x10B0
Steps 2 and 3 appear to be interchangeable.
Initialization
- Acquire control of USB device and select configuration 1
- Send FPGA bitstream (default: internal clock) to EP 4 via bulk transfer
- Device test sequence:
- Read long register 100; ignore result
- Read long register 100; result should be 0x1234567887654321
- Capture setup/state test (vendor software does this, not mandatory):
- Write sequence 0..9 to capture setup fields (via command 7)
- Read back capture state (via command 8): 0..4 should be read back as is, 5..9 are trashed anyway
- Memory test (vendor software does this, not mandatory):
- Write 2 to register 0x1074
- Write 0 to register 0x107C
- Read memory beginning at address 0 in chunks of 120 36-bit words, up to but not including 0x013FB0
- Read memory beginning at address 0x013FB0 in chunks of 8 36-bit words, up to but not including 0x014000
It does not seem to be necessary to use exactly the same read chunk length as the original vendor software. See the description of command 6 for constraints.
Poll channel state
The vendor software continuously polls the channel state even when idle. However, it is not mandatory to do so.
- Poll frequency of signal at CH1 to CH4:
- Read register 0x10C0: value is frequency of CH1 signal
- Read register 0x10C4: value is frequency of CH2 signal
- Read register 0x10C8: value is frequency of CH3 signal
- Read register 0x10CC: value is frequency of CH4 signal
- Poll signal level of all channels:
- Read capture state (via command 8): the signal level of all channels is recorded in field 8
Clocking Mode Switch
- Transfer one of three FPGA bitstreams to EP 4:
- Configuration for internal clock
- Configuration for external clock, rising edge
- Configuration for external clock, falling edge
Signal Capture
- Write 2 to register 0x1074
- Write 1 to register 0x1074
- Write 0x74 to long register 10
- Write divider bypass flag to register 0x1094 (see description of register)
- Write capture setup (command 7, address 0, length 10: see description of command)
- Write 1 to long register 10
- Wait for capture to finish:
- Poll capture state (command 8, address 0, length 10: see description of command)
- Report progress information (got trigger, cycles elapsed, memory fill percentage) to user
- Capture has finished once the memory available flag is reset
Cancel Signal Capture
- Write 0 to long register 10
- Write 0 to register 0x1094 (divider bypass flag)
After that, the memory available flag in the capture state should have been cleared. Continue in the same manner as for a regularly finished capture.
Read Captured Data
- Read register 0x1078: value is the number of 36-bit words in the capture buffer
- Write 1 to register 0x1094 (divider bypass flag)
- Write 2 to register 0x1074
- Write 4 to register 0x107C
- Read capture buffer:
- Read memory beginning at address 4 in chunks of 120 36-bit words, up to but not including 0x03FFC4
- Read memory beginning at address 0x03FFC4 in chunks of 8 36-bit words, up to but not including 0x03FFF4
- Write 0 to register 0x1094 (divider bypass flag)
- Issue command 5 with scrambled data to be verified by the device for copy protection
It does not seem to be necessary to use exactly the same read chunk length as the original vendor software. See the description of command 6 for constraints.
The original software does the copy protection check only for the first three captures. It is in fact possible to not send command 5 at all, which is exactly what the sigrok driver does.
Shutdown
1. Transfer FPGA configuration for device shutdown to EP 4