Difference between revisions of "Protocol decoder API"

From sigrok
Jump to navigation Jump to search
(Added tags)
 
(69 intermediate revisions by 8 users not shown)
Line 1: Line 1:
This page describes how Protocol Decoders (PDs) work in sigrok.
This page describes how [[libsigrokdecode]] '''Protocol Decoders ([[Protocol decoders|PDs]])''' work.
 
See also [[Protocol decoder HOWTO]] for a quick introduction of how to write your own decoders.
 
See [[Protocol decoder API/Queries]] for changes to the decoder API in version 3. All decoders use this API now and the v2 API is no longer supported.


== Architecture ==
== Architecture ==


The frontend gets input from the user on which PDs to use in an acquisition session. It then configures these into the session with '''<tt>session_pd_add()</tt>'''. As the first PD is added, the session sets up an additional datafeed callback to itself, which it uses as input to the first PD in the stack. The output of that is sent to the frontend, along with its original datafeed, as well as fed into the next PD in the stack.
* All PDs are written in Python (>= 3.0).
* Every PD registers its name, description, capabilities, etc.
* PDs can be stacked, so the user can construct a decoding pipeline/stack. The control of communication to/from PDs is done by the backend code in libsigrokdecode.
* The sample data passed into the PDs will be streamed/chunked, so they can run in real time as the data comes in from the hardware (or from a file).
* In order to keep PDs simple, they don't have to deal with the intricacies of the datafeed packets.


The frontend thus gets the raw datafeed as well as a feed from every PD in the stack. Which of these different feeds is actually displayed to the user is a matter of configuration or selection by the user; it should be possible, for example, to have [[Command-line|sigrok-cli]] print only the top of the PD stack's output on stdout.
The frontend passes sample data into libsigrokdecode and gets decoder output (of various types) from every PD in the stack. Which of these output types of which PDs are actually displayed to the user is a matter of configuration or selection by the user; it is possible, for example, to have [[sigrok-cli]] print only the top of the PD stack's annotation output on stdout.


* All PDs are written in Python.
== API ==
* Every PD registers its name, description, capabilities, etc by populating a hash (dictionary) in its own main namespace called <tt>register</tt>.
* PDs will be stacked together, so the user can construct a decoding pipeline. The control of communication to/from PDs is done by the PD controller code in libsigrokdecode.
* The data feed into the PDs will be streamed, so they will run in real time as the data comes in from the hardware (or from a file).
* In order to keep PDs simple, they don't have to deal with the intricacies of the datafeed packets. Instead, the PD controller will hide the details from the Python code:
** When receiving a DF_HEADER packet going to a PD, the controller intercepts the packet and instead generates a DF_HEADER "coming from" that PD across the session bus, with that PD's output characteristics (which it derived from the PD's register dictionary).
** Data packets get translated into a bytestream, which the PDs access through an API: the function '''<tt>get()</tt>''' is a blocking call into the controller, which only returns when a datafeed packet has arrived, and its payload queued up for the PD.
** DF_END packets are translated into an EOF on the '''<tt>get()</tt>''' call.


== API ==
=== Backend library ===


A module called '''<tt>sigrok</tt>''' is provided which contains functions and definitions for PDs to use.
A Python module called '''<tt>sigrokdecode</tt>''' is provided. Every protocol decoder must import this. It contains the following items:


* '''<tt>get_meta()</tt>''': returns a dictionary with meta information about the stream.
* '''<tt>the Decoder object</tt>'''
<blockquote>
<blockquote>
The information in this dictionary will not change during the stream; it is essentially fixed from the start. The following keys may be provided:
Every protocol decoder must subclass this object.
{| border="0" style="font-size: smaller"
</blockquote>
|- bgcolor="#6699ff"
!Key
!Guaranteed
!Description


|- bgcolor="#eee"
* '''<tt>OUTPUT_ANN</tt>'''
| <tt>driver</tt>
<blockquote>
| yes
A constant used to register annotation output, used as argument to the '''<tt>[[#register-function|register()]]</tt>''' function. [[Sigrok-cli|sigrok-cli]] shows the annotation output of a decoder stack's topmost decoder (per default), [[PulseView]] shows annotation output as graphical boxes or as circles (if the duration of the annotation is zero).
| The name of the hardware driver which is feeding this data into the module. When input comes from a sigrok session file, the name of the driver originally used will be given.
</blockquote>
* '''<tt>OUTPUT_PYTHON</tt>'''
<blockquote>
A constant used to register Python output, used as argument to the '''<tt>[[#register-function|register()]]</tt>''' function. Python output is passed as input to a decoder that is stacked onto the current decoder. The format of the data that is given to the '''<tt>[[#put-function|put()]]</tt>''' function is specific to a certain PD and should be documented for the authors of the higher level decoders, for example with a comment at the top of the decoder's source file.
</blockquote>
* '''<tt>OUTPUT_BINARY</tt>'''
<blockquote>
A constant used to register binary output, used as argument to the '''<tt>[[#register-function|register()]]</tt>''' function. The format of the data that is outputted is not specified, it's up to the author of the decoder to choose one (or multiple) appropriate format(s). For example, the [[Protocol_decoder:Uart|UART]] decoder outputs the raw bytes that it decodes, the [[Protocol_decoder:I2s|I²S]] decoder outputs the audio in WAV format, but the output could also be an image (JPG, PNG, other) file for a decoder that decodes a display protocol, a PCAP file for network/USB decoders, or one of [[Protocol decoder output|many other]] formats. [[Sigrok-cli|sigrok-cli]] can be used to redirect the binary output of a decoder into a file (or to pipe it into other applications), see the documentation of its '''--protocol-decoder-binary''' ('''-B''') option.
</blockquote>
* '''<tt>OUTPUT_META</tt>'''
<blockquote>
A constant used to register metadata output, used as argument to the '''<tt>[[#register-function|register()]]</tt>''' function. An example for a PD that outputs metadata is the SPI decoder that uses it to output the detected bitrate. See [[Protocol decoder output]] for various other possible examples.
</blockquote>


|- bgcolor="#ddd"
* <div id="put-function">'''<tt>put(startsample, endsample, output_id, data)</tt>'''</div>
| <tt>unitsize</tt>
<blockquote>
| yes
This is used to provide the decoded data back into the backend. '''<tt>startsample</tt>''' and '''<tt>endsample</tt>''' specify the absolute sample numbers of where this item (e.g. an annotation) starts and ends. '''<tt>output_id</tt>''' is an output identifier returned by the '''<tt>[[#register-function|register()]]</tt>''' function.
| The size of an item of data, in bytes. This corresponds to the length of each array in the list returned by the get() function.
The '''<tt>data</tt>''' parameter's contents depend on the output type ('''<tt>output_id</tt>'''):
* '''OUTPUT_ANN''': The '''<tt>data</tt>''' parameter is a Python list with two items. The first item is the annotation index (determined by the order of items in '''<tt>Decoder.annotations</tt>''', see [[#Decoder_registration|below]]), the second is a list of annotation strings. The strings should be longer and shorter versions of the same annotation text (sorted by length, longest first), which can be used by frontends to show different annotation texts depending on e.g. zoom level.
** Example: ''<tt>self.put(10, 20, self.out_ann, [4, ['Start', 'St', 'S']])</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_ANN, the annotation index is 4, the list of annotations strings is "Start", "St", "S".
** Example: ''<tt>self.put(10, 20, self.out_ann, [4, ['CRC']])</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_ANN, the annotation index is 4, the list of annotations strings is just "CRC" (the list containins only one item).
** Example: ''<tt>self.put(35, 9000, self.out_ann, [17, ['Registered Parameter Number', 'Reg Param Num', 'RPN', 'R']])</tt>'''
*** The emitted data spans samples 35 to 9000, is of type OUTPUT_ANN, the annotation index is 17, the list of annotations strings is "Registered Parameter Number", "Reg Param Num", "RPN", "R".
* '''OUTPUT_PYTHON''': The '''<tt>data</tt>''' parameter is any arbitrary Python object that will be passed to stacked decoders. The format and contents are entirely decoder-dependent. Typically a Python list with various contents is passed to the stacked PDs.
** Example: ''<tt>self.put(10, 20, self.out_python, ['PACKET', ['Foo', 19.7, [1, 2, 3], ('bar', 'baz')]])</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_PYTHON, the data contents themselves are entirely dependent on the respective decoder and should be documented in its [[Protocol_decoder_HOWTO#pd.py|pd.py]] file.
* '''OUTPUT_BINARY''': The '''<tt>data</tt>''' parameter is a Python list with two items. The first item is the binary format's index (determined by the order of items in '''<tt>Decoder.binary</tt>''', see [[#Decoder_registration|below]]), the second is a Python [https://docs.python.org/3/library/stdtypes.html#typebytes '''<tt>bytes</tt>'''] object.
** Example: ''<tt>self.put(10, 20, self.out_binary, [4, b'\xfe\x55\xaa'])</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_BINARY, the binary format's index is 4, the emitted bytes are 0xfe, 0x55, 0xaa.
* '''OUTPUT_META''': The '''<tt>data</tt>''' parameter is a Python object of a certain type, as defined in the respective '''<tt>[[#register-function|register()]]</tt>''' function.
** Example: ''<tt>self.put(10, 20, self.out_meta, 15.7)</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_META, the data itself is a floating point number in this case.
** Example: ''<tt>self.put(10, 20, self.out_meta, 42)</tt>'''
*** The emitted data spans samples 10 to 20, is of type OUTPUT_META, the data itself is an integer number in this case.
</blockquote>


|- bgcolor="#eee"
=== Decoder class functions ===
| <tt>starttime</tt>
| yes
| The date/time at which the acquisition started, expressed as a floating point number representing seconds since the epoch. Python's '''time''' module can be used to do convert this to e.g. a printable string.


|- bgcolor="#ddd"
==== Required functions ====
| <tt>probes</tt>
| yes
| A list containing probe names, in the order in which they will be returned from '''<tt>get()</tt>'''. The length of this list corresponds to the exact number of probes that were enabled in the acquisition.


|- bgcolor="#eee"
* '''<tt>start(self)</tt>'''
| <tt>samplerate</tt>
<blockquote>
| yes
This function is called before the beginning of the decoding. This is the place to '''<tt>[[#register-function|register()]]</tt>''' the output types, check the user-supplied PD options for validity, and so on.
| The samplerate at which the acquisition is done, specified in Hz.
</blockquote>


|}
* '''<tt>reset(self)</tt>'''
<blockquote>
This function is called before the beginning of the decoding. This is the place to reset variables internal to your protocol decoder to their initial state, such as state machines and counters.
</blockquote>
</blockquote>


* '''<tt>get()</tt>''': returns a dictionary with the following keys:
* <div id="decode-and-wait-function-">'''<tt>decode(self)</tt>'''</div>
** <tt>time</tt>: the time when this set of samples was captured, expressed as an offset in picoseconds since the session's starttime as returned by the meta() function.
** <tt>duration</tt>: the time it took, in picoseconds, for all the data in this set to be captured.
** <tt>data</tt>: a list containing items of data, such as samples or output from protocol decoders lower on the stack. Each item consists of an [http://docs.python.org/py3k/library/array.html array] of type '''B''', which is an unsigned char. The length of this array corresponds with the <tt>unitsize</tt> value returned by the <tt>meta()</tt> call.
 
* '''<tt>put()</tt>''':
<blockquote>
<blockquote>
This is used to provide the decoded data back into the backend. It takes a single argument, which is in the same form as that returned by the <tt>get()</tt> function.
'''In non-stacked decoders''', this function is called by the [[libsigrokdecode]] backend to start the decoding.


Since a frontend uses the time and duration parameters of the various PDs to align their output with each other, it is vital that the <tt>time</tt> and <tt>duration</tt> keys be correct. If the PD was not able to handle all the data it received from the <tt>get()</tt> function, the duration key must be recalculated to take this into account. The next batch of data must then have its <tt>time</tt> parameter adjusted accordingly.
It takes no arguments, but instead will enter an infinite loop and gets samples by calling the more versatile '''<tt>[[Protocol_decoder_API/Queries#self.wait()|wait()]]</tt>''' method. This frees specific protocol decoders from tedious yet common tasks like detecting edges, or sampling signals at specific points in time relative to the current position.


The <tt>data</tt> parameter's arrays must be of whatever length is appropriate to this PD's output. In other words, it must match the PD's native unitsize.
'''Note:''' This '''<tt>[[Protocol_decoder_API/Queries#self.decode()|decode(self)]]</tt>''' method's signature has been introduced in version 3 of the protocol decoder API, in previous versions only '''<tt>decode(self, startsample, endsample, data)</tt>''' was available.
</blockquote>
</blockquote>


== Module structure ==
* <div id="decode-function">'''<tt>decode(self, startsample, endsample, data)</tt>'''</div>
 
<blockquote>
A PD must contain at least a populated dictionary '''register''' which defines the name, capabilities etc of the PD. The following keys can be used:
'''In stacked decoders''', this is a function that is called by the [[libsigrokdecode]] backend whenever it has a chunk of data for the protocol decoder to handle.


<blockquote>
{| border="0" style="font-size: smaller;" class="alternategrey sortable sigroktable"
{| border="0" style="font-size: smaller"
|-
|- bgcolor="#6699ff"
!style="width: 8em;" | Argument
!Key
!Type
!Description
!Description


|- bgcolor="#6699ff"
|-
| colspan="3" | '''Protocol Decoder information'''
| '''<tt>startsample</tt>'''
| The absolute samplenumber of the first sample in this chunk of data.


|- bgcolor="#eee"
|-
| '''<tt>id</tt>'''
| '''<tt>endsample</tt>'''
| required
| The absolute samplenumber of the last sample in this chunk of data.
| A short unique identifier for this protocol decoder. It should be all-lowercase, and only contains a-z, 0-9 and underscores. The [[Command-line|sigrok-cli]] tool uses this to specify PDs.


|- bgcolor="#ddd"
|-
| '''<tt>description</tt>'''
| '''<tt>data</tt>'''
| required
| A list containing the data to decode. Depending on whether the decoder decodes raw samples or is stacked onto another decoder, this argument is:
| A freeform one-line description of the decoder. Used when listing available PDs.
* Raw samples ('''<tt>inputs = ['logic']</tt>'''):
  <blockquote>
  '''<tt>data</tt>''' is a list of tuples containing the (absolute) sample number and the channels of that sample: '''<tt>[(samplenum, channels), (samplenum, channels), ...]</tt>'''.<br />
  '''<tt>samplenum</tt>''' is the (absolute) number of the sample, an integer that takes the values from '''<tt>startsample</tt>''' to '''<tt>endsample</tt>''' - 1.<br />
  The type of '''<tt>channels</tt>''' is [https://docs.python.org/3/library/stdtypes.html#typebytes '''<tt>bytes</tt>'''],
  a sequence type whose length is the sum of the lengths of '''<tt>channels</tt>''' and '''<tt>optional_channels</tt>'''
  (in other words, '''<tt>channels</tt>''' contains a byte for every channel/optional channel).<br />
  The order of the bytes is the same as the order of the channels in '''<tt>channels</tt>''' and '''<tt>optional_channels</tt>'''.<br>
  The individual bytes take the values '''<tt>0</tt>''' or '''<tt>1</tt>''', or some other value for optional channels that aren't supplied to the decoder.<br>
  The [[Protocol_decoder_HOWTO#channels_.26_optional_channels|Protocol decoder HOWTO]] page contains an example how the data can be processed.
  </blockquote>


|- bgcolor="#eee"
* Stacked decoder ('''<tt>inputs = ['</tt>'''<id of some other decoder>'''<tt>']</tt>'''):
| '''<tt>author</tt>'''
  <blockquote>
| optional
  '''<tt>data</tt>''' is the '''<tt>OUTPUT_PYTHON</tt>''' output of the decoder this PD is stacked upon.
| Name and optionally email address of the author of this PD.
  Its format depends on the implementation of the underlying decoder and should be documented there.
  </blockquote>
|}


|- bgcolor="#ddd"
</blockquote>
| '''<tt>version</tt>'''
| optional
| Version of this PD.


|- bgcolor="#eee"
==== Optional functions ====
| '''<tt>apiversion</tt>'''
| required
| The sigrok API version which this module uses. This is currently 1.


|- bgcolor="#ddd"
* '''<tt>metadata(self, key, value)</tt>'''
| '''<tt>license</tt>'''
<blockquote>
| required
Used to pass the decoder metadata about the data stream. Currently the only value for '''<tt>key</tt>''' is '''<tt>sigrokdecode.SRD_CONF_SAMPLERATE</tt>''', '''<tt>value</tt>''' is then the sample rate of the data stream in Hz.  
| The license under which the module is provided. This must be either 'gplv2+' (meaning the GNU General Public License 2 or later), or 'gplv3+' (GNU General Public License 2 or later). No other licenses for modules are permitted in sigrok.


|- bgcolor="#eee"
This function can be called multiple times, so make sure your protocol decoder handles this correctly! Do not place statements in there that depend on metadata to be called only once.
| '''<tt>in</tt>'''
</blockquote>
| required
| The type of input this decoder needs. If the decoder takes input from a logic analyzer driver, this should be set to 'logic', which maps to DF_LOGIC, the datafeed type. If it takes input from another PD, it should be set to the value of the 'out' key of that PD. It should conform to the same rules as the '''id''' key.


|- bgcolor="#ddd"
=== Decoder registration ===
| '''<tt>out</tt>'''
| optional
| If this decoder can feed decoded data back into the datafeed stream, its output will be identified with this key's value. It should conform to the same rules as the `id` key. If not specified, this decoder cannot feed data back into the stream. This typically means it only does analysis on the whole stream, producing a report at the end of acquisition.


|- bgcolor="#6699ff"
A PD's '''Decoder''' class must contain a few attributes specifying metadata about the PD. The following keys can be used:
| colspan="3" | '''Available functions'''


|- bgcolor="#eeeeee"
<blockquote>
| '''<tt>start</tt>'''
{| border="0" style="font-size: smaller;" class="alternategrey sortable sigroktable"
| required
|-
| Reference to a Python function that will be called to start the decoder. The PD controller will call this function when the first packet with real data in it comes in on the stream. See below for arguments to this function.
!style="width: 8em;" | Key
!Description


|- bgcolor="#dddddd"
|-
| '''<tt>report</tt>'''
| '''<tt>api_version</tt>'''
| optional
| The libsigrokdecode API version which this module uses. This is currently either 2 or 3.
| If given, this is a reference to a function which returns a report in freeform text, describing some analysis done on the stream. For example, PDs for counting logic level transitions, bit rate analysis and so on would use this to report back. This function is typically only called at the end of the stream, so the whole stream is analyzed. However, the function may be called at any time -- possibly giving an analysis of data received up until that point.


|- bgcolor="#6699ff"
|-
| colspan="3" | '''Protocol Decoder configuration'''
| '''<tt>id</tt>'''
| A short unique identifier for this protocol decoder. It should be all-lowercase, and only contains a-z, 0-9 and underscores. This must match the PD's Python module name (subdirectory name in the '''decoders''' directory). The [[sigrok-cli]] tool uses this to specify PDs on the command-line. Examples: 'jtag', 'sdcard_spi', 'uart'.


|- bgcolor="#eeeeee"
|-
| '''<tt>probes</tt>'''
| '''<tt>name</tt>'''
| optional
| The name of the decoder. Used when listing available PDs. Examples: 'JTAG', 'SD card (SPI mode)', 'UART'.
| For the 'logic' input type this is a required key. it lists the probes (pins) the logic analyzer should feed to this PD, in order for it to make sense of the data. For example, an SPI decoder has to know which probe has the clock, which has the chip select, and so on. This key contains a list of probe entries, where each entry can be either a string with the probe name (`"SCLK"`) or a list with the probe name and description, such as `["SCLK", "Clock"]`.


|- bgcolor="#dddddd"
|-
| '''<tt>options</tt>'''
| '''<tt>longname</tt>'''
| optional
| The (long) name of the decoder. Used when listing available PDs. Example: 'Joint Test Action Group (IEEE 1149.1)', 'Secure Digital card (SPI mode)', 'Universal Asynchronous Receiver/Transmitter'.
| A dictionary with options for this decoder. The keys should follow the same rules as for the `id` key above, and each value is a list consisting of a short freeform description of the option, and the default value for that option. For example, an SPI decoder might have an entry with key `cpol`, and value `'Clock polarity', 0]`.


<!--
|-
|- bgcolor="#eeeeee"
| '''<tt>desc</tt>'''
|
| A freeform one-line description of the decoder. Used when listing available PDs. Should end with a full stop. Example: 'Protocol for testing, debugging, and flashing ICs.', 'Secure Digital card (SPI mode) low-level protocol.', 'Asynchronous, serial bus.'.
|  
|  


|- bgcolor="#dddddd"
|-
|  
| '''<tt>license</tt>'''
|  
| The license under which the module is provided. This must be either '''<tt>gplv2+</tt>''' (meaning the GNU General Public License 2 or later), or '''<tt>gplv3+</tt>''' (GNU General Public License 3 or later). No other licenses for modules are permitted in libsigrokdecode.
|
-->


|}
|-
| '''<tt>inputs</tt>'''
| The list of types of input this decoder needs. If the decoder takes input from a logic analyzer driver, this should be set to '''<tt>logic</tt>''', which maps to SR_DF_LOGIC, the datafeed type. If it takes input from another PD, it should be set to the value of the '''<tt>outputs</tt>''' key of that PD. It should conform to the same rules as the '''<tt>id</tt>''' key (lowercase, no spaces, and so on).


</blockquote>
|-
| '''<tt>outputs</tt>'''
| The list of types of output this decoder produces. If this decoder can feed decoded data back into the datafeed stream, its outputs will be identified with this key's value. It should conform to the same rules as the '''<tt>id</tt>''' key.


Here's an example for an SPI decoder:
|-
| '''<tt>channels</tt>'''
| This key contains information about the channels (pins) that '''must''' be provided to this PD; the PD will not be able to work without them. For example, the [[Protocol_decoder:Spi|SPI]] decoder has to know which channel has the clock signal. This key contains a tuple of channel entries, where each entry is a Python dict with the keys '''id''', '''name''', and '''desc'''. Example: '''<tt>{'id': 'rx', 'name': 'RX', 'desc': 'UART receive line'}</tt>'''.


register = {
|-
  'id': 'spi',
| '''<tt>optional_channels</tt>'''
  'description': 'SPI',
| The channels the PD can make use of, but are not strictly required. The key has the same format as that of the <tt>channels</tt> key above (a tuple of dicts). This tuple is allowed to be empty if the respective protocol decoder has no optional channels.
  'author': 'sigrok project',
  'license': 'gplv3+',
  'in': 'logic',
  'out': 'spi',
  'probes': [
    ['sclk', 'Clock, also known as sck or clk'],
    'mosi',
    'miso',
    ['ncs', 'Chip Select, also known as cs'],
  ],
  'options': {
    'cpol': ['Clock polarity', 0],
    'cpha': ['Clock phase', 0],
    'wordsize': ['Word size (in bits)', 8],
  },
  'start': spi_start,
  'report': report_stats
}


The <tt>start</tt> function is called by libsigrokdecode at the start of a session which has this PD in its pipeline. It should implement a loop calling '''<tt>get()</tt>''' and '''<tt>put()</tt>''', and only return when an EOF is detected in that loop.
|-
| '''<tt>options</tt>'''
| A tuple describing the options for this decoder. Each tuple entry is a Python dict with the keys '''id''', '''desc''', '''default''', and '''values'''. Example: '''<tt>{'id': 'bitorder', 'desc': 'Bit order', 'default': 'msb-first', 'values': ('msb-first', 'lsb-first')}</tt>'''. This tuple can be empty, if the PD has no options.


== File structure ==
|-
| '''<tt>annotations</tt>'''
| A list of annotation classes this protocol decoder can output. Elements of this list are tuples consisting of an identifier string and a human readable description string. The identifier string can be used in the options of [[sigrok-cli]] to select the specific annotation type, and should therefore not contain whitespace or special characters.


==== Code ====
|-
| '''<tt>annotation_rows</tt>'''
| Annotation rows are used to group multiple annotation types together. The elements of this list are three element tuples consisting of:
* An annotation row ID (same naming rules as for other IDs).
* A human readable name/description string for the annotation row.
* A tuple containing the indices of the the annotation classes in the '''<tt>annotations</tt>''' tuple.
See the [[Protocol_decoder_HOWTO#annotations_.26_annotation_rows|example on the Protocol decoder HOWTO page]] for more information on this attribute.


A protocol decoder always has its own directory, named after the PD's ID, corresponding to the <tt>id</tt> field in the <tt>register</tt> dictionary. The directory should have a <tt>__init__.py</tt> file in it. Like any other python module, this is the file that is responsible for important and organizing the rest of the module's code into the namespace.  
|-
| '''<tt>binary</tt>'''
| A list of binary output types this protocol decoder can output, same format as the '''<tt>annotations</tt>''' list.


The <tt>register</tt> dictionary is the only  symbol that absolutely MUST be in the module's main namespace, since that's where sigrok expects to find it. Here's an example <tt>__init__.py</tt>:
|-
| '''<tt>tags</tt>'''
| A list of strings that make this protocol decoder be listed in the same categories (e.g. when adding a decoder in PulseView). See [[Protocol_decoders]] for a list of what categories are currently in use.


  import foo
|}
 
  register = {
    'id': 'spi',
  [...]
    'start': foo.start
  }


==== Source code and copyright ====
</blockquote>


The module MUST come with source code in the form of .py files. No pre-compiled code should be present, python or otherwise. The module must not use any helpers that are not provided as source code under the same license as the module itself.
* <div id="register-function">'''<tt>register(output_type)</tt>'''</div>
 
<blockquote>
The <tt>register</tt> struct must have a license declaration (see above), stating the license under which all the contents in the module directory are provided.
This function is used to register the output that will be generated by the decoder, its argument should be one of the '''<tt>OUTPUT_...</tt>''' constants described above. The function returns an identifier that can then be used as the '''<tt>output_id</tt>''' argument of the '''<tt>[[#put-function|put()]]</tt>''' function.
 
</blockquote>
==== Example/test files ====
 
Every protocol decoder module MUST come with example input and output files, that put the decoder through its paces. They must have the following names (in case of a PD called '''foo'''):
 
* '''foo.in''': example input as it would be received from sigrok, either raw samples from a hardware device, or decoded data from an upstream PD. In other words, the data in this file corresponds in format to the <tt>in</tt> field of this PD's <tt>register</tt> struct. Just as with live data in a real session, the module should not assume it will be fed this file at any particular speed, or in same-sized chunks.


* '''foo.out''': this must correspond '''exactly''' with what this PD will output, having been fed the input file above. No more, no less.
See [[Protocol decoder HOWTO#pd.py|pd.py]] for an example.


These two files must be present in the module's main directory.
[[Category:APIs]]

Latest revision as of 15:14, 13 November 2020

This page describes how libsigrokdecode Protocol Decoders (PDs) work.

See also Protocol decoder HOWTO for a quick introduction of how to write your own decoders.

See Protocol decoder API/Queries for changes to the decoder API in version 3. All decoders use this API now and the v2 API is no longer supported.

Architecture

  • All PDs are written in Python (>= 3.0).
  • Every PD registers its name, description, capabilities, etc.
  • PDs can be stacked, so the user can construct a decoding pipeline/stack. The control of communication to/from PDs is done by the backend code in libsigrokdecode.
  • The sample data passed into the PDs will be streamed/chunked, so they can run in real time as the data comes in from the hardware (or from a file).
  • In order to keep PDs simple, they don't have to deal with the intricacies of the datafeed packets.

The frontend passes sample data into libsigrokdecode and gets decoder output (of various types) from every PD in the stack. Which of these output types of which PDs are actually displayed to the user is a matter of configuration or selection by the user; it is possible, for example, to have sigrok-cli print only the top of the PD stack's annotation output on stdout.

API

Backend library

A Python module called sigrokdecode is provided. Every protocol decoder must import this. It contains the following items:

  • the Decoder object

Every protocol decoder must subclass this object.

  • OUTPUT_ANN

A constant used to register annotation output, used as argument to the register() function. sigrok-cli shows the annotation output of a decoder stack's topmost decoder (per default), PulseView shows annotation output as graphical boxes or as circles (if the duration of the annotation is zero).

  • OUTPUT_PYTHON

A constant used to register Python output, used as argument to the register() function. Python output is passed as input to a decoder that is stacked onto the current decoder. The format of the data that is given to the put() function is specific to a certain PD and should be documented for the authors of the higher level decoders, for example with a comment at the top of the decoder's source file.

  • OUTPUT_BINARY

A constant used to register binary output, used as argument to the register() function. The format of the data that is outputted is not specified, it's up to the author of the decoder to choose one (or multiple) appropriate format(s). For example, the UART decoder outputs the raw bytes that it decodes, the I²S decoder outputs the audio in WAV format, but the output could also be an image (JPG, PNG, other) file for a decoder that decodes a display protocol, a PCAP file for network/USB decoders, or one of many other formats. sigrok-cli can be used to redirect the binary output of a decoder into a file (or to pipe it into other applications), see the documentation of its --protocol-decoder-binary (-B) option.

  • OUTPUT_META

A constant used to register metadata output, used as argument to the register() function. An example for a PD that outputs metadata is the SPI decoder that uses it to output the detected bitrate. See Protocol decoder output for various other possible examples.

  • put(startsample, endsample, output_id, data)

This is used to provide the decoded data back into the backend. startsample and endsample specify the absolute sample numbers of where this item (e.g. an annotation) starts and ends. output_id is an output identifier returned by the register() function. The data parameter's contents depend on the output type (output_id):

  • OUTPUT_ANN: The data parameter is a Python list with two items. The first item is the annotation index (determined by the order of items in Decoder.annotations, see below), the second is a list of annotation strings. The strings should be longer and shorter versions of the same annotation text (sorted by length, longest first), which can be used by frontends to show different annotation texts depending on e.g. zoom level.
    • Example: self.put(10, 20, self.out_ann, [4, ['Start', 'St', 'S']])'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_ANN, the annotation index is 4, the list of annotations strings is "Start", "St", "S".
    • Example: self.put(10, 20, self.out_ann, [4, ['CRC']])'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_ANN, the annotation index is 4, the list of annotations strings is just "CRC" (the list containins only one item).
    • Example: self.put(35, 9000, self.out_ann, [17, ['Registered Parameter Number', 'Reg Param Num', 'RPN', 'R']])'
      • The emitted data spans samples 35 to 9000, is of type OUTPUT_ANN, the annotation index is 17, the list of annotations strings is "Registered Parameter Number", "Reg Param Num", "RPN", "R".
  • OUTPUT_PYTHON: The data parameter is any arbitrary Python object that will be passed to stacked decoders. The format and contents are entirely decoder-dependent. Typically a Python list with various contents is passed to the stacked PDs.
    • Example: self.put(10, 20, self.out_python, ['PACKET', ['Foo', 19.7, [1, 2, 3], ('bar', 'baz')]])'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_PYTHON, the data contents themselves are entirely dependent on the respective decoder and should be documented in its pd.py file.
  • OUTPUT_BINARY: The data parameter is a Python list with two items. The first item is the binary format's index (determined by the order of items in Decoder.binary, see below), the second is a Python bytes object.
    • Example: self.put(10, 20, self.out_binary, [4, b'\xfe\x55\xaa'])'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_BINARY, the binary format's index is 4, the emitted bytes are 0xfe, 0x55, 0xaa.
  • OUTPUT_META: The data parameter is a Python object of a certain type, as defined in the respective register() function.
    • Example: self.put(10, 20, self.out_meta, 15.7)'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_META, the data itself is a floating point number in this case.
    • Example: self.put(10, 20, self.out_meta, 42)'
      • The emitted data spans samples 10 to 20, is of type OUTPUT_META, the data itself is an integer number in this case.

Decoder class functions

Required functions

  • start(self)

This function is called before the beginning of the decoding. This is the place to register() the output types, check the user-supplied PD options for validity, and so on.

  • reset(self)

This function is called before the beginning of the decoding. This is the place to reset variables internal to your protocol decoder to their initial state, such as state machines and counters.

  • decode(self)

In non-stacked decoders, this function is called by the libsigrokdecode backend to start the decoding.

It takes no arguments, but instead will enter an infinite loop and gets samples by calling the more versatile wait() method. This frees specific protocol decoders from tedious yet common tasks like detecting edges, or sampling signals at specific points in time relative to the current position.

Note: This decode(self) method's signature has been introduced in version 3 of the protocol decoder API, in previous versions only decode(self, startsample, endsample, data) was available.

  • decode(self, startsample, endsample, data)

In stacked decoders, this is a function that is called by the libsigrokdecode backend whenever it has a chunk of data for the protocol decoder to handle.

Argument Description
startsample The absolute samplenumber of the first sample in this chunk of data.
endsample The absolute samplenumber of the last sample in this chunk of data.
data A list containing the data to decode. Depending on whether the decoder decodes raw samples or is stacked onto another decoder, this argument is:
  • Raw samples (inputs = ['logic']):

data is a list of tuples containing the (absolute) sample number and the channels of that sample: [(samplenum, channels), (samplenum, channels), ...].
samplenum is the (absolute) number of the sample, an integer that takes the values from startsample to endsample - 1.
The type of channels is bytes, a sequence type whose length is the sum of the lengths of channels and optional_channels (in other words, channels contains a byte for every channel/optional channel).
The order of the bytes is the same as the order of the channels in channels and optional_channels.
The individual bytes take the values 0 or 1, or some other value for optional channels that aren't supplied to the decoder.
The Protocol decoder HOWTO page contains an example how the data can be processed.

  • Stacked decoder (inputs = ['<id of some other decoder>']):

data is the OUTPUT_PYTHON output of the decoder this PD is stacked upon. Its format depends on the implementation of the underlying decoder and should be documented there.

Optional functions

  • metadata(self, key, value)

Used to pass the decoder metadata about the data stream. Currently the only value for key is sigrokdecode.SRD_CONF_SAMPLERATE, value is then the sample rate of the data stream in Hz.

This function can be called multiple times, so make sure your protocol decoder handles this correctly! Do not place statements in there that depend on metadata to be called only once.

Decoder registration

A PD's Decoder class must contain a few attributes specifying metadata about the PD. The following keys can be used:

Key Description
api_version The libsigrokdecode API version which this module uses. This is currently either 2 or 3.
id A short unique identifier for this protocol decoder. It should be all-lowercase, and only contains a-z, 0-9 and underscores. This must match the PD's Python module name (subdirectory name in the decoders directory). The sigrok-cli tool uses this to specify PDs on the command-line. Examples: 'jtag', 'sdcard_spi', 'uart'.
name The name of the decoder. Used when listing available PDs. Examples: 'JTAG', 'SD card (SPI mode)', 'UART'.
longname The (long) name of the decoder. Used when listing available PDs. Example: 'Joint Test Action Group (IEEE 1149.1)', 'Secure Digital card (SPI mode)', 'Universal Asynchronous Receiver/Transmitter'.
desc A freeform one-line description of the decoder. Used when listing available PDs. Should end with a full stop. Example: 'Protocol for testing, debugging, and flashing ICs.', 'Secure Digital card (SPI mode) low-level protocol.', 'Asynchronous, serial bus.'.
license The license under which the module is provided. This must be either gplv2+ (meaning the GNU General Public License 2 or later), or gplv3+ (GNU General Public License 3 or later). No other licenses for modules are permitted in libsigrokdecode.
inputs The list of types of input this decoder needs. If the decoder takes input from a logic analyzer driver, this should be set to logic, which maps to SR_DF_LOGIC, the datafeed type. If it takes input from another PD, it should be set to the value of the outputs key of that PD. It should conform to the same rules as the id key (lowercase, no spaces, and so on).
outputs The list of types of output this decoder produces. If this decoder can feed decoded data back into the datafeed stream, its outputs will be identified with this key's value. It should conform to the same rules as the id key.
channels This key contains information about the channels (pins) that must be provided to this PD; the PD will not be able to work without them. For example, the SPI decoder has to know which channel has the clock signal. This key contains a tuple of channel entries, where each entry is a Python dict with the keys id, name, and desc. Example: {'id': 'rx', 'name': 'RX', 'desc': 'UART receive line'}.
optional_channels The channels the PD can make use of, but are not strictly required. The key has the same format as that of the channels key above (a tuple of dicts). This tuple is allowed to be empty if the respective protocol decoder has no optional channels.
options A tuple describing the options for this decoder. Each tuple entry is a Python dict with the keys id, desc, default, and values. Example: {'id': 'bitorder', 'desc': 'Bit order', 'default': 'msb-first', 'values': ('msb-first', 'lsb-first')}. This tuple can be empty, if the PD has no options.
annotations A list of annotation classes this protocol decoder can output. Elements of this list are tuples consisting of an identifier string and a human readable description string. The identifier string can be used in the options of sigrok-cli to select the specific annotation type, and should therefore not contain whitespace or special characters.
annotation_rows Annotation rows are used to group multiple annotation types together. The elements of this list are three element tuples consisting of:
  • An annotation row ID (same naming rules as for other IDs).
  • A human readable name/description string for the annotation row.
  • A tuple containing the indices of the the annotation classes in the annotations tuple.

See the example on the Protocol decoder HOWTO page for more information on this attribute.

binary A list of binary output types this protocol decoder can output, same format as the annotations list.
tags A list of strings that make this protocol decoder be listed in the same categories (e.g. when adding a decoder in PulseView). See Protocol_decoders for a list of what categories are currently in use.
  • register(output_type)

This function is used to register the output that will be generated by the decoder, its argument should be one of the OUTPUT_... constants described above. The function returns an identifier that can then be used as the output_id argument of the put() function.

See pd.py for an example.