Difference between revisions of "Protocol decoder HOWTO"

From sigrok
Jump to navigation Jump to search
(Showcase annotation/row enum pattern)
m (Layout fix)
Line 194: Line 194:
[[File:Pv example ir nec cropped.png]]
[[File:Pv example ir nec cropped.png]]


As you can imagine, handling numeric IDs is quite bothersome - especially if they change and all affected IDs have to be changed throughout the PD. To avoid this, you can use an enum:


However, as you can imagine, handling numeric IDs is quite bothersome - especially if they change and all affected IDs have to be changed throughout the PD. To avoid this, you can use an enum:


<small>
<small>

Revision as of 09:15, 27 July 2020

This page serves as a quick-start guide for people who want to write their own libsigrokdecode protocol decoders (PDs).

It is not intended to replace the Protocol decoder API page, but rather to give a short overview/tutorial and some tips.

Introduction

Protocol decoders are written entirely in Python (>= 3.0).

Files

Every protocol decoder is a Python module and has its own subdirectory in libsigrokdecode's decoders directory.

This is a minimalistic example of how a protocol decoder looks like, in this case the i2c decoder (license header, comments, and some other parts omitted).

Note: Do not start new protocol decoders by copying code from here. Instead, it's recommended to select an already existing decoder in the source code which is similar to the one you plan to write, and copy that as a starting point.

__init__.py

 '''
 I²C (Inter-Integrated Circuit) is a bidirectional, multi-master
 bus using two signals (SCL = serial clock line, SDA = serial data line).
 
 <Insert notes and hints for the user here>
 '''
 
 from .pd import Decoder

This is a standard Python file, required in every Python module. It contains a module-level docstring, which is accessible by frontends via the libsigrokdecode API. It should contain a (very) short description of what the protocol (in this case I²C) is about, and some notes and hints for the user of this protocol decoder (which can be shown in GUIs when the user selects/browses different PDs).

This docstring should not contain the full, extensive protocol description. Instead, the per-PD wiki page should be used for protocol description, photos of devices or photos of example acquisition setups, and so on. Each decoder has one unique wiki page at the URL http://sigrok.org/wiki/Protocol_decoder:<pd>, where <pd> is the Python module name of the decoder (i2c in this case). Some examples for such per-PD wiki pages: UART, PAN1321, MX25Lxx05D, DCF77.

The "from .pd import Decoder" line will make sure the code from pd.py gets properly imported when this module is used.

pd.py

 import sigrokdecode as srd
 
 class Decoder(srd.Decoder):
     api_version = 2
     id = 'i2c'
     name = 'I²C'
     longname = 'Inter-Integrated Circuit'
     desc = 'Two-wire, multi-master, serial bus.'
     license = 'gplv2+'
     inputs = ['logic']
     outputs = ['i2c']
     channels = (
         {'id': 'scl', 'name': 'SCL', 'desc': 'Serial clock line'},
         {'id': 'sda', 'name': 'SDA', 'desc': 'Serial data line'},
     )
     optional_channels = ()
     options = (
         {'id': 'address_format', 'desc': 'Displayed slave address format',
            'default': 'shifted', 'values': ('shifted', 'unshifted')},
     )
     annotations = (
         ('start', 'Start condition'),
         ('repeat-start', 'Repeat start condition'),
         ('stop', 'Stop condition'),
         ('ack', 'ACK'),
         ('nack', 'NACK'),
         ('bit', 'Data/address bit'),
         ('address-read', 'Address read'),
         ('address-write', 'Address write'),
         ('data-read', 'Data read'),
         ('data-write', 'Data write'),
         ('warnings', 'Human-readable warnings'),
     )
     annotation_rows = (
         ('bits', 'Bits', (5,)),
         ('addr-data', 'Address/Data', (0, 1, 2, 3, 4, 6, 7, 8, 9)),
         ('warnings', 'Warnings', (10,)),
     )
 
     def __init__(self, **kwargs):
         self.state = 'FIND START'
         # And various other variable initializations...
 
     def metadata(self, key, value):
         if key == srd.SRD_CONF_SAMPLERATE:
             self.samplerate = value
 
     def start(self):
         self.out_ann = self.register(srd.OUTPUT_ANN)
 
     def decode(self, ss, es, data):
         for self.samplenum, (scl, sda) in data:
             # Decode the samples.

The recommended name for the actual decoder file is pd.py. This file contains some meta information about the decoder, and the actual code itself, mostly in the decode() method.

If needed, large unwieldy lists or similar things can also be factored out into another *.py file (examples: midi, z80).

Copyright and license

Every protocol decoder must come with source code in the form of *.py files. No pre-compiled code should be present, Python or otherwise. The PD must not use any helpers that are not provided as source code under the same license as the PD itself.

The Decoder class must have a license declaration (see above), stating the license under which all the contents in the decoder's directory are provided. This is usually 'gplv2+' or 'gplv3+', whichever you prefer. In either case, the decoder license must be compatible with the libsigrokdecode license (which is "GPL, version 3 or later").

channels & optional_channels

The following excerpt from the SPI PD shows how to use channels and optional_channels. To decode SPI, the clock signal is always needed, the chip-select signal is optional and only used when provided. To give the user the flexibility to provide only one of the MOSI/MISO signals, they are both also defined as optional:

 class Decoder(srd.Decoder):
     ...
     id = 'spi'
     ...
     channels = (
         {'id': 'clk', 'name': 'CLK', 'desc': 'Clock'},
     )
     optional_channels = (
         {'id': 'miso', 'name': 'MISO', 'desc': 'Master in, slave out'},
         {'id': 'mosi', 'name': 'MOSI', 'desc': 'Master out, slave in'},
         {'id': 'cs', 'name': 'CS#', 'desc': 'Chip-select'},
     )

data, the argument of the decoder's decode() function that contains the data to decode, is a list of tuples. These tuples contain the (absolute) number of the sample and the data at that sample. To process all samples, the SPI decoder loops over data like this:

 def decode(self, ss, es, data):
     ...
     for (self.samplenum, pins) in data:

channels and optional_channels contain in total four channels, therefore the second member of the tuple is an object of Python's bytes class containing 4 bytes, one for each channel. The decoder unpacks the bytes into the variables clk, miso, mosi, and cs as shown below.

Then, it checks for the optional channels, if their value is either 0 or 1. If it is not, that optional channel is not provided to the decoder. In the case that neither of them is supplied, an exception is raised:

 (clk, miso, mosi, cs) = pins
 self.have_miso = (miso in (0, 1))
 self.have_mosi = (mosi in (0, 1))
 self.have_cs = (cs in (0, 1))
 
 # Either MISO or MOSI (but not both) can be omitted.
 if not (self.have_miso or self.have_mosi):
     raise ChannelError('Either MISO or MOSI (or both) pins required.')

annotations & annotation_rows

To make the relation between the annotations and the annotation_rows members of a decoder object more clear, take a look at how the ir_nec PD uses them:

 class Decoder(srd.Decoder):
     ...
     id = 'ir_nec'
     ...
     annotations = (                        # Implicitly assigned annotation type ID
         ('bit', 'Bit'),                    # 0
         ('agc-pulse', 'AGC pulse'),        # 1
         ('longpause', 'Long pause'),       # 2
         ('shortpause', 'Short pause'),     # 3
         ('stop-bit', 'Stop bit'),          # 4
         ('leader-code', 'Leader code'),    # 5
         ('addr', 'Address'),               # 6
         ('addr-inv', 'Address#'),          # 7
         ('cmd', 'Command'),                # 8
         ('cmd-inv', 'Command#'),           # 9
         ('repeat-code', 'Repeat code'),    # 10
         ('remote', 'Remote'),              # 11
         ('warnings', 'Warnings'),          # 12
     )
     annotation_rows = (
         ('bits', 'Bits', (0, 1, 2, 3, 4)),
         ('fields', 'Fields', (5, 6, 7, 8, 9, 10)),
         ('remote', 'Remote', (11,)),
         ('warnings', 'Warnings', (12,)),
     )

It groups the first five annotation types together into the bits row and the next six into the fields row. The rows remote and warnings both only contain one annotation type.

Without annotation_rows, PulseView would have to put each annotation type in its own row (which is unhandy if the decoder has many annotations) or it would have to put them all on the same row (which would result in unreadable output due to overlaps). But because of the annotation_rows, the output of the ir_nec decoder is grouped together as shown in the following picture (note how different annotation types, distinguishable by their different colors, share the same row):

Pv example ir nec cropped.png


However, as you can imagine, handling numeric IDs is quite bothersome - especially if they change and all affected IDs have to be changed throughout the PD. To avoid this, you can use an enum:

 ann_bit, ann_agc_pulse, ann_long_pause, ann_short_pause, ann_stop_bit, ann_leader_code, ann_addr, ann_addr_inv, ann_cmd, ann_cmd_inv, ann_repeat_code, ann_remote, ann_warning = range(13)

 class Decoder(srd.Decoder):
     ...
     id = 'ir_nec'
     ...
     annotations = (                        # Implicitly assigned annotation type ID
         ('bit', 'Bit'),                    # 0  = ann_bit
         ('agc-pulse', 'AGC pulse'),        # 1  = ann_agc_pulse
         ('longpause', 'Long pause'),       # 2  = ann_long_pause
         ('shortpause', 'Short pause'),     # 3  = ann_short_pause
         ('stop-bit', 'Stop bit'),          # 4  = ann_stop_bit
         ('leader-code', 'Leader code'),    # 5  = ann_leader_code
         ('addr', 'Address'),               # 6  = ann_addr
         ('addr-inv', 'Address#'),          # 7  = ann_addr_inv
         ('cmd', 'Command'),                # 8  = ann_cmd
         ('cmd-inv', 'Command#'),           # 9  = ann_cmd_inv
         ('repeat-code', 'Repeat code'),    # 10 = ann_repeat_code
         ('remote', 'Remote'),              # 11 = ann_remote
         ('warnings', 'Warnings'),          # 12 = ann_warning
     )
     annotation_rows = (
         ('bits', 'Bits', (ann_bit, ann_agc_pulse, ann_long_pause, ann_short_pause, ann_stop_bit)),
         ('fields', 'Fields', (ann_leader_code, ann_addr, ann_addr_inv, ann_cmd, ann_cmd_inv, ann_repeat_code)),
         ('remote', 'Remote', (ann_remote,)),
         ('warnings', 'Warnings', (ann_warning,)),
     )

This way, all you need to ensure is that the order of the enum entries is the same as in the annotations array and you're set!

Random notes, tips and tricks

  • You should usually only use raise in a protocol decoder to raise exceptions in cases which are a clear bug in how the protocol decoder is invoked (e.g. if no samplerate was provided for a PD which needs the samplerate, or if some of the required channels were not provided by the user, and so on).
  • A simple way to check whether an optional PD channel was supplied (by the frontend/user) in this run is:
 if pin in (0, 1):
     do_stuff()
  • The v3 API provides a has_channel() method.
  • A simple and fast way to calculate a parity (i.e., count the number of 1 bits) over a number (0x55 in this example) is:
 ones = bin(0x55).count('1')
  • A simple function to convert a BCD number (max. 8 bits) to an integer is:
 def bcd2int(b):
     return (b & 0x0f) + ((b >> 4) * 10)
  • An elegant way to convert a sequence of bus pins to a numeric value:
 from functools import reduce

 def reduce_bus(bus):
     if 0xFF in bus:
         return None # unassigned bus channels
     else:
         return reduce(lambda a, b: (a << 1) | b, reversed(bus))
  • A nice way to construct method names according to e.g. protocol commands is (assuming cmd is 8, this would call the function self.handle_cmd_0x08):
 fn = getattr(self, 'handle_cmd_0x%02x' % cmd);
 fn(arg1, arg2, ...)
  • A cheap way to deal with Python's lack of enumerations (useful for states, pin indices, annotation indices, etc.):
 class Cycle:
     NONE, MEMRD, MEMWR, IORD, IOWR, FETCH, INTACK = range(7)
  • You don't need to reinstall the whole libsigrokdecode project every time you make a change on your decoder. Instead, you can use the environment variable SIGROKDECODE_DIR to point the software to your development directory:
 $ SIGROKDECODE_DIR=/path/to/libsigrokdecode/decoders/ sigrok-cli … -P <decodername>

Because this environment variable is evaluated by the libsigrokdecode code itself, it can be used for any program that uses the library, for example when calling PulseView or the pdtest unit test utility from the sigrok-test repository.

If you compiled a recent libsigrokdecode by yourself (newer than this commit), you can also put decoders into your home directory, without the need for an additional environment variable. On Linux systems, this name follows the XDG base directory specification, which by default resolves to ~/.local/share/libsigrokdecode/decoders. If that folder does not exist, you can simply create it and drop your decoders there, in their own subdirectory, like you would do in the libsigrokdecode source tree. On Windows systems additional decoders are read from %ProgramData%\libsigrokdecode\decoders.

  • To debug the Python implementation of a decoder during development, maintenance or research either add print() statements at appropriate locations. Or get WinPDB and use the remote debugging feature as outlined below (add this hook somewhere in pd.py, then "File -> Attach" to the running process). Decoders cannot be used in "regular" debuggers since they expect a rather specific environment to execute in, for all of receiving their input as well as having their output saved or presented as well as processing samples (data types, runtime routines). Remote debugging works in both the sigrok-cli and pulseview context. Adding another print() statement before starting the embedded debugger can help identify the moment in time when to attach.
 def __init__():
     import rpdb2
     rpdb2.start_embedded_debugger("pd")
     ...

Unit tests

In order to keep protocol decoders in a running state even when we make changes to a decoder or libsigrokdecode itself, we use unit tests for as many decoders as we can. These are stored in the sigrok-test repository. If you want to add, modify or run one of them, clone that repository and check the README for documentation. We greatly appreciate it when you submit unit tests for your decoder so we can keep it in good health!

Submitting your decoder

When you've finished your decoder and everything is working nicely, please contribute the decoder to the sigrok project so that other people can benefit from it (and test it, improve upon it, and so on).

  • Send the decoder (preferrably as a patch against the current git HEAD of libsigrokdecode) to the sigrok-devel mailing list, and/or tell us the location of your git repository containing the decoder (on the #sigrok IRC channel on FreeNode).
  • Please also send us a few example data files (*.sr) and a small README to go with your decoder. We'll need these in order to properly review and test your decoder. Preferrably these files should also come as patches against the latest git HEAD of the sigrok-dumps repository. See Example dumps for details.
  • Finally, please also consider adding a few "unit tests" for your decoder in the sigrok-test repository. These test will automatically run the decoder against various input files specified in test.conf and check whether the expected output is produced (examples: rfm12, nrf24l01). This allows us to notice and fix any regressions in the decoder and/or the libsigrokdecode backend that may arise over time.

Thanks a lot!