File format:Sigrok/v3
Problem
The current .sr file format is a ZIP file containing multiple files (some metadata files and actual sampling data files). This works fine, but it also has some issues:
- In order to get to the data you want, you need to decompress the whole file.
- Appending to a file is not possible easily (and it's not efficient).
- ...
Wish list
- It should be able to store metadata and arbitrary data (logic samples, and/or analog samples, and so on).
- It must support compression.
- It should be able to handle run-time changes in the data streams (via meta packets on the session bus), e.g. changing samplerates, changing probes, etc. etc.
- Better compression properties (e.g. using LZO or other algorithms, this is to be evaluated). What we ideally want out of the compression algorithm is:
- Good and relatively fast compression results at only moderate CPU usage.
- Very fast decompression (LZO is probably the best one here, as it's specifically designed for this).
- Ideally, support for appending further data to already compressed data chunks (though this could be also implemented outside of the compression algorithm per se).
- Open-source license and OS portability. There should be an open-source library or code chunk for compression/uncompression and it should be widely available in Linux distros, and portable to Windows, Mac OS X, FreeBSD, and so on.
- Independent of hardware architecture (x86, ARM, PowerPC, MIPS, and so on), OS, endianness, float representation, and so on. All data fields must be properly specified (endianness, signedness, size, format).
- Must be (optionally) possible to store extra UI state data. E.g User configured probe colours, names, positions.
We were planning a new file format and a full spec for it for other reasons already, but this will probably be obsolete anyway and solved differently (see above).
jhol's Musings
- Data should be encoded in a data aware way. This would give greater compression:
- Logic Data is most efficient stored in RLE+Huffman or Golomb coding. e.g. a clock signal may compress to one bit per edge.
- FLAC (libflac) or a FLAC inspired codec (linear predicition) is probably as good as it gets for lossless analog data encoding.
- If data is stored in a format specific way, it would be best to store it as a series of stream-blocks, similar to how video containers work. Would it be possible to simply leverage a video container such as OGG? IIRC this contains headers to declare metadata about each stream, then a series of timestamped stream blocks interleaved together. The time stamp is a format specific number... for audio: the sample number, for video: the frame number, so sigrok formats can easily leverage this.
- PulseView needs some means to save state: Probe names/colours/positions etc. Probably best to store this in an INI file (- JSON?). In ZIP this is easy - just add an extra file to the archive.