Difference between revisions of "File format:Sigrok/v3"

From sigrok
Jump to navigation Jump to search
Line 28: Line 28:
* Data should be encoded in a data aware way. This would give greater compression:
* Data should be encoded in a data aware way. This would give greater compression:
** Logic Data is most efficient stored in RLE+Huffman or Golomb coding. e.g. a clock signal may compress to one bit per edge.
** Logic Data is most efficient stored in RLE+Huffman or Golomb coding. e.g. a clock signal may compress to one bit per edge.
** FLAC (libflac) or a FLAC inspired codec (linear predicition) is probably as good as it gets for analog data encoding.
** FLAC (libflac) or a FLAC inspired codec (linear predicition) is probably as good as it gets for lossless analog data encoding.
* If data is stored in a format specific way, it would be best to store it as a series of stream-blocks, similar to how video containers work. Would it be possible to simply leverage a video container such as OGG? IIRC this contains headers to declare metadata about each stream, then a series of timestamped stream blocks interleaved together. The time stamp is a format specific number... for audio: the sample number, for video: the frame number, so sigrok formats can easily leverage this.
* If data is stored in a format specific way, it would be best to store it as a series of stream-blocks, similar to how video containers work. Would it be possible to simply leverage a video container such as OGG? IIRC this contains headers to declare metadata about each stream, then a series of timestamped stream blocks interleaved together. The time stamp is a format specific number... for audio: the sample number, for video: the frame number, so sigrok formats can easily leverage this.
* PulseView needs some means to save state: Probe names/colours/positions etc. Probably best to store this in an INI file (- JSON?). In ZIP this is easy - just add an extra file to the archive.
* PulseView needs some means to save state: Probe names/colours/positions etc. Probably best to store this in an INI file (- JSON?). In ZIP this is easy - just add an extra file to the archive.

Revision as of 15:42, 4 December 2013

Problem

The current .sr file format is a ZIP file containing multiple files (some metadata files and actual sampling data files). This works fine, but it also has some issues:

  • In order to get to the data you want, you need to decompress the whole file.
  • Appending to a file is not possible easily (and it's not efficient).
  • ...

Wish list

  • It should be able to store metadata and arbitrary data (logic samples, and/or analog samples, and so on).
  • It must support compression.
  • It should be able to handle run-time changes in the data streams (via meta packets on the session bus), e.g. changing samplerates, changing probes, etc. etc.
  • Better compression properties (e.g. using LZO or other algorithms, this is to be evaluated). What we ideally want out of the compression algorithm is:
    • Good and relatively fast compression results at only moderate CPU usage.
    • Very fast decompression (LZO is probably the best one here, as it's specifically designed for this).
    • Ideally, support for appending further data to already compressed data chunks (though this could be also implemented outside of the compression algorithm per se).
    • Open-source license and OS portability. There should be an open-source library or code chunk for compression/uncompression and it should be widely available in Linux distros, and portable to Windows, Mac OS X, FreeBSD, and so on.
  • Independent of hardware architecture (x86, ARM, PowerPC, MIPS, and so on), OS, endianness, float representation, and so on. All data fields must be properly specified (endianness, signedness, size, format).
  • Must be (optionally) possible to store extra UI state data. E.g User configured probe colours, names, positions.

We were planning a new file format and a full spec for it for other reasons already, but this will probably be obsolete anyway and solved differently (see above).

jhol's Musings

  • Data should be encoded in a data aware way. This would give greater compression:
    • Logic Data is most efficient stored in RLE+Huffman or Golomb coding. e.g. a clock signal may compress to one bit per edge.
    • FLAC (libflac) or a FLAC inspired codec (linear predicition) is probably as good as it gets for lossless analog data encoding.
  • If data is stored in a format specific way, it would be best to store it as a series of stream-blocks, similar to how video containers work. Would it be possible to simply leverage a video container such as OGG? IIRC this contains headers to declare metadata about each stream, then a series of timestamped stream blocks interleaved together. The time stamp is a format specific number... for audio: the sample number, for video: the frame number, so sigrok formats can easily leverage this.
  • PulseView needs some means to save state: Probe names/colours/positions etc. Probably best to store this in an INI file (- JSON?). In ZIP this is easy - just add an extra file to the archive.