Bug 1190 - Python method self.wait() throws exceptions in decoders when it doesn't supposed to
Summary: Python method self.wait() throws exceptions in decoders when it doesn't suppo...
Status: RESOLVED DUPLICATE of bug 292
Alias: None
Product: sigrok-cli
Classification: Unclassified
Component: Other (show other bugs)
Version: unreleased development snapshot
Hardware: All All
: Normal normal
Target Milestone: ---
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-08 20:33 CEST by Aleksander Alsekseev
Modified: 2018-05-09 18:05 CEST (History)
2 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aleksander Alsekseev 2018-05-08 20:33:12 CEST
While working on decoder for ST7735 controller I discovered that self.wait() hangs forever after reaching end of data. This is a behavior of PulseView and it's considered to be an expected one. 

However the behavior of sigrok-cli differs - self.wait() throws an exception.

Here is how to reproduce it:

```
$ sigrok-cli -i /path/to/sigrok-dumps/display/st7735/st7735.sr \
  -P 'st7735:cs=CS:clk=SCLK:mosi=MOSI:dc=DC' \
  -A 'st7735=description'
...
st7735-1: CASET: Column address set
st7735-1: RASET: Row address set
st7735-1: RAMWR: Memory write
```

Note that the description of the last command RAMWR was successfully sent. This is because unlike PulseView sigrok-cli makes 'self.wait()' to throw an exception instead of hanging. The decoder interprets the exception as an end of stream and decodes buffered data.

I described this behavior in IRC to Uwe and apparently this is a bug in sigrok-cli: no exceptions should be thrown in self.wait().
Comment 1 Aleksander Alsekseev 2018-05-08 21:20:49 CEST
This is weird. When I change the code to make it print an exception:

```
            try:
                (cs, clk, mosi, dc) = self.wait( { 1:'e' } ) 
            except:
                print("Exception!!!:", sys.exc_info()[0])
                break

```

... it behaves as expected (doesn't show buffered command):


```
...
st7735-1: MADCTL: Memory data address control
st7735-1: CASET: Column address set
st7735-1: RASET: Row address set
```

However without printing it (`print` line is commented) it does print the last command that it doesn't supposed to print:

```
...
st7735-1: CASET: Column address set
st7735-1: RASET: Row address set
st7735-1: RAMWR: Memory write
```

Don't know what's going on yet. Sounds like something Valgrind is good for...
Comment 2 Gerhard Sittig 2018-05-08 21:39:09 CEST
The specific st7735 decoder implementation appears to trigger an unsupported 
code path.  Decoders' decode() methods are not supposed to catch so broad an 
exception, instead the libsigrokdecode backend tried to terminate decode() 
by raising that exception from within the wait() call.  But unexpectedly 
ended up executing "a little more of" the decoder's Python logic, while 
the backend already sent the DF_END packet and is about to shutdown.

Snippet of -l 5 output:
...
srd: Instance st7735-1 put 44284914-44284915 OUTPUT_ANN on oid 0.
srd: Instance st7735-1 put 44284893-44284915 OUTPUT_ANN on oid 0.
srd: Done, handled all samples (abs cur 44285020 / abs end 44285020).
sr: [00:19.100713] std: virtual-session: Sending SR_DF_END packet.
sr: [00:19.100726] session: bus: Received SR_DF_END packet.
cli: Received SR_DF_END.
sr: [00:19.100746] session: fd_source_finalize: key 0xffffffffffffffff
sr: [00:19.100762] session: Stopped.
srd: Exiting libsigrokdecode.
srd: Freeing instance st7735-1
srd: st7735-1: Joining decoder thread.
srd: st7735-1: Raising want_term, sending got_new.
srd: st7735-1: Running join().
srd: st7735-1: Decoder_wait: Will return from wait().
SystemError('error return without exception set',)
srd: Instance st7735-1 put 42277576-44284915 OUTPUT_ANN on oid 0.
42277576-44284915 st7735-1: description: "RAMWR: Memory write"
srd: st7735-1: decode() method terminated.
srd: st7735-1: decode() terminated (req 1).
srd: st7735-1: Thread done (with res).
srd: st7735-1: Call to join() done.
srd: st7735-1: Resetting decoder state.
srd: st7735-1: Releasing initial pin state.
srd: Destroyed session 1.
sr: [00:19.103913] hwdriver: Cleaning up all drivers.

The (current) lack of support for "end of input" signalling is not a bug, 
instead it's a still open design consideration.  It's yet to get decided 
what exactly should be done in that case (see bug 292 and past ML and IRC 
discussion). 

I'd suggest to adjust the PD implementation (if not already done that way, 
haven't checked thoroughly) like so:  Emit annotations as soon as their 
information becomes available (bits, and bytes).  Emit annotations for 
frames when the end of the frame _clearly_ was seen.  Assuming that a frame 
is complete just because the input happened to end (and in the absence of a 
better condition) could be considered cheating, or lying.  How can one tell 
when the input stream does not contain the end of a frame?
Comment 3 Aleksander Alsekseev 2018-05-08 22:30:47 CEST
The problem with this particular protocol is that the length of data is varying. For instance the RAMWR (write memory) can follow any amount of data bytes. And unfortunately PD can't rely on DC or CS as end of frame markers. The only way PD can say for sure that the command terminated and the next one began is to receive EOF (which is currently not implemented in Sigrok) or by encountering next command. Thus adjusting this particular PD the way that was proposed is not an option - it just implements the only option currently available.

This being said I don't know whether the described behavior of sigrok-cli should be considered a bug or not and whether it should be changed. From user's and PD developer perspective naturally the should be some sort of EOF marker (en exception, special method, or documentation that says to use destructor for this). Because otherwise I can't see how the entire dump can be decoded in case like this one. This however is a feature request, not a bug report.
Comment 4 Gerhard Sittig 2018-05-09 06:45:42 CEST
Let me rephrase:

The feature request to learn of "end of input" was communicated before, was 
understood, and is planned. Experiments are in sombody's queue, but nothing 
is in mainline yet.  There is no doubt that such a feature is desirable, 
but it's as important to be aware that nothing is as simple as it looks at 
first glance.

Two alternative approaches come to mind: Run an optional separate end routine 
that decoders can implement in parallel to decode(). Or a specific exception 
like Python file handling already has upon EOF, to be handled from within 
decode(), still optional of course.  Good names are yet to be found, but not 
that critical.

Independent from the availability of the EOF signalling there remains the 
question of how a decoder can "dare" to assume that a frame had ended, just 
because it ran out of input data.  If the protocol has no clear concept of 
"a frame ends here", and the decoder has not seen such a condition, it would 
be inappropriate to _claim_ that there had been a frame completion.  A warning 
("frame start seen, partial data seen, but not sure whether the frame has 
completed") would be most appropriate.

So I suggest to drop the try/except block and the final statement from the 
decoder implementation, because it happens to trigger a condition that is not 
officially supported by the library code which the Python decoder relies on. 
When official support for "end of input" becomes available in the library, 
then the Python implementation can make use of the feature.  But not before 
that.

For the specific case of st7735, no data is lost from the user's point of view. 
all bits and bytes are shown which clearly are there.  All frames (sequences 
of commands and their data each) which are known to be there are shown, too. 
The last command has not completed, as you state yourself (the protocol lacks 
the concept of such an "end of command" condition).  So there should not 
be an annotation about there being a full command, while all its parts are 
there.  Just the warning about the uncertainty of that last command's status 
is not there, since EOF currently is not yet communicated.  That can get added 
later.
Comment 5 Aleksander Alsekseev 2018-05-09 10:26:55 CEST
Good point. I updated PD and it's test cases accordingly.

Should we close this issue then?
Comment 6 Gerhard Sittig 2018-05-09 18:04:48 CEST
I'd suggest to mark it as a duplicate of bug 292, as it's yet another 
request for the "end of input" signal.  Which is fine, reasonable, and 
understood.  This report here has some related reasoning that's not 
filed in bug 292 yet, so it's good to have on record.

Thank you for looking, reporting, and adjusting your decoder implementation 
so swiftly!
Comment 7 Gerhard Sittig 2018-05-09 18:05:15 CEST

*** This bug has been marked as a duplicate of bug 292 ***