Bug 1374

Summary: not Clearing RAM after acquisition
Product: libsigrokdecode Reporter: gabse
Component: OtherAssignee: Uwe Hermann <uwe>
Status: RESOLVED FIXED    
Severity: major CC: soeren
Priority: High    
Version: unreleased development snapshot   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Empty error window wen trying to Run with reconnected tool

Description gabse 2019-04-15 08:46:11 CEST
Created attachment 523 [details]
Empty error window wen trying to Run with reconnected tool

In 0.5.0-git-af33d4c there is a bug. I was Debugging some DMX signals and discovered, that PlseView with each acquisition accumulates a fem 100MB of Computer RAM until the RAM is full and the Computer gets unusable slow.

I also found a bug wen disconnecting and reconnecting my LA(Saleae clone). The Software does not recognise the reconnected tool, and when I hit the Run button it crashes with a empty error window (see attached Picture)

I had both errors on a Windows 7 and a Windows 8 machine , both 64bit.
Comment 1 gabse 2019-06-10 11:40:13 CEST
The Error of excessive RAM usage still exists. It's maybe a error related to she Saleae Logic driver, because in demo mode it doesnt create any problems.
Comment 2 Uwe Hermann 2019-06-11 23:04:41 CEST
Hi, quick note on the "does not recognise the reconnected tool", that's expected. Removing a USB device while PulseView is running or plugging new devices while it's running is not supported yet (and has not been supported in the past). It would be nice if it were supported of course, and it probably will be eventually, but it's not entirely trivial. But that's unrelated to this specific bug report either way.

As for the potential memory leaks or "excessive RAM usage" (I assume "not Clearing RAM after acquisition" also refers to that?):

I was not able to reproduce that after a few quick tests with current nightly/git PulseView on Windows 10 64bit (both 32bit and 64bit PulseView) or Linux.

I can press the Run button multiple times, after each acquisition the RAM usage goes back to the initial "base" value before filling up again with the newly acquired samples. Even when adding a decoder this doesn't seem to change, so there's probably no (major) memory leak from what I can see (there were some leaks in the past, but those have been fixed a while ago).

Can you please describe how exactly you test and where you check the RAM usage?

Here's what I did for my quick tests with a Saleae Logic / FX2 device:

 - Start PulseView (64bit) on Windows 10 64bit.
 - Start "Task manager", got to the "Details" tab, sort by name, select the "pulseview.exe" item. For me, the "Memory (active private working set)" says something like "29,096 K".
 - In PulseView, select the "Saleae Logic" device if it's not selected already, set "200 M samples" sample limit and "12 MHz" sample rate.
 - Press Run. Task manager reads "238,028 K" for PulseView after the acquisition is done (after 16 seconds or so).
 - Press Run again. Task manager goes down to "42,000 K", then rises during the acquisition; when it's finished it shows "238,704 K" again.
 - Press Run again. "238,356 K".
 - Press Run again. "238,684 K".
 - Press Run again. "238,756 K".
 - Press Run again. "238,796 K".
 - Press Run again. "238,452 K".

Those are pretty minor variations, I can't see any major memory leaks in my tests.

You can now add a protocol decoder, e.g. "timing", and set the "Data" decoder channel to "D0".

For me, the value now goes up to "451,316 K" in task manager. Multiple runs with the decoder:

 - Press Run again. "451,392 K".
 - Press Run again. "451,276 K".
 - Press Run again. "450,608 K".
 - Press Run again. "451,480 K".
 - Press Run again. "451,508 K".

So I can't reproduce any major memory leaks in the protocol decoder setup, either.
Comment 3 gabse 2019-06-16 13:13:30 CEST
Hey, I did some furder investigation, and found out, that the error of the excessive ram usage ai not Clearing RAM after acquisition is related to the DMX512 Decoder. If the decoder is added the restarting the acquisition will clear some ram space, but not the whole ram used by the last acquisition. Probably the raw data is overwritten, but the decoded data not? I could shoot a short video if it helps.
Comment 4 Uwe Hermann 2019-06-20 14:17:19 CEST
I can now reproduce the issue, thanks! It seems to happen with any PDs (not just DMX512), but only if they emit annotations (or other types of output) apparently. 

If the PD does run but never emits an annotation, that doesn't seem to leak any memory.

We'll look into this and update the bug as soon as we know more.
Comment 5 gabse 2019-10-26 09:50:40 CEST
Hey Uwe, I wanted to ask if there is a fix for the Bug yet, because i was now using the decoder since a wile, and saw that the bug still seems to exist. It also seems that the DMX decoder is having problems with decoding Data if not all 512 Bytes are transmitted, witch, by the way is also allowed by the DMX512 standard.
Comment 6 Soeren Apel 2019-11-06 21:08:24 CET
Assigning to DMX PD as the bug only occurs with the DMX PD, so it is likely a bug in libsrd.
Comment 7 Uwe Hermann 2019-11-20 00:06:01 CET
Re-assigning, this is not Windows-specific and not DMX512-specific, happens with multiple (probably all) PDs. I've identified a few leaks in libsigrokdecode (maybe not all, but at least the largest ones), fixes are coming up.
Comment 8 Uwe Hermann 2019-11-20 00:28:15 CET
This should be fixed now in e2768fbcdeba97d10c2beaea709412e6fb8b047d, thanks!

The root cause were multiple memory leaks in libsigrokdecode. From my testing, these changes drastically improve the behavior on both Windows and Linux (and probably other OSes). There might be additional (smaller) leaks in all libs and frontends, but those are not directly related to this specific issue, and we'll probably do some more systematic leak testing + fixing at some point in the future anyway. I'm considering this specific bug fixed for the time being, though, hence the closing.

Some random stats from Windows (PulseView 64bit on 64bit Windows 10, spi/mx25l1605d/mx25l1605d_read.sr from sigrok-dumps with SPI decoder, 10 times clicking on Run in PV, Windows task manger "Memory, private working set" field):

Before the fixes:
206052K
273196K
340088K
378552K
457428K
524536K
591092K
658000K
724568K
791480K

After the fixes:
140724K
140848K
140540K
140608K
140800K
140872K
140960K
141168K
141072K
141136K

Other tools, like VMMap on Windows, ps/top/smem on Linux etc. show similar results.

As you can see, the absolute numbers of used memory are now lower, and the numbers after each consecutive click on "Run" in PulseView are now staying more or less "stable".

The numbers are different on Linux, but it's also clearly noticeable that they no longer increase as drastically per run as they did before.
Comment 9 gabse 2019-11-21 20:50:45 CET
I tested it, and it seems to work now.
Thank you a lot. The bug can be closed.