Bug 1115 - pulseview crashes during startup with ATEN KVMs
Summary: pulseview crashes during startup with ATEN KVMs
Status: CONFIRMED
Alias: None
Product: PulseView
Classification: Unclassified
Component: Acquisition (show other bugs)
Version: unreleased development snapshot
Hardware: x86 Windows
: Normal normal
Target Milestone: ---
Assignee: Nobody
URL:
Keywords:
: 1319 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-01-23 14:48 CET by Andy Burns
Modified: 2021-08-17 22:07 CEST (History)
8 users (show)



Attachments
output from pulseview -l 5 (5.14 KB, text/plain)
2018-01-23 15:39 CET, Andy Burns
Details
gdb output (1.85 KB, text/plain)
2018-01-23 17:51 CET, Andy Burns
Details
gdb backtrace (3.54 KB, text/plain)
2018-01-23 23:39 CET, Andy Burns
Details
gbb backtrace (11.36 KB, text/plain)
2018-01-23 23:53 CET, Andy Burns
Details
libusb debug info (26.50 KB, text/plain)
2018-01-24 01:27 CET, Andy Burns
Details
lsusb stuff (114.59 KB, text/plain)
2018-01-24 19:25 CET, Andy Burns
Details
pulseview -l 5 (11.43 KB, text/plain)
2018-01-24 19:37 CET, Andy Burns
Details
Crash at start with Adafruit Feather nRF52840 Express attached (44.55 KB, application/octet-stream)
2019-05-04 03:50 CEST, Henry Gabryjelski
Details
Crash log runs and usb device information from device that causes crash. (13.33 KB, application/zip)
2021-03-06 00:15 CET, Chris
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Burns 2018-01-23 14:48:16 CET
Purchased a cheap 24MHz 8Ch logic analyzer from aliexpress,
installed windows pulseview nightly, 
configured drivers with Zadig.

When I run pulseview (with or without USB device connected) 
startup dialog says something like "scanning for center-309"
then I get "pulseview.exe has stopped working" with no further details, 
I checked I do have the VS2010 runtime installed.

I looked around but couldn't see anything other than the latest nightly build,
which at the time was #1​08​2 22-Jan-2018 23:38

should there be an archive of other versions I could try?, didn't see anything on Jenkins ...
Comment 1 Soeren Apel 2018-01-23 14:55:57 CET
Can you please try starting PV with no USB devices attached (except for mouse/keyboard)?
Comment 2 Andy Burns 2018-01-23 15:05:30 CET
I unplugged the USB logic analyzer, and my external USB audio device, 
but as this is a laptop it has numerous "internal"  USB devices (camera, fingerprint reader, bluetooth, smartcard reader, etc which I can't remove.

I found an older mingw32 build on jenkins #272 and took just the pulseview.exe from that, but it didn't help.

Windows 10 64bit (creators release 1709) 

I can try on a different machine, but it's also a laptop.
Comment 3 Soeren Apel 2018-01-23 15:21:11 CET
okay, then please use the debug build from the download page, start it from the console using "pulseview.exe -l 5" and show us the last ~20 lines of text from the console output. This should tell us which driver crashes it.
Comment 4 Andy Burns 2018-01-23 15:39:33 CET
Created attachment 379 [details]
output from pulseview -l 5
Comment 5 Soeren Apel 2018-01-23 17:01:28 CET
Uh... it crashed after the "Scan found 0 devices" line?
Comment 6 Andy Burns 2018-01-23 17:07:18 CET
Well, it crashes whether or not I have the LA connected, if that's what you're asking
Comment 7 Soeren Apel 2018-01-23 17:18:39 CET
What I wanted to know is whether that's the last output you see from the application before it disappears (i.e. crashes).

If that's the case then the only way to find out what's going on is to create a backtrace using gdb.

Which Windows version do you use?
Comment 8 Andy Burns 2018-01-23 17:29:03 CET
Yes that was the end of the debug output (since there was so little I included it all, rather than just the last 20 lines)

The laptop is running Windows 10 Pro 64bit (16299.192)

I'm a bit rusty on gdb, but will give it a try ...
Comment 9 Andy Burns 2018-01-23 17:51:06 CET
Created attachment 380 [details]
gdb output

I didn't get much output when I tried a backtrace after the fault

(gdb) backtrace
#0  0x0000002b in ?? ()
Cannot access memory at address 0xabababb7

While pulsview was paused in the debugger, I can at least see the message that was displayed in the dialog says "Scanning for center-309..." and a progress bar that reaches 7%
Comment 10 Soeren Apel 2018-01-23 21:18:26 CET
> #0  0x0000002b in ?? ()
okay, so we know that *something* tried to access an object but used a null pointer instead of the object's address.

However, I'm surprised that the backtrace comes out empty. Would you mind trying it with the gdb script shown here: https://sigrok.org/wiki/Developers#PulseView_and_GDB ?

I'm not sure it makes a difference but I'd like to make sure.
Comment 11 Andy Burns 2018-01-23 23:39:51 CET
Created attachment 381 [details]
gdb backtrace

I didn't use your script, instead I used a 32bit version of gdb instead of a 64bit version, it seems to give a meaningful backtrace
Comment 12 Andy Burns 2018-01-23 23:53:37 CET
Created attachment 382 [details]
gbb backtrace

This is the output, still from the 32bit verison of gdb, but calling it from the script, there is more information about threads (but only hex addresses) so I think the actual backtrace for the main thread is more or less the same
Comment 13 Soeren Apel 2018-01-23 23:59:20 CET
Thanks for the backtraces, much more helpful this time :)

I haven't seen such a bug before and the only relevant bug report I came across was https://github.com/libusb/libusb/issues/16 - which is closed and fixed in 1.0.20, which we're already using.
Comment 14 Andy Burns 2018-01-24 00:15:15 CET
It seems I need a way to avoid pulseview scanning for asix-sigma (or maybe chronovu-la) devices? 

Rather than having pulseview scan for the correct device, I see the mechanism to use "-d xxxxxxx" to specify the driver, is there a way to generate a list of suitable driver names? 

I guess I want something like fx2lafw ?
Comment 15 Uwe Hermann 2018-01-24 00:16:33 CET
Hi, I just ran a quick test with the PulseView installer I downloaded 5 minutes ago (i686 installer, used on Windows 10 64bit though), used Zadig, started PulseView with a USBee SX attached. Works fine here, both startup and acquisitions, multiple times in a row.

This hints at something system-specific some more.

Could you please also try "set LIBUSB_DEBUG=1" and then "pulseview -l 5" from the command line? That should contain a more verbose libusb log as well.

Thanks!

Oh, btw, don't use the USB cable the device shipped with, those are usually garbage (might cause all kinds of USB issues). Please retry with a known-good USB cable, maybe that helps.
Comment 16 Uwe Hermann 2018-01-24 00:17:38 CET
Yup, "sigrok-cli -L" will show all driver names that are compiled in the lib. In this case "fx2lafw" is correct, yes.
Comment 17 Andy Burns 2018-01-24 00:35:16 CET
The LIBUSB_DEBUG=1 environment variable didn't seem to produce any extra output

I had heard that the supplied USB cables were poor quality, but in this case pulseview crashes even when the LA is not plugged in.

This machine was wiped and reinstalled when 1709 edition of Win10 was released (the previous install was Win7->Win8->Win10 upgrade and I had loaded various libusb devices such as FTDI JTAG breakout boards using Zadig or similar driver loaders, but I am sure I haven't "polluted" the machine with that type of driver since it was installed clean).
Comment 18 Uwe Hermann 2018-01-24 01:19:33 CET
Oops, sorry. I meant "set LIBUSB_DEBUG=5".
Comment 19 Andy Burns 2018-01-24 01:27:31 CET
Created attachment 383 [details]
libusb debug info

Yes that's very verbose!

My FX2LAWF device has VID_0925&PID_3881 which I do see mentioned within the output
Comment 20 Andy Burns 2018-01-24 01:38:52 CET
OK, by judicious unplugging of USB cable after libusb/sigrok has started its scan, I found that it is my KVM which libusb objects to.

The KVM is an ATEN CS782DP

<http://www.aten.com/au/en/products/kvm/desktop-kvm-switches/cs782dp> 

If I get the timing right, I can enter the command to start pulseview, the very quickly unplug the KVM, the scanning detects my LA and pulseview starts, then I can plug back my KVM to get keyboard and mouse again.

The LA shows as a Salae Logic, not as an fx2lafw, does that mean I need to alter the USB VID/PID to get the firmware to load?
Comment 21 Andy Burns 2018-01-24 02:15:10 CET
I noticed there was a firmware update available for the KVM, installed it, and tried again leaving the KVM connected, but that didn't help with the crash.
Comment 22 Andy Burns 2018-01-24 13:39:47 CET
To avoid frantic typing and unplugging of USB devices, I now start pulseview from a command like such as

timeout /t 10 & pulseview.exe

which gives me time to unplug the KVM, wait a while and then plug it back in with pulseview working.

Some selections from the verbose log output are below

libusb: debug [libusb_init] created default context
libusb: debug [libusb_init] libusb v1.0.20.11003-rc3
libusb: debug [windows_init] Windows 8 64-bit
libusb: debug [init_dlls] Will use CancelIoEx for I/O cancellation
libusb: info [winusbx_init] libusbK DLL is not available, will use native WinUSB
libusb: debug [winusbx_init] initalized sub API libusbK
libusb: debug [winusbx_init] initalized sub API libusb0
libusb: debug [winusbx_init] initalized sub API WinUSB
[...]
libusb: debug [init_device] found 1 configurations (active conf: 1)
libusb: debug [cache_config_descriptors] cached config descriptor 0 (bConfigurationValue=1, 25 bytes)
libusb: debug [init_device] (bus: 2, addr: 21, depth: 1, port: 3): '\\.\USB#VID_0925&PID_3881#SALEAE_LOGIC'
libusb: debug [windows_get_device_list] allocating new device for session [258]
[...]
sr: std: fx2lafw: std_opts_config_list: sdi/cg != NULL: not handling.

I'm a little surprised it believes the device is a Saleae Logic, I was expecting it to show up as and use the fx2lafw firmware, but I see no sign of that, should that be the case?

The device works, if I set a trigger on the D0 pin, and then touch that pin on and off the CLK output pin, I do get a capture.

The device seems similar but not identical to the ARMFLY mini-logic shown on the wiki, I'll add a page for it, if I can ...
Comment 23 Uwe Hermann 2018-01-24 13:47:54 CET
If the device has a VID/PID of 0925:3881 it's indistinguishable from a Saleae Logic (or an actual Saleae Logic) and sigrok will use fx2lafw for it, that's expected.
Comment 24 Andy Burns 2018-01-24 14:03:38 CET
OK thanks, I just expected to see some sign of uploading one of the .fw files to the device

If you think there's any think further I can do to resulve the crash when libusb sees my KVM, I'm happy to give whatever info, or do tests.

Thanks so far ...
Comment 25 Andy Burns 2018-01-24 19:25:34 CET
Created attachment 384 [details]
lsusb stuff

lsusb, lsusb -t and lsusb -vvv output

The machine is a laptop, in a docking station, with the ATEN KVM that causes the problem
Comment 26 Andy Burns 2018-01-24 19:37:59 CET
Created attachment 385 [details]
pulseview -l 5

While booted into fedora27 liveUSB to get the lsusb outputs, I installed and ran the version from the fedora repo (and manually copied .fw files)

It scanned and found the LA at first attempt ...
Comment 27 Andy Burns 2018-01-30 20:10:23 CET
Using the new "-D" option together with the existing "-d fx2lafw" provides a much more satisfactory workaround to the KVM/libusb problem, tested using nightly build 1090.
Comment 28 bram 2018-03-25 19:46:14 CEST
Not sure why this is marked resolved? I have the exact same symptoms, and can only avoid the crash by running with -D. If debugging output is desired, let me know.
Comment 29 Uwe Hermann 2018-03-25 21:02:11 CEST
Reopening for now, not sure why it's closed either.

Please use the latest *.exe installers for sigrok-cli and PulseView and attach "-l 5" logs of the problem, ideally use "set LIBUSB_DEBUG=5" beforehand as well.

Please also mention which OS you use, whether 32/64 bit, which device exactly you use (VID/PID etc.) as well as whether or not there are USB hubs or KVM switches or the like involved here. Thanks!
Comment 30 John 2018-10-15 15:05:22 CEST
I may be able to add some information to this bug. I experienced it for the first time today.

PulseView
Windows 10 64-bit.
ASIX SIGMA.

With this device connected (it appears as a USB Microchip Virtual Coms Port);
https://www.microchip.com/DevelopmentTools/ProductDetails/rn-4678-pictail

PulseView fails to start. I see a search progress bar reach about 3 or 4% and then it closes.

Reinstalling the latest PulseView version did not help.

If I unplug the Microchip development kit, PulseView starts normally.

If PulseView is already running, I can plug in the Microchip kit and PulseView continues to run.

I have tested this a couple of times and it seems to be repeatable.

I have also tested with a FTDI virtual coms port (RS422 dongle) and have no problems.

In my case at least, there would seem to be a clash between the USB drivers. Although where the fault lies is another matter.

I am happy to carry out further testing to provide more detail if that is of interest - time allowing.

John
Comment 31 Henry Gabryjelski 2019-05-03 23:31:39 CEST
I've been able to successfully load PulseView once.
I am now experiencing this bug also.

After reading this thread, and hearing it may be a COM port for others, may have been a useful comment!
I caught a first trace using a USB-attached Rigol DS1054z.

Later attempts to launch PulseView just crashed, but it's possible I had attached an Arduino variant that exposed a COM port.  Having an Arduino attached (with its COM port) is hopefully a "core" scenario that PulseView would want to support.  :)

I can obtain the requested information, using a debug build, via:

SET LIBUSB_DEBUG=5
PulseView.exe -l 5

However, I can *also* obtain a type of trace that will revolutionize how you these types of bugs are root-caused... (at least on Windows).

It's called `Time Travel Debugging`.  Once a TTD trace is generated, that trace allows a remote replay of the exact instructions and data, as it occurred when generating the trace, and playing the execution "backwards in time".  No more back-and-forth asking users to try additional variations... the exact root cause can quickly be tracked down, by debugging it "as though" it were live, but with added capability of stepping/going backwards in time (including breakpoints).

Creating a TTD trace (.RUN file) takes 10x-20x more time.  So, I want to ensure someone would be able to review the resulting .RUN prior to spending the time generating it.


How to consume a Time Travel Trace?

Requirements:
1. Windows 10
2. `WinDBG Preview` application from the Windows App Store

See https://blogs.windows.com/buildingapps/2017/09/27/time-travel-debugging-now-available-windbg-preview/


Yes, it's worth the time to learn WinDBG… Example:

0. Load the .RUN file and setup symbols, index of trace, etc.
   0:000> .symfix
   0:000> .sympath+ C:\SRC\Symbols\
   0:000> .srcfix
   0:000> .srcpath+ C:\SRC\PulseView\
   0:000> .reload
1. Load the .RUN file into WinDBG Preview, and index it
   0:000> !tt.index
2. Run the program until the crash (guaranteed to repro the issue)
   0:000> g
3. set a breakpoint-on-access, write, for that (e.g. 4-byte) pointer
   0:000> ba w4 <address>
4. go "backwards" in time until that breakpoint is hit
   0:000> g-


Execution is now at the exact instruction that wrote the NULL value
lather/rinse/repeat....
See https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/time-travel-debugging-navigation-commands

WILL YOU LOOK AT A TIME TRAVEL TRACE (.RUN file), IF GENERATED / PROVIDED?
Comment 32 Soeren Apel 2019-05-03 23:52:57 CEST
Thanks for your offer Henry, but I don't run Windows. Not sure if Uwe has the time/resources to dig into WinDBG either, but I'm sure he'll let us know.

Essentially, what happens is that libsigrok thinks that there are devices present that need to be scanned because they could be known data acquisition devices - and crashes while attempting this. This crash could be the result of a broken device driver or a matching VID:PID pair. In either case, our goal is to try and figure out a way to prevent libsigrok from scanning devices it should be ignoring.

To do so, a simple backtrace is sufficient in the case of a broken device driver and a USB info overview in the case of an invalid device match.
Comment 33 Soeren Apel 2019-05-04 00:05:22 CEST
@Uwe: I looked a little closer at Andy's backtrace and noticed that

Thread 1 received signal SIGSEGV, Segmentation fault.
windows_assign_endpoints (dev_handle=0xafc0600, iface=2, altsetting=<optimized out>) at os/windows_usb.c:762

could indeed cause a crash when iface #0 hasn't been claimed before - it's a "known" Windows quirk: https://sourceforge.net/p/libusb/mailman/message/24535681/ (final few messages)

I wonder if we have AUTOCLAIM enabled or not. If not, I think we should enable it when building for Windows.
Comment 34 Henry Gabryjelski 2019-05-04 03:49:20 CEST
I cannot find symbols, even in the debug nightly builds.  Are those available?

The difference between PulseView starting or crashing is whether or not I've got an Adafruit Feather nRF52840 Express device attached via USB or not.


Here's a stack trace, from x64 nightly windows binaries dated 2019-05-03:

0:000> kv
 # Child-SP          RetAddr           : Args to Child                                                           : Call Site
00 00000000`072ce550 00000000`00a276fe : 00000000`000005cc 00000000`00000005 00000000`072ce7c0 00000000`013c9ecf : pulseview!sp_get_lib_version_string+0x61b
01 00000000`072ce750 00000000`00a28185 : 00000000`00000564 00000000`072cea70 00000000`013c9ecf 00000000`072cea70 : pulseview!sp_get_lib_version_string+0x4be
02 00000000`072ce950 00000000`00a1faae : 00000000`387d4a90 00000000`32ff5da0 00000000`0000006c 00000000`00000056 : pulseview!sp_get_lib_version_string+0xf45
03 00000000`072cee30 00000000`00a1fe2a : 00000000`00000000 00000000`00000000 00007ffb`7c89d450 00007ffb`7da19da0 : pulseview!sp_get_port_by_name+0x10e
04 00000000`072cee90 00000000`00000000 : 00000000`00000000 00000000`32ff5da0 00000000`000001c1 81010101`01010100 : pulseview!sp_free_port_list+0x11a

See attached file for debug output of when it crashes (pulseview04.txt) vs without the Adafruit Feather nRF52840 Express board (Arduino based board) attached.
Comment 35 Henry Gabryjelski 2019-05-04 03:50:17 CEST
Created attachment 528 [details]
Crash at start with Adafruit Feather nRF52840 Express attached
Comment 36 Soeren Apel 2020-02-10 13:41:22 CET
*** Bug 1319 has been marked as a duplicate of this bug. ***
Comment 37 Chris 2021-03-06 00:15:43 CET
Created attachment 729 [details]
Crash log runs and usb device information from device that causes crash.

Hello all!

I've had this problem for a while and decided to dig at it a little bit. My workaround has been launching in safe mode, and manually selecting my device type and then scanning.

Starting with all of my LAs detached, I went through the list of all the types today, and found that I only ran into the crash when it tried to scan with the `chronovu-la` driver, and the `zeroplus-logic-cube` driver.

I snagged the debug build and set LIBUSB_DEBUG=5, and ran with -l 5 to grab some output. I've attached two runs. The first generated from clicking the scan button after launching in safe mode. The second is just running regularly, letting it do the auto scan.

I noticed both times it was hitting the segfault after trying to open device "3.5", so I traced up the output a bit. 3.5 seemed to refer to VID 0x0B05, PID 0x18F3, (which seemed to be referred to as session '2CC')

I recognized 0x0B05 as ASUS - my motherboard manufacturer. I opened up the Zadig that came with PulseView. Of course, there was nothing in the list, so I enabled "List All Devices" and thumbed through the list looking for the VID/PID pair from above. Sure enough, I found it:

Two interfaces were listed for "AURA LED Controller". Fired up UsbTreeView and tracked down which device was presenting those interfaces, and they both came from a single USB Composite Device (thus why it presenting as usbccgp.) I disabled it, and rebooted as it wouldn't let me hot-remove it since the driver was holding on to it. (It's a physical chip on the motherboard just connected via USB. There's nothing to physically unplug.)

After the reboot, fired up Pulse View with no safe mode and it scanned for everything, and dropped me in the application with the "Demo device"! Closed it out, re-enabled my LED controller, ran pulse view again and reproduced the crash!

Now the hard part of: but why? Well, I hope that's where y'all have something fun to tell me, because I'm stumped at this moment.

The zip file I've attached also has the output of USB Device Tree View for the LED controller too, in any case it might help.

Let me know if I can look at anything else, thanks for all the hard work!
Comment 38 Nadir Syed 2021-08-17 22:05:31 CEST
(In reply to Chris from comment #37)
> Created attachment 729 [details]
> Crash log runs and usb device information from device that causes crash.
> 
> Hello all!
> 
> I've had this problem for a while and decided to dig at it a little bit. My
> workaround has been launching in safe mode, and manually selecting my device
> type and then scanning.
> 
> Starting with all of my LAs detached, I went through the list of all the
> types today, and found that I only ran into the crash when it tried to scan
> with the `chronovu-la` driver, and the `zeroplus-logic-cube` driver.
> 
> I snagged the debug build and set LIBUSB_DEBUG=5, and ran with -l 5 to grab
> some output. I've attached two runs. The first generated from clicking the
> scan button after launching in safe mode. The second is just running
> regularly, letting it do the auto scan.
> 
> I noticed both times it was hitting the segfault after trying to open device
> "3.5", so I traced up the output a bit. 3.5 seemed to refer to VID 0x0B05,
> PID 0x18F3, (which seemed to be referred to as session '2CC')
> 
> I recognized 0x0B05 as ASUS - my motherboard manufacturer. I opened up the
> Zadig that came with PulseView. Of course, there was nothing in the list, so
> I enabled "List All Devices" and thumbed through the list looking for the
> VID/PID pair from above. Sure enough, I found it:
> 
> Two interfaces were listed for "AURA LED Controller". Fired up UsbTreeView
> and tracked down which device was presenting those interfaces, and they both
> came from a single USB Composite Device (thus why it presenting as usbccgp.)
> I disabled it, and rebooted as it wouldn't let me hot-remove it since the
> driver was holding on to it. (It's a physical chip on the motherboard just
> connected via USB. There's nothing to physically unplug.)
> 
> After the reboot, fired up Pulse View with no safe mode and it scanned for
> everything, and dropped me in the application with the "Demo device"! Closed
> it out, re-enabled my LED controller, ran pulse view again and reproduced
> the crash!
> 
> Now the hard part of: but why? Well, I hope that's where y'all have
> something fun to tell me, because I'm stumped at this moment.
> 
> The zip file I've attached also has the output of USB Device Tree View for
> the LED controller too, in any case it might help.
> 
> Let me know if I can look at anything else, thanks for all the hard work!
Comment 39 Nadir Syed 2021-08-17 22:07:28 CEST
I have a similar issue also with my Asus motherboard.
I can only successfully launch PV using -D and then manually selecting the device.

I also have the ASUS Aura LED thing using winUSB that I cannot "unplug".