After fix from bug 1416 (commit 02a8c07d89ae), libsigrok is miscompiled when LTO is enabled. It finds no devices: $ sigrok-cli --scan <no output> If I disable LTO: $ sigrok-cli --scan The following devices were found: demo - Demo device with 12 channels: D0 D1 D2 D3 D4 D5 D6 D7 A0 A1 A2 A3 The problem are sr_driver_list__start and sr_driver_list__stop symbols: (gdb) p &sr_driver_list__start $1 = (const struct sr_dev_driver *(*)[1]) 0x7ffff7dd7f80 <sr_driver_list.start> (gdb) p &sr_driver_list__stop $2 = (const struct sr_dev_driver *(*)[1]) 0x7ffff7dd7f88 <sr_driver_list.stop> With LTO, the array contains only the dummy NULL entry (1 pointer/8 bytes). On the other hand, when LTO is disabled: (gdb) p &sr_driver_list__start $1 = (const struct sr_dev_driver *(*)[]) 0x7ffff7dd81f0 <sr_driver_list.start> (gdb) p &sr_driver_list__stop $2 = (const struct sr_dev_driver *(*)[]) 0x7ffff7dd86b8 <sr_driver_list.stop> (gdb) p (0x7ffff7dd86b8-0x7ffff7dd81f0)/8 $9 = 153 Note that in both cases, __sr_driver_list is correct: $ objdump -h -j __sr_driver_list /usr/lib64/libsigrok.so.4 /usr/lib64/libsigrok.so.4: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn 24 __sr_driver_list 000004d0 00000000001071f0 00000000001071f0 001061f0 2**3 CONTENTS, ALLOC, LOAD, DATA So with LTO, the hack for sr_driver_list__start and stop does not work.
Created attachment 556 [details] working fix If I use this linker script instead of the hack, it works.
agreed on the idea to remove the head and tail source files, and to use linker provided symbols instead for start and stop -- good idea! suggest minor adjustments (though these are low prio): - add the driver list after .text or .rodata and before .data, because this is constant data - will -Wc,-T,filename work? keeps the -T option and its argument together and increases readability/awareness/maintainability these are more severe: - it's essential to terminate the list with a NULL item, can the linker script do that? like ". += 16;" after the stop symbol and before the end of the new output section? libsigrok (internal library code, and bindings) do rely on the NULL termination, its absence results in undetermined behaviour (may or may not work, by coincidence) - is the linker script's collecting the list items portable across the set of sigrok supported platforms? not all of them use __sr_driver_list as the items' section name -- is something wider needed? like "*(*__sr_driver_list)" perhaps? with a wildcard in the section name
(In reply to Gerhard Sittig from comment #2) > - will -Wc,-T,filename work? keeps the -T option and its argument together > and increases readability/awareness/maintainability I am not sure, it actually might work, I will try. > these are more severe: > - it's essential to terminate the list with a NULL item, can the linker > script do that? Simply adding QUAD(0) after the sections content should do the job. > - is the linker script's collecting the list items portable across the > set of sigrok supported platforms? not all of them use __sr_driver_list > as the items' section name -- is something wider needed? like > "*(*__sr_driver_list)" perhaps? with a wildcard in the section name If there are other sections names, sure, wildcards would work. BTW do you have some test farm cross-compiling on all the various metal, so that you can test whether it compiles at least?
(In reply to Jiri Slaby from comment #3) > (In reply to Gerhard Sittig from comment #2) > > - will -Wc,-T,filename work? keeps the -T option and its argument together > > and increases readability/awareness/maintainability > > I am not sure, it actually might work, I will try. It works.
I'm not aware of a publicly accessible build farm (can't tell whether an OBS user created a configuration for the sigrok software). But the sigrok project certainly has a CI setup for the mainline codebase, if this was your question. Thank you for checking the -Wc,-T,file and the section wildcards syntax, and for the QUAD(0) hint which is more reliable than FILL(). Have also noticed that the linker script needs a $(srcdir) prefix to unbreak out-of-source builds. Build dependencies may be incomplete, modifications to the linker script don't always trigger rebuilds or take effect here. Have yet to check the resulting binaries, static and dynamic, and the symbols and terminators which they contain. Ideally your LTO motivated change would improve robustness of the build process for all other build configurations, too. And a partial link of driver sources and list items to one libdrivers.o file which contains the driver code and the complete list of drivers including the terminator and the iteration helper would be most preferrable.
I have created an isolated experiment to tinker with autotools and libtool and partial linking and different build configurations (LTO and non-LTO). Turns out to be rather non-trivial. The autotools seem to like getting in the way, but I cannot see a better approach to the driver list collection than this approach of partial linking. So I followed that road. There are several subtleties involved, like amending vs overriding linker scripts, order of command line words and turning special options on and off again, static vs dynamic libs, the different times of compilation and linking of LTO and non-LTO setups, etc. This implementation tried to use official .am syntax, but failed and then fell back to custom make(1) rules which hook up to libtool and automake use. The idea is to isolate the intermediate step between driver source compilation and creating the complete libsigrok library, so that the adjustment to the existing sigrok build process remains least intrusive. git://repo.or.cz/autotool-experiments.git The current implementation does work here for empty and non-empty sets of user specified drivers, with LTO enabled and disabled, on Linux. It has not yet been tested on other platforms and may need adjustment of filename extensions there. @Jiri: The patch that you provided was useful to get there, but needs more adjustment before it will work in the sigrok project for more build configs. Unless you provide another version, I will port my experiment (once it has seen wider tests) to the libsigrok environment, and will add your Reported-By: for the record.
Hi! Is there any news about this issue? FYI: Fedora recently switched on LTO globally for all packages. This resulted in missing HW drivers in Sigrok in Fedora 33: https://bugzilla.redhat.com/show_bug.cgi?id=1877485.
See comment 6. When I asked for review and/or test, I got zero feedback as usual (got this on several occassions before). Seems like no other users care about this issue, or maybe they rather wait for somebody else to do all of what's involved. So the situation has not changed since then. There is (independent) work of moving from autotools to cmake. Can't tell what the status is on that front, and how it affects the drivers list topic.
Created attachment 697 [details] Simpler fix I think I found a simpler fix for this issue. It does not require any build system modifications at all.
I like the approach in comment 9 a lot for its simplicity. Easy to reason about, low risk. Unfortunately the feature appears to depend on the toolchain version, and causes lots of warnings on some of the platforms that are supported by sigrok (one message per driver registration). ... CC src/driver_list_start.lo ./source/src/driver_list_start.c:33:3: warning: 'no_reorder' attribute directive ignored [-Wattributes] = { NULL /* Dummy item, as zero length arrays are not allowed by C99 */ }; ^ CCLD src/libdrivers_head.la CC src/driver_list_stop.lo ./source/src/driver_list_stop.c:33:3: warning: 'no_reorder' attribute directive ignored [-Wattributes] = { NULL /* Dummy item, as zero length arrays are not allowed by C99 */ }; ^ CCLD src/libdrivers_tail.la CCLD src/libdrivers.o ... https://www.gnu.org/software//gcc/gcc-5/changes.html suggests the feature was introduced in GCC 5. Can you add a feature check (preferred) or a version check? That'd allow me to pick up your fix for mainline, and give you credit for the complete fix.
Created attachment 698 [details] conditional no_reorder linker flag application Found an inspiration in the above mentioned release notes. See the attached diff. If this works for you, I can take this into mainline and give you a Submitted-By. Can you agree? [ am rushing this diff to avoid duplicate effort, hence the dirty form ]
(In reply to Gerhard Sittig from comment #11) I tested you patch, and it works. So yes, I agree.
Fixed in libsigrok 3decd3b1f0cb. Phrased such that other similar setups can get handled by future commits, and that developers/packagers can specify what _they_ need in their specific situation.
Passing an update from our toolchain person. https://bugzilla.redhat.com/show_bug.cgi?id=1877485#c14 Fundamentally this code from src/drivers.c is broken: for (const struct sr_dev_driver **drivers = sr_driver_list__start + 1; drivers < sr_driver_list__stop; drivers++) g_array_append_val(array, *drivers); The patch referenced in this BZ may work now, but I wouldn't at all be surprised if this breaks again one day. Depending on two distinct objects being ordered in a particular way in memory is just wrong and I'm confident if I had the time to dig through the ISO C/C++ standards that it's got undefined behavior.
Re comment 14: That's rather bold a claim, especially in these general terms. I can agree that the location of mere C language variables or routines in memory is not defined and shall not be relied upon. But the code in question is different, and the "just wrong" comment may miss its context. A linker script arranges for the sr_driver_list items to form a NULL terminated array of references to driver item descriptors. If this is considered unreliable, then .data copies from flash to RAM, .bss initialization, constructor and destructor calls, the separation of early discardable init code from persistent .text content, and similar constructs were broken, too. If some LTO configuration manages to bodge these essential mechanisms, then I would not want to use it. :) Granted, future maintenance needs to remain aware of the necessity to order these registered items and terminate the list properly. So if LTO based setups ignore the linker script, some other enforcement needs to be put in place. That was the point of using declaration helpers for all items such that the list construction can remain transparent as well as reliable. Comment 13 suggests that we are aware, _and_ have prepared for future adjustment as the need arises. Can you agree with that point of view, or am I missing something critical? Was comment 14 referring to a different patch than commit 3decd3b1f0cb perhaps?
Call for feedback: If the current implementation of the driver list item declaration is considered incomplete or unreliable, got suggestions how to improve its reliability and/or portability? That'd be very much appreciated. Would be better than waiting for the next breakage to get reported after having a user suffer from the issue again. We don't want that. Thank you for helping improve the implementation.
I still feel that the linker section with the list of pointers to driver items _is_ the most appropriate way of implementing the drivers list contruction. Though I can also agree that the linker section approach depends on the separate compiler and linker tools and how external build infrastructure combines these, or even adds more complexity and fragility to the setup. The recent trend to more aggressively optimize without much regard for other valid use cases beyond application development just reveals the fragility. But I have yet to understand what is considered _wrong_ about the approach of collecting the pointers in a linker section. In the sense of "incorrect". Beyond the potential "rather complex" or "not intuitive" concerns, which do apply but don't relate to correctness, only express some expectation. Is there a convincing reason why the linker section of only pointers is _wrong_? The more I think about this, the better I like comparing the drivers list to constructor calls, which also happens to be implemented by means of a list of pointers in a linker section. What am I missing when I consider the drivers list and the constructors list as something similar? IRC conversation suggests that "C language constructors" and runtime registration of items in the drivers list could work. It's unfortunate that these have to re-invent the drivers list collection that was done at compile time before. Cost would be a little higher, but acceptable. The complexity would remain at a similar level: The iteration in calling code remains as simple as it is now. Individual drivers don't need adjustment as the specifically chosen approach remains transparent. All adjustment is concentrated in a single spot which declares the registration helper. Are constructor attributes considered more reliable than linker section attributes? Are constructors considered "used" and will be kept and not dropped by aggressive optimizers? Have updated the isolated experiment which eliminates the complexity of libsigrok build rules out of the picture. The recent version implements the constructor approach and registers driver items at runtime. A linked list is used to eliminate the necessity for a compile time constant for the maximum list size. Local experiments on a small group of platforms look promising so far. git://repo.or.cz/autotool-experiments.git
*** Bug 1849 has been marked as a duplicate of this bug. ***
This has also recently cropped up in ubuntu 23.04: https://bugs.launchpad.net/ubuntu/+source/libsigrok/+bug/2025248 I kind of blame them for packaging the old 0.5.2 (which predates http://sigrok.org/gitweb/?p=libsigrok.git;a=commit;h=3decd3b1f0cbb3a035f72e9eade42279d0507b89) but I kind of don't because it is the last release tag. So it might be time to tag another release? As long as I'm asking dumb questions, was there a reason not to make each driver a separate shared library then try to dlopen all of them (ignoring ones that aren't found) instead of arcane linker tricks?
updating status based on https://bugs.launchpad.net/ubuntu/+source/libsigrok/+bug/2025248
Proposed fix (WIP), https://github.com/sigrokproject/libsigrok/pull/233 Instead of using a special section that holds a pointer to each of the driver descriptors, use "__attribute__((constructor))" to run a certain function at runtime just before main() entry. That is also before glib init and possibly (unclear) before heap is usable, so it may not be possible to directly build a GArray or GSList. A simple linked-list provides the intermediarey step, and sr_drivers_init() still builds the final driver list via a GArray. This compiler attribute is well-supported by gcc and clang. The SR_REGISTER_DEV_DRIVER_LIST() implementation is also not very elegant, but is a drop-in replacement. Comments & tests welcome, currently just $worksforme