Over the last month or so I've been switching my system over to a state where the default system libraries and error prone programs (e.g. firefox, galeon, etc.) have full debugging capabilities. So when these programs and the required libraries are emerged, I use -ggdb for CFLAGS & CXXFLAGS and the debug files are placed on a non-critical disk symlinked to /usr/lib/debug. This generally works fine (and my gracious thanks and doffed hat to the people who made this possible). Except (there are are always one or two aren't there...) there seems to be a problem with getting symbols (files, line numbers, argument names, etc.) for a subset of the libraries required. When debugging firefox or seamonkey, I can't appear to get symbols for libglib.so. When debugging galeon I can't get symbols for libxul.so (which is kind of strange since emerge reports galeon was built (debug -seamonkey -xulrunner). The debug files are there: -rw-r--r-- 1 root root 343584 May 12 12:33 /usr/lib/debug/usr/bin/galeon.debug -rw-r--r-- 1 root root 6187897 May 12 08:21 /usr/lib/debug/usr/lib/xulrunner-1.9/libxul.so.debug And gdb "thinks" it loaded the symbols, e.g. Reading symbols from /usr/lib/xulrunner-1.9/libxul.so...Reading symbols from /root3/usr/lib/debug/usr/lib/xulrunner-1.9/libxul.so.debug...done. done. Loaded symbols for /usr/lib/xulrunner-1.9/libxul.so (in the middle of loading at least 149 shared libraries -- and one wonders why programs take so long to start!!) For almost *all* of these shared libraries the gdb backtraces print what one would expect. But for any functions in libxul.so zip. The same applies to libglib.so functions when debugging firefox. I've tried playing around with info sharedlibrary and add-symbol-file but these appear to have no effect. I did notice that the loading of libxul.so symbols in galeon takes place after some additional threads have been started but don't think this should make a difference. I don't think that is the case with firefox. Reproducible: Always Steps to Reproduce: 1. Emerge galeon (or firefox) and associated system libraries with -ggdb flags. 2. Run these programs under gdb and set a breakpoint @ main(). 3. When main() is reached set a breakpoint in an often used function, poll() is a good example (or you can set the poll() breakpoint at startup taking into account that glibc isn't loaded. 4. Wait until a poll() is hit; or run the program for a while and then signal it with a "kill -13 galeon-PID-number" and do a thread apply all bt. Actual Results: One gets a mixed bag of functions with and functions without symbols. In the example I'm looking at it appears that libxul, libglib, libbonobo are missing symbols in spite of the fact that the debug libraries all exist. I've tried linking the "version" named debug files to the generic versions, e.g. /usr/lib/debug/usr/lib/libglib-2.0.so.0.2000.2.debug links to /usr/lib/debug/usr/lib/libglib-2.0.so.0.debug but that doesn't appear do anything. Gdb acts as if it doesn't have any idea about certain library symbols. For example, gpoll.c:g_poll() calls poll() but I can't list gpoll.c. Expected Results: Gdb should locate and provide access to all symbols. See attachments.
Created attachment 193956 [details] emerge --info output Emerge --info for gdb/galeon/firefox system missing sumbols
Created attachment 193958 [details] Backtrace (thread apply all bt) for galeon showing missing symbols This is a thread apply all bt for galeon interupted during a relatively large "restart" which was in an endless non-productive CPU loop (galeon, firefox and opera all are problematic with older and/or larger "restarts" involving some variabilities of max-ing out the CPU and/or IP I/O and their use of the poll() calls and when timeouts on various descriptors takes place). This backtrace resulted from interrupting galeon with a SIGPIPE signal. Similar results are obtained if one sets a breakpoint() at poll() proceeds through the first few simple examples of poll() being called and gets up to the point where there are multiple threads (doing DNS lookups?) running and one is attempting to poll() 10+ fds (sockets, pipes & files).
post the build log for the packages in question. most likely they arent respecting the debug flag and instead are stripping debug information. gdb is most likely correct -- it is reading the symbols *that are available* in the split debug file.
There are two separate bugs here. 1) Lack symbol information from specific libraries (e.g. libgtk-2.0, libgdk-2.0, libxul and some others). The packages gtk+ and mozilla-firefox are two that I'm dealing with right now. They *do* however produce "local" debug information, for example the difference between a "nm" and a "nm --extern-only" on /usr/lib/debug/usr/lib/libgtk-x11-2.0.so.0.1600.1.debug is 18,000+ symbols. And an "info shared" in gdb indicates that it has the symbols for "/usr/lib/libgtk-x11-2.0.so.0" which is symlinked to libgtk-x11-2.0.so.1600.1. Gdb has the global symbols which is what you would expect but is missing the line number and the arguments symbols. This does not apply to all shared libraries which is the strange part. 2) Building a glibc with debug symbols fails to create argument symbols for some functions. For example poll() gets symbols while read() does not. This is a library build problem presumably. I'm relatively convinced that this involves the use of strong_alias / weak_alias function name mapping (which may or may not imply saving argument symbol names & # for the aliased functions. Attachments forthcoming.
Created attachment 198694 [details] Example of C program calling read() and poll() w/ gdb trace Program and trace thereof showing that in gdb with glibc functions one can get valid traces for some system calls (e.g. poll) and not for others (e.g. read). Try it on your system with a glibc built for debugging.
Created attachment 198696 [details] Gdb trace (edited) showing lack of arguments in back traces The back traces (bt) show the mix of functions (most of which are from shared libraries (most of which are compiled for debugging), some of which provide argument lists (or function line numbers), some of which do not. The only explanation that I have is a gdb bug. I don't know if there is a way to pass an argument to "nm" to confirm that the line numbers & argument symbols are indeed in the *.debug libraries. The "info shared" output seems to suggest the symbols are present.
While trying to analyze core dumps generated from xserver-1.8.0.901 using gdb-7.1, I observed that bypassing the splitdebug machinery by copying over libglx.so from portage's build directory to /usr/lib/opengl/xorg-x11/extensions/libglx.so brought back line numbers and source file information. I'm therefore suspecting another problem in elfutils/debugedit (like the one fixed in bug #288977).
While analysing an x11-base/xorg-server-1.8.2 core file for an opengl related segfault, I think I figured out a part of the problem: # gdb $(which X) X-6362-6-1281112323 GNU gdb (Gentoo 7.1 p1) 7.1 [...] (gdb) bt [...] #9 <signal handler called> #10 0x4ebc8f46 in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #11 0x1102dbef in FreeClientResources (client=0x11421bf0) at resource.c:818 #12 0x11009fad in CloseDownClient (client=0x11421bf0) at dispatch.c:3631 #13 0x1100f071 in KillAllClients () at dispatch.c:3655 #14 Dispatch () at dispatch.c:468 #15 0x1100345a in main (argc=10, argv=0x5aecaee4, envp=0x5aecaf10) at main.c:286 ==> no symbols for libglx.so (gdb) info shared [...] 0x4efa0d90 0x4efa364c Yes /usr/lib/xorg/modules/extensions/libdbe.so 0x4ec1c9c0 0x4ec21d74 Yes (*) /usr/lib/xorg/modules/extensions/libdri.so 0x4ebe9000 0x4ebee124 Yes /usr/lib/libdrm.so.2 0x4ec15320 0x4ec17d54 Yes (*) /usr/lib/xorg/modules/extensions/libdri2.so 0x4eb8f500 0x4ebd5520 Yes (*) /usr/lib/xorg/modules/extensions/libglx.so 0x4ec08570 0x4ec10d5c Yes /usr/lib/xorg/modules/input/synaptics_drv.so [...] (*): Shared library is missing debugging information --> except for libdri.so, libdri2.so libglx.so, all symbols are available. But these three shared objects are those which are shuffled around using symlinks by "eselect opengl set x11-xorg": # ls /usr/lib/xorg/modules/extensions -l -rwxr-xr-x 1 root root 17896 6. Aug 20:59 libdbe.so lrwxrwxrwx 1 root root 46 6. Aug 21:00 libdri2.so -> ../../../opengl/xorg-x11/extensions/libdri2.so lrwxrwxrwx 1 root root 45 6. Aug 21:00 libdri.so -> ../../../opengl/xorg-x11/extensions/libdri.so -rwxr-xr-x 1 root root 96440 6. Aug 20:59 libextmod.so lrwxrwxrwx 1 root root 45 6. Aug 21:00 libglx.so -> ../../../opengl/xorg-x11/extensions/libglx.so -rwxr-xr-x 1 root root 26132 6. Aug 20:59 librecord.so Looking into the equivalent debug symbol path in /usr/src/debug: # ls -l /usr/src/debug/usr/lib/xorg/modules/extensions -rw-r--r-- 1 root root 67112 6. Aug 20:59 libdbe.so.debug -rw-r--r-- 1 root root 277363 6. Aug 20:59 libextmod.so.debug -rw-r--r-- 1 root root 71076 6. Aug 20:59 librecord.so.debug So, where are libglx.so.debug, libdri.debug and libdri2.debug? Well, here: # ls -l /usr/src/debug/usr/lib/opengl/xorg-x11/extensions/ -rw-r--r-- 1 root root 69468 6. Aug 20:59 libdri2.so.debug -rw-r--r-- 1 root root 111270 6. Aug 20:59 libdri.so.debug -rw-r--r-- 1 root root 1514581 6. Aug 20:59 libglx.so.debug Now testing the hypothesis by linking libglx.so.debug manually: # pwd /usr/src/debug/usr/lib/xorg/modules/extensions # ln -s ../../../opengl/xorg-x11/extensions/libglx.so.debug libglx.so.debug # nm --line-numbers libglx.so.debug | head 000484f9 t .L10 /mnt/hda1/tmp/portage/x11-base/xorg-server-1.8.2/work/xorg-server-1.8.2/glx/singlesize.c:156 0004848b t .L11 /mnt/hda1/tmp/portage/x11-base/xorg-server-1.8.2/work/xorg-server-1.8.2/glx/singlesize.c:159 [...] # gdb $(which X) X-6362-6-1281112323 GNU gdb (Gentoo 7.1 p1) 7.1 [...] (gdb)info shared [...] 0x4efa0d90 0x4efa364c Yes /usr/lib/xorg/modules/extensions/libdbe.so 0x4ec1c9c0 0x4ec21d74 Yes (*) /usr/lib/xorg/modules/extensions/libdri.so 0x4ebe9000 0x4ebee124 Yes /usr/lib/libdrm.so.2 0x4ec15320 0x4ec17d54 Yes (*) /usr/lib/xorg/modules/extensions/libdri2.so 0x4eb8f500 0x4ebd5520 Yes /usr/lib/xorg/modules/extensions/libglx.so 0x4ec08570 0x4ec10d5c Yes /usr/lib/xorg/modules/input/synaptics_drv.so [...] (gdb) bt [...] #9 <signal handler called> #10 0x4ebc8f46 in DrawableGone (glxPriv=0x11ed1710, xid=20971624) at glxext.c:133 #11 0x1102dbef in FreeClientResources (client=0x11421bf0) at resource.c:818 #12 0x11009fad in CloseDownClient (client=0x11421bf0) at dispatch.c:3631 #13 0x1100f071 in KillAllClients () at dispatch.c:3655 #14 Dispatch () at dispatch.c:468 #15 0x1100345a in main (argc=10, argv=0x5aecaee4, envp=0x5aecaf10) at main.c:286 That however does not resolve the other issues: 1.) that the gdb symbol-file command does not work correctly. Possibly also because gdb is unable to relate the debug symbols to the already loaded object file due to the same pathname mismatch. If you try to use symbol-file with gdb-7.1 it will even destroy the other symbols loaded. 2.) the missing argument issue seen by Robert, which he suspects having to do with weak symbols. Robert, can you try gdb-7.1 to see if there are also shared objects marked with (*) and then inspect these corresponding symbol files very careful for them really being in the right relative place? From your logs I would assume they are, or, that they had been in the correct location for a certain moment at least. Maybe all you else had to do would have been to restart the gdb session after having copied symbol files?
Using "add-symbol-file" instead of "symbol-file" and specify the module load address as taken from the "info shared" output, one can cope with such situations. (gdb) info shared 0x4eb8f500 0x4ebd5520 Yes (*) /usr/lib/xorg/modules/extensions/libglx.so [...] (gdb) add-symbol-file /usr/src/debug/usr/lib/opengl/xorg-x11/extensions/libglx.so.debug 0x4eb8f500 Looking at the (horrible) gdb code in symfile.c it seems as if the command "symbol-file" simply overwrites all symbolic information that has been loaded before: /* Currently we keep symbols from the add-symbol-file command. If the user wants to get rid of them, they should do "symbol-file" without arguments first. Not sure this is the best behavior (PR 2207). */
4 months with no info closing, feel free to reopen if you are continuing to have issues.