(Blocks bug 264558, per request by SpanKY: http://bugs.gentoo.org/show_bug.cgi?id=264558#c20 ) On AMD64 (and only on AMD64) I get a long list of errors with *any* executable in valgrind. For example "valgrind ls" or "valgrind ldd" (ldd is a static executable). Or anything else. This is with glibc-2.10.1. Valgrind can't cope with the new, sse-optimized strlen function of glibc 2.10.1 on amd64. Fedora is applying a patch for this: valgrind-3.4.1-x86_64-ldso-strlen.patch as well as another, glibc 2.10.1 specific one: valgrind-3.4.1-glibc-2.10.1.patch Those patches can be found at: http://cvs.fedoraproject.org/viewvc/rpms/valgrind/F-11 However, the strlen patch (the one we need) does not work with Gentoo unless glibc is emerged with debug symbols enabled: valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fix: install glibc's debuginfo package on this machine. valgrind: valgrind: Cannot continue -- exiting now. Sorry. I'm attaching the errors and my emerge --info.
Created attachment 195254 [details] valgrind ldd
Created attachment 195256 [details] emerge --info
A SVN version of valgrind (revision 10418) is working find with glibc 2.10.1 on AMD64.
(In reply to comment #3) > A SVN version of valgrind (revision 10418) is working find with glibc 2.10.1 on > AMD64. Unfortunately, SVN revision 10480 doesn't work with the same glibc on AMD64. Running valgrind ls shows tons of errors.
Edit: SVN 10418 or 10419 doesn't even compile anymore due to undefined LibVEX_version. Also, I could have read it wrong but it seems that Valgrind developers gave up on this or similar issue on ppc platform: https://bugs.kde.org/show_bug.cgi?id=182474 (bottom commend)
I added valgrind-3.4.1-r1 that includes both patches as well as the same warning message for amd64 that was already present for ppc before. People will just have to emerge libc with splitdebug to not have any problems with valgrind.
(In reply to comment #6) > I added valgrind-3.4.1-r1 that includes both patches as well as the same > warning message for amd64 that was already present for ppc before. > > People will just have to emerge libc with splitdebug to not have any problems > with valgrind. I just emerged glibc with splitdebug enabled (it produced *.debug files in /usr/lib64/debug) but it still doesn't work. I always get: valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fix: add splitdebug to FEATURES in make.conf and remerge glibc. valgrind: valgrind: Cannot continue -- exiting now. Sorry. So I'm reopening.
It's looking at the right place, the symbol is there, it's not finding it. wolf@wolfpc ~ $ emerge -pv glibc valgrind These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] sys-libs/glibc-2.10.1 USE="(multilib) nls -debug -gd -glibc-omitfp (-hardened) -profile (-selinux) -vanilla" 0 kB [ebuild R ] dev-util/valgrind-3.4.1-r1 USE="-mpi" 0 kB Total: 2 packages (2 reinstalls), Size of downloads: 0 kB wolf@wolfpc ~ $ valgrind -v true ==22091== Memcheck, a memory error detector. ==22091== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==22091== Using LibVEX rev 1884, a library for dynamic binary translation. ==22091== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==22091== Using valgrind-3.4.1, a dynamic binary instrumentation framework. ==22091== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==22091== --22091-- Command line --22091-- true --22091-- Startup, with flags: --22091-- -v --22091-- Contents of /proc/version: --22091-- Linux version 2.6.31-rc5 (wolf@wolfpc) (gcc version 4.4.1 (Gentoo 4.4.1 p1.0) ) #4 SMP Wed Aug 5 08:06:29 IDT 2009 --22091-- Arch and hwcaps: AMD64, amd64-sse2 --22091-- Page sizes: currently 4096, max supported 4096 --22091-- Valgrind library directory: /usr/lib64/valgrind --22091-- Reading syms from /bin/true (0x400000) --22091-- object doesn't have a symbol table --22091-- Reading syms from /usr/lib64/valgrind/amd64-linux/memcheck (0x38000000) --22091-- Reading debug info from /usr/lib/debug/usr/lib64/valgrind/amd64-linux/memcheck.debug .. --22091-- object doesn't have a dynamic symbol table --22091-- Reading syms from /lib64/ld-2.10.1.so (0x3637200000) --22091-- Reading debug info from /usr/lib/debug/lib64/ld-2.10.1.so.debug .. valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fix: add splitdebug to FEATURES in make.conf and remerge glibc. valgrind: valgrind: Cannot continue -- exiting now. Sorry. wolf@wolfpc ~ $ nm /usr/lib/debug/lib64/ld-2.10.1.so.debug | grep ' strlen' 0000000000015800 t strlen wolf@wolfpc ~ $ emerge --info Portage 2.1.6.13 (default/linux/amd64/2008.0/desktop, gcc-4.4.1, glibc-2.10.1-r0, 2.6.31-rc5 x86_64) ================================================================= System uname: Linux-2.6.31-rc5-x86_64-Intel-R-_Core-TM-2_Duo_CPU_E6550_@_2.33GHz-with-gentoo-2.0.1 Timestamp of tree: Fri, 07 Aug 2009 05:15:02 +0000 app-shells/bash: 4.0_p28 dev-java/java-config: 2.1.8-r1 dev-lang/python: 2.6.2-r1, 3.1 dev-util/cmake: 2.6.4-r2 sys-apps/baselayout: 2.0.1 sys-apps/openrc: 0.4.3-r3 sys-apps/sandbox: 2.0 sys-devel/autoconf: 2.13, 2.63-r1 sys-devel/automake: 1.7.9-r1, 1.9.6-r2, 1.10.2, 1.11 sys-devel/binutils: 2.19.1-r1 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.2.6a virtual/os-headers: 2.6.30-r1 ACCEPT_KEYWORDS="amd64 ~amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-g -O2 -pipe -march=core2 -momit-leaf-frame-pointer" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d" CXXFLAGS="-g -O2 -pipe -march=core2 -momit-leaf-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks fixpackages installsources parallel-fetch protect-owned sandbox sfperms splitdebug strict unmerge-orphans userfetch" GENTOO_MIRRORS="http://mirror.hamakor.org.il/pub/mirrors/gentoo/ " LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed" LINGUAS="en en_US" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage/layman/java-overlay /usr/local/portage/layman/gcc-porting /usr/local/portage/layman/mozilla /usr/local/portage" SYNC="rsync://mirror.hamakor.org.il/gentoo-portage" USE="X acl acpi alsa amd64 bash-completion bidi bluetooth branding bzip2 cairo cdr cli cracklib crypt dbus dri dvd dvdr eds emboss encode evo fam firefox gif gpm gstreamer gtk hal iconv ipv6 isdnlog jpeg kde libnotify mad mikmod mmx mp3 mpeg mudflap multilib ncurses nls nptl nptlonly ogg opengl openmp pch pcre pdf perl png ppds pppd python qt3 qt3support qt4 quicktime readline reflection sdl session spell spl sqlite3 sse sse2 ssl startup-notification svg sysfs tcpd tiff truetype unicode usb vorbis xcb xcomposite xml xorg xulrunner xv zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CAMERAS="stv0680" ELIBC="glibc" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en en_US" SANE_BACKENDS="niash stv680" USERLAND="GNU" VIDEO_CARDS="nv nvidia" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Upstream bug: http://bugs.kde.org/show_bug.cgi?id=190429 Upstream commit mentioned in http://bugs.kde.org/show_bug.cgi?id=190429#c11: http://sourceforge.net/mailarchive/forum.php?thread_name=20090802122132.B1F9E10887D@jail0086.vps.exonetric.net&forum_name=valgrind-developers The upstream SVN refuses to talk to me, and the commit in the mailing list doesn't apply cleanly. I've applied it manually and it doesn't work. Maybe a previous change (like the one that adds "DebugInfo_get_soname") is required. When I'll manage to check out the code from upstream repository I'll test that and report here.
valgrind built manually (without portage) from svn checkout (r10754) works. $ /usr/local/bin/valgrind -v true ==6515== Memcheck, a memory error detector. ==6515== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==6515== Using Valgrind-3.5.0.SVN and LibVEX; rerun with -h for copyright info ==6515== Command: true ==6515== --6515-- Valgrind flags: --6515-- -v --6515-- Contents of /proc/version: --6515-- Linux version 2.6.31-rc5 (wolf@wolfpc) (gcc version 4.4.1 (Gentoo 4.4.1 p1.0) ) #4 SMP Wed Aug 5 08:06:29 IDT 2009 --6515-- Arch and hwcaps: AMD64, amd64-sse3-cx16 --6515-- Page sizes: currently 4096, max supported 4096 --6515-- Valgrind library directory: /usr/local/lib/valgrind --6515-- Reading syms from /bin/true (0x400000) --6515-- object doesn't have a symbol table --6515-- Reading syms from /usr/local/lib64/valgrind/memcheck-amd64-linux (0x38000000) --6515-- object doesn't have a dynamic symbol table --6515-- Reading syms from /lib64/ld-2.10.1.so (0x3662200000) --6515-- Reading debug info from /usr/lib/debug/lib64/ld-2.10.1.so.debug .. --6515-- Reading suppressions file: /usr/local/lib/valgrind/default.supp --6515-- REDIR: 0x3662215800 (strlen) redirected to 0x3803f647 (vgPlain_amd64_linux_REDIR_FOR_strlen) --6515-- Reading syms from /usr/local/lib64/valgrind/vgpreload_core-amd64-linux.so (0x4802000) --6515-- Reading syms from /usr/local/lib64/valgrind/vgpreload_memcheck-amd64-linux.so (0x4a04000) ==6515== WARNING: new redirection conflicts with existing -- ignoring it --6515-- new: 0x3662215800 (strlen ) R-> 0x04a085d0 strlen --6515-- REDIR: 0x3662215750 (index) redirected to 0x4a08320 (index) --6515-- REDIR: 0x36622157d0 (strcmp) redirected to 0x4a088d0 (strcmp) --6515-- Reading syms from /lib64/libc-2.10.1.so (0x3662800000) --6515-- Reading debug info from /usr/lib/debug/lib64/libc-2.10.1.so.debug .. --6515-- REDIR: 0x366287a840 (rindex) redirected to 0x4a08180 (rindex) --6515-- REDIR: 0x3662879f30 (strcmp) redirected to 0x4a08880 (strcmp) --6515-- REDIR: 0x366287a4c0 (strlen) redirected to 0x4a08590 (strlen) --6515-- REDIR: 0x366287a690 (strncmp) redirected to 0x4a08810 (strncmp) --6515-- REDIR: 0x3662879eb0 (index) redirected to 0x4a08220 (index) --6515-- REDIR: 0x366287fb20 (strchrnul) redirected to 0x4a09520 (strchrnul) --6515-- REDIR: 0x3662876810 (malloc) redirected to 0x4a07491 (malloc) --6515-- REDIR: 0x3662876730 (free) redirected to 0x4a070a1 (free) --6515-- REDIR: 0x366287d3a0 (memcpy) redirected to 0x4a089a0 (memcpy) --6515-- REDIR: 0x366287d050 (stpcpy) redirected to 0x4a091d0 (stpcpy) ==6515== ==6515== HEAP SUMMARY: ==6515== in use at exit: 0 bytes in 0 blocks. ==6515== total heap usage: 29 allocs, 29 frees, 3,897 bytes allocated. ==6515== ==6515== All heap blocks were freed -- no leaks are possible. ==6515== ==6515== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1) --6515-- --6515-- used_suppression: 4 dl-hack3-cond-1 ==6515== ==6515== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1)
Then i suspect there must be something else that solves the problem for you. As far as I can tell the only significant difference between that patch and the one in portage is that it makes the redir non-mandatory for glibc < 2.10. Could you try replacing our ldso-strlen patch with the one from upstream? You'll have to get rid of the last hunk in the patch (taken from svn diff -r10688:10689), because it's irrelevant and fails to apply. The only noticeable difference between your system and mine seems to be gcc-4.4. If you still have 4.3 installed it may be worth it to remerge glibc with that and see what happens.
I said I did apply the upstream patch instead, it didn't work. I looked at the commit log (thought it's big, six months of development) and applied r8959 and r9329 (in modified form to fit without the intermediate steps) because they seemed likely to affect a function override in a prelinked shared object based on split debug info in dwarf format, but still no go. I don't know how recompiling glibc with gcc-4.3 will help - after all, the upstream trunk version of valgrind works with my glibc. I just don't know which combination of commits in those six months does the trick. I think I'll use the valgrind I built without portage and wait for 3.5 to be out. Or, if someone makes a live ebuild. I have the upstream repo in git (with full history) and can probably send it to you if upstream svn gives you trouble.
Created attachment 200938 [details] valgrind-9999.ebuild Current SVN works here OK too. Here's a live ebuild which can be used as a temporary work-around until 3.5 is out.
(In reply to comment #13) > Created an attachment (id=200938) [edit] > valgrind-9999.ebuild > > Current SVN works here OK too. Here's a live ebuild which can be used as a > temporary work-around until 3.5 is out. > Did not fix the issue for me valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fixes: (1, short term): install glibc's debuginfo valgrind: package on this machine. (2, longer term): ask the packagers valgrind: for your Linux distribution to please in future ship a non- valgrind: stripped ld.so (or whatever the dynamic linker .so is called) valgrind: that exports the above-named function using the standard valgrind: calling conventions for this platform. valgrind: valgrind: Cannot continue -- exiting now. Sorry.
(In reply to comment #14) > (In reply to comment #13) > > Created an attachment (id=200938) [edit] > > valgrind-9999.ebuild > > > > Current SVN works here OK too. Here's a live ebuild which can be used as a > > temporary work-around until 3.5 is out. > > > > Did not fix the issue for me > > valgrind: Fatal error at startup: a function redirection > valgrind: which is mandatory for this platform-tool combination > valgrind: cannot be set up. Details of the redirection are: > valgrind: > valgrind: A must-be-redirected function > valgrind: whose name matches the pattern: strlen > valgrind: in an object with soname matching: ld-linux-x86-64.so.2 > valgrind: was not found whilst processing > valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 > valgrind: > valgrind: Possible fixes: (1, short term): install glibc's debuginfo > valgrind: package on this machine. (2, longer term): ask the packagers > valgrind: for your Linux distribution to please in future ship a non- > valgrind: stripped ld.so (or whatever the dynamic linker .so is called) > valgrind: that exports the above-named function using the standard > valgrind: calling conventions for this platform. > valgrind: > valgrind: Cannot continue -- exiting now. Sorry. > Works fine for me, You've just to install glibc with FEATURES=splitdebug.
Valgrind 3.5.0 has just been released.
I emerged glibc with splitdebug on my laptop (core2duo) and desktop maschine (phenom), it works fine on my laptop, but the issue is still present on my desktop. Both ~amd64, gcc-4.4.1 and (beside from march) same cflags valgrind-9999 does not help
Bernd, doesn't 3.5.0 work for you?
(In reply to comment #18) > Bernd, doesn't 3.5.0 work for you? > It does not work for me as well, I compiled glibc with splitdebug an installed valgrind 3.5... without success. Any advice? Vinenzo
Created attachment 209182 [details, diff] Allows valgrind to search debuginfo files into /usr/lib/debug/libXX I think a runtime check for determining the arch is fine, but maybe a compile time one could be desired. For users who don't want to fix valgrind a quick and dirty hack is to symlink /usr/lib/debug/libXX to /usr/lib/debug/lib/. This fixes the problem for me.
I ask everybody, if running 64bit multilib systems (and report if not but you are running into this issue anyway), to post the following command output: $ file /usr/lib* Thanks
(In reply to comment #21) # file /usr/lib* /usr/lib: symbolic link to `lib64' /usr/lib32: directory /usr/lib64: directory /usr/libexec: directory ( Phenom X4 )
(In reply to comment #22) > (In reply to comment #21) > > # file /usr/lib* > /usr/lib: symbolic link to `lib64' > /usr/lib32: directory > /usr/lib64: directory > /usr/libexec: directory > > ( Phenom X4 ) > I've got the same output, and an Athlon II X4
I'm having the same problem - very frustrating, since I'm trying to use valgrind and work to help check the quality of new code. As per comment 21: philip@gridbug(0)debug $ file /usr/lib* /usr/lib: symbolic link to `lib64' /usr/lib32: directory /usr/lib64: directory /usr/libexec: directory However, I don't think any amount of symlinking will get around the issue, the strlen symbol just doesn't seem to be there in the debuginfo for the 64-bit ld.so: philip@gridbug(128)debug $ find -type f -exec strings -f {} \; | grep strlen ./lib32/ld-2.10.1.so.debug: max_capstrlen ./lib32/ld-2.10.1.so.debug: strlen strings: ./usr/lib32/misc/glibc/pt_chown.debug: Permission denied strings: ./usr/lib64/misc/glibc/pt_chown.debug: Permission denied ./lib64/ld-2.10.1.so.debug: max_capstrlen
A shot in the dark, but there were issues with debugedit recently. Try to re-emerge dev-libs/beecrypt, dev-util/debugedit, glibc. I'm now using sys-libs/glibc-2.11 and it works great with dev-util/valgrind-3.5.0. $ nm /usr/lib/debug/lib64/ld-2.11.so.debug | grep ' strlen' 0000000000016530 t strlen
I don't have debugedit installed. What do you have installed that depends on it?
FEATURES="installsources"
That's a different matter. I don't want/need the source of glibc installed - I don't want to debug glibc, just programs compiled against it. I haven't needed either "installsources" or "splitdebug" with previous versions of valgrind/glibc.
(In reply to comment #28) > I haven't needed either "installsources" or "splitdebug" with previous > versions of valgrind/glibc. You need at least splitdebug now.
To summarize: - you need to have emerged glibc with splitdebug in FEATURES in make.conf - /usr/lib should be a symlink to /usr/lib64 and not the other way around I can't figure out what is going wrong without being able to reproduce it. If you're still seeing this problem despite the above conditions, and if you're able to do some debugging yourself, please share any information you find that can help determine the root cause. Alternatively you can also contact the upstream developers either through their bugzilla (https://bugs.kde.org/enter_valgrind_bug.cgi) or through the mailing lists (http://valgrind.org/support/mailing_lists.html). RESOLVED/FIXED+INVALID+WORKSFORME+NEEDINFO+UPSTREAM
In my case, /usr/lib *is* a symlink to /usr/lib64, and I *did* emerge glibc with "splitdebug". I don't want or need to use "installsources" and emerge debugedit, because it isn't glibc itself that I'm trying to debug. The part I don't understand is why there is *no strlen symbol anywhere in the 64-bit glibc's debug info* (especially when it's still there in the 32-bit version). No version of Valgrind is going to work correctly with a version of glibc where that is the case! FWIW, I upgraded to glibc 2.11, which seems to have a strlen symbol in the debug info again. I would love to help further, but I don't have time to downgrade again and continue playing - as I said before, I'm using Valgrind for work purposes. Perhaps strlen is forcibly inlined with 2.10 in certain conditions?
Apparently, glibc build shouldnt be splitted and glibc ebuild specifically prevents that with RESTRICT="strip" as per http://bugs.gentoo.org/46186 so splitdebug is not a solution here
(In reply to comment #32) > http://bugs.gentoo.org/46186 The problem mentioned there has nothing to do with this bug report.
Has anyone seen this rear up again? I am running a (mostly) amd64 machine, with ~amd64 valgrind (3.5.0). This was working when I was debugging a lot in the winter. I just tried to do 'valgrind ls', and get the same output as comment #7. I have recompiled both glibc and valgrind to no avail. I have yet to try "splitdebug" et al. but it seems that this is not necessarily the solution?
(In reply to comment #34) > I have yet to try "splitdebug" et al. but it seems that this is not necessarily > the solution? > splitdebug is still necessary in order to work properly
Another solution: mkdir -p /etc/portage/env/sys-libs echo FEATURES=\"nostrip\" > /etc/portage/env/sys-libs/glibc emerge -1v sys-libs/glibc
So just to clarify this: setting FEATURE "splitdebug" should solve this? Because in my case it doesn't. I have set this and the issue remains. Is there anything else one could do? Cheers, Stephan
RE: comment #37, did you recompile glibc after setting FEATURE="splitdebug"? This feature produces separate ELF files (shared objects) with the symbol table for the binaries, while they are compiled / installed. glibc must be built with debug symbols (-g in CFLAGS) and valgrind must be able to find those symbols, either in the shared objects themselves or in a separate file pointed to by .gnu_debuglink. What is your libc? Something like: $ which true /bin/true $ ldd /bin/true linux-vdso.so.1 => (0x00007fff423ff000) libc.so.6 => /lib/libc.so.6 (0x0000003d9ee00000) /lib64/ld-linux-x86-64.so.2 (0x0000003d9ea00000) So libc is /lib/libc.so.6 $ ll /lib/libc.so.6 lrwxrwxrwx 1 root root 14 Jun 11 13:47 /lib/libc.so.6 -> libc-2.11.2.so Real file. $ objdump -t /lib/libc-2.11.2.so /lib/libc-2.11.2.so: file format elf64-x86-64 SYMBOL TABLE: no symbols So if we look at the binary, there is no way to tell where some function is. All we have is a big .text section of code. We need the offset of malloc! But, maybe a separate file has the symbols? $ objdump -s -j .gnu_debuglink /lib/libc-2.11.2.so /lib/libc-2.11.2.so: file format elf64-x86-64 Contents of section .gnu_debuglink: 0000 6c696263 2d322e31 312e322e 736f2e64 libc-2.11.2.so.d 0010 65627567 00000000 65c52829 ebug....e.() Aha! a file libc-2.11.2.so.debug with crc32 of 0x65c52829 has the symbols. Where is it? Under /usr/lib/debug/$(lib_dir)/debug_file_name. So: $ objdump -t /usr/lib/debug/lib64/libc-2.11.2.so.debug | egrep ' malloc$' 0000000000076d10 g F .text 0000000000000257 malloc Now we know offset of malloc. So we can see calls to malloc to patch them.
OK, I have tried all sorts of combinations now I can think of. To no avail. I have "splitdebug" with or without "nostrip" and several combinations of -g and -ggdb in my CFLAGS. Here's what I think would be relevant: It is an ~amd64 box with all current ebuilds. No overlays or any special customization. sm@nt468 ~ $ ldd /bin/ls linux-vdso.so.1 => (0x00007fffb0bff000) librt.so.1 => /lib/librt.so.1 (0x00007f5f7d4ac000) libacl.so.1 => /lib/libacl.so.1 (0x00007f5f7d2a5000) libc.so.6 => /lib/libc.so.6 (0x00007f5f7cf58000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007f5f7cd3d000) /lib64/ld-linux-x86-64.so.2 (0x00007f5f7d6b4000) libattr.so.1 => /lib/libattr.so.1 (0x00007f5f7cb38000) sm@nt468 ~ $ ll /lib/libc.so.6 lrwxrwxrwx 1 root root 14 15. Jun 16:56 /lib/libc.so.6 -> libc-2.11.2.so sm@nt468 ~ $ objdump -t /lib/libc-2.11.2.so |grep strlen 0000000000078480 l F .text 0000000000000045 __GI_strlen 0000000000078480 g F .text 0000000000000045 strlen (the command issues lots and lots of .text symbols) sm@nt468 ~ $ objdump -s -j .gnu_debuglink /lib/libc-2.11.2.so /lib/libc-2.11.2.so: file format elf64-x86-64 (no .gnu_debuglink section, that's it.) So I have to assume that despite all my efforts to change that and all my rebuilds, glibc has no debug symbols at all, right? Or if it does, it's split away and valgrind can't find it. here's my make.conf: CFLAGS="-O3 -pipe -g -march=native -mtune=native" CXXFLAGS="-O3 -pipe -g -march=native -mtune=native" CHOST="x86_64-pc-linux-gnu" USE="3dnow 3dnowext 64bit admin administrator adns ads aio akode akonadi apache2 aqua_theme async banshee bash-completion bind blender blender-game boost c++ ccache cdr cgi cover css curl curlwrappers cxx dhcp dhcpcd drm-next dvd examples exceptions extensions extra extra-algorithms extraengine extrafilters extras ffmpeg firefox3 flash fortune fts3 gd gimp git github glib gnutls google google-gadgets gpg graphviz grub gstreamer hddtemp hdri html http i18n icu id3 id3tag java java6 jpeg2k kate kcal kdcraw kde kdevplatform kdm kdrive kerberos kexi keyboard kpoll lame ldap-sasl libsigsegv libssh2 linux-smp-stats lm_sensors logrotate mjpeg mmxext mng mono mozbranding mozdevelop mozembed mp3tunes mpe mpi mpi-threads mplayer musicbrainz mysql net netpbm network nsplugin nss numeric nvidia nvram offensive ogg123 ogm okular openexr openssl optimization optimized-qmake phonon php plasma player plugins posix postgres postproc postscript povray ps pth python-bindings python3 qt-dbus qt-webkit qt4 qtmultimedia raytracerx real regex rtc rtsp samba sasl scp secure-delete semantic semantic-desktop sensord server sftp sip smbclient smbkrb5passwd smbsharemodes sql sqlite ssh stream subversion swat swig symlink syslog taglib tcpdump tex tex4ht themes threads tools transcode utils uuid valgrind vdpau video_cards_nvidia vim vim-pager vim-syntax vlc wav webkit wireshark wma wmf xorgmodule youtube zeroconf zip -dso -hal -ipv6" ACCEPT_KEYWORDS="~amd64" ACCEPT_LICENSE="dlj-1.1" PORTAGE_NICENESS=19 MAKEOPTS="-j2" FEATURES="splitdebug nostrip" LINGUAS="de en" VIDEO_CARDS="nvidia" INPUT_DEVICES="evdev keyboard mouse" there's additionally the use flag "debug" set for glibc. Is that helpful at all? I really need valgrind and must have this issue fixed in order to use it. Cheers, Stephan
PS: Could it be a ccache issue? I use ccache...
Your glibc was compiled with debug symbols and with "nostrip", so valgrind should be happy, but, I've just noticed I'm using valgrind-9999 (from svn), so apparently 3.5.0 didn't work for me. I suggest trying that and hope what's in svn trunk compiles and works for you. My svn checkout is: $ /usr/portage/distfiles/svn-src/valgrind/trunk $ svn info Path: . URL: svn://svn.valgrind.org/valgrind/trunk Repository Root: svn://svn.valgrind.org/valgrind Repository UUID: a5019735-40e9-0310-863c-91ae7b9d1cf9 Revision: 11100 Node Kind: directory Schedule: normal Last Changed Author: bart Last Changed Rev: 11100 Last Changed Date: 2010-04-02 13:27:35 +0300 (Fri, 02 Apr 2010)
Blast! I'm sitting behind a HTTP only firewall here and can't access svn repos. So I cannot try. Is there any place where I can get a snapshot with the patch via HTTP?
(In reply to comment #24) That's interesting. As expected: $ nm /usr/lib/debug/lib32/ld-2.11.2.so.debug | grep ' strlen' 00017fc0 t strlen but strlen is missing in lib64: $ nm /usr/lib/debug/lib64/ld-2.11.2.so.debug | grep ' strlen'
Request for re-opening this bug - it's not fixed in SVN (3.6.0) (see attachment)
/usr/local/bin/valgrind -v true ==14250== Memcheck, a memory error detector ==14250== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==14250== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info ==14250== Command: true ==14250== --14250-- Valgrind options: --14250-- -v --14250-- Contents of /proc/version: --14250-- Linux version 2.6.31-gentoo-r6 (root@quadux) (gcc version 4.3.4 (Gentoo 4.3.4 p1.0, pie-10.1.5) ) #8 SMP Fri Dec 18 14:00:52 CET 2009 --14250-- Arch and hwcaps: AMD64, amd64-sse3-cx16 --14250-- Page sizes: currently 4096, max supported 4096 --14250-- Valgrind library directory: /usr/local/lib/valgrind --14250-- Reading syms from /bin/true (0x400000) --14250-- Considering /usr/lib/debug/bin/true.debug .. --14250-- .. CRC is valid --14250-- Reading syms from /lib64/ld-2.11.2.so (0x4000000) --14250-- Considering /usr/lib/debug/lib64/ld-2.11.2.so.debug .. --14250-- .. CRC is valid valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fixes: (1, short term): install glibc's debuginfo valgrind: package on this machine. (2, longer term): ask the packagers valgrind: for your Linux distribution to please in future ship a non- valgrind: stripped ld.so (or whatever the dynamic linker .so is called) valgrind: that exports the above-named function using the standard valgrind: calling conventions for this platform. valgrind: valgrind: Cannot continue -- exiting now. Sorry.
Well, are there any news on this one? I'd ove to have valgrind working again after more than a year now... Stephan
There's still activity on this bug and it may not be resolved. I'm going to reopen it.
I emerged valgrind 3.6.0 and it didn't work. strace revealed the solution: cd /usr/lib64/debug ln -s lib64 lib Now it works for me.
I remember checking this a couple of days ago with valgrind 3.5 and it wasn't working. Emerged 3.6 yesterday and it's finally working without thousands of errors.
sadly I can not confirm this works now. valgrind 3.6 just came in on upstream and the bug still occurs. I also tried the link trick Joerg mentioned. It's probably wrong though since the link points to nothing here. I have "nostrip" and not "splitdebug". Is there anything else one can try?
(In reply to comment #50) > sadly I can not confirm this works now. valgrind 3.6 just came in on upstream > and the bug still occurs. > > I also tried the link trick Joerg mentioned. It's probably wrong though since > the link points to nothing here. > > I have "nostrip" and not "splitdebug". > > Is there anything else one can try? > Can you post your emerge --info. I know there's an emerge info above, but its on an old 2008.0 profile. I'm going to build a vm which is a close as possible to reproduce this bug and nail it.
Created attachment 254083 [details] emerge --info Sure, no problem. Here it is. Let me know if I can provide anything else. Cheers, Stephan
Created attachment 254133 [details] My emerge info I was not able to hit the bug, ie. valgrind worked fine for me on ls, ldd and some other random tests. Comparing my emerge --info with the the previous comment, the only significant differences I see is glibc. I have 2.11.2-r3 and above we have 2.12.1-r3. There are some other difference but I'm not sure they are significant. Maybe we should compare make.conf's. I have: CFLAGS="-O2 -pipe -ggdb" CXXFLAGS="${CFLAGS}" CHOST="x86_64-pc-linux-gnu" USE="mmx sse sse2" FEATURES="nostrip"
I also have no problems with valgrind anymore. I'm on glibc 2.11.2-r3. I never updated to 2.12 simply because portage seems to make it impossible to downgrade again unless I start hacking it. So I tend to wait many months before updating glibc on my system.
Okay I think I know what's going on here. To confirm my suspicion, I upgraded from glibc 2.11.2-r3 to 2.12.1-r3 by *just* emerge =sys-lib/glibc-2.12.1-r3 and leaving the rest of my system untouched. Immediately after, I tried valgrind which had worked under 2.11.2-r3 and it failed --- see the failure below. I then emerge -e world and it worked. Aside: upgrading glibc should not be taken lightly. Whenever I change any part of my toolchain, I emerge -e world. Can someone repeat the above and confirm? valgrind portage # valgrind /bin/ls ==30229== Memcheck, a memory error detector ==30229== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==30229== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info ==30229== Command: /bin/ls ==30229== ==30229== Conditional jump or move depends on uninitialised value(s) ==30229== at 0x40163F6: index (strchr.S:56) ==30229== by 0x4007264: expand_dynamic_string_token (dl-load.c:320) ==30229== by 0x400766F: _dl_map_object (dl-load.c:2184) ==30229== by 0x400198D: map_doit (rtld.c:629) ==30229== by 0x400D8A5: _dl_catch_error (dl-error.c:178) ==30229== by 0x40018A6: do_preload (rtld.c:813) ==30229== by 0x40044A7: dl_main (rtld.c:1691) ==30229== by 0x4014907: _dl_sysdep_start (dl-sysdep.c:244) ==30229== by 0x40014D2: _dl_start (rtld.c:334) ==30229== by 0x4000BA7: ??? (in /lib64/ld-2.12.1.so) ==30229== ==30229== Conditional jump or move depends on uninitialised value(s) ==30229== at 0x40163FB: index (strchr.S:59) ==30229== by 0x4007264: expand_dynamic_string_token (dl-load.c:320) ==30229== by 0x400766F: _dl_map_object (dl-load.c:2184) ==30229== by 0x400198D: map_doit (rtld.c:629) ==30229== by 0x400D8A5: _dl_catch_error (dl-error.c:178) ==30229== by 0x40018A6: do_preload (rtld.c:813) ==30229== by 0x40044A7: dl_main (rtld.c:1691) ==30229== by 0x4014907: _dl_sysdep_start (dl-sysdep.c:244) ==30229== by 0x40014D2: _dl_start (rtld.c:334) ==30229== by 0x4000BA7: ??? (in /lib64/ld-2.12.1.so) ==30229== bin package.keywords package.unmask postsync.d savedconfig ==30229== ==30229== HEAP SUMMARY: ==30229== in use at exit: 20,171 bytes in 11 blocks ==30229== total heap usage: 16 allocs, 5 frees, 53,057 bytes allocated ==30229== ==30229== LEAK SUMMARY: ==30229== definitely lost: 0 bytes in 0 blocks ==30229== indirectly lost: 0 bytes in 0 blocks ==30229== possibly lost: 0 bytes in 0 blocks ==30229== still reachable: 20,171 bytes in 11 blocks ==30229== suppressed: 0 bytes in 0 blocks ==30229== Rerun with --leak-check=full to see details of leaked memory ==30229== ==30229== For counts of detected and suppressed errors, rerun with: -v ==30229== Use --track-origins=yes to see where uninitialised values come from ==30229== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)
Hi, I'd like to reply to some of the last comments. Yes, I do have glibc 2.12 but I really don't think this is so closely related as it may seem. a) I've been having this problem for at least a year now and during that time I've had a fair number of glibc revisions. I always do the upgrade when it comes in on upstream and rarely had any kind of problem. b) I maintain 2 systems with almost completely the same setup, make.conf and all but one is x86 and one is AMD64. They have the same glibc revision and build flags. The problem only occurs on the AMD64 one. Sadly, this is the one at work where I need valgrind :-( I believe the reason is the debug info for glibc can either not be found or when they can, don't contain the strlen symbol because of strlen being inlined away by or some such optimization magic. However, I can do a downgrade to prove my point here.
(In reply to comment #56) > > However, I can do a downgrade to prove my point here. > Correction: No, I can not. I just learned downgrading glibc is not supported. Sorry. But consider glibc 2.12 wasn't out a year ago when this bug was filed. S.
(In reply to comment #55) > leaving the rest of my system untouched. Immediately after, I tried valgrind > which had worked under 2.11.2-r3 and it failed --- see the failure below. I > then emerge -e world and it worked. Anthony, to respond to this one, the output you post is a relatively normal valgrind output. The tool actually runs all the way to the end to present you the Heap summary. So this is not an error. At least not one related to the one this bug is about. So your test just proved you don't have the problem with your make.conf and toolset. Since you have just set this up, maybe it's got something to do with the age of the system. Maybe some glibc upgrade a long time ago left somethig behind that causes this so a freshly setup system won't have it? Can anyone tell me how I can scan the system for possible debris of older upgrades? Is there a gentoo tool for that? S.
(In reply to comment #58) > The tool actually runs all the way to the end to present you > the Heap summary. So this is not an error. At least not one related to the one > this bug is about. So your test just proved you don't have the problem with > your make.conf and toolset. Yep, the original was strictly on strlen. > Since you have just set this up, maybe it's got something to do with the age of > the system. Maybe some glibc upgrade a long time ago left somethig behind that > causes this so a freshly setup system won't have it? Can anyone tell me how I > can scan the system for possible debris of older upgrades? Is there a gentoo > tool for that? Upgrade paths make a difference and I've seen toolchains break as a result. I wish there were a tool to find such problems. The best I can think of is something like epm where you can do epm -qf to see if a file belongs to a package or not. However, this is like searching for a needle in a haystack. I don't know how to proceed with this one. If you can hit it using a fresh build and post step by step how to get to the error it would make life a lot easier --- I understand the difficulty in this :(
Well, frankly a fresh install is for me just not in the cards right now. What I just did was trying to find such orphans. First I looked for *.la files and deleted all the orphaned ones. Just a few. Then I did the same search for left behind *.so files as such: qfile -o `find /lib/ /usr/lib/ /lib32 /lib64 /usr/local/lib/ /usr/x86_64-pc-linux-gnu/lib/ -name \*.so\*` Surprisingly there were quite a few. I have no idea though whether it would be wise to delete them. This is the output I got: /lib/libdevmapper-event-lvm2mirror.so.2.02 /lib/libdevmapper-event-lvm2snapshot.so.2.02 /lib/libgcc_s.so.1 /lib/modules/2.6.35-gentoo-r10/modules.softdep /lib/modules/2.6.36-gentoo/modules.softdep /usr/lib/libboost_python-mt.so /usr/lib/libboost_python.so /usr/lib/libboost_prg_exec_monitor-mt.so /usr/lib/libboost_wserialization.so /usr/lib/libpgtypes.so /usr/lib/libnvidia-cfg.so.1 /usr/lib/libdb_java.so /usr/lib/libnvcuvid.so.1 /usr/lib/libboost_thread-mt.so /usr/lib/libtalloc.so.2 /usr/lib/libboost_math_c99l.so /usr/lib/libboost_iostreams-mt.so /usr/lib/libdb_cxx.so /usr/lib/libblas.so /usr/lib/libboost_math_tr1f-mt.so /usr/lib/libecpg.so /usr/lib/libblas.so.0 /usr/lib/libboost_math_tr1l-mt.so /usr/lib/libboost_system-mt.so /usr/lib/libboost_math_c99l-mt.so /usr/lib/libboost_prg_exec_monitor.so /usr/lib/libcblas.so /usr/lib/libboost_graph_parallel-mt.so /usr/lib/nsbrowser/plugins/npwrapper.nppdf.so /usr/lib/nsbrowser/plugins/javaplugin.so /usr/lib/libboost_mpi-mt.so /usr/lib/libecpg_compat.so /usr/lib/libboost_regex.so /usr/lib/libboost_signals-mt.so /usr/lib/libboost_iostreams.so /usr/lib/libboost_math_tr1.so /usr/lib/libboost_serialization.so /usr/lib/libboost_date_time.so /usr/lib/libcblas.so.0 /usr/lib/libboost_system.so /usr/lib/libboost_mpi_python-mt.so /usr/lib/libboost_random-mt.so /usr/lib/libGL.so /usr/lib/libboost_wave-mt.so /usr/lib/libboost_unit_test_framework-mt.so /usr/lib/libGLcore.so /usr/lib/libboost_random.so /usr/lib/libboost_filesystem.so /usr/lib/libboost_math_c99f.so /usr/lib/libboost_graph-mt.so /usr/lib/libboost_math_tr1-mt.so /usr/lib/libboost_regex-mt.so /usr/lib/libdb.so /usr/lib/libboost_filesystem-mt.so /usr/lib/xorg/modules/extensions/libglx.so /usr/lib/libboost_wave.so /usr/lib/libpq.so /usr/lib/libboost_signals.so /usr/lib/python2.6/site-packages/mpi.so /usr/lib/libboost_math_tr1l.so /usr/lib/libboost_math_c99-mt.so /usr/lib/libboost_mpi.so /usr/lib/libboost_math_c99f-mt.so /usr/lib/libboost_math_c99.so /usr/lib/libboost_program_options.so /usr/lib/libboost_unit_test_framework.so /usr/lib/libboost_program_options-mt.so /usr/lib/libboost_math_tr1f.so /usr/lib/libboost_serialization-mt.so /usr/lib/libboost_graph.so /usr/lib/libboost_date_time-mt.so /usr/lib/libXvMCNVIDIA_dynamic.so.1 /usr/lib/libboost_thread.so /usr/lib/libboost_wserialization-mt.so /lib32/libgcc_s.so.1 /lib64/libdevmapper-event-lvm2mirror.so.2.02 /lib64/libdevmapper-event-lvm2snapshot.so.2.02 /lib64/libgcc_s.so.1 /lib64/modules/2.6.35-gentoo-r10/modules.softdep /lib64/modules/2.6.36-gentoo/modules.softdep /usr/local/lib/libcurlpp.so /usr/local/lib/libglog.so /usr/local/lib/libutilspp.so /usr/local/lib/libcurlpp.so.0 /usr/local/lib/libglog.so.0 /usr/local/lib/libgtest.so.0.0.0 /usr/local/lib/libgtest_main.so /usr/local/lib/libcurlpp.so.0.0.2 /usr/local/lib/libgtest.so /usr/local/lib/libjrtp.so /usr/local/lib/libjrtp-3.8.0.so /usr/local/lib/libeqserver.so /usr/local/lib/libeq.so /usr/local/lib/libgtest_main.so.0 /usr/local/lib/libutilspp.so.0 /usr/local/lib/libgtest_main.so.0.0.0 /usr/local/lib/libglog.so.0.0.0 /usr/local/lib/libutilspp.so.0.0.0 /usr/local/lib/libgtest.so.0 /usr/x86_64-pc-linux-gnu/lib/libopcodes-2.20.1.20100303.so /usr/x86_64-pc-linux-gnu/lib/libbfd.so /usr/x86_64-pc-linux-gnu/lib/libbfd-2.20.1.20100303.so /usr/x86_64-pc-linux-gnu/lib/libopcodes.so I really didn't expect that many. The way I understand gentoo it doesn't know about these files so in a fresh install they wouldn't be there but the system would work, right? So I could delete them. But would that do any good about this? If there is no chance it could solve the issue I#d rather leave them alone because there appears to be some risk involved. What's your suggestion? Cheers, Stephan
When trying to emerge glibc-2.11.2-r3 with USE="debug" and FEATURES="splitdebug" got next installation error: * >>> SetUID: [chmod go-r] /usr/lib32/misc/glibc/pt_chown ... [ ok ] !!! FAILED prerm: 2816 * The ebuild phase 'postrm' has exited unexpectedly. This type of behavior * is known to be triggered by things such as failed variable assignments * (bug #190128) or bad substitution errors (bug #200313). Normally, before * exiting, bash should have displayed an error message above. If bash did * not produce an error message above, it's possible that the ebuild has * called `exit` when it should have called `die` instead. This behavior * may also be triggered by a corrupt bash binary or a hardware problem * such as memory or cpu malfunction. If the problem is not reproducible or * it appears to occur randomly, then it is likely to be triggered by a * hardware problem. If you suspect a hardware problem then you should try * some basic hardware diagnostics such as memtest. Please do not report * this as a bug unless it is consistently reproducible and you are sure * that your bash binary and hardware are functioning properly. Sandboxed process killed by signal: Segmentation fault * The ebuild phase 'die_hooks' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. !!! FAILED postrm: 1 * The 'postrm' phase of the 'sys-libs/glibc-2.11.2-r3' package has failed * with exit value 1. * * The problem occurred while executing the ebuild file named * 'glibc-2.11.2-r3.ebuild' located in the '/var/db/pkg/sys- * libs/glibc-2.11.2-r3' directory. If necessary, manually remove the * environment.bz2 file and/or the ebuild file located in that directory. * * Removal of the environment.bz2 file is preferred since it may allow the * removal phases to execute successfully. The ebuild will be sourced and * the eclasses from the current portage tree will be used when necessary. * Removal of the ebuild file will cause the pkg_prerm() and pkg_postrm() * removal phases to be skipped entirely. * The ebuild phase 'postinst' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. Sandboxed process killed by signal: Segmentation fault * The ebuild phase 'die_hooks' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. !!! FAILED postinst: 1 Sandboxed process killed by signal: Segmentation fault * The ebuild phase 'success_hooks' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. Sandboxed process killed by signal: Segmentation fault * The ebuild phase 'die_hooks' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. * Messages for package sys-libs/glibc-2.11.2-r3: * The ebuild phase 'postrm' has exited unexpectedly. This type of behavior * is known to be triggered by things such as failed variable assignments * (bug #190128) or bad substitution errors (bug #200313). Normally, before * exiting, bash should have displayed an error message above. If bash did * not produce an error message above, it's possible that the ebuild has * called `exit` when it should have called `die` instead. This behavior * may also be triggered by a corrupt bash binary or a hardware problem * such as memory or cpu malfunction. If the problem is not reproducible or * it appears to occur randomly, then it is likely to be triggered by a * hardware problem. If you suspect a hardware problem then you should try * some basic hardware diagnostics such as memtest. Please do not report * this as a bug unless it is consistently reproducible and you are sure * that your bash binary and hardware are functioning properly. * The 'postrm' phase of the 'sys-libs/glibc-2.11.2-r3' package has failed * with exit value 1. * * The problem occurred while executing the ebuild file named * 'glibc-2.11.2-r3.ebuild' located in the '/var/db/pkg/sys- * libs/glibc-2.11.2-r3' directory. If necessary, manually remove the * environment.bz2 file and/or the ebuild file located in that directory. * * Removal of the environment.bz2 file is preferred since it may allow the * removal phases to execute successfully. The ebuild will be sourced and * the eclasses from the current portage tree will be used when necessary. * Removal of the ebuild file will cause the pkg_prerm() and pkg_postrm() * removal phases to be skipped entirely. * Messages for package sys-libs/glibc-2.11.2-r3: * The ebuild phase 'postinst' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly.
Created attachment 258334 [details] My emerge --info Also I get in stderr: "!!! No gcc found. You probably need to 'source /etc/profile' !!! to update the environment of this terminal and possibly !!! other terminals also." And I get segfaults in catch signals and cann't run gdb.
Also,I cann't reemere without debug: # emerge -1 glibc * IMPORTANT: 6 news items need reading for repository 'gentoo'. * Use eselect news to read news items. Calculating dependencies... done! >>> Verifying ebuild manifests >>> Emerging (1 of 1) sys-libs/glibc-2.11.2-r3 Sandboxed process killed by signal: Segmentation fault * The ebuild phase 'die_hooks' has exited unexpectedly. This type of * behavior is known to be triggered by things such as failed variable * assignments (bug #190128) or bad substitution errors (bug #200313). * Normally, before exiting, bash should have displayed an error message * above. If bash did not produce an error message above, it's possible * that the ebuild has called `exit` when it should have called `die` * instead. This behavior may also be triggered by a corrupt bash binary or * a hardware problem such as memory or cpu malfunction. If the problem is * not reproducible or it appears to occur randomly, then it is likely to * be triggered by a hardware problem. If you suspect a hardware problem * then you should try some basic hardware diagnostics such as memtest. * Please do not report this as a bug unless it is consistently * reproducible and you are sure that your bash binary and hardware are * functioning properly. >>> Failed to emerge sys-libs/glibc-2.11.2-r3 * Messages for package sys-libs/glibc-2.11.2-r3: * IMPORTANT: 6 news items need reading for repository 'gentoo'. * Use eselect news to read news items.
Oleg, are you sure this has got anything to do with this bug? If so, what makes you think so? Cheers, Stephan
Can I add debug info into glibc by other methods?
(In reply to comment #65) > Can I add debug info into glibc by other methods? No, because the *.so files themselves need to contain links to the debug information files. Do not put "debug" in USE. Never do that! All you need is "-g" in CFLAGS and "splitdebug" in FEATURES. Putting "debug" in USE does much more than just generate debug info; it enabled extra debugging paths (assertions and extra checks) in the code and is intended to be used by developers of those packages. Also, your CFLAGS look silly to me. Use something line "-pipe -g -O2 -march=generic".
(In reply to comment #66) > Also, your CFLAGS look silly to me. Use something line "-pipe -g -O2 > -march=generic". Er, oops, I meant "-march=native" of course, not "generic".
Is it possible that using older gcc as default compiler causes this problem? I have two amd64 systems which are very similar. Both use glibc-2.12.1-r3. But valgrind reports the error only on the system which has gcc-4.4.2 as system compiler. I did emerge -e world and glibc is emerged with FEATURES="splitdebug" and valgrind still doesn't work. The other system with gcc-4.5.2 as system compiler doesn't have any problems with valgrind. Info from the system where valgrind doesn't work: $ gcc-config -l [1] x86_64-pc-linux-gnu-4.2.4 [2] x86_64-pc-linux-gnu-4.3.5 [3] x86_64-pc-linux-gnu-4.4.2 * [4] x86_64-pc-linux-gnu-4.5.2 $ valgrind -v ldd ==4747== Memcheck, a memory error detector ==4747== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==4747== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info ==4747== Command: /usr/bin/ldd ==4747== --4747-- Valgrind options: --4747-- -v --4747-- Contents of /proc/version: --4747-- Linux version 2.6.38-gentoo-r1 (root@d630) (gcc version 4.4.2 (Gentoo 4.4.2 p1.0) ) #1 SMP PREEMPT Mon Apr 18 10:58:47 PDT 2011 --4747-- Arch and hwcaps: AMD64, amd64-sse3-cx16 --4747-- Page sizes: currently 4096, max supported 4096 --4747-- Valgrind library directory: /usr/lib64/valgrind --4747-- Reading syms from /bin/bash (0x400000) --4747-- object doesn't have a symbol table --4747-- Reading syms from /lib64/ld-2.12.1.so (0x4000000) --4747-- Considering /usr/lib/debug/lib64/ld-2.12.1.so.debug .. --4747-- .. CRC is valid valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: ... valgrind: Cannot continue -- exiting now. Sorry. $ nm /usr/lib/debug/lib64/ld-2.12.1.so.debug | grep ' strlen$' $ nm /usr/lib/debug/lib32/ld-2.12.1.so.debug | grep ' strlen$' $ nm /usr/lib/debug/lib64/libc-2.12.1.so.debug | grep ' strlen$' 0000000000078960 i strlen $ nm /usr/lib/debug/lib64/libc-2.12.1.so.debug | grep ' malloc$' 0000000000073c79 T malloc $ file /usr/lib* /usr/lib: symbolic link to `lib64' /usr/lib32: directory /usr/lib64: directory /usr/libexec: directory
(In reply to comment #68) > Is it possible that using older gcc as default compiler causes this problem? I > have two amd64 systems which are very similar. Both use glibc-2.12.1-r3. But > valgrind reports the error only on the system which has gcc-4.4.2 as system > compiler. I did emerge -e world and glibc is emerged with FEATURES="splitdebug" > and valgrind still doesn't work. > The other system with gcc-4.5.2 as system compiler doesn't have any problems > with valgrind. Info from the system where valgrind doesn't work: > I have not been able to get a handle on this bug because I was never able to reproduce it. I've used valgrind with both compilers, no problem. My suspicion is that your compiler is broken but that's just a guess because you were able to emerge -e world with it.
(In reply to comment #0) > valgrind: Fatal error at startup: a function redirection > valgrind: which is mandatory for this platform-tool combination > valgrind: cannot be set up. Details of the redirection are: > valgrind: > valgrind: A must-be-redirected function > valgrind: whose name matches the pattern: strlen > valgrind: in an object with soname matching: ld-linux-x86-64.so.2 > valgrind: was not found whilst processing > valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 I was getting exactly this ^^^ even for 'valgrind /bin/true' after upgrade to glibc-2.12.2 with gcc-4.5.2 and -march=nocona in CFLAGS. I temporarily removed -march=nocona from CFLAGS, rebuilt glibc and the problem vanished.
You should replace -march=nocona with -march=native anyway.
(In reply to comment #71) > You should replace -march=nocona with -march=native anyway. Is -march=native compatible with distcc? If it inherited the flags from the distcc host, which defaults to -march=core2, it would end up with a broken glibc installation on my laptop. Anyway, I suspect that -march=nocona is equivalent to -march=native on my laptop. Why do you think that -march=native is going to solve the problem in general?
(In reply to comment #72) > Anyway, I suspect that -march=nocona is equivalent to -march=native on my > laptop. Why do you think that -march=native is going to solve the problem in > general? Because traditionally, "nocona" was used due to GCC not having a flag for "core2" and "atom" in earlier versions. I assumed you were one of those cases, simply because there are so many core2 and atom users who still use "nocona" ;-) If you really have a Pentium4 (with 64-bit extensions) laptop though, then yes, "nocona" is correct. No idea about distcc.
It does, however, not explain why the symbol strlen is missing in /lib64/ld-2.12.2.so when compiled with -march=nocona. Is it a glibc bug? A gcc bug? valgrind's wrong assumption about availability of that symbol? A user error? I would still like to get back the optimized glibc as soon as this issue is resolved. Here are the involved ebuilds: sys-devel/gcc-4.5.2 sys-libs/glibc-2.12.2 dev-util/valgrind-3.6.1
The problem is that gcc optimizes out all calls of strlen() when building ld-2.12.2.so and the symbol is then not available from outside, which badly surprises valgrind. Forwarded upstream: https://bugs.kde.org/show_bug.cgi?id=190429#c17
Thank you Kamil. I confirm that changing -march=nocona to -march=core2 in make.conf and reemerging glibc solved the issue.
I just gave it another try here and it doesn't occur anymore. Don't know why or since when exactly but I consider it fixed. Stephan
Created attachment 311403 [details] My emerge --info
Still there for me, attached emerge --info valgrind-3.6.1-r3 note -march=native in my make.conf
I just want to add a fact for the casual reader who looks into this bug report because *running* valgrind (rather than building it) spews out a long of messages regarding libc internals rather than problems in the actual application. The important thing to note here is that valgrind creates an exclusion list for such libc-internal warnings when it is emerged. Therefore, every time glibc gets emerged, valgrind should be re-emerged after that too. Then valgrind will rebuild its exclusion list, and glibc-internal problems will no longer be displayed.
(In reply to comment #80) > The important thing to note here is that valgrind creates an exclusion list > for such libc-internal warnings when it is emerged. That's certainly an important thing, but not the only thing discussed here. > Therefore, every time glibc gets emerged, valgrind should be re-emerged > after that too. > > Then valgrind will rebuild its exclusion list, and glibc-internal problems > will no longer be displayed. Still, if /lib/ld-2.??.?.so does not expose the strlen symbol, valgrind fails to start. Rebuild of valgrind could hardly solve this issue, since it is glibc that needs to be rebuilt to make the symbol available.
*** This bug has been marked as a duplicate of bug 390323 ***