This applies to valgrind v2.1.1. Using a sample program with an array indexing bug, I run the following cmd: valgrind --tool=memcheck --db-attach=yes ./dumb This produces the following output: ==13657== Memcheck, a memory error detector for x86-linux. ==13657== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward. ==13657== Using valgrind-2.1.1, a program supervision framework for x86-linux. ==13657== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward. ==13657== For more details, rerun with: -v ==13657== ==13657== Invalid read of size 4 ==13657== at 0x80483B9: main (dumb.c:7) ==13657== Address 0x3C15D054 is not stack'd, malloc'd or free'd ==13657== ==13657== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- I reply 'y', and the following output shows that valgrind-initiated gdb fails to open the binary via /proc: <--------------------------------------- starting debugger ==13657== starting debugger with cmd: /usr/bin/gdb -nw /proc/13693/fd/822 13693 valgrind: vg_signals.c:1587 (vg_sync_signalhandler): Assertion `info->si_code <= 0' failed. ==13693== at 0xB802FA30: vgPlain_skin_assert_fail (vg_mylibc.c:1211) ==13693== by 0xB802FA2F: assert_fail (vg_mylibc.c:1207) ==13693== by 0xB802FA9D: vgPlain_core_assert_fail (vg_mylibc.c:1218) ==13693== by 0xB8036125: vg_sync_signalhandler (vg_signals.c:1630) sched status: Thread 1: status = Runnable, associated_mx = 0x0, associated_cv = 0x0 ==13693== at 0x80483B0: main (dumb.c:5) Note: see also the FAQ.txt in the source distribution. It contains workarounds to several common problems. If that doesn't help, please report this bug to: valgrind.kde.org In the bug report, send all the above text, the valgrind version, and what Linux distro you are using. Thanks. GNU gdb 6.0 Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu".../proc/13693/fd/822: Permission denied. Attaching to process 13693 ptrace: Operation not permitted. /home/graham/junk/13693: No such file or directory. ---------------------------------------> Looking in /proc, I can see that a number of nnnn directories exist for each of my processes. Each of the "live" ones contains files that are owned by me. However, the perms for 13693 are: graham@spiceisland junk $ ls -l /proc/13693 ls: cannot read symbolic link /proc/13693/cwd: Permission denied ls: cannot read symbolic link /proc/13693/root: Permission denied ls: cannot read symbolic link /proc/13693/exe: Permission denied total 0 -r--r--r-- 1 root root 0 Apr 27 00:22 cmdline lrwxrwxrwx 1 root root 0 Apr 27 00:22 cwd -r-------- 1 root root 0 Apr 27 00:22 environ lrwxrwxrwx 1 root root 0 Apr 27 00:22 exe dr-x------ 2 root root 0 Apr 27 00:22 fd -r--r--r-- 1 root root 0 Apr 27 00:22 maps -rw------- 1 root root 0 Apr 27 00:22 mem -r--r--r-- 1 root root 0 Apr 27 00:22 mounts lrwxrwxrwx 1 root root 0 Apr 27 00:22 root -r--r--r-- 1 root root 0 Apr 27 00:22 stat -r--r--r-- 1 root root 0 Apr 27 00:22 statm -r--r--r-- 1 root root 0 Apr 27 00:22 status Doing a "cat /proc//proc/13693/status" shows that 13693 is a zombie: Name: valgrind State: Z (zombie) Tgid: 13693 Pid: 13693 PPid: 13657 TracerPid: 0 Uid: 1000 1000 1000 1000 Gid: 407 407 407 407 FDSize: 0 Groups: 407 10 35 80 441 SigPnd: 0000000000000000 SigBlk: fffffffffffbfeff SigIgn: 0000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 and yet this is the process being exuted by valgrind and the target of the attempted attaching of gdb! Non-attaching modes of operation of valgrind 2.1.1 seem to work fine, but db-attach seems badly broken ... Reproducible: Always Steps to Reproduce: emerge info produces: Portage 2.0.50-r6 (default-x86-1.4, gcc-3.3.2, glibc-2.3.2-r9, 2.4.25-gentoo-r2)================================================================= System uname: 2.4.25-gentoo-r2 i686 AMD Athlon(tm) processor Gentoo Base System version 1.4.9 Autoconf: sys-devel/autoconf-2.58-r1 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-O3 -mcpu=athlon-tbird -march=athlon-tbird -funroll-loops -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref /usr/share/config /var/lib/jboss /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O3 -mcpu=athlon-tbird -march=athlon-tbird -funroll-loops -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs ccache sandbox" GENTOO_MIRRORS="http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ http://gentoo.tiscali.nl/gentoo/ http://ftp.heanet.ie/pub/gentoo/ http://ftp.snt.utwente.nl/pub/os/linux/gentoo"MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="X apache2 apm avi berkdb cdr crypt cups encode esd foomaticdb gdbm gif gimpprint gnome gpm gtk gtk2 gtkhtml imap imlib java jpeg ldap libg++ libwww mad mikmod motif mozilla moznoirc mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png postgres ppds python quicktime readline sdl slang spell ssl svga tcltk tcpd tiff truetype usb x86 xml2 xmms xv zlib" This also applies to a Gentoo system I have at work which is a Pentium 4 with gcc/g++ -O2 set.
Created attachment 30119 [details] Example program used to test valgrind
Just noticed from the sample output I gave, that the process number generated by valgrind when all the assertions fail is different from the process number output at the beginning. This is not a mistake on my part, this is what happens! Got a colleague to try this out on their Debian system (woody, 2.4.18), and this (failure to attach) happens to them. Tried out the same thing on a RH9 system (2.4.20-30.9smp) and everything works fine!?! Finally, just downloaded a copy of valgrind from their CVS. Gdb attaching now works fine there too, so I guess there is a bug in 2.1.1 that has been fixed or worked around.
Yep, found the bug-report over at valgrind web site. Fixed 6 days ago (21st April). Have a look at: http://bugs.kde.org/show_bug.cgi?id=77824 Concocted patch with a view to creating a 2.1.1-r1 ebuild. Applied patch and re-ran my test scenario. Immediate problem now fixed (valgrind now successfully attaches gdb), but then trying other stuff from with gdb (such as step, next or cont) produces a similar assertion failure and/or segmentation violation as before. This does NOT happen in the CVS sources of valgrind that I have downloaded and built. I tried to produce a diff between 2.1.1 and 2.1.2.CVS, but this as over 180K, so that seems a little impractical! Personally, I'm giving up on valgrind 2.1.1 and going back to 2.0.0 until 2.1.2 comes out. If anyone else can be bothered (i.e. can't wait for 2.1.2), I'm willing to test any fixes/patches they produce. Ho hum.
Final note (for the moment)! Should say that 2.1.0 is even more broken in this respect than 2.1.1, but since both are development snapshots, not much point in trying to fix 2.1.0.
Problem does not occur in Valgrind v2.2.0 (just released). Suggest this bug is binned. Also, after a new ebuild for 2.2.0 arrives, suggest ebuilds for 2.1.0 and 2.1.1 are removed?