Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 264558 - sys-libs/glibc-2.8_p20080602-r1: ld-2.8.so reads from uninitialised memory regions
Summary: sys-libs/glibc-2.8_p20080602-r1: ld-2.8.so reads from uninitialised memory re...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
: 271596 (view as bug list)
Depends on: 274771
Blocks:
  Show dependency tree
 
Reported: 2009-04-01 22:23 UTC by Stephan Krauß
Modified: 2009-09-06 16:46 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Test code showing some misbehaviour (test.tar.bz2,5.06 KB, text/plain)
2009-04-04 14:27 UTC, Stephan Krauß
Details
Test code showing some misbehaviour (test.tar.bz2,5.06 KB, application/octet-stream)
2009-04-04 14:29 UTC, Stephan Krauß
Details
valgrind ldd (error,188.11 KB, text/plain)
2009-06-13 16:47 UTC, Nikos Chantziaras
Details
emerge --info (emerge--info,4.14 KB, text/plain)
2009-06-13 16:48 UTC, Nikos Chantziaras
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stephan Krauß 2009-04-01 22:23:45 UTC
ld-2.8.so reads from uninitialised variables while doing dynamic linking. A program of mine therefore computes wrong results. (But I suppose this affects all programs, which are linked dynamically.) Originally, I thought my program would read from uninitialised memory regions, but using Valgrind I found out, that it is actually the fault of ld.so. (I tested it under an Ubuntu installation, where it computes everything correctly. So this is really the linker's fault.)

Reproducible: Always

Steps to Reproduce:
1. Make sure you have glibc-2.8-2.8_p20080602-r1 installed.
2. Run a program with Valgrind or gdb.
   For example: valgrind ls
3. Watch it doing indeterministic jumps.
Actual Results:  
The relevant parts of the Valgrind output:
==15748== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)
==15748== 
==15748== 1 errors in context 1 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400ACA4: _dl_relocate_object (do-rel.h:117)
==15748==    by 0x4003F4E: dl_main (rtld.c:2304)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 2 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400AB61: _dl_relocate_object (do-rel.h:68)
==15748==    by 0x4003F4E: dl_main (rtld.c:2304)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 3 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400AB59: _dl_relocate_object (do-rel.h:65)
==15748==    by 0x4003F4E: dl_main (rtld.c:2304)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 4 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400ACA4: _dl_relocate_object (do-rel.h:117)
==15748==    by 0x400410D: dl_main (rtld.c:2234)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 5 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400B019: _dl_relocate_object (do-rel.h:104)
==15748==    by 0x400410D: dl_main (rtld.c:2234)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 6 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400AB61: _dl_relocate_object (do-rel.h:68)
==15748==    by 0x400410D: dl_main (rtld.c:2234)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== 
==15748== 1 errors in context 7 of 7:
==15748== Conditional jump or move depends on uninitialised value(s)
==15748==    at 0x400AB59: _dl_relocate_object (do-rel.h:65)
==15748==    by 0x400410D: dl_main (rtld.c:2234)
==15748==    by 0x4014225: _dl_sysdep_start (dl-sysdep.c:239)
==15748==    by 0x400138E: _dl_start (rtld.c:330)
==15748==    by 0x4000986: (within /lib/ld-2.8.so)
==15748==  Uninitialised value was created by a stack allocation
==15748==    at 0x400AA66: _dl_relocate_object (dl-reloc.c:142)
==15748== IN SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)

Expected Results:  
No errors. Neither in Valgrind output, nor in my program.

The obligatory emerge --info:

Portage 2.1.6.7 (default/linux/x86/2008.0/desktop, gcc-4.1.2, glibc-2.8_p20080602-r1, 2.6.27-gentoo-r8 i686)
=================================================================
System uname: Linux-2.6.27-gentoo-r8-i686-Intel-R-_Pentium-R-_M_processor_2.00GHz-with-glibc2.0
Timestamp of tree: Wed, 01 Apr 2009 21:00:16 +0000
app-shells/bash:     3.2_p39
dev-java/java-config: 2.1.7
dev-lang/python:     2.5.2-r7
dev-util/cmake:      2.4.8
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium-m -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-O2 -march=pentium-m -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://ftp.wh2.tu-dresden.de/pub/mirrors/gentoo 		ftp://files.gentoo.org	http://files.gentoo.org"
LANG="de_DE.UTF-8"
LC_ALL="de_DE.UTF-8"
LDFLAGS="-Wl,-O1"
LINGUAS="de"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa berkdb bluetooth bzip2 cairo cdr cli cracklib crypt cups dbus dri dvd dvdr dvdread emboss encode exif fam firefox fortran gdbm gif gnutls gpm gstreamer gtk hal iconv ipv6 isdnlog java jpeg jpeg2k laptop libnotify mad midi mikmod mmx mp3 mpeg mudflap ncurses nls nptl nptlonly ogg opengl openmp pam pcre pdf perl png ppds pppd python qt3support quicktime readline reflection sdl session spl sse sse2 ssl startup-notification svg sysfs tcpd tiff truetype unicode usb vorbis win32codecs x264 x86 xcomposite xml xorg xulrunner xv xvid xvmc zlib" ALSA_CARDS="intel8x0 intel8x0m" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de" USERLAND="GNU" VIDEO_CARDS="fbdev radeon vesa vga vmware"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Comment 1 Peter Alfredsen (RETIRED) gentoo-dev 2009-04-01 22:38:47 UTC
Do:
FEATURES="nostrip" emerge -1 glibc and see if this doesn't go away.

*** This bug has been marked as a duplicate of bug 47576 ***
Comment 2 Stephan Krauß 2009-04-01 23:17:38 UTC
I did already re-emerge it with added debug use-flag, -g cflag and nostrip feature. That's why you can see the function calls and source code lines. But just in case you changed something, I'm doing it again...
Comment 3 Stephan Krauß 2009-04-01 23:41:27 UTC
I re-emerged it again and the problem is still there. Maybe this is a regression? (Sorry for not finding bug 47576.)
Comment 4 Peter Alfredsen (RETIRED) gentoo-dev 2009-04-02 00:08:06 UTC
That's weird, reopening.
Comment 5 SpanKY gentoo-dev 2009-04-02 01:35:35 UTC
post some code that actually exhibits a problem.  valgrind has a history of not being reliable with the ldso.
Comment 6 Stephan Krauß 2009-04-02 21:23:50 UTC
Here is what I found out after a day of work:
The bug in ld.so seems to get triggered only under very special circumstances. My setting is as follows:
I have an archive with library code, which does some calculation. This archive is linked statically into a binary created with LLVM. (LLVM links in other libraries, does some instrumentation and compiles the code to a normal object file. After that the said archive is linked in.) When the resulting program is started, ld.so links dynamically some standard libraries and in doing so, it damages the code of the contained archive. (At least its calculations get corrupted. Its hard to track down what's actually going wrong. If I let some intermediate values being printed out, the final result changes. And no, its not my code's fault. Under Debian and Ubuntu, this doesn'nt happen.)
I've tried to reproduce this behaviour without LLVM being involved -- without success. So it seems that ld.so has problems to handle the code generated by LLVM. Strangely, the same code works flawlessly, when using other distros. Well, not that surprising, as Valgrind does not find any reads from uninitialised memory there.
I suppose, you won't bother yourself with LLVM. Unfortunately, I can't provide some non-LLVM code, as it seems to work fine if it is not involved.
I understand, that you don't want to waste your time on a bug, that you can't be sure, that it is really there and even if it exists, won't affect most users. (At least I hope so.)
I believe the best will be, if I test it with newer versions of the glibc-package and if that doesn't help I'll install a different Linux distribution.
Comment 7 SpanKY gentoo-dev 2009-04-03 17:18:38 UTC
that's still a high level description, not a set of instructions like:
 - download XXX files
 - run XXX commands
 - see XXX misbehavior
Comment 8 Stephan Krauß 2009-04-04 14:27:25 UTC
Created attachment 187298 [details]
Test code showing some misbehaviour

I finally figured out a way to reproduce it without LLVM. (It takes hour to compile LLVM. I don't want to do this to you.)
- download and extract the test archive
- run make to compile the example
- run ./test
- rename the emitted debug.log
- comment out or delete line 522 in calc.cpp (cout ...)
- re-run make and ./test
- make a diff of the old and new debug.log
- see different results at some places

Another way to get different results, is to run it with valgrind (and its memcheck tool). So seems to be somehow memory related. (Although I've spent a great deal in looking for reads from uninitialised memory and couldn't find some in my code.)
By the way, it makes no difference, where the cout line is put (as long as it is in the same method). It is also indifferent what you print out, or how often.
While it cut down the code, I sometimes even saw changes in the results of the calculation, as I removed some methods, although they were never called in this test.
By the way, you need boost installed.
Unfortunately, I was not able to produce a smaller piece of code yet, that shows this behaviour. The problem is that is hard to figure out, what's going on, if adding some output code changes the results. The strange thing is, that not every value is wrong. Thus, I'm still wondering if it my fault. But I have no idea, what I could do to the code, besides reading from uninitialised memory, that could result in such a strange behaviour.
Comment 9 Stephan Krauß 2009-04-04 14:29:44 UTC
Created attachment 187299 [details]
Test code showing some misbehaviour

Sorry for the wrong content type. Fixed it.
Comment 10 SpanKY gentoo-dev 2009-04-04 18:57:39 UTC
thanks for spending the time to put that together
Comment 11 SpanKY gentoo-dev 2009-04-06 01:04:34 UTC
i really dont think these warnings from glibc have any bearing on your test case misbehavior whatsoever.  they seem to be x86-specific ... or at least, valgrind doesnt whine on x86_64.  and the same warning exists in glibc-2.9.
Comment 12 Stephan Krauß 2009-04-06 20:46:36 UTC
I'm so sorry. I was totally wrong. This hasn't to do anything with ld.so. When I tested my code on non-Gentoo machines, I also compiled it on them. (And had no issues.) But if I use the binary built under Gentoo, I see the same strange behaviour on those non-Gentoo machines. Thus, it is clearly a compiler issue and ld.so is not to blame for it. I will move to the recently stabilized GCC asap. Hope that will fix it.
Comment 13 Nessuno 2009-04-07 06:18:12 UTC
After an upgrade, my code:

--- program.cpp ---
int main(void) {
  return 0;
}
--- program.cpp ---

$ g++ -g program.cpp
$ valgrind ./a.out

produced the error, until i did:
# emerge -C valgrind
# emerge =valgrind-3.3.1      # after this I get no errors
# emerge -C valgrind
# emerge valgrind             # means version 3.4.0
after the reinstall I can't reproduce it.
(with gcc 4.1.2 on a Core2Duo in x86 mode (not 64 bit).)
Comment 14 SpanKY gentoo-dev 2009-04-10 08:53:34 UTC
yes, on my x86 systems where i could reproduce, i rebuilt glibc and no longer get the valgrind warnings
Comment 15 SpanKY gentoo-dev 2009-05-30 00:05:55 UTC
*** Bug 271596 has been marked as a duplicate of this bug. ***
Comment 16 Nikos Chantziaras 2009-06-13 16:47:25 UTC
I get a long list of errors with *any* executable in valgrind.  For example "valgrind ls" or "valgrind ldd" (ldd is a static executable).  Or anything else.  This is with glibc-2.10.1.

Although it's a different glibc version that the version this bug is about, I'm not opening a new bug since someone already did (but 271596) but it got marked as duplicate of this one.

I'm attaching the errors and my emerge --info.
Comment 17 Nikos Chantziaras 2009-06-13 16:47:56 UTC
Created attachment 194570 [details]
valgrind ldd
Comment 18 Nikos Chantziaras 2009-06-13 16:48:23 UTC
Created attachment 194571 [details]
emerge --info
Comment 19 Nikos Chantziaras 2009-06-13 17:53:18 UTC
OK, it seems I found the solution.  Valgrind can't cope with the new, sse-optimized strlen function of glibc 2.10.1 on amd64.  Fedora is applying a patch for this:

  valgrind-3.4.1-x86_64-ldso-strlen.patch

as well as another, glibc 2.10.1 specific one:

  valgrind-3.4.1-glibc-2.10.1.patch

Those patches can be found at:

  http://cvs.fedoraproject.org/viewvc/rpms/valgrind/F-11

However, the strlen patch (the one we need) does not work with Gentoo unless glibc is emerged with debug symbols enabled:

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux-x86-64.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux-x86-64.so.2
valgrind:
valgrind:  Possible fix: install glibc's debuginfo package on this machine.
valgrind:
valgrind:  Cannot continue -- exiting now.  Sorry.
Comment 20 SpanKY gentoo-dev 2009-06-19 22:41:25 UTC
could you file a new bug about that (as that would be assigned to the valgrind maintainer) and have it block this bug please ?
Comment 21 Nikos Chantziaras 2009-06-20 08:40:46 UTC
(In reply to comment #20)
> could you file a new bug about that (as that would be assigned to the valgrind
> maintainer) and have it block this bug please ?

Done.