Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 637116 - sys-libs/glibc : Segmentation fault in on invalid ELF files
Summary: sys-libs/glibc : Segmentation fault in on invalid ELF files
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Toolchain Maintainers
Depends on:
Reported: 2017-11-10 23:58 UTC by Alexander Bezrukov
Modified: 2018-09-11 19:15 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---

Example of shared object which causes crash (,4.41 KB, application/x-sharedlib)
2017-11-10 23:58 UTC, Alexander Bezrukov

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Bezrukov 2017-11-10 23:58:16 UTC
Created attachment 503552 [details]
Example of shared object which causes crash

After upgrade to glibc-2.25 (a long, long while ago) or approximately this time I started to get sporadic segfaults in After a scan with revdep-rebuild (now, as an example, I am getting my syslog spammed with tons of messages like:
[1985463.850366][2352]: segfault at 4 ip 00000000f77602c7 sp 00000000ffd1b700 error 4 in[f7754000+23000]
[1985468.718062] show_signal_msg: 177 callbacks suppressed

Segfaults happen at different ip values and at different addresses.

Today I finally looked into the crash.

1. I managed to reduce the reproduction to this simple command (see the attachment):

$ /lib64/ --list
Segmentation fault

2. Backtrace:

(gdb) where
#0  0x00007ffff7ddfca1 in elf_get_dynamic_info (temp=0x0, l=0x7ffff7ffe120) at get-dynamic-info.h:102
#1  _dl_map_object_from_fd (name=name@entry=0x7fffffffe12e "/tmp/", origname=origname@entry=0x0, fd=<optimized out>, fbp=<optimized out>, 
    realname=<optimized out>, loader=loader@entry=0x0, l_type=0, mode=536870912, stack_endp=0x7fffffffc888, nsid=0) at dl-load.c:1200
#2  0x00007ffff7de2524 in _dl_map_object (loader=loader@entry=0x0, name=0x7fffffffe12e "/tmp/", type=type@entry=0, 
    trace_mode=trace_mode@entry=0, mode=mode@entry=536870912, nsid=nsid@entry=0) at dl-load.c:2199
#3  0x00007ffff7ddece0 in dl_main (phdr=<optimized out>, phnum=6, user_entry=0x7fffffffdd38, auxv=0x7fffffffdfc8) at rtld.c:1037
#4  0x00007ffff7df114e in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7fffffffdde0, dl_main=dl_main@entry=0x7ffff7ddbd70 <dl_main>)
    at ../elf/dl-sysdep.c:253
#5  0x00007ffff7ddb909 in _dl_start_final (arg=0x7fffffffdde0) at rtld.c:399
#6  _dl_start (arg=0x7fffffffdde0) at rtld.c:505
#7  0x00007ffff7ddab48 in _start ()

3. I have a gentoo host which produces these crashes (in normal everyday work) and one (also gentoo) which doesn't. Both are amd64, both are mostly on stable with the same packages installed in @system set, and both have the same version of gcc and binutils. The difference is CPU, the "bad" one is K8 while the "good" one is core-i5. CFLAGS on both include "-march=native". If I copy an .so from the "bad" host to the "good" one and run --list <this_bad_so_specimen>
then I reproduce the crash on the "good" host. That is, what causes crash is the so file, not the environment. Another difference is kernel configs, and the so's which produce the crashes which I could catch so far belong to /usr/src/linux-*.

4. The corresponding snippet of code which produces the segfault is:

# define DT_HASH 4

# define ADJUST_DYN_INFO(tag) \
      do                      \
  if (info[tag] != NULL)                  \
    {                     \
      if (temp)                   \
        {                     \
    temp[cnt].d_tag = info[tag]->d_tag;           \
    temp[cnt].d_un.d_ptr = info[tag]->d_un.d_ptr + l_addr;        \
    info[tag] = temp + cnt++;             \
        }                     \
      else                    \
        info[tag]->d_un.d_ptr += l_addr;              \
    }                     \
      while (0)


And in assembly:

   0x00007ffff7ddfc97 <+2055>:	mov    0x60(%r12),%rax
   0x00007ffff7ddfc9c <+2060>:	test   %rax,%rax
   0x00007ffff7ddfc9f <+2063>:	je     0x7ffff7ddfca5 <_dl_map_object_from_fd+2069>
=> 0x00007ffff7ddfca1 <+2065>:	add    %rcx,0x8(%rax)

What makes me wondering is:

(gdb) p temp
$27 = (Elf64_Dyn *) 0x0
(gdb) p &info[4]
$28 = (Elf64_Dyn **) 0x7ffff7ffe180
(gdb) p &info[4]->d_un.d_ptr
$29 = (Elf64_Addr *) 0x7ffff7ff7370
(gdb) i r rax
rax            0x7ffff7ff7368	140737354101608
(gdb) p *info[4]
$30 = {d_tag = 4, d_un = {d_val = 288, d_ptr = 288}}
(gdb) p info[4]->d_un.d_ptr += l_addr
$31 = 140737354101024
(gdb) p *info[4]
$32 = {d_tag = 4, d_un = {d_val = 140737354101024, d_ptr = 140737354101024}}

So the memory (despite the VDSO page address) seems to be readable and writable. I see no obvious reason as to what causes the segmentation fault. I assume that after the access violation this page somehow becomes RW (maybe re-mapped in signal handler) but I don't know how to quickly check this assumption.

5. Rebuilding glibc, gcc, binutils does not make the issue disappear.

6. On both ("good" and "bad") hosts:
>grep CONFIG_COMPAT_VDSO /usr/src/linux/.config

7. emerge --info (on "bad" host, to be specific)
$ emerge --info
Portage 2.3.8 (python 3.4.5-final-0, default/linux/amd64/13.0, gcc-5.4.0, glibc-2.25-r8, 4.9.57-alb x86_64)
System uname: Linux-4.9.57-alb-x86_64-Dual_Core_AMD_Opteron-tm-_Processor_290-with-gentoo-2.4.1
KiB Mem:    16536472 total,   2109828 free
KiB Swap:   50331636 total,  48188980 free
Timestamp of repository gentoo: Fri, 10 Nov 2017 15:15:01 +0000
Head commit of repository gentoo: 9974fd94f4fdc69679834e441c7f86a787effe8e
sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.28.1 p1.0) 2.28.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-java/java-config:     2.2.0-r3::gentoo
dev-lang/perl:            5.24.3::gentoo
dev-lang/python:          2.7.14::gentoo, 3.4.5::gentoo, 3.5.4::gentoo
dev-util/cmake:           3.8.2::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.32.1::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69::gentoo
sys-devel/automake:       1.11.6-r1::gentoo, 1.15-r2::gentoo
sys-devel/binutils:       2.28.1::gentoo
sys-devel/gcc:            5.4.0-r3::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.9::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r8::gentoo

    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://
    priority: -1000

    location: /usr/local/portage
    masters: gentoo
    priority: 0

    location: /var/lib/layman/vmware
    masters: gentoo
    priority: 50

ACCEPT_LICENSE="* -@EULA sun-bcla-java-vm Oracle-BCLA-JavaSE dlj-1.1 skype-eula skype- googleearth AdobeFlash-11.x Intel-SDP TeamViewer NVIDIA-CUDA NVIDIA-gdk ACML-EULA OPERA-12 RAR"
CFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer -finline-functions-called-once -ftree-vectorize"
CONFIG_PROTECT="/etc /etc/stunnel/stunnel.conf /usr/lib64/libreoffice/program/sofficerc /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.3/conf /usr/share/themes/oxygen-gtk/gtk-2.0"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php7.0/ext-active/ /etc/php/cgi-php7.0/ext-active/ /etc/php/cli-php7.0/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer -finline-functions-called-once -ftree-vectorize"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer -finline-functions-called-once -ftree-vectorize -fprefetch-loop-arrays -funroll-loops -fno-stack-protector"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
USE="X a52 aac aacs acl acpi alsa amd64 amr apng asm bash-completion berkdb bidi bluray bundled-libs bzip2 caps cdda cdr celt cjk cli cracklib crypt cryptsetup cscope cups cxx dbus dirac djvu dri dv dvb dvd dvdr dvdread eselect exif faac ffmpeg flac fontconfig fortran g726 g729 gdbm gif gimp gmp gpm gsm gsm-nonstandard gtk http iconv icu idn ieee1394 ilbc jpeg jpeg2k lame lcms ldap ldapsam libnotify lm_sensors lock logrotate mad matroska mmap mms mng modules mp3 mpeg multilib musepack ncurses nls nodrm nptl nsplugin numa ogg opencl opengl openmp opus pam pcre pkcs11 png qt5 readline samba seccomp session silk srtp ssl startup-notification taglib tcpd theora threads thunar tiff timidity truetype udev unicode usb vcd vdpau vim-syntax visio vorbis vpx wavpack winbind wmf wpg x264 xattr xcomposite xinerama xmp xv xvid xvmc zlib" ABI_X86="32 64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="alias auth_basic auth_digest authn_alias authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_user autoindex dir env expires filter headers deflate info log_config logio mime mime_magic negotiation status unique_id userdir rewrite reqtimeout proxy proxy_connect proxy_http authn_core authz_core unixd socache_shmcb" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="3dnow 3dnowext mmx mmxext sse sse2 sse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev wacom" KERNEL="linux" L10N="en fa ru" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en fa ru" NGINX_MODULES_HTTP="auth_pam access auth_basic autoindex browser charset fastcgi fancyindex geoip gzip headers_more limit_conn limit_req proxy referer rewrite scgi stub_status" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-0" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_4" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby22" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Comment 1 Alexander Bezrukov 2017-11-11 00:53:23 UTC
I realized what made this from linux kernel "toxic". It is


on the "bad" host. On the "good" host the selection is


(Both settings are on purpose.)

This also probably explains why memory becomes read-write after the fault.

Anyway, in my opinion, should not segfault with any input.
Comment 2 Andreas K. Hüttel gentoo-dev 2017-11-11 15:31:44 UTC
Yep, that looks like a bug.
Comment 3 Sergei Trofimovich gentoo-dev 2017-11-11 20:02:17 UTC
(In reply to Alexander Bezrukov from comment #1)
> I realized what made this from linux kernel "toxic". It is
> on the "bad" host. On the "good" host the selection is
> (Both settings are on purpose.)
> This also probably explains why memory becomes read-write after the fault.
> Anyway, in my opinion, should not segfault with any input.

I suggest reporting the bug directly upstream. I would be very wary to add gentoo-specific code in early startup phase. It's very easy to break. does minimal to no validation of mapped ELF files:
libc is not loaded, relocations are not yet processed.
It's a very sensitive piece of code both from performance
and fragility standpoints. But maybe it can be tweaked for
this particular case.;a=blob;f=elf/dl-load.c;h=1220183ce29f83668d2044dc25093c08184335fe;hb=HEAD#l857

Note how little it does before actually crashing:

$ LD_DEBUG=all /lib/ --list ./ 
     24623:     file=./ [0];  generating link map

$ strace -f /lib/ --list ./ 
execve("/lib/", ["/lib/", "--list", "./"], 0x7ffe60180838 /* 80 vars */) = 0
brk(NULL)                               = 0x7fa529126000
openat(AT_FDCWD, "./", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\t\0\0\0\0\0\0"..., 832) = 832
lseek(3, 1960, SEEK_SET)                = 1960
read(3, "\6\0\0\0\4\0\0\0\0\0\0\0Linux\0\0\0<\t\4\0\4\0\0\0\24\0\0\0"..., 60) = 60
fstat(3, {st_mode=S_IFREG|0755, st_size=4512, ...}) = 0
getcwd("/home/slyfox/Downloads", 128)   = 23
mmap(NULL, 3209, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fa5277bc000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fa5277bc370} ---
+++ killed by SIGSEGV (core dumped) +++
Comment 4 Andreas K. Hüttel gentoo-dev 2018-09-11 14:58:13 UTC
From comment #3 I assume that this is not restricted to 2.25 ...

When you file an upstream bug report, please link to it here!