Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 599894 - sys-apps/sandbox-2.11: programs misbehave/crash when prelinked
Summary: sys-apps/sandbox-2.11: programs misbehave/crash when prelinked
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Sandbox (show other bugs)
Hardware: All Linux
: Normal normal with 1 vote (vote)
Assignee: Sandbox Maintainers
URL:
Whiteboard:
Keywords:
: 600048 636074 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-11-15 16:43 UTC by Anders Larsson
Modified: 2019-06-22 07:46 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Generated sandbox log (sandbox-11125.log,7.17 KB, text/plain)
2016-11-15 16:43 UTC, Anders Larsson
Details
Full build log (previously included as paste link) (build.log,18.27 KB, text/plain)
2016-11-16 15:06 UTC, Anders Larsson
Details
Emerge --info (emergeinfo,5.36 KB, text/plain)
2016-11-16 15:07 UTC, Anders Larsson
Details
readelf -a -W pypy-prelinked (readelf-pypy-prelinked,39.08 KB, text/plain)
2016-11-17 12:32 UTC, Marien Zwart
Details
readelf -a -W pypy-unprelinked (readelf-pypy-unprelinked,9.37 KB, text/plain)
2016-11-17 12:33 UTC, Marien Zwart
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anders Larsson 2016-11-15 16:43:56 UTC
Created attachment 453392 [details]
Generated sandbox log

After updating to sys-apps/sandbox-2.11-r2 was no longer possible to merge packages with sandbox enabled. After downgrade to sys-apps/sandbox-2.10-r2 everything works again.

To downgrade to sys-apps/sandbox-2.10-r2 sandbox had to be disabled (FEATURES='-sandbox -user-sandbox'). Thanks FireBurn on IRC.
Comment 1 Anders Larsson 2016-11-15 16:46:35 UTC
Please see log below when I attempted to merge dev-perl/XML-LibXML but it occured on any package.
https://paste.pound-python.org/show/g3PZRUZR5B2nyIuayzns/
Comment 2 SpanKY gentoo-dev 2016-11-15 23:15:06 UTC
please attach the full build log, and emerge info
Comment 3 Kenton Groombridge 2016-11-16 02:33:01 UTC
I am guessing an incompatibility with prelink.  If you are prelinking your files, undo them and check again.
Comment 4 Marien Zwart 2016-11-16 10:40:28 UTC
This does look like a problem with prelink.

I've made two copies of /bin/true, and ran "prelink -u" on one of them. In a sandbox shell (/usr/bin/sandbox) from prelink-2.10-r2, both work. If I upgrade to 2.11-r2, the unprelinked binary works while the prelinked one segfaults. Running emerge results in error message spew similar to what Anders Larsson reports.

It looks like running emerge also results in things segfaulting. gdb on one of the resulting cores reports:

#0  0x00007f743d78272a in sb_check_exec (filename=0xe64c60 "/usr/bin/tr", argv=0xe722c0)
    at /var/tmp/portage/sys-apps/sandbox-2.11-r2/work/sandbox-2.11/libsandbox/wrapper-funcs/__wrapper_exec.c:175
#1  0x00007f743d782a8b in execve_DEFAULT (path=0xe64c60 "/usr/bin/tr", argv=0xe722c0, envp=0xe70680)
    at /var/tmp/portage/sys-apps/sandbox-2.11-r2/work/sandbox-2.11/libsandbox/wrapper-funcs/__wrapper_exec.c:256

I assume the rest of the frames are not interesting, but let me know if you want them.

We're crashing at:
175                             PARSE_ELF(64);

but that's not helpful as PARSE_ELF is an almost 100-line macro.

Judging from the locals, we've made it to the final loop in that macro. Some interesting locals:

(gdb) p vhash
$25 = 0
(gdb) p/x symoff
$29 = 0x318
(gdb) p/x stroff
$28 = 0xa164

As mentioned in the comments, the code assumes the symbol table runs until the start of the sysv hash table (if present) or the string table (if not). That assumption is not valid for this binary:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        0000000000400270 000270 00001c 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            000000000040028c 00028c 000020 00   A  0   0  4
  [ 3] .gnu.hash         GNU_HASH        00000000004002b0 0002b0 000064 00   A  4   0  8
  [ 4] .dynsym           DYNSYM          0000000000400318 000318 000660 18   A 17   1  8
  [ 5] .gnu.liblist      GNU_LIBLIST     0000000000400978 000978 000028 14   A 17   0  4
  [ 6] .gnu.version      VERSYM          0000000000400c64 000c64 000088 02   A  4   0  2
  [ 7] .gnu.version_r    VERNEED         0000000000400cf0 000cf0 000060 00   A 17   1  8
  [ 8] .rela.dyn         RELA            0000000000400d50 000d50 0000a8 18   A  4   0  8
  [ 9] .rela.plt         RELA            0000000000400df8 000df8 000588 18  AI  4  11  8
  [10] .init             PROGBITS        0000000000401380 001380 00001a 00  AX  0   0  4
  [11] .plt              PROGBITS        00000000004013a0 0013a0 0003c0 10  AX  0   0 16
  [12] .text             PROGBITS        0000000000401760 001760 005539 00  AX  0   0 16
  [13] .fini             PROGBITS        0000000000406c9c 006c9c 000009 00  AX  0   0  4
  [14] .rodata           PROGBITS        0000000000406cc0 006cc0 0022ae 00   A  0   0 32
  [15] .eh_frame_hdr     PROGBITS        0000000000408f70 008f70 0002c4 00   A  0   0  4
  [16] .eh_frame         PROGBITS        0000000000409238 009238 000f2c 00   A  0   0  8
  [17] .dynstr           STRTAB          000000000040a164 00a164 000308 00   A  0   0  1
  [18] .gnu.conflict     RELA            000000000040a470 00a470 0003f0 18   A  4   0  8
  [19] .init_array       INIT_ARRAY      000000000060ae10 00ae10 000008 00  WA  0   0  8
  [20] .fini_array       FINI_ARRAY      000000000060ae18 00ae18 000008 00  WA  0   0  8
  [21] .jcr              PROGBITS        000000000060ae20 00ae20 000008 00  WA  0   0  8
  [22] .dynamic          DYNAMIC         000000000060ae28 00ae28 0001d0 10  WA 17   0  8
  [23] .got              PROGBITS        000000000060aff8 00aff8 000008 08  WA  0   0  8
  [24] .got.plt          PROGBITS        000000000060b000 00b000 0001f0 08  WA  0   0  8
  [25] .data             PROGBITS        000000000060b200 00b200 000074 00  WA  0   0 32
  [26] .dynbss           PROGBITS        000000000060b280 00b280 000048 00  WA  0   0 32
  [27] .bss              NOBITS          000000000060b2c8 00b2c8 002478 00  WA  0   0 32
  [28] .gnu_debuglink    PROGBITS        0000000000000000 00b2c8 000010 00      0   0  1
  [29] .gnu.prelink_undo PROGBITS        0000000000000000 00b2d8 0008f0 01      0   0  8
  [30] .shstrtab         STRTAB          0000000000000000 00bbc8 000120 00      0   0  1

Notice symoff is the offset for the DYNSYM section, which is the 4th one, while stroff is the offset for section 17. The loop is going to try to process sections 5-16 as if they're part of the symbol table, which won't work so well.

And sure enough:

(gdb) p/x (void*)sym-elf
$32 = 0x990

the symbol we're processing is past the end of the symbol table, partway into the GNU_LIBLIST section. We're probably segfaulting dereferencing symname:

(gdb) p symname
$33 = 0x7f749581dd33 <error: Cannot access memory at address 0x7f749581dd33>

Comparing readelf output for a prelinked and an unprelinked file confirms that the unprelinked one has its strtab section immediately after the dynsym section, as the code expects.

The code claims "the size of the symbol table" isn't recorded. I'm not particularly familiar with ELF,  but isn't the sh_size field in the section header exactly that? Shouldn't that be used to calculate symend?

The code also claims glibc makes the same assumption it does. I haven't tracked that bit down yet.
Comment 5 Anders Larsson 2016-11-16 15:06:26 UTC
Created attachment 453518 [details]
Full build log (previously included as paste link)
Comment 6 Anders Larsson 2016-11-16 15:07:02 UTC
Created attachment 453520 [details]
Emerge --info
Comment 7 SpanKY gentoo-dev 2016-11-16 15:32:31 UTC
thanks guys, i can reproduce that behavior over here.  i'll take a look.
Comment 8 cristiano04 2016-11-16 15:52:54 UTC
Anything above 2.10-r2 segfaults here, including git, which I compiled myself. Segfaults (as per dmesg) and fails to create the work directory because of it. Compiling old version by hand fixes it.
Comment 9 Mike Lothian 2016-11-16 17:04:04 UTC
Compiling the old version with FEATURES="-sandbox -usersandbox" gets it working again
Comment 10 SpanKY gentoo-dev 2016-11-16 20:44:32 UTC
(In reply to Marien Zwart from comment #4)
> The code claims "the size of the symbol table" isn't recorded. I'm not
> particularly familiar with ELF,  but isn't the sh_size field in the section
> header exactly that? Shouldn't that be used to calculate symend?

while the symbol table has that info, it's not what's used at runtime, nor is a symbol table required at all to execute.  the only info the ldso uses and requires are program headers and dynamic tags so those are the only things sandbox looks at.  it's also possible, albeit unlikely, that the section headers do not match the program headers.

there is no info at runtime available that says for sure the symbol end.

> The code also claims glibc makes the same assumption it does. I haven't
> tracked that bit down yet.

"just" grep the glibc source:
https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-addr.c;h=1b16a58cedaa534318aa33039181350c40c69db2#l82

i need to implement support for walking the .gnu.hash table.
Comment 11 SpanKY gentoo-dev 2016-11-16 22:21:07 UTC
turns out i can continue to delay on implementing hash table support ... prelink itself replaces the string table with its own liblist table, so sandbox can use that as a end marker.

should be fixed in 2.11-r3.

https://gitweb.gentoo.org/proj/sandbox.git/commit/?id=3ff625739ab2660e7f0adeb99f75ee44c20fef09
https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8ec3e4da8a051ed983a770b719b10730038474bb
Comment 12 Marien Zwart 2016-11-17 08:53:59 UTC
Thanks, confirmed that fixes it for me. Also thanks for the glibc source link and explanation.
Comment 13 Marien Zwart 2016-11-17 12:32:58 UTC
Created attachment 453644 [details]
readelf -a -W pypy-prelinked
Comment 14 Marien Zwart 2016-11-17 12:33:27 UTC
Created attachment 453646 [details]
readelf -a -W pypy-unprelinked
Comment 15 Marien Zwart 2016-11-17 12:46:18 UTC
I spoke too soon: I found a binary on my system that still crashes when prelinked with sandbox-2.11-r3 :(

Unfortunately the binary is pypy (from dev-python/pypy-5.4.1), which takes inconveniently long for you to build to reproduce the problem with.

Comparing the readelf output for an unprelinked and prelinked version, we see the unprelinked version has strtab after dynsym, as sandbox expects:

  [ 4] .dynsym           DYNSYM          00000000004002c0 0002c0 000120 18   A  5   1  8
  [ 5] .dynstr           STRTAB          00000000004003e0 0003e0 0000dd 00   A  0   0  1
  [ 6] .gnu.version      VERSYM          00000000004004be 0004be 000018 02   A  4   0  2
  [ 7] .gnu.version_r    VERNEED         00000000004004d8 0004d8 000020 00   A  5   1  8
(skipping...)
  [17] .eh_frame         PROGBITS        0000000000400758 000758 0000ec 00   A  0   0  8
  [18] .init_array       INIT_ARRAY      0000000000600de0 000de0 000008 00  WA  0   0  8

But when prelinked, instead of replacing the strtab with the liblist, prelink left a hole. Then it put the strtab and liblist much further down:

  [ 5] .dynstr           STRTAB          00000000004003e0 0003e0 0000dd 00   A  0   0  1
  [ 6] .gnu.version      VERSYM          00000000004004be 0004be 000018 02   A  4   0  2
(skipping...)
  [16] .eh_frame         PROGBITS        0000000000400758 000758 0000ec 00   A  0   0  8
  [17] .dynstr           STRTAB          0000000000400844 000844 000197 00   A  0   0  1
  [18] .gnu.liblist      GNU_LIBLIST     00000000004009dc 0009dc 000140 14   A 17   0  4
  [19] .init_array       INIT_ARRAY      0000000000600de0 000de0 000008 00  WA  0   0  8

Looking at the sizes of these sections, I think this happened because the new liblist is larger than the old strtab (maybe it actually always first grows strtab, having to move it down, and then puts liblist in the first available hole, which is usually the one just left by strtab?). I haven't spotted binaries other than pypy affected by this yet.

I've attached the readelf output for pypy, and for the same pypy with prelinking undone. Let me know if you need more information.
Comment 16 SpanKY gentoo-dev 2016-11-17 14:13:18 UTC
*** Bug 600048 has been marked as a duplicate of this bug. ***
Comment 17 SpanKY gentoo-dev 2016-11-17 21:28:10 UTC
(In reply to Marien Zwart from comment #15)

i think your truncated readelf output there for the prelinked ELF is broken, but the full readelf output you attached looks good, so don't worry about it.

it does look like prelink relocated the contents elsewhere and either left the old content untouched or zeroed it.  either way, can't rely on that.

i'll have to implement gnu hash table walking to match glibc, but if it's much more uncommon now, i won't super prioritize it.  got some more research & documentation to write on the topic first (since `man 5 elf` doesn't cover any of these sections/segments).
Comment 18 Steven Newbury 2017-06-11 09:40:12 UTC
(In reply to SpanKY from comment #17)
> (In reply to Marien Zwart from comment #15)
> 
> i think your truncated readelf output there for the prelinked ELF is broken,
> but the full readelf output you attached looks good, so don't worry about it.
> 
> it does look like prelink relocated the contents elsewhere and either left
> the old content untouched or zeroed it.  either way, can't rely on that.
> 
> i'll have to implement gnu hash table walking to match glibc, but if it's
> much more uncommon now, i won't super prioritize it.  got some more research
> & documentation to write on the topic first (since `man 5 elf` doesn't cover
> any of these sections/segments).

Glad I found this bug, I've been trying to debug why I've been unable to upgrade sandbox to 2.11 which has been frustrating since I maintain a VAAPI enabled Chromium ebuild!  I use pypy as my default python with portage, and auto-prelink* on emerge, so upgrading sandbox resulted in segfaults immediately on any portage operation.

Will the hash table walking be implemented before 2.11 is unmasked?

* It's an experimental new feature I'm working on :-)
Comment 19 Martin von Gagern 2017-12-08 12:26:26 UTC
I've been bitten by this, too. First in bug 636074 with /usr/sbin/gencmn (binary attached there) crashing which prevents libreoffice from building. And just now gnuplot fails to build as it can't detect the fontforge version due to this:

# sandbox /bin/sh -c '/usr/bin/fontforge --version'
Sandboxed process killed by signal: Segmentation fault
Segmentation fault

Thanks for pointing out that sandbox 2.10 is not affected, will stick to that for the time being. But a proper fix would be appreciated.
Comment 20 Martin von Gagern 2017-12-08 12:28:56 UTC
*** Bug 636074 has been marked as a duplicate of this bug. ***
Comment 21 Sergei Trofimovich (RETIRED) gentoo-dev 2019-06-22 07:46:48 UTC
DT_GNU_HASH support was implemented in https://gitweb.gentoo.org/proj/sandbox.git/commit/?id=c8146cfbcd36f9be4a447bf057811fe2f6c543b2

We don't rely on section ordering to detect symbol size.