Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 820071 - sys-libs/libcap[abi_x86_32]: tests segfault (with glibc-2.34 in fstatat64_time64_statx on amd64/hardened?)
Summary: sys-libs/libcap[abi_x86_32]: tests segfault (with glibc-2.34 in fstatat64_tim...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords: TESTFAILURE
Depends on:
Blocks: glibc-2.34
  Show dependency tree
 
Reported: 2021-10-24 21:52 UTC by Sam James
Modified: 2024-08-14 16:01 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build.log (file_820071.txt,53.40 KB, text/plain)
2021-10-24 21:52 UTC, Sam James
Details
.config (5.10.75, gentoo-kernel[hardened]) (file_820071.txt,229.09 KB, text/plain)
2021-10-25 08:44 UTC, Sam James
Details
build.log + emerge --info (amd64 with glibc-2.33-r7) (build.log-emerge-info.txt,72.72 KB, text/plain)
2021-10-26 06:04 UTC, Ionen Wolkens
Details
Dockerfile (gentoo, glibc-2.34, build libcap from git) (file_820071.txt,380 bytes, text/plain)
2021-11-11 06:58 UTC, Sam James
Details
Dockerfile (gentoo, musl, build libcap from git) (file_820071.txt,252 bytes, text/plain)
2021-11-13 21:59 UTC, Sam James
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-24 21:52:54 UTC
Created attachment 746538 [details]
build.log

gdb wasn't super helpful:
```
Reading symbols from ./libcap.so...
(gdb) r
Starting program: /var/tmp/portage/sys-libs/libcap-2.57/work/libcap-2.57-abi_x86_32.x86/libcap/libcap.so

Program received signal SIGSEGV, Segmentation fault.
0xf7e80b44 in ?? ()
(gdb) bt
#0  0xf7e80b44 in ?? ()
#1  0x00000000 in ?? ()
(gdb) q
A debugging session is active.

	Inferior 1 [process 3933492] will be killed.

Quit anyway? (y or n) y
```

But from coredumpctl:
```

       Message: Process 3933170 (libcap.so) of user 250 dumped core.

                Found module /var/tmp/portage/sys-libs/libcap-2.57/work/libcap-2.57-abi_x86_32.x86/libcap/libcap.so.2.57 without build-id.
                Found module /lib/libc.so.6 without build-id.
                Found module /usr/lib/libsandbox.so without build-id.
                Found module linux-gate.so.1 with build-id: 49b4e744799734796173c78e9030bb29c2d3354e
                Found module /lib/ld-linux.so.2 without build-id.
                Stack trace of thread 147:
                #0  0x00000000f7dfdb44 fstatat64_time64_statx (/lib/libc.so.6 + 0x106b44)
                #1  0x00000000f7dfd65d __GI___stat64_time64 (/lib/libc.so.6 + 0x10665d)
                #2  0x00000000f7f2021b fopen64 (/usr/lib/libsandbox.so + 0x821b)
                #3  0x000000005657818b n/a (/var/tmp/portage/sys-libs/libcap-2.57/work/libcap-2.57-abi_x86_32.x86/libcap/libcap.so.2.57 + 0x718b)
                #4  0x000000005657836c n/a (/var/tmp/portage/sys-libs/libcap-2.57/work/libcap-2.57-abi_x86_32.x86/libcap/libcap.so.2.57 + 0x736c)
```
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-24 21:53:15 UTC
Portage 3.0.28 (python 3.10.0-final-0, default/linux/amd64/17.1/desktop/plasma/systemd, gcc-11.2.0, glibc-2.34, 5.10.75-gentoo-dist-hardened x86_64)
=================================================================
System uname: Linux-5.10.75-gentoo-dist-hardened-x86_64-AMD_Ryzen_9_3950X_16-Core_Processor-with-glibc2.34
KiB Mem:    16365060 total,   1914612 free
KiB Swap:   25067512 total,  25033720 free
Timestamp of repository gentoo: Sun, 24 Oct 2021 21:21:30 +0000
Head commit of repository gentoo: 8a3e7257bcd00a7251ea616e8c823acfa05b9e15

Timestamp of repository kde: Sun, 24 Oct 2021 16:36:49 +0000
Head commit of repository kde: 5102652de8f29bfa82b2b992ed5258a7c6e7588a

Timestamp of repository qt: Thu, 21 Oct 2021 00:34:37 +0000
Head commit of repository qt: 898c77083ea7795a655a5c5c8c8c1dd3030366a4

Timestamp of repository steam-overlay: Sun, 24 Oct 2021 16:36:44 +0000
Head commit of repository steam-overlay: 022a274069862bab24ce4d1a83b6372eea5b3988

sh dash 0.5.11.5
ld GNU ld (Gentoo 2.37_p1 p0) 2.37
ccache version 4.4.2 [disabled]
app-shells/bash:          5.1_p8::gentoo
dev-lang/perl:            5.34.0-r5::gentoo
dev-lang/python:          3.8.12_p1::gentoo, 3.9.7_p1::gentoo, 3.10.0_p1::gentoo
dev-lang/rust-bin:        1.56.0::gentoo
dev-util/ccache:          4.4.2::gentoo
dev-util/cmake:           3.21.3::gentoo
sys-apps/baselayout:      2.8::gentoo
sys-apps/sandbox:         2.27::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.71-r1::gentoo
sys-devel/automake:       1.16.5::gentoo
sys-devel/binutils:       2.37_p1::gentoo
sys-devel/gcc:            9.4.0::gentoo, 10.3.0-r2::gentoo, 11.2.0::gentoo
sys-devel/gcc-config:     2.4::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.3::gentoo
sys-kernel/linux-headers: 5.14::gentoo (virtual/os-headers)
sys-libs/glibc:           2.34::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: git
    sync-uri: git://github.com/gentoo-mirror/gentoo.git
    priority: -1000
    eclass-overrides: sam_c
    sync-git-verify-commit-signature: yes
    sync-git-clone-extra-opts: -b stable -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 -c gc.rerereunresolved=0 -c gc.pruneExpire=now

crossdev
    location: /var/db/repos/crossdev
    masters: gentoo
    eclass-overrides: sam_c

kde
    location: /var/db/repos/kde
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/kde.git
    masters: gentoo
    eclass-overrides: sam_c

qt
    location: /var/db/repos/qt
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/qt.git
    masters: gentoo
    eclass-overrides: sam_c

sam_c
    location: /home/sam/git/overlay
    masters: gentoo
    eclass-overrides: sam_c

steam-overlay
    location: /var/db/repos/steam-overlay
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/steam-overlay.git
    masters: gentoo
    eclass-overrides: sam_c

ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -fdiagnostics-color=always -frecord-gcc-switches"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native -fdiagnostics-color=always -frecord-gcc-switches"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--keep-going --with-bdeps=y --complete-graph --deep --changed-deps-report=y"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe -march=native -fdiagnostics-color=always -frecord-gcc-switches"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs compressdebug config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox mount-sandbox multilib-strict network-sandbox news parallel-fetch parallel-install pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -march=native -fdiagnostics-color=always -frecord-gcc-switches"
GENTOO_MIRRORS="http://mirror.bytemark.co.uk/gentoo/ http://www.mirrorservice.org/sites/distfiles.gentoo.org/ http://mirrors.soeasyto.com/distfiles.gentoo.org/ http://mirrors.gethosted.online/gentoo"
LANG="en_GB.UTF-8"
LDFLAGS="-fuse-ld=lld -fpie -Wl,--as-needed -Wl,-z,relro,-z,now -Wl,--defsym=__gentoo_check_ldflags__=0"
LINGUAS="en en_GB"
MAKEOPTS="-j16"
PKGDIR="/var/cache/binpkgs"
PORTAGE_COMPRESS="pzstd"
PORTAGE_COMPRESS_FLAGS="-9 --rm"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="PIC X a52 aac acl acpi activities aes alsa amd64 avx avx2 bash-completion bluetooth branding bzip2 cairo caps cdda cdr cli crypt dbus declarative dist-kernel dri dts dvd dvdr emacs emboss encode exif f16c filecaps firewalld flac fma3 fortran gdbm gif gmp gpm graphite gtk gui hardened hunspell iconv icu ipv6 jit jpeg kde kdesu kipi kwallet lcms libglvnd libnotify libtirpc llvm-libunwind mad mmx mmxext mng mp3 mp4 mpeg multilib ncurses nls nptl ogg opengl openmp pam pango pclmul pcre pdf pgo pie plasma png policykit popcnt ppds pulseaudio qml qt5 rdrand readline sdl seccomp semantic-desktop sha spell split-usr sse sse2 sse3 sse4_1 sse4_2 sse4a ssl ssse3 startup-notification svg system-av1 system-binutils system-boost system-bootstrap system-cairo system-clang system-digest system-ffmpeg system-harfbuzz system-heimdal system-icu system-jpeg system-leveldb system-libevent system-libs system-libvpx system-libyaml system-lz4 system-mitkrb5 system-sqlite system-ssl system-tbb system-uulib system-webp system-zlib systemd threads tiff truetype udev udisks unicode upower usb verify-sig vorbis vulkan wayland widgets wxwidgets x264 xattr xcb xml xv xvid zfs zlib zsh-completion" ABI_X86="32 64" ADA_TARGET="gnat_2019" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en en-GB" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9 python3_10 python3_8" RUBY_TARGETS="ruby26 ruby27" USERLAND="GNU" VIDEO_CARDS="amdgpu radeonsi radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, LC_ALL, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_RSYNC_EXTRA_OPTS, RUSTFLAGS
Comment 2 SpanKY gentoo-dev 2021-10-24 22:07:55 UTC
please check to see if sandbox-2.26 passes, and if sandbox-2.27[-nnp] works
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-24 22:13:30 UTC
(In reply to SpanKY from comment #2)
> please check to see if sandbox-2.26 passes, and if sandbox-2.27[-nnp] works

- sandbox-2.27[-nnp] didn't work, did try before but forgot to mention (apologies)
- sandbox-2.26 didn't work either, nor does sandbox-2.25, nor does sandbox-2.24

... so I'm guessing this never worked with glibc-2.34 (unkeyworded but we're nearly done with most blockers) and we only just noticed?
Comment 4 SpanKY gentoo-dev 2021-10-25 00:24:41 UTC
(In reply to Sam James from comment #3)

thanks, that makes me happy(ish).  i was afraid it was a regression.

unfortunately, libcap tests are passing fine for me on my amd64 system w/glibc-2.34.  the crash looks like it's in the x86 ABI on your amd64 system, but i verified that passed too.
  ABI_X86=32 FEATURES=test emerge libcap
checked 2.57 & 2.59.

but i'm not running a hardened setup.  let me see if i have a hardened install laying around.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-25 01:11:12 UTC
(In reply to SpanKY from comment #4)
> (In reply to Sam James from comment #3)
> 
> thanks, that makes me happy(ish).  i was afraid it was a regression.
> 
> unfortunately, libcap tests are passing fine for me on my amd64 system
> w/glibc-2.34.  the crash looks like it's in the x86 ABI on your amd64
> system, but i verified that passed too.
>   ABI_X86=32 FEATURES=test emerge libcap
> checked 2.57 & 2.59.
> 
> but i'm not running a hardened setup.  let me see if i have a hardened
> install laying around.

Wait, it happens without sandbox:

```
                #0  0x00000000f7dbfb44 fstatat64_time64_statx (/lib/libc.so.6 + 0x106b44)
                #1  0x00000000f7dbf6cc __GI___fstat64_time64 (/lib/libc.so.6 + 0x1066cc)
                #2  0x00000000f7d29381 __GI__IO_file_doallocate (/lib/libc.so.6 + 0x70381)
                #3  0x00000000f7d38215 __GI__IO_doallocbuf (/lib/libc.so.6 + 0x7f215)
                #4  0x00000000f7d360c6 __GI__IO_file_xsgetn (/lib/libc.so.6 + 0x7d0c6)
                #5  0x00000000f7d2a3d5 __GI__IO_fread (/lib/libc.so.6 + 0x713d5)
                #6  0x0000000056601c76 n/a (/var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap/libcap.so.2.60 + 0x6c76)
```

Uh oh.
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-25 01:13:46 UTC
It's also passing without abi_x86_32(!). Don't know why I didn't check that earlier, sorry.

Poking at https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/release/2.34/master to see if there's anything worth trying.
Comment 7 SpanKY gentoo-dev 2021-10-25 01:30:51 UTC
(In reply to Sam James from comment #5)

ok, let's stop thinking about sandbox then
Comment 8 SpanKY gentoo-dev 2021-10-25 07:00:52 UTC
i can't reproduce.  i wonder if the kernel is the diff.

i also have USE=-pam, but i don't *think* that's relevant.  worth a test on your side at least.

Portage 3.0.28 (python 3.9.7-final-0, default/linux/amd64/17.1/hardened, gcc-11.2.0, glibc-2.34, 5.14.1 x86_64)
=================================================================
System uname: Linux-5.14.1-x86_64-AMD_FX-tm-4350_Quad-Core_Processor-with-glibc2.34
KiB Mem:    32787516 total,   4308456 free
KiB Swap:          0 total,         0 free
sh bash 5.1_p8
ld GNU ld (Gentoo 2.37_p1 p0) 2.37
ccache version 3.3.4 [enabled]
app-shells/bash:          5.1_p8::gentoo
dev-lang/perl:            5.34.0-r3::gentoo
dev-lang/python:          3.9.7_p1::gentoo
dev-util/ccache:          3.3.4::gentoo
sys-apps/baselayout:      2.8::gentoo
sys-apps/sandbox:         2.26::gentoo
sys-devel/autoconf:       2.69-r5::gentoo, 2.71-r1::gentoo
sys-devel/automake:       1.16.5::gentoo
sys-devel/binutils:       2.37_p1::gentoo
sys-devel/gcc:            11.2.0::gentoo
sys-devel/gcc-config:     2.4::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.3::gentoo
sys-kernel/linux-headers: 5.14::gentoo (virtual/os-headers)
sys-libs/glibc:           2.34::gentoo
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -g"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe -march=native -g"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs ccache compressdebug config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news noinfo parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms splitdebug strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.UTF8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en en@quot en_US"
MAKEOPTS="-j4"
PKGDIR="/root/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="acl amd64 bzip2 crypt hardened iconv ipv6 libglvnd libtirpc multilib ncurses nptl openmp pcre pie readline seccomp split-usr ssl ssp unicode xattr xtpax zlib" ABI_X86="64" ADA_TARGET="gnat_2019" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en en-US" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26 ruby27" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RUSTFLAGS
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-25 08:44:33 UTC
Created attachment 746595 [details]
.config (5.10.75, gentoo-kernel[hardened])

(In reply to SpanKY from comment #8)
> i can't reproduce.  i wonder if the kernel is the diff. 

Attached config. Managed to reproduce it on same machine w/ stage3-amd64-openrc-20211024T170536Z.tar.xz upgraded to ~arch + glibc-2.34 at least.

Works fine in the same chroot if I roll it back to pre 2.34 snapshot (2.37-r7).

Stumped as to why I can't get anything more meaningful out of gdb but assuming it's because of the voodoo that's going on with libcap:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/execable.c?id=5306fa23ff92832be949b28d86eec39b54fbee26
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/execable.h?id=5306fa23ff92832be949b28d86eec39b54fbee26

> i also have USE=-pam, but i don't *think* that's relevant.

no difference
Comment 10 Ionen Wolkens gentoo-dev 2021-10-26 06:04:01 UTC
Created attachment 746757 [details]
build.log + emerge --info (amd64 with glibc-2.33-r7)

Unlikely to help (and may be a different cause entirely) but posting anyway.

I get a similar issue in my test chroot, but it happens with glibc-2.33-r7 and only for amd64 while x86's ./libcap.so executes normally.

The further odd bit is that tests pass fine if built outside the chroot (same running kernel, and shares exact same glibc build -- few minor differences that I use for testing but haven't been able to pinpoint a cause).

And to make things worse, rebuilding glibc-2.33-r7 with debug symbols in the chroot made the tests pass.. so I don't really have anything meaningful to give to help debug this. May indicate that how glibc is built has some impact though.
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-11 00:41:19 UTC
CCing upstream (Andrew).

Summary is:
1. Segfaults when running the test suite (or executing 32-bit libcap.so) at least with glibc-2.34 pretty reliably and it seems to happen with glibc-2.33 too.

2. I can't get a meaningful backtrace because things seem to get corrupted.

3. Output looks like:
```
cap_test PASS
make libcapsotest
make[2]: Entering directory '/var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap'
Makefile:106: warning: overriding recipe for target 'cap_names.list.h'
Makefile:102: warning: ignoring old recipe for target 'cap_names.list.h'
./libcap.so
make[2]: *** [Makefile:149: libcapsotest] Segmentation fault (core dumped)
make[2]: Leaving directory '/var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap'
make[1]: *** [Makefile:156: test] Error 2
make[1]: Leaving directory '/var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap'
make: *** [Makefile:12: test] Error 2
```

4. Running directly fails the same way:
```
mop /var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap # ./libcap.so.2.60
Segmentation fault (core dumped)
```

5. gdb:
```
(gdb) r
Starting program: /var/tmp/portage/sys-libs/libcap-2.60/work/libcap-2.60-abi_x86_32.x86/libcap/libcap.so.2.60

Program received signal SIGSEGV, Segmentation fault.
0xf7e81b94 in ?? ()
(gdb) bt
#0  0xf7e81b94 in ?? ()
#1  0x00000000 in ?? ()
(gdb)
```

6. strace:
```
[...]
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY|O_LARGEFILE) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0444, stx_size=0, ...}) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
```

If I comment out the line causing /proc/self/cmdline to be read (and change it to NULL so it should fall back based on reading execable.h), it just dies on another statx call very similarly. Seems like something isn't being set up correctly before handing over to normal loader(?) but I've got no idea what the spec is for stuff like that.
Comment 12 Andrew G. Morgan 2021-11-11 02:51:02 UTC
What does ldd say for the failing binary?
Comment 13 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-11 03:14:55 UTC
(In reply to Andrew G. Morgan from comment #12)
> What does ldd say for the failing binary?

# ldd libcap.so
	linux-gate.so.1 (0xf7f97000)
	libc.so.6 => /lib/libc.so.6 (0xf7d3e000)
	/lib/ld-linux.so.2 (0xf7f99000)

FWIW, I got a bit stuck trying to understand glibc's dynamic linker code. I had an idea to try musl and fortunately it failed even with 64 bit.

I was able to extract a possibly more meaningful backtrace:
```
#0  0x00007ffff7fabcdd in vfprintf (f=0x7ffff7ffb280 <__stdout_FILE>, fmt=fmt@entry=0x55555555c5d8 "%s is the shared library version: libcap-2.60.\nSee the License file for distribution information.\nMore information on this library is available from:\n\n    https://sites.google.com/site/fullycapable/\n", ap=ap@entry=0x7fffffffe150) at src/stdio/vfprintf.c:660
#1  0x00007ffff7fa8af9 in printf (fmt=fmt@entry=0x55555555c5d8 "%s is the shared library version: libcap-2.60.\nSee the License file for distribution information.\nMore information on this library is available from:\n\n    https://sites.google.com/site/fullycapable/\n") at src/stdio/printf.c:9
#2  0x000055555555b6f6 in __execable_main (argc=<optimized out>, argv=0x555555560cc0) at execable.c:10
#3  __so_start () at execable.c:4
```

It might be possible for you to try e.g. an Alpine container (if not a Gentoo one) and run tests there and see if it happens for you. But this might be a different bug and if it looks like that, feel free to ignore this bit for now.
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-11 03:15:33 UTC
(In reply to Sam James from comment #13)
> (In reply to Andrew G. Morgan from comment #12)
> > What does ldd say for the failing binary?
> 
> # ldd libcap.so
> 	linux-gate.so.1 (0xf7f97000)
> 	libc.so.6 => /lib/libc.so.6 (0xf7d3e000)
> 	/lib/ld-linux.so.2 (0xf7f99000)
> 

In dmesg:
[108801.223685] traps: libcap.so[2353148] general protection fault ip:f7e74b94 sp:ffc0ab74 error:0 in libc.so.6[f7d8e000+179000]
Comment 15 Andrew G. Morgan 2021-11-11 05:55:08 UTC
This is curious. OOC on your target system can you reproduce the crash with my answer on stackoverflow ( https://stackoverflow.com/a/68339111 ) summarizing how all this stuff works? That example is completely stand alone and might be easier to debug if it crashes as well.

Other notes.

The attached build.log is from a libcap-2.57 build, but I tried with HEAD (I don't see any code changes between these two revisions that would change this stuff).

At this point, I tried and wasn't able to reproduce any crash on a fedora-34 system gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1). I had to install glibc-2.33-20.fc34.i686 and glibc-devel-2.33-20.fc34.i686 to build it:

$ make clean COPTS="-m32 -O0 -g -ggdb3" -C libcap all test
[...]
make libcapsotest
make[1]: Entering directory '/home/andrew/OLDME/gits/libcap/libcap'
./libcap.so
./libcap.so is the shared library version: libcap-2.60.
See the License file for distribution information.
More information on this library is available from:

    https://sites.google.com/site/fullycapable/
make[1]: Leaving directory '/home/andrew/OLDME/gits/libcap/libcap'
make libpsxsotest
make[1]: Entering directory '/home/andrew/OLDME/gits/libcap/libcap'
./libpsx.so
./libpsx.so is the shared library version: libpsx-2.60.
See the License file for distribution information.
More information on this library is available from:

    https://sites.google.com/site/fullycapable/
make[1]: Leaving directory '/home/andrew/OLDME/gits/libcap/libcap'
make: Leaving directory '/home/andrew/OLDME/gits/libcap/libcap'
$ ldd libcap/libcap.so
        linux-gate.so.1 (0xf7ed8000)
        libc.so.6 => /lib/libc.so.6 (0xf7ce4000)
        /lib/ld-linux.so.2 (0xf7eda000)
Comment 16 Ionen Wolkens gentoo-dev 2021-11-11 06:28:13 UTC
(In reply to Andrew G. Morgan from comment #15)
> glibc-2.33-20.fc34.i686 and glibc-devel-2.33-20.fc34.i686
The x86 issue is with glibc-2.34, which I can also reproduce (I've had oddities with 2.33 too but I believe that's something else).

(In reply to Andrew G. Morgan from comment #15)
> This is curious. OOC on your target system can you reproduce the crash with
> my answer on stackoverflow ( https://stackoverflow.com/a/68339111 )
> summarizing how all this stuff works? That example is completely stand alone
> and might be easier to debug if it crashes as well.
glibc-2.33: both amd64 and `gcc -m32`'s ./multi.so work
glibc-2.34: 64 works, -m32 ./multi.so created the same way segfaults
Comment 17 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-11 06:58:02 UTC
Created attachment 750255 [details]
Dockerfile (gentoo, glibc-2.34, build libcap from git)

Not had a chance to try the SO minimal version yet, but for now, this may be useful.

I've attached a Dockerfile which installs glibc-2.34 (currently unkeyworded - not unleashed on users by default) and tries to build libcap and run its tests using the command you gave.

```
mkdir -p ~/docker/gentoo && cd ~/docker

# Create the attached Dockerfile
$EDITOR ~/docker/gentoo/Dockerfile

# Build the image
docker build -t gentoo gentoo/

# Run it (which builds & tries to run tests for libcap)
docker run -it gentoo
```

If Docker isn't usable (I'm no expert with it myself, just wanted something generic I could give you to get setup easily), I can hopefully give some other minimal instructions to use a Gentoo chroot quickly. But that may not be necessary if I can build the SO version from the link you gave and debug it there.
Comment 18 Andrew G. Morgan 2021-11-12 05:57:28 UTC
Using the docker stuff, I can reproduce it. A mystery so far. I have to confess, the SSE instruction is not something I recognize (stepi ing through the code under gdb). We seem to be crashing on some sort of SSE move:

=> 0xf7df4674:  31 db   xor    %ebx,%ebx
(gdb) 
0xf7df4676 in ?? ()
=> 0xf7df4676:  66 0f eb e0     por    %xmm0,%xmm4
(gdb) 
0xf7df467a in ?? ()
=> 0xf7df467a:  66 0f 6e c1     movd   %ecx,%xmm0
(gdb) 
0xf7df467e in ?? ()
=> 0xf7df467e:  0f b6 ca        movzbl %dl,%ecx
(gdb) 
0xf7df4681 in ?? ()
=> 0xf7df4681:  66 0f 62 c1     punpckldq %xmm1,%xmm0
(gdb) 
0xf7df4685 in ?? ()
=> 0xf7df4685:  0f 29 64 24 10  movaps %xmm4,0x10(%esp)
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0xf7df4685 in ?? ()
=> 0xf7df4685:  0f 29 64 24 10  movaps %xmm4,0x10(%esp)

The objdump in the container doesn't seem to be able to disassemble the 32 bit glibc so at this point, I'm stuck.
Comment 19 Andrew G. Morgan 2021-11-12 14:59:19 UTC
It looks like the crash is somewhere inside the glibc:fread() function code. glibc seems unfriendly to the debugger as currently installed, so I can't easily debug what is going on. Is there a "debug symbol" build of libc.so.6 ?

f5d495130a5e /libcap/libcap # gdb ./libcap.so.2.60 
GNU gdb (Gentoo 11.1 vanilla) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./libcap.so.2.60...
(gdb) b fread@plt
Breakpoint 1 at 0x2210
(gdb) r
Starting program: /libcap/libcap/libcap.so.2.60 
warning: Error disabling address space randomization: Operation not permitted

Breakpoint 1, 0x565b0210 in fread@plt ()
(gdb) stepi
0x565b0216 in fread@plt ()
(gdb) 
0x565b021b in fread@plt ()
(gdb) 
0x565b0020 in ?? ()
(gdb) 
0x565b0026 in ?? ()
(gdb) 
0xf7f4bb90 in ?? () from /lib/ld-linux.so.2
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0xf7e11685 in ?? ()
(gdb) p/x 0xf7f4bb90 - 0xf7e11685
$1 = 0x13a50b
(gdb) x/2i 0xf7e11685
=> 0xf7e11685:  movaps %xmm4,0x10(%esp)
   0xf7e1168a:  movq   0xcc(%esp),%xmm4
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /libcap/libcap/libcap.so.2.60 
warning: Error disabling address space randomization: Operation not permitted

Breakpoint 1, 0x56643210 in fread@plt ()
(gdb) stepi
0x56643216 in fread@plt ()
(gdb) 
0x5664321b in fread@plt ()
(gdb) 
0x56643020 in ?? ()
(gdb) 
0x56643026 in ?? ()
(gdb) 
0xf7f4bb90 in ?? () from /lib/ld-linux.so.2
(gdb) p/x 0xf7f4bb90 - 0x13a50b
$2 = 0xf7e11685
(gdb) b *0xf7e11685
Breakpoint 2 at 0xf7e11685
(gdb) c
Continuing.

Breakpoint 2, 0xf7e11685 in ?? ()
(gdb) x/5i $eip
=> 0xf7e11685:  movaps %xmm4,0x10(%esp)
   0xf7e1168a:  movq   0xcc(%esp),%xmm4
   0xf7e11693:  movdqa %xmm0,%xmm5
   0xf7e11697:  psllq  $0x20,%xmm0
   0xf7e1169c:  psllq  $0x8,%xmm5
(gdb) bt
#0  0xf7e11685 in ?? ()
#1  0xf7f67b88 in ?? ()
#2  0xffd3d93c in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) info registers 
eax            0x0                 0
ecx            0x0                 0
edx            0x0                 0
ebx            0x0                 0
esp            0xffd3d614          0xffd3d614
ebp            0xffd3d804          0xffd3d804
esi            0x8124              33060
edi            0xf7f23000          -135122944
eip            0xf7e11685          0xf7e11685
eflags         0x246               [ PF ZF IF ]
cs             0x23                35
ss             0x2b                43
ds             0x2b                43
es             0x2b                43
fs             0x0                 0
gs             0x63                99
(gdb) x/20x $esp
0xffd3d614:     0x000007f8      0xf7f35000      0xf7f23000      0xf7f23000
0xffd3d624:     0x00800000      0xfffff218      0x00000012      0x0000000f
0xffd3d634:     0xffd3d648      0xf7d95790      0xf7f23000      0xffd3d668
0xffd3d644:     0xf7f66fc8      0xffd3d8a8      0xf7d8eee2      0x00000012
0xffd3d654:     0xffd3d668      0xf7d8ee10      0xf7f4d54f      0xf7f29000
(gdb)
Comment 20 Mike Gilbert gentoo-dev 2021-11-12 15:17:50 UTC
(In reply to Andrew G. Morgan from comment #19)
> It looks like the crash is somewhere inside the glibc:fread() function code.
> glibc seems unfriendly to the debugger as currently installed, so I can't
> easily debug what is going on. Is there a "debug symbol" build of libc.so.6 ?

Probably easiest to just rebuild it with symbols. The following command line should work.

FEATURES="nostrip" CFLAGS="-ggdb" emerge --oneshot sys-libs/glibc
Comment 21 Mike Gilbert gentoo-dev 2021-11-12 15:27:25 UTC
Also maybe add MAKEOPTS="-j$(nproc)" to enable a multi-job build.
Comment 22 Andrew G. Morgan 2021-11-13 05:59:20 UTC
So I rebuilt glibc with those instructions. The segfault is still there. Oddly, the crash is still in a location without any symbols...? Assembly code?

I downloaded a copy of glibc and browsed around in there looking for some of the SSE codes. I thought I might find something that matched without having symbols to go on...

From last time, the location of the crash was before this code sequence:

Breakpoint 2, 0xf7e11685 in ?? ()
(gdb) x/5i $eip
=> 0xf7e11685:  movaps %xmm4,0x10(%esp)
   0xf7e1168a:  movq   0xcc(%esp),%xmm4
   0xf7e11693:  movdqa %xmm0,%xmm5
   0xf7e11697:  psllq  $0x20,%xmm0
   0xf7e1169c:  psllq  $0x8,%xmm5

Without symbols it is hard to guess what it might be. I grepped around looking for examples of 'psllq' in the glibc-2.34 tagged code. Interestingly, the only assembly that contains 'psllq' is in files with names like sysdeps/x86_64/fpu/multiarch/svml_*.S and they look like sin and other math functions.

However, that is 64 bit code, so how would that get invoked from 32-bit code? I'm really not familiar with any of this stuff. I'm more and more convinced we need someone who is to look deeper into this.

Other things I did:

I waded around for some time in gdb decoding the syscalls manaually: putting a breakpoint on __kernel_vsyscall.

Eventually, I just emerged strace and ran that:

f5d495130a5e /libcap/libcap # strace ./libcap.so
execve("./libcap.so", ["./libcap.so"], 0x7fff569d86e0 /* 9 vars */) = 0
[ Process PID=191791 runs in 32 bit mode. ]
brk(NULL)                               = 0x57667000
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7fa9000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=15552, ...}) = 0
mmap2(NULL, 15552, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7fa5000
close(3)                                = 0
openat(AT_FDCWD, "/lib/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 \26\2\0004\0\0\0"..., 512) = 512
pread64(3, "\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\3\0\0\0\2\0\0\0\0\0\0\0"..., 72, 468) = 72
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=12387916, ...}) = 0
mmap2(NULL, 2201084, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf7d8b000
mprotect(0xf7dab000, 2039808, PROT_NONE) = 0
mmap2(0xf7dab000, 1515520, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0xf7dab000
mmap2(0xf7f1d000, 520192, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x192000) = 0xf7f1d000
mmap2(0xf7f9d000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x211000) = 0xf7f9d000
mmap2(0xf7fa0000, 17916, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7fa0000
close(3)                                = 0
set_thread_area({entry_number=-1, base_addr=0xf7faa100, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=12)
set_tid_address(0xf7faa168)             = 191791
set_robust_list(0xf7faa170, 12)         = 0
mprotect(0xf7f9d000, 8192, PROT_READ)   = 0
mprotect(0x565fc000, 4096, PROT_READ)   = 0
mprotect(0xf7fe1000, 8192, PROT_READ)   = 0
ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
munmap(0xf7fa5000, 15552)               = 0
getrandom("\xf6\xb9\x25\x64", 4, GRND_NONBLOCK) = 4
brk(NULL)                               = 0x57667000
brk(0x57688000)                         = 0x57688000
brk(0x57689000)                         = 0x57689000
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY|O_LARGEFILE) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0444, stx_size=0, ...}) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
Comment 23 Andrew G. Morgan 2021-11-13 16:35:28 UTC
The fact that #c16 fails with glibc-2.34 suggests we can decouple the libcap specific aspects of this failure from the failure itself. There is something about the execution mechanism, it is relying on, that has this incompatibility with the 32-bit build and environment of the -m32 shared object.

We can focus on this pared down code:

# cat multi.h 
/* multi.h */
void multi_main(void);
void multi(const char *caller);
# cat multi.c 
/* multi.c */
#include <stdio.h>
#include <stdlib.h>
#include "multi.h"

void multi(const char *caller) {
    printf("called from %s\n", caller);
}

void multi_main(void) {
    multi(__FILE__);
    exit(42);
}

const char dl_loader[] __attribute__((section(".interp"))) =
    DL_LOADER ;
# gcc -fPIC -shared -m32 -DDL_LOADER="\"/lib/ld-linux.so.2\"" -o multi32.so multi.c -Wl,-e,multi_main
# ./multi32.so
Segmentation fault (core dumped)

Which we can compare with 

# gcc -fPIC -shared -DDL_LOADER="\"/lib64/ld-linux-x86-64.so.2\"" -o multi64.so multi.c -Wl,-e,multi_main
# ./multi64.so 
called from multi.c

It looks as if the loader itself is happy with the binary. That is, both of these work:

# /lib64/ld-linux-x86-64.so.2 --verify multi64.so
# /lib/ld-linux.so.2 --verify multi32.so

I'm kind of stumbling around in the dark, but could there be some sort of HWCAP problem for these -m32 builds? That is, this seems to actually work(!):

# /lib/ld-linux.so.2 --glibc-hwcaps-mask i686,tls,sse2 ./multi32.so 
called from multi.c

Which, if I'm reading the output of '/lib/ld-linux.so.2 --help' correctly, limits HWCAP features available when running the program.

- Could there be some issue with other HWCAP things on these systems that is interfering with the correct operation of the program when it implicitly leverages /lib/ld-linux.so.2 at execution time?

- Does this new glibc require something explicitly mask the hwcaps with some new linker trickery and the .so binary?
Comment 24 Andrew G. Morgan 2021-11-13 16:46:20 UTC
Is this saying that the way glibc is built can affect this sort of thing?

https://www.gnu.org/software/libc/manual/html_node/Tunables.html
Comment 25 Arsen Arsenović gentoo-dev 2021-11-13 21:57:07 UTC
The reason this crash happens is because libc isn't set up. Due to the way
the binary is built, constructors and other libc init code isn't ran, so it's
honestly unfair to assume it should (not could!) work.

While I'm largely unfamiliar with glibc internals and even more unfamiliar with
x86 weirdness, I am decently sure what happens here is that fopen@plt resolves
fopen which then swiftly trips either on a misaligned stack or a uninitialized
global. This is further confirmed with LD_DEBUG=all and LD_BIND_NOW=no, which
would skip the other potential suspect: the RTDL.

On the topic of the (AFAIU) motivation behind this:

Honestly, while the idea of having .sos be given capabilities outside
the permitted set of it's host process is a novel one, I think it's too
abuse-able/easy to trick for it to be worth implementing, moreover, I think
forking for it is too unportable/fragile. A better idea would be to improve
capability related things in the kernel, permitting you to be granted a
"ticket" in form of a file descriptor for privileged operations (for instance,
and this is something I'd like to implement via a flag to pidfd_open, ownership
of a pidfd ought to be enough to authorize controlling the process on the other
side).

P.S. tunables could be involved, though it's more likely that a tunable makes
this work by coincidence rather than making it break by coincidence.
Comment 26 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-13 21:59:19 UTC
Created attachment 750957 [details]
Dockerfile (gentoo, musl, build libcap from git)

(I've also attached a musl Dockerfile which crashes in possibly a more informative way. I was able to get a more useful backtrace out of it. Note that it's possibly a different issue, but like Arsen, I suspect that this is more to do with how things are setup in the libcap loader, and is therefore fragile, but I'm not sure.)
Comment 27 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-11-13 22:09:32 UTC
(In reply to Andrew G. Morgan from comment #24)
> Is this saying that the way glibc is built can affect this sort of thing?
> 
> https://www.gnu.org/software/libc/manual/html_node/Tunables.html

According to https://stackoverflow.com/questions/42451492/disable-avx-optimized-functions-in-glibc-ld-hwcap-mask-etc-ld-so-nohwcap-for, for the purposes of testing, we can do LD_HWCAP_MASK=0.

But we may want to try build glibc with simpler instructions to help too.

In the 'emerge' line for glibc in the Dockerfile, prefix it with something like: CFLAGS="-march=x86-64", maybe?
Comment 28 Andrew G. Morgan 2021-11-14 05:37:21 UTC
It looks as if there is a missing detail in execable.h. GCC is built with the
assumption that the stack alignment for main is 16 bytes. This is required
for the SSE instruction 'movaps %xmm4,0x10(%esp)' to work. That is operating on
%xmm4 will segfault if the address this instruction points to is not 16-byte
aligned.

Since 0x10 is a 16-byte aligned offset, we need to ensure that %esp starts off
aligned from the beginning of main.

I've pushed the following commit to force this alignment on __i386__ builds:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=c234bf90839f19e0332b586335411cb626a25a18

The x86_64 ABI requires it to always be true, so it can't be violated by the
compiler. However, the 32-bit x86 variant doesn't require it, and as
witnessed in this bug, we don't reliably get it.

I'll update the stackoverflow answer I referenced in #c15 so I don't forget this detail.

Re: I've filed this bug to investigate what is going on with musl builds:

   https://bugzilla.kernel.org/show_bug.cgi?id=215009
Comment 29 Arsen Arsenović gentoo-dev 2021-11-14 11:18:59 UTC
argc and argv are passed as the bottom two entries in the stack (so, 4(%esp) and 8(%esp)), by the way, so that should be used instead of parsing /proc/self/cmdline.

I managed to get reading those arguments to work by adding a stub to push another zero below them and then jump to __so_start (I think this is the case because there should be a pointer below them for the return address, which there isn't since the rtdl does not push one at the bottom of the stack, or something of such nature).

libc.so.6 does something similar, and I also am not able to justify it in their case either.
Comment 30 Andrew G. Morgan 2021-11-14 14:53:12 UTC
While this is true, it then requires assembly for each supported architecture to implement. My preference is C on all variants of Linux.
Comment 31 Andrew G. Morgan 2021-11-14 15:41:33 UTC
For reference, this long bug discusses this issue. It goes back and forth until this specific entry:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838#c91

which explains what appears to be the current state of affairs.
Comment 32 Larry the Git Cow gentoo-dev 2021-11-20 08:29:48 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8b27cb9d5856f9461666b7e40bc047522ab91aed

commit 8b27cb9d5856f9461666b7e40bc047522ab91aed
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-11-20 08:28:30 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-11-20 08:29:41 +0000

    sys-libs/libcap: backport alignment fixes
    
    This fixes a segfault in the test suite for abi_x86_32 and musl.
    
    Closes: https://bugs.gentoo.org/820071
    Thanks-to: Arsen Arsenovic <arsen@aarsen.me>
    Thanks-to: Andrew G. Morgan <morgan@kernel.org>
    Signed-off-by: Sam James <sam@gentoo.org>

 .../files/libcap-2.60-libcap-alignment.patch       | 105 +++++++++++++++++++++
 sys-libs/libcap/libcap-2.60-r1.ebuild              |  89 +++++++++++++++++
 2 files changed, 194 insertions(+)