Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 690600

Summary: sys-libs/glibc-2.29-r2 fails to build with segmentation fault on ARM (ptrace?)
Product: Gentoo Linux Reporter: Nuno <can.ecodo.nu.n.o+bugs.gentoo>
Component: Current packagesAssignee: Gentoo Toolchain Maintainers <toolchain>
Status: RESOLVED OBSOLETE    
Severity: normal CC: herrtimson, matoro_bugzilla_gentoo
Priority: Normal    
Version: unspecified   
Hardware: ARM   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: build.log.gz (13MB uncompressed)
environment.gz
emerge.info

Description Nuno 2019-07-24 16:43:05 UTC
Created attachment 584302 [details]
build.log.gz (13MB uncompressed)

sys-libs/glibc-2.29-r2 fails to build with segmentation fault


/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list
make[1]: *** [Makefile:106: install-symbolic-link] Error 139
make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.29-r2/work/glibc-2.29'
make: *** [Makefile:12: install] Error 2
 * ERROR: sys-libs/glibc-2.29-r2::gentoo failed (install phase):
 *   emake failed




[Jul24 04:13] make/558: potentially unexpected fatal signal 11.
[  +0.003849] Pid: 558, comm:                 make
[  +0.004715] CPU: 0    Not tainted  (3.4.104-sunxi-g1df3de8e #35)
[  +0.001881] PC is at 0xb6dd0b0c
[  +0.001976] LR is at 0xcbdf3c00
[  +0.008888] pc : [<b6dd0b0c>]    lr : [<cbdf3c00>]    psr: 800d0010
[  +0.000007] sp : be864e10  ip : 00000001  fp : be864fc8
[  +0.003965] r10: b6e74c00  r9 : b6e74f00  r8 : b7823e58
[  +0.005219] r7 : 000000c0  r6 : 00000010  r5 : 00000000  r4 : ffffffff
[  +0.005235] r3 : b6e93000  r2 : cbdf3c00  r1 : 00000010  r0 : b6e93000
[  +0.005988] Flags: Nzcv  IRQs on  FIQs on  Mode USER_32  ISA ARM  Segment user
[  +0.004478] Control: 10c5387d  Table: 473b806a  DAC: 00000015

[  +0.001671] LR: 0xcbdf3b80:
[  +0.000447] 3b80  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 11fef3d0
[  +0.007117] 3ba0  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.007179] 3bc0  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f5f5d5
[  +0.006974] 3be0  11fef3d0 33f6fddb 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5
[  +0.007209] 3c00  75f2fbd5 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.007029] 3c20  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.006967] 3c40  11fef3d0 11fef3d0 11fef3d0 75f5f5d5 11fef3d0 33f6fddb 11fef3d0 11fef3d0
[  +0.007153] 3c60  11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 75f2fbd5 11fef3d0 11fef3d0 11fef3d0

[  +0.008226] R2: 0xcbdf3b80:
[  +0.000447] 3b80  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 11fef3d0
[  +0.007060] 3ba0  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.006992] 3bc0  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f5f5d5
[  +0.007094] 3be0  11fef3d0 33f6fddb 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5
[  +0.007116] 3c00  75f2fbd5 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.007066] 3c20  11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0
[  +0.006953] 3c40  11fef3d0 11fef3d0 11fef3d0 75f5f5d5 11fef3d0 33f6fddb 11fef3d0 11fef3d0
[  +0.006988] 3c60  11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 75f2fbd5 11fef3d0 11fef3d0 11fef3d0
[  +0.014973] [<c0014508>] (unwind_backtrace+0x0/0xec) from [<c00461e0>] (get_signal_to_deliver+0x4e0/0x588)
[  +0.007944] [<c00461e0>] (get_signal_to_deliver+0x4e0/0x588) from [<c0010e78>] (do_signal+0x68/0x4f0)
[  +0.007324] [<c0010e78>] (do_signal+0x68/0x4f0) from [<c00115e8>] (do_notify_resume+0x50/0x5c)
[  +0.007502] [<c00115e8>] (do_notify_resume+0x50/0x5c) from [<c000dff8>] (work_pending+0x24/0x28)
Comment 1 Nuno 2019-07-24 16:44:27 UTC
Created attachment 584304 [details]
environment.gz
Comment 2 Nuno 2019-07-24 16:48:59 UTC
Created attachment 584306 [details]
emerge.info

emerge --info '=sys-libs/glibc-2.29-r2::gentoo' > emerge.info



emerge -pqv '=sys-libs/glibc-2.29-r2::gentoo'
[ebuild     U ] sys-libs/glibc-2.29-r2 [2.28-r6] USE="multiarch nscd ssp (-audit) -caps (-cet) (-compile-locales) -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)"
Comment 3 Sergei Trofimovich (RETIRED) gentoo-dev 2019-08-04 17:05:53 UTC
> [Jul24 04:13] make/558: potentially unexpected fatal signal 11.
> [  +0.003849] Pid: 558, comm:                 make
> [  +0.004715] CPU: 0    Not tainted  (3.4.104-sunxi-g1df3de8e #35)

Looks like most interesting stuff has happened before. Can you attach full 'dmesg'? Is crash reproducible?
Comment 4 Nuno 2019-08-05 00:34:55 UTC
(In reply to Sergei Trofimovich from comment #3)
> > [Jul24 04:13] make/558: potentially unexpected fatal signal 11.
> > [  +0.003849] Pid: 558, comm:                 make
> > [  +0.004715] CPU: 0    Not tainted  (3.4.104-sunxi-g1df3de8e #35)
> 
> Looks like most interesting stuff has happened before. Can you attach full
> 'dmesg'? Is crash reproducible?

There is nothing interesting in dmesg before that; besides UFW and w1_slave_driver lines, the only thing besides what I already posted is:

[Jul24 01:43] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case.  If you have one, please send an email to linux-mm@kvack.org.

Note that the dmesg output I posted is `dmesh -H`, so timestamps beginning with '+' are relative to the previous absolute timestamps, not system boot.


I can reproduce the crash every time in this system (Banana Pi M1) but am not able to reproduce it in my Raspberry Pi nor in a armv7a-unknown-linux-gnueabihf chroot.

I am happy to follow additional steps to debug this, I just don't know what else to try :/
Comment 5 Sergei Trofimovich (RETIRED) gentoo-dev 2019-08-07 18:23:32 UTC
can you extract exact command that crashes? If you run 'make' manually in a build directory for example. Would be nice to extract minimal example to explore and debug the cause of crash.
Comment 6 Nuno 2019-08-28 23:57:04 UTC
(In reply to Sergei Trofimovich from comment #5)
> can you extract exact command that crashes? If you run 'make' manually in a
> build directory for example. Would be nice to extract minimal example to
> explore and debug the cause of crash.

I tried the following and none of them crashed:
* as portage:
 make -j1 -l2 -C /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl


* as root:
 cd /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl; make -j1 -l2 install_root=/var/tmp/portage/sys-libs/glibc-2.29-r2/image/ install


The latter ends with the following lines:
/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list                                                                  
rm -f /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list
make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.29-r2/work/glibc-2.29'


Note that in, build.log, make segfaults after running sln.

---

So, I tried running the command that leads to the segmentation fault (from build.log):

portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ strace /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list
execve("/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln", ["/var/tmp/portage/sys-libs/glibc-"..., "/var/tmp/portage/sys-libs/glibc-"...], 0xbedf41b4 /* 33 vars */) = 0                    
brk(NULL)                               = 0x10b4000
brk(0x10b4d08)                          = 0x10b4d08
set_tls(0x10b44c0)                      = 0
uname({sysname="Linux", nodename="banana", ...}) = 0
readlink("/proc/self/exe", "/var/tmp/portage/sys-libs/glibc-"..., 4096) = 98
brk(0x10d5d08)                          = 0x10d5d08
brk(0x10d6000)                          = 0x10d6000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x1} ---
+++ killed by SIGSEGV +++

I'm assuming it should run as portage because all the files in work/build-arm-armv7a-unknown-linux-gnueabihf-nptl are owned by portage. However, I previously ran the same command as root and it succeeded...


Running with gdb:

portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ gdb /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln
GNU gdb (Gentoo 8.1 p1) 8.1
(...)
Reading symbols from /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln...done.
(gdb) run /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list
Starting program: /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list
Invalid link from "ld-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/ld-linux-armhf.so.3": Permission denied
Invalid link from "libc-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libc.so.6": Permission denied
Invalid link from "libBrokenLocale-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libBrokenLocale.so.1": Permission denied
Invalid link from "libm-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libm.so.6": Permission denied
Invalid link from "libdl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libdl.so.2": Permission denied
Invalid link from "libc-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libc.so.6": Permission denied
Invalid link from "libpthread-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libpthread.so.0": Permission denied
Invalid link from "librt-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/librt.so.1": Permission denied
Invalid link from "libcrypt-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libcrypt.so.1": Permission denied
Invalid link from "libthread_db-1.0.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libthread_db.so.1": Permission denied
Invalid link from "libresolv-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libresolv.so.2": Permission denied
Invalid link from "libnss_dns-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_dns.so.2": Permission denied
Invalid link from "libanl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libanl.so.1": Permission denied
Invalid link from "libnss_files-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_files.so.2": Permission denied
Invalid link from "libnss_db-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_db.so.2": Permission denied
Invalid link from "libnss_compat-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_compat.so.2": Permission denied
Invalid link from "libnss_hesiod-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_hesiod.so.2": Permission denied
Invalid link from "libnsl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnsl.so.1": Permission denied
Invalid link from "libutil-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libutil.so.1": Permission denied
Invalid link from "ld-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/ld-linux-armhf.so.3": Permission denied
[Inferior 1 (process 17547) exited with code 01]


/var/tmp/portage/sys-libs/glibc-2.29-r2/image/ is owned by root so that explains the "Permission denied" erros. Doesn't explain the segmentation fault when under strace.

---

I then recompiled the package with CFLAGS="-Og -ggdb (...)" and got the same output on gdb and strace.

Was able to get a coredump with nothing useful:

portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ gdb -c core
GNU gdb (Gentoo 8.1 p1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7a-unknown-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
[New LWP 16745]
Core was generated by `/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnu'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00037828 in ?? ()
(gdb) symbol-file /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln
Reading symbols from /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln...done.
(gdb) bt
#0  0x00037828 in __open_nocancel (file=<optimized out>, oflag=-1946596096) at ../sysdeps/unix/sysv/linux/open_nocancel.c:42
#1  0x00404000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 


Not sure if 'sln' segfaulting under 'strace' could be related to 'make' segfaulting... what else could I try?
Comment 7 Sergei Trofimovich (RETIRED) gentoo-dev 2019-08-29 06:33:38 UTC
(In reply to Nuno from comment #6)
...
> Not sure if 'sln' segfaulting under 'strace' could be related to 'make'
> segfaulting... what else could I try?

Nice debugging! Yes, I think it's related. static binaries are handled with ptrace() by emerge's sandbox (very similar to what strace uses).

As a workaround you can try merging glibc with FEATURES="-sandbox -usersandbox" to check if lack of sandbox makes glibc ebuild work.
Comment 8 Nuno 2019-08-30 15:10:20 UTC
(In reply to Sergei Trofimovich from comment #7)
> Nice debugging! Yes, I think it's related. static binaries are handled with
> ptrace() by emerge's sandbox (very similar to what strace uses).
> 
> As a workaround you can try merging glibc with FEATURES="-sandbox
> -usersandbox" to check if lack of sandbox makes glibc ebuild work.

Indeed, I was able to build sys-libs/glibc-2.29-r2 with FEATURES="-sandbox -usersandbox":


[ebuild     U  ] sys-libs/glibc-2.29-r2:2.2::gentoo [2.28-r6:2.2::gentoo] USE="multiarch nscd (split-usr%*) ssp (-audit) -caps (-cet) (-compile-locales) -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)" 0 KiB

So is this an upstream bug on 'sln'?
Comment 9 Sergei Trofimovich (RETIRED) gentoo-dev 2019-08-30 18:46:35 UTC
(In reply to Nuno from comment #8)
> So is this an upstream bug on 'sln'?

I suspect it's a gentoo's sandbox bug (or ptrace kernel bug). I suspect sandbox manages to corrupt syscall parameters somehow and brings sln into inconsistent state.
Comment 10 matoro archtester 2020-02-18 21:39:16 UTC
I am seeing this on armv6j as well...more specifically, after bump from =sys-libs/glibc-2.30-r3 to -r4, as well as the same thing on =sys-libs/glibc-2.31-r1.  This leads me to think it may be a regression in the patchset from https://github.com/gentoo/gentoo/commit/9097e0e8399937b751ac38153f44a12f9f1c2b54 which was supposed to resolve https://bugs.gentoo.org/708758
Comment 11 Sergei Trofimovich (RETIRED) gentoo-dev 2020-02-19 09:17:29 UTC
(In reply to matoro from comment #10)
> I am seeing this on armv6j as well...more specifically, after bump from
> =sys-libs/glibc-2.30-r3 to -r4, as well as the same thing on
> =sys-libs/glibc-2.31-r1.  This leads me to think it may be a regression in
> the patchset from
> https://github.com/gentoo/gentoo/commit/
> 9097e0e8399937b751ac38153f44a12f9f1c2b54 which was supposed to resolve
> https://bugs.gentoo.org/708758

It might also be a glibc rebuild against new linux-headers that exposes a deficiency in our sandbox ptrace wrapper.

Do you also see SIGSEGV on openat()?
Comment 12 Sergei Trofimovich (RETIRED) gentoo-dev 2020-02-21 14:22:54 UTC
I tried to reproduce it on smaller example in qemu-system-arm and failed. Things just work for me on vexpress-a9 emulator.

I'll need more details from the affected systems:
1. kernel's .config
2. emerge --info
3. static 'sln' binary attached (if it crashes under sandbox)
Comment 13 Nuno 2020-02-22 23:41:25 UTC
Here's the files from my Banana Pi M2.

The kernel is from linux sunxi with a minor patch: https://github.com/linux-sunxi/linux-sunxi

Linux banana 3.4.104-sunxi-g1df3de8e #35 SMP Tue Jul 17 23:37:34 WEST 2018 armv7l sun7i GNU/Linux


banana ~ # emerge -pqv '=sys-libs/glibc-2.29-r7::gentoo'
[ebuild   R   ] sys-libs/glibc-2.29-r7  USE="multiarch nscd ssp (-audit) -caps (-cet) -compile-locales -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)"
Comment 14 Nuno 2020-02-22 23:47:11 UTC
Files are too big to attach.
Here's a link: https://drive.google.com/file/d/1lAvrZpPLQasVR_kkr3WJcXtoVaNZ7QN5/view?usp=sharing

sha1sum:
c1ab631f561e9dc1422e5a4980a29f757b71f4f0  bug-690600_banana-pi_glibc-2.29-r7.tar.gz
Comment 15 Nuno 2020-02-22 23:48:06 UTC
(In reply to Nuno from comment #13)
> Here's the files from my Banana Pi M2.
I meant Banana Pi M1, not M2.