Summary: | sys-libs/glibc-2.29-r2 fails to build with segmentation fault on ARM (ptrace?) | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Nuno <can.ecodo.nu.n.o+bugs.gentoo> |
Component: | Current packages | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | herrtimson, matoro_bugzilla_gentoo |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | ARM | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
build.log.gz (13MB uncompressed)
environment.gz emerge.info |
Created attachment 584304 [details]
environment.gz
Created attachment 584306 [details]
emerge.info
emerge --info '=sys-libs/glibc-2.29-r2::gentoo' > emerge.info
emerge -pqv '=sys-libs/glibc-2.29-r2::gentoo'
[ebuild U ] sys-libs/glibc-2.29-r2 [2.28-r6] USE="multiarch nscd ssp (-audit) -caps (-cet) (-compile-locales) -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)"
> [Jul24 04:13] make/558: potentially unexpected fatal signal 11.
> [ +0.003849] Pid: 558, comm: make
> [ +0.004715] CPU: 0 Not tainted (3.4.104-sunxi-g1df3de8e #35)
Looks like most interesting stuff has happened before. Can you attach full 'dmesg'? Is crash reproducible?
(In reply to Sergei Trofimovich from comment #3) > > [Jul24 04:13] make/558: potentially unexpected fatal signal 11. > > [ +0.003849] Pid: 558, comm: make > > [ +0.004715] CPU: 0 Not tainted (3.4.104-sunxi-g1df3de8e #35) > > Looks like most interesting stuff has happened before. Can you attach full > 'dmesg'? Is crash reproducible? There is nothing interesting in dmesg before that; besides UFW and w1_slave_driver lines, the only thing besides what I already posted is: [Jul24 01:43] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case. If you have one, please send an email to linux-mm@kvack.org. Note that the dmesg output I posted is `dmesh -H`, so timestamps beginning with '+' are relative to the previous absolute timestamps, not system boot. I can reproduce the crash every time in this system (Banana Pi M1) but am not able to reproduce it in my Raspberry Pi nor in a armv7a-unknown-linux-gnueabihf chroot. I am happy to follow additional steps to debug this, I just don't know what else to try :/ can you extract exact command that crashes? If you run 'make' manually in a build directory for example. Would be nice to extract minimal example to explore and debug the cause of crash. (In reply to Sergei Trofimovich from comment #5) > can you extract exact command that crashes? If you run 'make' manually in a > build directory for example. Would be nice to extract minimal example to > explore and debug the cause of crash. I tried the following and none of them crashed: * as portage: make -j1 -l2 -C /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl * as root: cd /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl; make -j1 -l2 install_root=/var/tmp/portage/sys-libs/glibc-2.29-r2/image/ install The latter ends with the following lines: /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list rm -f /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.29-r2/work/glibc-2.29' Note that in, build.log, make segfaults after running sln. --- So, I tried running the command that leads to the segmentation fault (from build.log): portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ strace /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list execve("/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln", ["/var/tmp/portage/sys-libs/glibc-"..., "/var/tmp/portage/sys-libs/glibc-"...], 0xbedf41b4 /* 33 vars */) = 0 brk(NULL) = 0x10b4000 brk(0x10b4d08) = 0x10b4d08 set_tls(0x10b44c0) = 0 uname({sysname="Linux", nodename="banana", ...}) = 0 readlink("/proc/self/exe", "/var/tmp/portage/sys-libs/glibc-"..., 4096) = 98 brk(0x10d5d08) = 0x10d5d08 brk(0x10d6000) = 0x10d6000 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x1} --- +++ killed by SIGSEGV +++ I'm assuming it should run as portage because all the files in work/build-arm-armv7a-unknown-linux-gnueabihf-nptl are owned by portage. However, I previously ran the same command as root and it succeeded... Running with gdb: portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ gdb /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln GNU gdb (Gentoo 8.1 p1) 8.1 (...) Reading symbols from /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln...done. (gdb) run /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list Starting program: /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list Invalid link from "ld-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/ld-linux-armhf.so.3": Permission denied Invalid link from "libc-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libc.so.6": Permission denied Invalid link from "libBrokenLocale-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libBrokenLocale.so.1": Permission denied Invalid link from "libm-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libm.so.6": Permission denied Invalid link from "libdl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libdl.so.2": Permission denied Invalid link from "libc-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libc.so.6": Permission denied Invalid link from "libpthread-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libpthread.so.0": Permission denied Invalid link from "librt-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/librt.so.1": Permission denied Invalid link from "libcrypt-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libcrypt.so.1": Permission denied Invalid link from "libthread_db-1.0.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libthread_db.so.1": Permission denied Invalid link from "libresolv-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libresolv.so.2": Permission denied Invalid link from "libnss_dns-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_dns.so.2": Permission denied Invalid link from "libanl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libanl.so.1": Permission denied Invalid link from "libnss_files-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_files.so.2": Permission denied Invalid link from "libnss_db-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_db.so.2": Permission denied Invalid link from "libnss_compat-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_compat.so.2": Permission denied Invalid link from "libnss_hesiod-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnss_hesiod.so.2": Permission denied Invalid link from "libnsl-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libnsl.so.1": Permission denied Invalid link from "libutil-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/libutil.so.1": Permission denied Invalid link from "ld-2.29.so" to "/var/tmp/portage/sys-libs/glibc-2.29-r2/image//lib/ld-linux-armhf.so.3": Permission denied [Inferior 1 (process 17547) exited with code 01] /var/tmp/portage/sys-libs/glibc-2.29-r2/image/ is owned by root so that explains the "Permission denied" erros. Doesn't explain the segmentation fault when under strace. --- I then recompiled the package with CFLAGS="-Og -ggdb (...)" and got the same output on gdb and strace. Was able to get a coredump with nothing useful: portage@banana:~/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl$ gdb -c core GNU gdb (Gentoo 8.1 p1) 8.1 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "armv7a-unknown-linux-gnueabihf". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://bugs.gentoo.org/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". [New LWP 16745] Core was generated by `/var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnu'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00037828 in ?? () (gdb) symbol-file /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln Reading symbols from /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln...done. (gdb) bt #0 0x00037828 in __open_nocancel (file=<optimized out>, oflag=-1946596096) at ../sysdeps/unix/sysv/linux/open_nocancel.c:42 #1 0x00404000 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) Not sure if 'sln' segfaulting under 'strace' could be related to 'make' segfaulting... what else could I try? (In reply to Nuno from comment #6) ... > Not sure if 'sln' segfaulting under 'strace' could be related to 'make' > segfaulting... what else could I try? Nice debugging! Yes, I think it's related. static binaries are handled with ptrace() by emerge's sandbox (very similar to what strace uses). As a workaround you can try merging glibc with FEATURES="-sandbox -usersandbox" to check if lack of sandbox makes glibc ebuild work. (In reply to Sergei Trofimovich from comment #7) > Nice debugging! Yes, I think it's related. static binaries are handled with > ptrace() by emerge's sandbox (very similar to what strace uses). > > As a workaround you can try merging glibc with FEATURES="-sandbox > -usersandbox" to check if lack of sandbox makes glibc ebuild work. Indeed, I was able to build sys-libs/glibc-2.29-r2 with FEATURES="-sandbox -usersandbox": [ebuild U ] sys-libs/glibc-2.29-r2:2.2::gentoo [2.28-r6:2.2::gentoo] USE="multiarch nscd (split-usr%*) ssp (-audit) -caps (-cet) (-compile-locales) -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)" 0 KiB So is this an upstream bug on 'sln'? (In reply to Nuno from comment #8) > So is this an upstream bug on 'sln'? I suspect it's a gentoo's sandbox bug (or ptrace kernel bug). I suspect sandbox manages to corrupt syscall parameters somehow and brings sln into inconsistent state. I am seeing this on armv6j as well...more specifically, after bump from =sys-libs/glibc-2.30-r3 to -r4, as well as the same thing on =sys-libs/glibc-2.31-r1. This leads me to think it may be a regression in the patchset from https://github.com/gentoo/gentoo/commit/9097e0e8399937b751ac38153f44a12f9f1c2b54 which was supposed to resolve https://bugs.gentoo.org/708758 (In reply to matoro from comment #10) > I am seeing this on armv6j as well...more specifically, after bump from > =sys-libs/glibc-2.30-r3 to -r4, as well as the same thing on > =sys-libs/glibc-2.31-r1. This leads me to think it may be a regression in > the patchset from > https://github.com/gentoo/gentoo/commit/ > 9097e0e8399937b751ac38153f44a12f9f1c2b54 which was supposed to resolve > https://bugs.gentoo.org/708758 It might also be a glibc rebuild against new linux-headers that exposes a deficiency in our sandbox ptrace wrapper. Do you also see SIGSEGV on openat()? I tried to reproduce it on smaller example in qemu-system-arm and failed. Things just work for me on vexpress-a9 emulator. I'll need more details from the affected systems: 1. kernel's .config 2. emerge --info 3. static 'sln' binary attached (if it crashes under sandbox) Here's the files from my Banana Pi M2. The kernel is from linux sunxi with a minor patch: https://github.com/linux-sunxi/linux-sunxi Linux banana 3.4.104-sunxi-g1df3de8e #35 SMP Tue Jul 17 23:37:34 WEST 2018 armv7l sun7i GNU/Linux banana ~ # emerge -pqv '=sys-libs/glibc-2.29-r7::gentoo' [ebuild R ] sys-libs/glibc-2.29-r7 USE="multiarch nscd ssp (-audit) -caps (-cet) -compile-locales -doc -gd -headers-only (-multilib) -profile (-selinux) -suid -systemtap -test (-vanilla)" Files are too big to attach. Here's a link: https://drive.google.com/file/d/1lAvrZpPLQasVR_kkr3WJcXtoVaNZ7QN5/view?usp=sharing sha1sum: c1ab631f561e9dc1422e5a4980a29f757b71f4f0 bug-690600_banana-pi_glibc-2.29-r7.tar.gz (In reply to Nuno from comment #13) > Here's the files from my Banana Pi M2. I meant Banana Pi M1, not M2. |
Created attachment 584302 [details] build.log.gz (13MB uncompressed) sys-libs/glibc-2.29-r2 fails to build with segmentation fault /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/sln /var/tmp/portage/sys-libs/glibc-2.29-r2/work/build-arm-armv7a-unknown-linux-gnueabihf-nptl/elf/symlink.list make[1]: *** [Makefile:106: install-symbolic-link] Error 139 make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.29-r2/work/glibc-2.29' make: *** [Makefile:12: install] Error 2 * ERROR: sys-libs/glibc-2.29-r2::gentoo failed (install phase): * emake failed [Jul24 04:13] make/558: potentially unexpected fatal signal 11. [ +0.003849] Pid: 558, comm: make [ +0.004715] CPU: 0 Not tainted (3.4.104-sunxi-g1df3de8e #35) [ +0.001881] PC is at 0xb6dd0b0c [ +0.001976] LR is at 0xcbdf3c00 [ +0.008888] pc : [<b6dd0b0c>] lr : [<cbdf3c00>] psr: 800d0010 [ +0.000007] sp : be864e10 ip : 00000001 fp : be864fc8 [ +0.003965] r10: b6e74c00 r9 : b6e74f00 r8 : b7823e58 [ +0.005219] r7 : 000000c0 r6 : 00000010 r5 : 00000000 r4 : ffffffff [ +0.005235] r3 : b6e93000 r2 : cbdf3c00 r1 : 00000010 r0 : b6e93000 [ +0.005988] Flags: Nzcv IRQs on FIQs on Mode USER_32 ISA ARM Segment user [ +0.004478] Control: 10c5387d Table: 473b806a DAC: 00000015 [ +0.001671] LR: 0xcbdf3b80: [ +0.000447] 3b80 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 11fef3d0 [ +0.007117] 3ba0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.007179] 3bc0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f5f5d5 [ +0.006974] 3be0 11fef3d0 33f6fddb 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 [ +0.007209] 3c00 75f2fbd5 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.007029] 3c20 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.006967] 3c40 11fef3d0 11fef3d0 11fef3d0 75f5f5d5 11fef3d0 33f6fddb 11fef3d0 11fef3d0 [ +0.007153] 3c60 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 75f2fbd5 11fef3d0 11fef3d0 11fef3d0 [ +0.008226] R2: 0xcbdf3b80: [ +0.000447] 3b80 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 11fef3d0 [ +0.007060] 3ba0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.006992] 3bc0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 75f5f5d5 [ +0.007094] 3be0 11fef3d0 33f6fddb 11fef3d0 11fef3d0 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 [ +0.007116] 3c00 75f2fbd5 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.007066] 3c20 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 11fef3d0 [ +0.006953] 3c40 11fef3d0 11fef3d0 11fef3d0 75f5f5d5 11fef3d0 33f6fddb 11fef3d0 11fef3d0 [ +0.006988] 3c60 11fef3d0 75f2fbd5 75f2fbd5 75f2fbd5 75f2fbd5 11fef3d0 11fef3d0 11fef3d0 [ +0.014973] [<c0014508>] (unwind_backtrace+0x0/0xec) from [<c00461e0>] (get_signal_to_deliver+0x4e0/0x588) [ +0.007944] [<c00461e0>] (get_signal_to_deliver+0x4e0/0x588) from [<c0010e78>] (do_signal+0x68/0x4f0) [ +0.007324] [<c0010e78>] (do_signal+0x68/0x4f0) from [<c00115e8>] (do_notify_resume+0x50/0x5c) [ +0.007502] [<c00115e8>] (do_notify_resume+0x50/0x5c) from [<c000dff8>] (work_pending+0x24/0x28)