Summary: | sys-libs/glibc-2.32-r3: SIGSEGV on adjtime(NULL,...) (was: net-misc/openntpd-6.2_p3-r2 segfaults on start on x86) | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Tobias Leupold <tl> |
Component: | Current packages | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bugs.gentoo, proxy-maint, sam, vereecke.jan |
Priority: | Normal | Keywords: | PATCH |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | https://sourceware.org/PR26833 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
'strace' of failing OpenNTPD process
Patch for bug in glibc function adjtime |
Description
Tobias Leupold
2020-11-24 12:52:42 UTC
Able to get a gdb backtrace? (https://wiki.gentoo.org/wiki/Debugging) This seems to be a bit complicated (although I'm not super fit with gdb), as the daemon forks: (gdb) run Starting program: /usr/sbin/ntpd -d [Detaching after fork from child process 3389] ntp engine ready Terminating [Inferior 1 (process 3385) exited normally] I also tried this one, but it seems to be not really better: (gdb) set follow-fork-mode child (gdb) run Starting program: /usr/sbin/ntpd -d [Attaching after process 3395 fork to child process 3399] [New inferior 2 (process 3399)] [Detaching after fork from parent process 3395] [Inferior 1 (process 3395) detached] process 3399 is executing new program: /usr/sbin/ntpd [Attaching after process 3399 fork to child process 3400] [New inferior 3 (process 3400)] [Detaching after fork from parent process 3399] [Inferior 2 (process 3399) detached] ntp engine ready process 3400 is executing new program: /usr/sbin/ntpd Terminating [Inferior 3 (process 3400) exited normally] Linking thread about this for reference: https://forums.gentoo.org/viewtopic-t-1122620.html I was able to reproduce in a fresh unmodified stage3-i686-20201116T214503Z, but didn't investigate further. I'm also running into this on both two 32-bit ARM systems and an x86 system (32-bit). Seems to work fine on my 64-bit system. I've not managed to get a stack trace or a core dump, despite using 'ulimit -c unlimited' and '/proc/sys/kernel/core_pattern'. It looks like the daemon is spawning threads and one of them dies shortly after running 'clock_gettime(CLOCK_REALTIME, {tv_sec=1606433599, tv_nsec=15663478}) = 0'. Will attach an 'strace' dump. Created attachment 675985 [details]
'strace' of failing OpenNTPD process
I narrowed down the problem to a bug in the implementation of the function adjtime in sys-libs/glibc-2.32-r2. The bug is in the file sysdeps/unix/sysv/linux/adjtime.c and the problem code is something new for recent versions of glibc. The code in sys-libs/glibc-2.32-r2 does not allow the first parameter of adjtime to be NULL. That goes against the documented behaviour of adjtime (see man adjtime) and is something which openntpd requires. I patched sys-libs/glibc-2.32-r2 so adjtime behaves as expected and confirmed that the patch does work (ie prevents openntpd from segfault'ng). I also checked the lastest development source code for glibc available at https://www.gnu.org/software/libc/sources.html and there is a fix for this bug there. Created attachment 676354 [details, diff]
Patch for bug in glibc function adjtime
This patch almost identical to what already exists upstream for glibc. Also note that the code in question is specific to 32 bit hosts.
See also: https://sourceware.org/bugzilla/show_bug.cgi?id=26833 I guess this bug should be reassigned to glibc folks. (In reply to acmondor from comment #6) > I narrowed down the problem to a bug in the implementation of the function > adjtime in sys-libs/glibc-2.32-r2. The bug is in the file > sysdeps/unix/sysv/linux/adjtime.c and the problem code is something new for > recent versions of glibc. The code in sys-libs/glibc-2.32-r2 does not allow > the first parameter of adjtime to be NULL. That goes against the documented > behaviour of adjtime (see man adjtime) and is something which openntpd > requires. > > I patched sys-libs/glibc-2.32-r2 so adjtime behaves as expected and > confirmed that the patch does work (ie prevents openntpd from segfault'ng). > I also checked the lastest development source code for glibc available at > https://www.gnu.org/software/libc/sources.html and there is a fix for this > bug there. Thanks for the analysis. It usually speeds things up to link to upstream fix. https://sourceware.org/PR26833 The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=973c2804ea92b4acb82e0d61ae9179b0fc15db74 commit 973c2804ea92b4acb82e0d61ae9179b0fc15db74 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-12-07 22:47:32 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-12-08 00:37:18 +0000 sys-libs/glibc: 2.32: cut 3 patchset This patchset includes a few upstream backports: - sh: Add sh4 fpu Implies folder - linux: Allow adjtime with NULL argument [BZ #26833] - __vfscanf_internal: fix aliasing violation (bug 26690) - iconv: Accept redundant shift sequences in IBM1364 [BZ #26224] - aarch64: Add unwind information to _start (bug 26853) - support: Provide a way to reorder responses within the DNS test server - support: Provide a way to clear the RA bit in DNS server responses - resolv: Handle transaction ID collisions in parallel queries (bug 26600) - resolv: Serialize processing in resolv/tst-resolv-txnid-collision - struct _Unwind_Exception alignment should not depend on compiler flags - Remove __warn_memset_zero_len [BZ #25399] - Remove __warndecl - aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798] Two of them are specual (noticed by Gentoo users): - "linux: Allow adjtime with NULL argument [BZ #26833]" fixes openntpd startup failure. - "__vfscanf_internal: fix aliasing violation (bug 26690)" fixes gcc-11 compatibility. Reported-by: Tobias Leupold Bug: https://bugs.gentoo.org/756316 Bug: https://sourceware.org/PR26833 Reported-by: andy Bug: https://bugs.gentoo.org/750992 Bug: https://sourceware.org/PR26690 Package-Manager: Portage-3.0.11, Repoman-3.0.2 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> sys-libs/glibc/Manifest | 1 + sys-libs/glibc/glibc-2.32-r4.ebuild | 1513 +++++++++++++++++++++++++++++++++++ 2 files changed, 1514 insertions(+) (In reply to acmondor from comment #6) > I narrowed down the problem to a bug in the implementation of the function > adjtime in sys-libs/glibc-2.32-r2. The bug is in the file > sysdeps/unix/sysv/linux/adjtime.c and the problem code is something new for > recent versions of glibc. The code in sys-libs/glibc-2.32-r2 does not allow > the first parameter of adjtime to be NULL. That goes against the documented > behaviour of adjtime (see man adjtime) and is something which openntpd > requires. > > I patched sys-libs/glibc-2.32-r2 so adjtime behaves as expected and > confirmed that the patch does work (ie prevents openntpd from segfault'ng). > I also checked the lastest development source code for glibc available at > https://www.gnu.org/software/libc/sources.html and there is a fix for this > bug there. I'll second what Sergei said: thank you for finding this! Let's declare it fixed in ~arch. Next glibc stabilization round should have it fixed for everyone. |