So after today's system upgrade I have distcc segfaulting a lot. I've been able to pin it down to building glibc-2.25-r3 with gcc-7.1.0 (if I build it with 5.4.0, it is fine). The segv is related to using sys-auth/nss-mdns-9999; I've tried rebuilding it with gcc-7.1.0 to confirm it's not some incompatibility, and in both cases it fails the same. Core was generated by `/usr/lib64/distcc/bin/x86_64-pc-linux-gnu-gcc-7.1.0 -I/var/tmp/portage/dev-util'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __memcmp_sse2 () at ../sysdeps/x86_64/multiarch/../memcmp.S:74 74 ../sysdeps/x86_64/multiarch/../memcmp.S: Nie ma takiego pliku ani katalogu. (gdb) bt #0 __memcmp_sse2 () at ../sysdeps/x86_64/multiarch/../memcmp.S:74 #1 0x00007f41f9dcacbd in in6aicmp (p2=0x14b3908, p1=0x0) at ../sysdeps/posix/getaddrinfo.c:1757 #2 __GI_bsearch (__compar=0x7f41f9dc7710 <in6aicmp>, __size=24, __nmemb=<optimized out>, __base=<optimized out>, __key=0x0) at ../bits/stdlib-bsearch.h:33 #3 __GI_getaddrinfo (name=<optimized out>, service=<optimized out>, hints=<optimized out>, pai=0x7fff330ba998) at ../sysdeps/posix/getaddrinfo.c:2523 #4 0x0000000000403b3b in ?? () #5 0x0000000000405b01 in ?? () #6 0x0000000000405170 in ?? () #7 0x0000000000403428 in ?? () #8 0x00007f41f9d073fa in __libc_start_main (main=0x403080, argc=21, argv=0x7fff330badb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff330bada8) at ../csu/libc-start.c:295 #9 0x000000000040348a in ?? () I'm going to investigate further.
(In reply to Michał Górny from comment #0) > So after today's system upgrade I have distcc segfaulting a lot. I've been > able to pin it down to building glibc-2.25-r3 with gcc-7.1.0 (if I build it > with 5.4.0, it is fine). The segv is related to using > sys-auth/nss-mdns-9999; I've tried rebuilding it with gcc-7.1.0 to confirm > it's not some incompatibility, and in both cases it fails the same. > > > Core was generated by `/usr/lib64/distcc/bin/x86_64-pc-linux-gnu-gcc-7.1.0 > -I/var/tmp/portage/dev-util'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 __memcmp_sse2 () at ../sysdeps/x86_64/multiarch/../memcmp.S:74 > 74 ../sysdeps/x86_64/multiarch/../memcmp.S: Nie ma takiego pliku ani > katalogu. > (gdb) bt > #0 __memcmp_sse2 () at ../sysdeps/x86_64/multiarch/../memcmp.S:74 > #1 0x00007f41f9dcacbd in in6aicmp (p2=0x14b3908, p1=0x0) at > ../sysdeps/posix/getaddrinfo.c:1757 > #2 __GI_bsearch (__compar=0x7f41f9dc7710 <in6aicmp>, __size=24, > __nmemb=<optimized out>, __base=<optimized out>, __key=0x0) > at ../bits/stdlib-bsearch.h:33 This looks like bsearch(key=NULL) crash on ipv6 address list. The crash happens in comparison helper: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/posix/getaddrinfo.c;h=43eb31365ed10059bb6e1147af197ed54550e6c5;hb=db0242e3023436757bbc7c488a779e6e3343db04#l1752 It should not be NULL as it's a pointer to on-stack variable but perhaps something else caused stack/register corruption: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/posix/getaddrinfo.c;h=43eb31365ed10059bb6e1147af197ed54550e6c5;hb=db0242e3023436757bbc7c488a779e6e3343db04#l2494 > #3 __GI_getaddrinfo (name=<optimized out>, service=<optimized out>, > hints=<optimized out>, pai=0x7fff330ba998) > at ../sysdeps/posix/getaddrinfo.c:2523 > #4 0x0000000000403b3b in ?? () > #5 0x0000000000405b01 in ?? () > #6 0x0000000000405170 in ?? () > #7 0x0000000000403428 in ?? () This looks like missing debug symbols of actual program that crashed. I take it's distcc which might shed some light on how exactly getaddrinfo() is called.
Well, I've lost some hair trying to figure this out and it still looks like a problem inside glibc. The crash seems to happen in (reindented to avoid bugzie wrapping): 2523 struct in6addrinfo *found 2524 = bsearch (&tmp, in6ai, in6ailen, sizeof (*in6ai), 2525 in6aicmp); (gdb) p &tmp $71 = (struct in6addrinfo *) 0x7fffffffd720 (gdb) p in6ai $72 = (struct in6addrinfo *) 0x555555757cb8 (gdb) p in6ailen $73 = 4 (gdb) p sizeof(*in6ai) $74 = 24 (gdb) p in6aicmp $75 = {int (const void *, const void *)} 0x7ffff7b07710 <in6aicmp> However, after stepping into the function: (gdb) down #0 __GI_bsearch (__compar=0x7ffff7b07710 <in6aicmp>, __size=24, __nmemb=4, __base=0x555555757cb8, __key=0x0) at ../bits/stdlib-bsearch.h:27 27 __l = 0; i.e. key somehow becomes 0x0.
How does your /etc/nsswitch.conf looks like?
Tried to reproduce locally as: $ cat /etc/nsswitch.conf # /etc/nsswitch.conf: # $Header: /var/cvsroot/gentoo/src/patchsets/glibc/extra/etc/nsswitch.conf,v 1.2 2017/08/12 16:21:44 slyfox Exp $ passwd: compat files shadow: compat files group: compat files #hosts: files dns hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4 networks: files dns services: db files protocols: db files rpc: db files ethers: db files netmasks: files netgroup: files bootparams: files automount: files aliases: files $ getent ahostsv6 sf.local ::ffff:192.168.1.200 STREAM sf.local ::ffff:192.168.1.200 DGRAM ::ffff:192.168.1.200 RAW Seems to work on small clients.
hosts: files mdns_minimal [NOTFOUND=return] dns By using -4 you disable ipv6, and I reproduce it only with v4+v6 address (not sure if it's just two addresses or because it's v4+v6).
(In reply to Michał Górny from comment #5) > hosts: files mdns_minimal [NOTFOUND=return] dns > > By using -4 you disable ipv6, and I reproduce it only with v4+v6 address > (not sure if it's just two addresses or because it's v4+v6). [sf] ~:LANG=C getent ahosts sf.local Segmentation fault (core dumped) \o/
The happiness of breaking your system -- except I suppose you don't use MDNS, so your system isn't really broken ;-). But at least we know it's not specific to my system.
(In reply to Michał Górny from comment #7) > The happiness of breaking your system -- except I suppose you don't use > MDNS, so your system isn't really broken ;-). But at least we know it's not > specific to my system. I think I've found your buffer overflow in sys-auth/nss-mdns-9999. That seems to fix it: diff --git a/src/nss.c b/src/nss.c index ebb887c..7fee410 100644 --- a/src/nss.c +++ b/src/nss.c @@ -272,9 +272,9 @@ enum nss_status _nss_mdns_gethostbyname4_r( // Copy address memcpy(&(tuple->addr), &(u.data.result[i].address), address_length); if(address_length < sizeof(ipv6_address_t)) { - memset((&(tuple->addr) + address_length - sizeof(ipv6_address_t)), 0, + memset(((char*)&(tuple->addr) + address_length - sizeof(ipv6_address_t)), 0, (sizeof(ipv6_address_t) - address_length) ); } tuple->addr has a type of 'uint32_t addr[4]', not byte: https://github.molgen.mpg.de/git-mirror/glibc/blob/20003c49884422da7ffbc459cdeee768a6fee07b/nss/nss.h#L47 As a result memset() wipes out random part of stack as glibc allocates small chunks of memory on stack only. I've fould the overflow by tweaking scratch buffer size to always overflow to malloc() as: diff --git a/include/scratch_buffer.h b/include/scratch_buffer.h index dd17a4a7e1..c28f48c978 100644 --- a/include/scratch_buffer.h +++ b/include/scratch_buffer.h @@ -66,7 +66,7 @@ struct scratch_buffer { void *data; /* Pointer to the beginning of the scratch area. */ size_t length; /* Allocated space at the data pointer, in bytes. */ - char __space[1024] + char __space[4] __attribute__ ((aligned (__alignof__ (max_align_t)))); }; There valgrind says precisely where out-ouf-buffer write happens: ==12732== Command: /home/slyfox/dev/git/glibc-build/elf/ld-linux-x86-64.so.2 --inhibit-cache --library-path /home/slyfox/dev/git/glibc-build:/home/slyfox/dev/git/glibc-build/resolv:/home/slyfox/dev/git/glibc-build/nss:/tmp/portage/sys-auth/nss-mdns-9999/work/nss-mdns-9999/src/.libs ./a sf.local 0 22 ==12732== Invalid write of size 1 ==12732== at 0x4C11A29: memset (vg_replace_strmem.c:1239) ==12732== by 0x57FA348: _nss_mdns_minimal_gethostbyname4_r (nss.c:292) ==12732== by 0x4F016D8: gaih_inet.constprop.7 (getaddrinfo.c:806) ==12732== by 0x4F02673: getaddrinfo (getaddrinfo.c:2317) ==12732== by 0x4800B3B: main (a.c:34)
Created attachment 489106 [details, diff] nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch should fix the OOB
(In reply to Sergei Trofimovich from comment #9) > Created attachment 489106 [details, diff] [details, diff] > nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch > > nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch should fix the OOB Seems to work now: $ getent ahosts sf.local fe80::20d:81ff:fea9:990 STREAM sf.local fe80::20d:81ff:fea9:990 DGRAM fe80::20d:81ff:fea9:990 RAW 192.168.1.200 STREAM 192.168.1.200 DGRAM 192.168.1.200 RAW Proposed upstream as: https://github.com/lathiat/nss-mdns/pull/23
(In reply to Sergei Trofimovich from comment #10) > (In reply to Sergei Trofimovich from comment #9) > > Created attachment 489106 [details, diff] [details, diff] [details, diff] > > nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch > > > > nss-mdns-9999-src-nss.c-fix-out-of-bounds-memset.patch should fix the OOB > > Seems to work now: > $ getent ahosts sf.local > fe80::20d:81ff:fea9:990 STREAM sf.local > fe80::20d:81ff:fea9:990 DGRAM > fe80::20d:81ff:fea9:990 RAW > 192.168.1.200 STREAM > 192.168.1.200 DGRAM > 192.168.1.200 RAW > > Proposed upstream as: https://github.com/lathiat/nss-mdns/pull/23 Was merged upstream.
The new release includes the fix. https://github.com/lathiat/nss-mdns/releases/tag/v0.11
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5adf255b5716f5b9c2b28dcb9898d3bafa732ea9 commit 5adf255b5716f5b9c2b28dcb9898d3bafa732ea9 Author: Michał Górny <mgorny@gentoo.org> AuthorDate: 2018-01-23 08:27:28 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2018-01-23 08:28:08 +0000 sys-auth/nss-mdns: Bump to 0.11 Bump to the first release from the new upstream. Big thanks to Adam Goode for merging our patches and working on the code! Closes: https://bugs.gentoo.org/590968 Closes: https://bugs.gentoo.org/600282 Closes: https://bugs.gentoo.org/627770 sys-auth/nss-mdns/Manifest | 1 + sys-auth/nss-mdns/nss-mdns-0.11.ebuild | 54 ++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+)