During standard emerge updates, went from dhcpcd-8.0.6 to 8.1.0, upon reboot received messages stating checksum failure from dhcp.server.ip.addy, it kept nfs mounts from working, etc. downgrading to 8.0.6 restored working functionality. Please remove/hardmask/fix/report to author that 8.1.0 doesn't function properly Thanks
Portage 2.3.76 (python 3.6.9-final-0, default/linux/amd64/17.1/desktop, gcc-9.2.0, glibc-2.29-r5, 5.3.4-gentoo x86_64) ================================================================= System uname: Linux-5.3.4-gentoo-x86_64-AMD_FX-tm-8350_Eight-Core_Processor-with-gentoo-2.6 KiB Mem: 8117860 total, 6131180 free KiB Swap: 2097148 total, 2097148 free Timestamp of repository gentoo: Sun, 13 Oct 2019 19:15:47 +0000 Head commit of repository gentoo: 41bc579929e8b0cbcbb0ac695b03c4fc20ed5a34 Timestamp of repository eclipse: Fri, 13 Sep 2019 19:44:02 +0000 Head commit of repository eclipse: e35205466a070b3d0976ad3652ff150120ce8b15 Head commit of repository steam-overlay: 5b099b835b5b8dc7f37390e37e44cebba7cc315d Head commit of repository x11: afbaafaa912dc09a19fee7ee28ef94c44a134a06 sh bash 5.0_p11 ld GNU ld (Gentoo 2.32 p2) 2.32.0 app-shells/bash: 5.0_p11::gentoo dev-java/java-config: 2.2.0-r4::gentoo dev-lang/perl: 5.30.0::gentoo dev-lang/python: 2.7.16::gentoo, 3.6.9::gentoo dev-util/cmake: 3.15.4::gentoo dev-util/pkgconfig: 0.29.2::gentoo sys-apps/baselayout: 2.6-r1::gentoo sys-apps/openrc: 0.42.1::gentoo sys-apps/sandbox: 2.18::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.69-r4::gentoo sys-devel/automake: 1.11.6-r3::gentoo, 1.16.1-r1::gentoo sys-devel/binutils: 2.32-r1::gentoo sys-devel/gcc: 9.2.0-r1::gentoo sys-devel/gcc-config: 2.1::gentoo sys-devel/libtool: 2.4.6-r5::gentoo sys-devel/make: 4.2.1-r4::gentoo sys-kernel/linux-headers: 5.3::gentoo (virtual/os-headers) sys-libs/glibc: 2.29-r5::gentoo Repositories: gentoo location: /usr/portage sync-type: git sync-uri: https://anongit.gentoo.org/git/repo/gentoo.git priority: -1000 eclipse location: /var/db/repos/eclipse sync-type: git sync-uri: https://github.com/gentoo-mirror/eclipse.git masters: gentoo localrepo location: /usr/local/portage masters: gentoo steam-overlay location: /usr/local/steam-overlay sync-type: git sync-uri: https://github.com/anyc/steam-overlay/ masters: gentoo x11 location: /usr/local/x11-overlay sync-type: git sync-uri: https://anongit.gentoo.org/git/proj/x11.git masters: gentoo ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="*" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -march=native -mtune=native" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=native -mtune=native" DISTDIR="/usr/portagedistfiles" EMERGE_DEFAULT_OPTS="--jobs 2 --load-average 4 --nospinner --fail-clean --with-bdeps=y --autounmask-keep-masks" ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fail-clean fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="en_US.utf8" LDFLAGS="-O2 -march=native -mtune=native" MAKEOPTS="-j8 -l8" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="X a52 aac acl acpi amd64 berkdb bluetooth branding bzip2 cairo cdda cdr cli consolekit crypt cups cxx dbus dri dts dvd dvdr emboss encode exif fam flac fortran gdbm gif glamor gpm gtk iconv icu introspection jpeg lcms ldap libnotify libtirpc lm_sensors mad mng mp3 mp4 mpeg multilib ncurses nls nptl ogg opengl openmp pam pango pcre pdf png policykit ppds pulseaudio qt5 readline sdl seccomp spell split-usr ssl startup-notification svg tcpd threads tiff truetype udev udisks unicode upower usb vaapi vdpau vorbis wxwidgets x264 xattr xcb xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx f16c fma3 fma4 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 xop" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" NETBEANS_MODULES="apisupport cnd groovy gsf harness ide identity j2ee java mobility nb php profiler soa visualweb webcommon websvccommon xml" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby24 ruby25" SANE_BACKENDS="pixma" USERLAND="GNU" VIDEO_CARDS="amdgpu radeonsi" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Checksums are working fine here. Can you email me private a wireshark trace of the DHCP transaction please?
I too have run into this same issue, at least on x86_64. Rebuilding dhcpcd with CFLAGS set to -O0 or -O1 yields a working dhcpcd, but with -O2 we get a checksum failure.
It seems it may have something to do with the inlining of in_cksum. If I add the volatile keyword to sum within (or otherwise force it to not be inlined) the issue disappears.
ARCH=~x86 is also affected
I finally have taken a tshark capture 172MB, sent to Roy Marples. Glad someone else was affected, I was wondering if somehow my systems were unique in some way that was changing the outcome (running nfsroot systems)
dhcpcd-8.0.6 works dhcpcd-8.1.0 fails with checksum failure Portage 2.3.76 (python 3.6.9-final-0, default/linux/amd64/17.0, gcc-9.2.0, glibc-2.29-r5, 4.19.66-gentoo x86_64) ================================================================= System Settings ================================================================= System uname: Linux-4.19.66-gentoo-x86_64-Intel-R-_Core-TM-_i5-3570_CPU_@_3.40GHz-with-gentoo-2.6 KiB Mem: 16318796 total, 14139304 free KiB Swap: 0 total, 0 free Timestamp of repository gentoo: Tue, 15 Oct 2019 00:15:01 +0000 Head commit of repository gentoo: 491c3b2c497837cff2b3bc2d7da90154da353e41 sh bash 5.0_p11 ld GNU ld (Gentoo 2.32 p2) 2.32.0 app-shells/bash: 5.0_p11::gentoo dev-java/java-config: 2.2.0-r4::gentoo dev-lang/perl: 5.30.0::gentoo dev-lang/python: 2.7.16::gentoo, 3.5.7::gentoo, 3.6.9::gentoo, 3.7.4-r1::gentoo dev-util/cmake: 3.15.4::gentoo dev-util/pkgconfig: 0.29.2::gentoo sys-apps/baselayout: 2.6-r1::gentoo sys-apps/openrc: 0.42.1::gentoo sys-apps/sandbox: 2.18::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.69-r4::gentoo sys-devel/automake: 1.13.4-r2::gentoo, 1.16.1-r1::gentoo sys-devel/binutils: 2.32-r1::gentoo sys-devel/gcc: 9.2.0-r1::gentoo sys-devel/gcc-config: 2.1::gentoo sys-devel/libtool: 2.4.6-r5::gentoo sys-devel/make: 4.2.1-r4::gentoo sys-kernel/linux-headers: 5.3::gentoo (virtual/os-headers) sys-libs/glibc: 2.29-r5::gentoo ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="*" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-mtune=native -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php7.2/ext-active/ /etc/php/apache2-php7.3/ext-active/ /etc/php/cgi-php7.2/ext-active/ /etc/php/cgi-php7.3/ext-active/ /etc/php/cli-php7.2/ext-active/ /etc/php/cli-php7.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-mtune=native -O2 -pipe" DISTDIR="/usr/portage/distfiles" ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://tux.rainside.sk/gentoo/" LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="en" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="acl amd64 berkdb bzip2 cli crypt cxx dri fortran gdbm iconv ipv6 libtirpc multilib ncurses nptl openmp pam pcre readline seccomp split-usr ssl tcpd unicode xattr zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_core authn_dbm authn_file authz_core authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif socache_shmcb speling status unique_id unixd userdir usertrack vhost_alias watchdog proxy proxy_connect proxy_fcgi proxy_http" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx f16c popcnt sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64 xen pc" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" NETBEANS_MODULES="apisupport cnd groovy gsf harness ide identity j2ee java mobility nb php profiler soa visualweb webcommon websvccommon xml" NGINX_MODULES_HTTP="access charset gzip limit_conn limit_req log proxy rewrite gunzip upstream_check stub_status geoip2" NGINX_MODULES_STREAM="geoip2" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2 php7-3" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6 python3_7" RUBY_TARGETS="ruby25 ruby26" USERLAND="GNU" VIDEO_CARDS="intel i915 i965" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS ================================================================= Package Settings ================================================================= net-misc/dhcpcd-8.0.6::gentoo was built with the following: USE="embedded ipv6 udev" ABI_X86="(64)"
Just to note that the dhcp server runs on Microsoft Windows servers.
there is some difference in declaration of »uh_sum« and »csum« which comparison finally gets »return«ed in checksums_valid(): checksums_valid(...) { ... uint16_t ..., uh_sum; ... uint32_t csum; ... return csum == uh_sum; } might that be something that is affected to be "optimized" by choosing a higher "-O" in CFLAGS, as this checksum failure seems to happen only with -O2 rather than -O0 or -O1 (also GCC-9.2.0 with -O2 here)
Created attachment 592746 [details, diff] Force boolean I cannot repliate this problem at my end. Does this patch help at all?
Created attachment 592750 [details, diff] Fixes issues This patch should fix things.
(In reply to Roy Marples from comment #11) > Created attachment 592750 [details, diff] [details, diff] > Fixes issues > > This patch should fix things. The patch didn't fix checksum error for me (GCC-9.2.0-r1, CFLAGS=-O2) dnsmasq at openwrt as DHCP server here
Just here to confirm same issues, updated earlier and this started happening. Tried with -O1 and it now works again, the above patch with -O2 didn't help. I would suggest masking 8.1.0 as soon as possible so fewer affected people are left attempting to fix things without easy connectivity.
the patch does not fix for me either. however, I second the proposal that code inlining might be the root cause from comment 4 if I add "-fno-inline-small-functions" to CFLAGS while keeping "-O2", the resulting binary also works - just like using "-O1".
(In reply to Roy Marples from comment #11) > Created attachment 592750 [details, diff] [details, diff] > Fixes issues > > This patch should fix things. I see things like udp = (struct udphdr *)(void *)((char *)ip + ip_hlen); in checksums_valid(). Are you sure that cast is valid and does not violate strict-aliasing rules?
(In reply to David Seifert from comment #15) > in checksums_valid(). Are you sure that cast is valid and does not violate > strict-aliasing rules? Likely on to something because -O2 -fno-strict-aliasing produces a working dhcpcd for me.
I've just checked: it DOES help to prevent in_cksum(..) being inlined automagically by implicit "-finline-small-functions" set by -O2+ changing the declaration from: > static uint16_t in_cksum(const void *data, size_t len, uint32_t *isum) { .. } to: > static __attribute__ ((noinline)) uint16_t in_cksum(const void *data, size_t len, uint32_t *isum) { .. } produced a working binary - though this stanza is GCC specific. googled for a portable way and found it might be called through a pointer will also keep it from being inlined: > static uint16_t in_cksum_ptr(const void *data, size_t len, uint32_t *isum) { .. } > static uint16_t (*in_cksum)() = in_cksum_ptr; this also works for me.
Created attachment 592758 [details, diff] Fix aliasing and sanity. New patch also copies out UDP header to avoid any aliasing issue. I don't know why this just broke ..... the UDP reference has been working fine for ages.
Unfortunately the problem appears to be persisting for me with the new patch, unless I pass -fno-strict-aliasing still.
Is everone here using gcc-9? The newest gcc I have is 8.3.
(In reply to Roy Marples from comment #20) > Is everyone here using gcc-9? The newest gcc I have is 8.3. I would assume so, otherwise they'd "likely" be using packages marked stable which mean they'd also still be using dhcpcd 7.2.3 instead. I don't have 8.3 around anymore but let me get this setup and I'll check if it works with it.
(In reply to Roy Marples from comment #20) > Is everone here using gcc-9? The newest gcc I have is 8.3. I am on gcc 9.2. However I havent yet had time to evaluate the patch(es) to see if anything changed.
(In reply to Roy Marples from comment #20) > Is everone here using gcc-9? The newest gcc I have is 8.3. Well there we have it, it works entirely fine with 8.3 without any patches and -O2. So this is gcc-9 specific.
I'll need someone to explain to me why this fails with gcc-9 AND if the expectation is on me to fix it and if so how. For the time being I'll assume this is a gcc bug as it works fine with all other compilers thus far.
(In reply to Roy Marples from comment #24) > I'll need someone to explain to me why this fails with gcc-9 AND if the > expectation is on me to fix it and if so how. > > For the time being I'll assume this is a gcc bug as it works fine with all > other compilers thus far. Have you tried running this code under UBSAN? It is very likely that dhcpcd triggers undefined behavior, given how GCC has been aggressively optimizing code in the presence of undefined behavior in recent releases. Disabling inlining seems to support this argument.
Let's ask toolchain team for their opinion.
I don't thing existing -fsanitize= would catch the bug. Compiler assumes loads and stores to different types are independent. llvm has something as a work in progress for that: https://reviews.llvm.org/D32199 Does dhcpcd have minimal unit test that exposes failure with and without -fno-strict-aliasing? When I build dhcp.c locally with and without -fno-strict-aliasing I see a lot of loads and stores reordered across local function calls. Having concrete function call would help a lot.
Confirm problem. Please mask =net-misc/dhcpcd-8.1.0 HW: Mikrotik RB951Ui-2HnD with latest stable firmware(6.45.6) <=======> Intel Centrino Advanced-N 6205 Prev version dhcpcd-8.0.6 work nice. With best wishes - Y.D.
Confirmed on my end too. Was going crazy debugging it, should have come here sooner.
(In reply to Sergei Trofimovich from comment #27) > Does dhcpcd have minimal unit test that exposes failure with and without > -fno-strict-aliasing? > > When I build dhcp.c locally with and without -fno-strict-aliasing I see a > lot of loads and stores reordered across local function calls. > > Having concrete function call would help a lot. I edited dhcp.c to save the packet and captured an invalid checksum. I copied the checksum code to a small test case file verbtaim and the passed the saved packet to it. Checksum code worked fine, but fails on exactly the same data in dhcpcd. I cannot explain this. I do have another data point though: Alpine Linux with gcc-9.2 works fine with dhcpcd-8.1.0. Arch Linux with gcc-9.2 works fine with dhcpcd-8.1.0. Devuan Linux with gcc-9.2 fails with dhcpcd-8.1.0.
Thank you for trying to debug and fix the issue, but since several people ran into this, could you please mask the package for the time being? Thanks
(In reply to Roy Marples from comment #30) > (In reply to Sergei Trofimovich from comment #27) > > Does dhcpcd have minimal unit test that exposes failure with and without > > -fno-strict-aliasing? > > > > When I build dhcp.c locally with and without -fno-strict-aliasing I see a > > lot of loads and stores reordered across local function calls. > > > > Having concrete function call would help a lot. > > I edited dhcp.c to save the packet and captured an invalid checksum. > I copied the checksum code to a small test case file verbtaim and the passed > the saved packet to it. Checksum code worked fine, but fails on exactly the > same data in dhcpcd. I cannot explain this. Looking at generated dhcp.c code everything is inlined into dhcp_readbpf(). Thus you might need to call it as-is. If I read correctly a diff of '-fverbose-asm' vs '-fverbose-asm -fno-strict-aliasing' checksums_valid() lost reads of ip->ip_src / ip->ip_dst around: static bool checksums_valid(void *packet, struct in_addr *from, unsigned int flags) { struct ip *ip = packet; struct ip pseudo_ip = { .ip_p = IPPROTO_UDP, .ip_src = ip->ip_src, // <- this 32-bit store .ip_dst = ip->ip_dst // <- and this 32-bit store }; ... in_cksum(&pseudo_ip, sizeof(pseudo_ip), &csum); ... and left them uninitialized on stack. in_cksum() treats 'pseudo_ip' effectively as unint16_t[]. I could see why gcc might decide to consider 32-bit stores and 16-bit loads as not affecting one another and removing "dead" stored of 'pseudo_ip.ip_src'. I think the only valid interpretation of struct is a uint8_t[].
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c0cd07cf03f3024c83749097d9e70cae0c28e41b commit c0cd07cf03f3024c83749097d9e70cae0c28e41b Author: Lars Wendler <polynomial-c@gentoo.org> AuthorDate: 2019-10-16 07:58:58 +0000 Commit: Lars Wendler <polynomial-c@gentoo.org> CommitDate: 2019-10-16 07:58:58 +0000 package.mask: Masked =net-misc/dhcpcd-8.1.0 Bug: https://bugs.gentoo.org/697672 Signed-off-by: Lars Wendler <polynomial-c@gentoo.org> profiles/package.mask | 5 +++++ 1 file changed, 5 insertions(+)
(In reply to Sergei Trofimovich from comment #32) > Looking at generated dhcp.c code everything is inlined into dhcp_readbpf(). > Thus you might need to call it as-is. > > If I read correctly a diff of '-fverbose-asm' vs '-fverbose-asm > -fno-strict-aliasing' checksums_valid() lost reads of ip->ip_src / > ip->ip_dst around: > > static bool > checksums_valid(void *packet, > struct in_addr *from, unsigned int flags) > { > struct ip *ip = packet; > struct ip pseudo_ip = { > .ip_p = IPPROTO_UDP, > .ip_src = ip->ip_src, // <- this 32-bit store > .ip_dst = ip->ip_dst // <- and this 32-bit store > }; > ... > in_cksum(&pseudo_ip, sizeof(pseudo_ip), &csum); > ... > > and left them uninitialized on stack. What is not initialised exactly? We have a compile time assert at the top to ensure that there is no padding in the structure and C member initialisation ensures that all members are set to zero if not specified. > > in_cksum() treats 'pseudo_ip' effectively as unint16_t[]. I could see why > gcc might decide to consider 32-bit stores and 16-bit loads as not affecting > one another and removing "dead" stored of 'pseudo_ip.ip_src'. > > I think the only valid interpretation of struct is a uint8_t[]. in_cksum is based on RFC 1071 section 4.1 which you'll find in many samples like here: https://gist.github.com/fakessh/4137434#file-client-c-L34 Are suggesting all these implementations are broken?
(In reply to Roy Marples from comment #34) > (In reply to Sergei Trofimovich from comment #32) > > Looking at generated dhcp.c code everything is inlined into dhcp_readbpf(). > > Thus you might need to call it as-is. > > > > If I read correctly a diff of '-fverbose-asm' vs '-fverbose-asm > > -fno-strict-aliasing' checksums_valid() lost reads of ip->ip_src / > > ip->ip_dst around: > > > > static bool > > checksums_valid(void *packet, > > struct in_addr *from, unsigned int flags) > > { > > struct ip *ip = packet; > > struct ip pseudo_ip = { > > .ip_p = IPPROTO_UDP, > > .ip_src = ip->ip_src, // <- this 32-bit store > > .ip_dst = ip->ip_dst // <- and this 32-bit store > > }; > > ... > > in_cksum(&pseudo_ip, sizeof(pseudo_ip), &csum); > > ... > > > > and left them uninitialized on stack. > > What is not initialised exactly? We have a compile time assert at the top to > ensure that there is no padding in the structure and C member initialisation > ensures that all members are set to zero if not specified. > > > > > in_cksum() treats 'pseudo_ip' effectively as unint16_t[]. I could see why > > gcc might decide to consider 32-bit stores and 16-bit loads as not affecting > > one another and removing "dead" stored of 'pseudo_ip.ip_src'. > > > > I think the only valid interpretation of struct is a uint8_t[]. > > in_cksum is based on RFC 1071 section 4.1 which you'll find in many samples > like here: > https://gist.github.com/fakessh/4137434#file-client-c-L34 > > Are suggesting all these implementations are broken? It seems like you are ultimately overlaying a uint16_t pointer over struct Packet, which definitely violates strict-aliasing rules. Just because you ensure fields are padded doesn't make this valid. C is not portable assembly.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=06a772cc26a47136b71d2d0a9eaec5bf09b817dd commit 06a772cc26a47136b71d2d0a9eaec5bf09b817dd Author: Lars Wendler <polynomial-c@gentoo.org> AuthorDate: 2019-10-16 15:10:17 +0000 Commit: Lars Wendler <polynomial-c@gentoo.org> CommitDate: 2019-10-16 15:11:09 +0000 package.mask: Removed dhcpcd-8.1.0 mask Bug: https://bugs.gentoo.org/697672 Signed-off-by: Lars Wendler <polynomial-c@gentoo.org> profiles/package.mask | 5 ----- 1 file changed, 5 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1b2a8af882b4b47b2c4189a52b4bd9f3308666fe commit 1b2a8af882b4b47b2c4189a52b4bd9f3308666fe Author: Lars Wendler <polynomial-c@gentoo.org> AuthorDate: 2019-10-16 15:09:50 +0000 Commit: Lars Wendler <polynomial-c@gentoo.org> CommitDate: 2019-10-16 15:11:08 +0000 net-misc/dhcpcd: Bump to version 8.1.1. Removed old Bug: https://bugs.gentoo.org/697672 Package-Manager: Portage-2.3.77, Repoman-2.3.17 Signed-off-by: Lars Wendler <polynomial-c@gentoo.org> net-misc/dhcpcd/Manifest | 2 +- net-misc/dhcpcd/{dhcpcd-8.1.0.ebuild => dhcpcd-8.1.1.ebuild} | 0 2 files changed, 1 insertion(+), 1 deletion(-)
Please test dhcpcd-8.1.1 and report back your results.
(In reply to David Seifert from comment #35) > It seems like you are ultimately overlaying a uint16_t pointer over struct > Packet, which definitely violates strict-aliasing rules. Just because you > ensure fields are padded doesn't make this valid. C is not portable assembly. That's a fair point. Using a union with struct ip and uint16_t[sizeof(struct ip)] members allow strict aliasing to work with in_cksum. dhcpcd-8.1.1 has been pushed with this change so should fix this. At least it does on the one box I can repliate this on.
=net-misc/dhcpcd-8.1.1 works for me, thanks.
It seems like working)))
It's probably worth providing a related upstream change for posterity: https://github.com/rsmarples/dhcpcd/commit/0f8f4506858ed4d3e029054ce74f7fc3be7112b3 [ on unrelated note 'uint16_t w[sizeof(struct ip)];' likely allocates twice as much space than you need ] (In reply to Roy Marples from comment #34) > (In reply to Sergei Trofimovich from comment #32) > > Looking at generated dhcp.c code everything is inlined into dhcp_readbpf(). > > Thus you might need to call it as-is. > > > > If I read correctly a diff of '-fverbose-asm' vs '-fverbose-asm > > -fno-strict-aliasing' checksums_valid() lost reads of ip->ip_src / > > ip->ip_dst around: > > > > static bool > > checksums_valid(void *packet, > > struct in_addr *from, unsigned int flags) > > { > > struct ip *ip = packet; > > struct ip pseudo_ip = { > > .ip_p = IPPROTO_UDP, > > .ip_src = ip->ip_src, // <- this 32-bit store > > .ip_dst = ip->ip_dst // <- and this 32-bit store > > }; > > ... > > in_cksum(&pseudo_ip, sizeof(pseudo_ip), &csum); > > ... > > > > and left them uninitialized on stack. > > What is not initialised exactly? We have a compile time assert at the top to > ensure that there is no padding in the structure and C member initialisation > ensures that all members are set to zero if not specified. GCC's optimiser effectively deletes the lines I marked as "// <- ". I suspect it's type based alias analysis but for my own curiosity will check in more detail. The presence of loop also seems to have the effect. > > in_cksum() treats 'pseudo_ip' effectively as unint16_t[]. I could see why > > gcc might decide to consider 32-bit stores and 16-bit loads as not affecting > > one another and removing "dead" stored of 'pseudo_ip.ip_src'. > > > > I think the only valid interpretation of struct is a uint8_t[]. > > in_cksum is based on RFC 1071 section 4.1 which you'll find in many samples > like here: > https://gist.github.com/fakessh/4137434#file-client-c-L34 > > Are suggesting all these implementations are broken? Yes, I think this code snippet is broken at: https://gist.github.com/fakessh/4137434#file-client-c-L101 as it also casts a struct* to uint16_t*.
(In reply to Sergei Trofimovich from comment #41) > [ on unrelated note 'uint16_t w[sizeof(struct ip)];' likely allocates twice > as much space than you need ] https://roy.marples.name/cgit/dhcpcd.git/commit/?id=f77022dbce8890eb5c6597af99490852aa4cd5ec Fixed, thanks
Can confirm fixed for me as well, great to see this was figured out rather than worked around.
Thanks for the fix, 8.1.1 works here too.
Created attachment 594184 [details] dhcp.c (In reply to Sergei Trofimovich from comment #41) > > What is not initialised exactly? We have a compile time assert at the top to > > ensure that there is no padding in the structure and C member initialisation > > ensures that all members are set to zero if not specified. > > GCC's optimiser effectively deletes the lines I marked as "// <- ". I > suspect it's type based alias analysis but for my own curiosity will check > in more detail. The presence of loop also seems to have the effect. Added trimmed down dhcp.c example. Looks like the loop was not the major issue here. The type cast is enough for gcc to throw away most of struct initializer. $ gcc dhcp.c -o dhcp -O2 && ./dhcp packet sum: 0x011f5b $ gcc dhcp.c -o dhcp -O2 -fno-strict-aliasing && ./dhcp packet sum: 0x005555 Fun fact: the code had a chance to mis-compile even by gcc-4.0.4: gcc-3.3.6: packet sum: 0x005555 gcc-3.4.6: packet sum: 0x005555 gcc-4.0.4: packet sum: 0x001180 gcc-4.1.2: packet sum: 0x005555 gcc-4.2.4: packet sum: 0x005555 gcc-4.3.6: packet sum: 0x005555 gcc-4.4.7: packet sum: 0x0011b0 gcc-4.5.4: packet sum: 0x0011b0 gcc-4.6.4: packet sum: 0x0010c0 gcc-4.7.4: packet sum: 0x0011d0 gcc-4.8.5: packet sum: 0x0011d0 gcc-4.9.4: packet sum: 0x001210 gcc-5.4.0: packet sum: 0x005555 gcc-5.5.0: packet sum: 0x005555 gcc-6.4.0: packet sum: 00000000 gcc-6.5.0: packet sum: 00000000 gcc-7.3.0: packet sum: 00000000 gcc-7.3.1: packet sum: 00000000 gcc-7.4.0: packet sum: 00000000 gcc-8.2.0: packet sum: 0x00bb1e gcc-8.3.0: packet sum: 0x00acf4 gcc-8.3.1: packet sum: 0x0109f1 gcc-9.1.0: packet sum: 0x016d51 gcc-9.2.0: packet sum: 0x01b295