My DECchip 21140 network card seems to not handle incoming packets with kernels younger than sys-kernel/hardened-sources-3.4.5. I've tried with hardened-sources-3.5.4 and hardened-sources-3.6.7 - in both versions packets are transmitted but no incoming packet is handled while it works very well with hardened-sources-3.4.5 and prior to that. I've seen no log messages from the kernel and I suspect this might be an upstream bug. Kernel tulip driver configuration for all affected and unaffected versions: # grep TULIP .config CONFIG_NET_TULIP=y CONFIG_TULIP=m # CONFIG_TULIP_MWI is not set CONFIG_TULIP_MMIO=y CONFIG_TULIP_NAPI=y # CONFIG_TULIP_NAPI_HW_MITIGATION is not set # emerge --info Portage 2.1.11.31 (hardened/linux/amd64, gcc-4.5.4, glibc-2.15-r3, 3.4.5-hardened x86_64) ================================================================= System uname: Linux-3.4.5-hardened-x86_64-Intel-R-_Core-TM-2_CPU_6320_@_1.86GHz-with-gentoo-2.1 Timestamp of tree: Tue, 27 Nov 2012 02:45:01 +0000 ld GNU ld (GNU Binutils) 2.22 distcc 3.1 x86_64-pc-linux-gnu [disabled] app-shells/bash: 4.2_p37 dev-lang/python: 2.7.3-r2, 3.2.3 dev-util/cmake: 2.8.9 dev-util/pkgconfig: 0.27.1 sys-apps/baselayout: 2.1-r1 sys-apps/openrc: 0.11.5 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.68 sys-devel/automake: 1.9.6-r3, 1.11.6 sys-devel/binutils: 2.22-r1 sys-devel/gcc: 4.5.4 sys-devel/gcc-config: 1.7.3 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r3 sys-kernel/linux-headers: 3.4-r2 (virtual/os-headers) sys-libs/glibc: 2.15-r3 Repositories: gentoo x-svn-portage x-portage ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer -fforce-addr -ftracer" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/fax /usr/share/gnupg/qualified.txt /usr/share/openvpn/easy-rsa /var/bind /var/spool/fax/etc /var/yp/Makefile" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/apache2-php5.4/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cgi-php5.4/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/php/cli-php5.4/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=native -O2 -pipe -fomit-frame-pointer -fforce-addr -ftracer" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://ftp.spline.inf.fu-berlin.de/mirrors/gentoo/ http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://gentoo.tiscali.nl/ http://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/" LANG="de_DE.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="de en" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/svn-portage /usr/local/portage" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="acl amd64 apache2 bzip2 bzlib cli cracklib crypt cups curl cxx dri gmp gpm hardened iconv imap ipv6 jpeg justify maildir mmx modules mudflap multilib ncurses nis nls nptl nptlonly openmp pax_kernel pcre png ppds pppd readline session skey sse sse2 ssl svg tcpd threads tiff truetype unicode urandom vhosts vim-syntax wmf xattr xml zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif status unique_id userdir vhost_alias auth_digest authn_dbd cern_meta charset_lite dbd imagemap log_forensic version" APACHE2_MPMS="worker" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="de en" PHP_TARGETS="php5-3 php5-4" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby18 ruby19" SANE_BACKENDS="dell1600n_net" USERLAND="GNU" VOICEMAIL_STORAGE="file" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
(In reply to comment #0) > My DECchip 21140 network card seems to not handle incoming packets with > kernels younger than sys-kernel/hardened-sources-3.4.5. I've tried with > hardened-sources-3.5.4 and hardened-sources-3.6.7 - in both versions packets > are transmitted but no incoming packet is handled while it works very well > with hardened-sources-3.4.5 and prior to that. > > I've seen no log messages from the kernel and I suspect this might be an > upstream bug. > > Kernel tulip driver configuration for all affected and unaffected versions: > # grep TULIP .config > CONFIG_NET_TULIP=y > CONFIG_TULIP=m > # CONFIG_TULIP_MWI is not set > CONFIG_TULIP_MMIO=y > CONFIG_TULIP_NAPI=y > # CONFIG_TULIP_NAPI_HW_MITIGATION is not set I don't have a card with that chipset, so I'm going to have to ask you to try a few things: 1) Does this happen with the equivalent vanilla sources? If so can you bracket which version bump it broke under? If it doesn't happen under vanilla, but does under hardened, then 2) Try using CONFIG_PAX_KERNEXEC_PLUGIN_METHOD_BTS for your Return address method under non-exe pages in the PaX config menu, rather than OR. 3) If BTS vs OR makes no difference, then please test the very latest hardened-sources and see if its an issue there. I'll have the very latest from upstream available by the end of the day. If it is, then we'll have to pass stuff along upstream.
would be nice to see your config. also as a first try, disable the PaX features that rely on a gcc plugin and see if that changes anything (KERNEXEC/SIZE_OVERFLOW/STACKLEAK/CONSTIFY/LATENT_ENTROPY).
Created attachment 330802 [details] Current 3.6.7-hardened kernel config (In reply to comment #2) This is the config which I used to compile 3.6.7 (failing) with. It's basically the same configuration as it was with older versions as I only use 'make oldconfig' for newer kernels. (In reply to comment #1) > 2) Try using CONFIG_PAX_KERNEXEC_PLUGIN_METHOD_BTS for your Return address method > under non-exe pages in the PaX config menu, rather than OR. I am not using CONFIG_PAX_KERNEXEC at all. (And currently I have no idea why, but I'm sure there was a reason when that option was new...)
(In reply to comment #1) > I don't have a card with that chipset, so I'm going to have to ask you to > try a few things: > > 1) Does this happen with the equivalent vanilla sources? If so can you > bracket which version bump it broke under? It is at least broken for vanilla-sources-3.6.7 and vanilla-sources-3.6.8 as well. Can't narrow down further as I have to keep the box alive and can't experiment with network down for too long (each cycle takes about 15 minutes). Given this I suspect this is an upstream regression introduced somewhere between 3.4.5 and 3.5.4.
Is already open upstream at https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like my bug.
(In reply to comment #5) > Is already open upstream at > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like > my bug. Given your info above, this should be fixed in hardened-sources-3.7.3 which is based on vanilla 3.7.3. Can you confirm?
(In reply to comment #6) > (In reply to comment #5) > > Is already open upstream at > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like > > my bug. > > Given your info above, this should be fixed in hardened-sources-3.7.3 which > is based on vanilla 3.7.3. Can you confirm? Unfortunately not. I've tried with hardened-sources-3.7.3 and the link worked for about 13.5 minutes (that's what pppd using the link in question went down) and I was unable to revive it. Going back to hardened-sources-3.4.5 with unchanged configuration and the link is as of now up and stable for more than 6 hours. I might have been mistaken with my earlier assumption about this bug and https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same.
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #5) > > > Is already open upstream at > > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like > > > my bug. > > > > Given your info above, this should be fixed in hardened-sources-3.7.3 which > > is based on vanilla 3.7.3. Can you confirm? > > Unfortunately not. > I've tried with hardened-sources-3.7.3 and the link worked for about 13.5 > minutes (that's what pppd using the link in question went down) and I was > unable to revive it. > > Going back to hardened-sources-3.4.5 with unchanged configuration and the > link is as of now up and stable for more than 6 hours. > > I might have been mistaken with my earlier assumption about this bug and > https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same. Do you have anymore information on this? Have you tried any of the 3.8 series?
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > (In reply to comment #5) > > > > Is already open upstream at > > > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like > > > > my bug. > > > > > > Given your info above, this should be fixed in hardened-sources-3.7.3 which > > > is based on vanilla 3.7.3. Can you confirm? > > > > Unfortunately not. > > I've tried with hardened-sources-3.7.3 and the link worked for about 13.5 > > minutes (that's what pppd using the link in question went down) and I was > > unable to revive it. > > > > Going back to hardened-sources-3.4.5 with unchanged configuration and the > > link is as of now up and stable for more than 6 hours. > > > > I might have been mistaken with my earlier assumption about this bug and > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same. > > Do you have anymore information on this? Have you tried any of the 3.8 > series? Yes, just so a few hours ago with hardened-sources-3.8.5: Absolute same result as with 3.7.3. The link worked for a short amount of time, went down and not even rebooting the box helped. So my guess is that any version above 3.4.5 does "something" to the card after it lived for some time which kills the link and only booting the older kernel resets this "something" so the link comes back up.
I just thought about something: My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four network chips and the system's PCI-bus. Is it possible that this problem is less related to the tulip NIC driver and more a problem with the kernel's PCI subsystem and the driver for that PCI-PCI-bridge? I also do apologize if that actually was the missing bit of information here. Hardware information below: # lspci -n 01:04.0 0604: 1011:0024 (rev 03) # lspci -vvv 01:04.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 16 bytes Bus: primary=01, secondary=02, subordinate=02, sec-latency=64 I/O behind bridge: 0000a000-0000bfff Memory behind bridge: edc00000-eddfffff Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [dc] Power Management version 1 Flags: PMEClk- DSI- D1- D2- AuxCurrent=220mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Bridge: PM- B3+
(In reply to comment #10) > I just thought about something: > My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four > network chips and the system's PCI-bus. Is it possible that this problem is > less related to the tulip NIC driver and more a problem with the kernel's > PCI subsystem and the driver for that PCI-PCI-bridge? > I could be. In comment 4 you say that you hit it with vanilla. Was it ever working and then it broke? If so, git bisect down to the commit that broke it.
(In reply to comment #11) > (In reply to comment #10) > > I just thought about something: > > My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four > > network chips and the system's PCI-bus. Is it possible that this problem is > > less related to the tulip NIC driver and more a problem with the kernel's > > PCI subsystem and the driver for that PCI-PCI-bridge? > > > > I could be. In comment 4 you say that you hit it with vanilla. Was it ever > working and then it broke? If so, git bisect down to the commit that broke > it. I'll try. It will take some time as it is my main server and takes a long time for testing as well as I can't take it down for long periods of time.
(In reply to Felix Tiede from comment #12) > (In reply to comment #11) > > (In reply to comment #10) > > > I just thought about something: > > > My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four > > > network chips and the system's PCI-bus. Is it possible that this problem is > > > less related to the tulip NIC driver and more a problem with the kernel's > > > PCI subsystem and the driver for that PCI-PCI-bridge? > > > > > > > I could be. In comment 4 you say that you hit it with vanilla. Was it ever > > working and then it broke? If so, git bisect down to the commit that broke > > it. > > I'll try. It will take some time as it is my main server and takes a long > time for testing as well as I can't take it down for long periods of time. Any news here?
(In reply to Anthony Basile from comment #13) > > Any news here? I'm going to assume this issue is fixed. I've compiled tulip many times recently, configured as in comment 0 and had no problem. I never did hit the original issue.