Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 444892

Summary: Network failure with >sys-kernel/hardened-sources-3.4.5 tulip driver (DECchip 21140)
Product: Gentoo Linux Reporter: Felix Tiede <info>
Component: HardenedAssignee: The Gentoo Linux Hardened Kernel Team (OBSOLETE) <hardened-kernel+disabled>
Status: RESOLVED FIXED    
Severity: normal CC: kernel, pageexec, spender
Priority: Normal    
Version: unspecified   
Hardware: AMD64   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: Current 3.6.7-hardened kernel config

Description Felix Tiede 2012-11-27 06:13:17 UTC
My DECchip 21140 network card seems to not handle incoming packets with kernels younger than sys-kernel/hardened-sources-3.4.5. I've tried with hardened-sources-3.5.4 and hardened-sources-3.6.7 - in both versions packets are transmitted but no incoming packet is handled while it works very well with hardened-sources-3.4.5 and prior to that.

I've seen no log messages from the kernel and I suspect this might be an upstream bug.

Kernel tulip driver configuration for all affected and unaffected versions:
# grep TULIP .config
CONFIG_NET_TULIP=y
CONFIG_TULIP=m
# CONFIG_TULIP_MWI is not set
CONFIG_TULIP_MMIO=y
CONFIG_TULIP_NAPI=y
# CONFIG_TULIP_NAPI_HW_MITIGATION is not set

# emerge --info
Portage 2.1.11.31 (hardened/linux/amd64, gcc-4.5.4, glibc-2.15-r3, 3.4.5-hardened x86_64)
=================================================================
System uname: Linux-3.4.5-hardened-x86_64-Intel-R-_Core-TM-2_CPU_6320_@_1.86GHz-with-gentoo-2.1
Timestamp of tree: Tue, 27 Nov 2012 02:45:01 +0000
ld GNU ld (GNU Binutils) 2.22
distcc 3.1 x86_64-pc-linux-gnu [disabled]
app-shells/bash:          4.2_p37
dev-lang/python:          2.7.3-r2, 3.2.3
dev-util/cmake:           2.8.9
dev-util/pkgconfig:       0.27.1
sys-apps/baselayout:      2.1-r1
sys-apps/openrc:          0.11.5
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.68
sys-devel/automake:       1.9.6-r3, 1.11.6
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.5.4
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 3.4-r2 (virtual/os-headers)
sys-libs/glibc:           2.15-r3
Repositories: gentoo x-svn-portage x-portage
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer -fforce-addr -ftracer"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/fax /usr/share/gnupg/qualified.txt /usr/share/openvpn/easy-rsa /var/bind /var/spool/fax/etc /var/yp/Makefile"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/apache2-php5.4/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cgi-php5.4/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/php/cli-php5.4/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe -fomit-frame-pointer -fforce-addr -ftracer"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://ftp.spline.inf.fu-berlin.de/mirrors/gentoo/ http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://gentoo.tiscali.nl/ http://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/"
LANG="de_DE.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="de en"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/svn-portage /usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="acl amd64 apache2 bzip2 bzlib cli cracklib crypt cups curl cxx dri gmp gpm hardened iconv imap ipv6 jpeg justify maildir mmx modules mudflap multilib ncurses nis nls nptl nptlonly openmp pax_kernel pcre png ppds pppd readline session skey sse sse2 ssl svg tcpd threads tiff truetype unicode urandom vhosts vim-syntax wmf xattr xml zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif status unique_id userdir vhost_alias auth_digest authn_dbd cern_meta charset_lite dbd imagemap log_forensic version" APACHE2_MPMS="worker" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="de en" PHP_TARGETS="php5-3 php5-4" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby18 ruby19" SANE_BACKENDS="dell1600n_net" USERLAND="GNU" VOICEMAIL_STORAGE="file" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 Anthony Basile gentoo-dev 2012-11-27 11:25:53 UTC
(In reply to comment #0)
> My DECchip 21140 network card seems to not handle incoming packets with
> kernels younger than sys-kernel/hardened-sources-3.4.5. I've tried with
> hardened-sources-3.5.4 and hardened-sources-3.6.7 - in both versions packets
> are transmitted but no incoming packet is handled while it works very well
> with hardened-sources-3.4.5 and prior to that.
> 
> I've seen no log messages from the kernel and I suspect this might be an
> upstream bug.
> 
> Kernel tulip driver configuration for all affected and unaffected versions:
> # grep TULIP .config
> CONFIG_NET_TULIP=y
> CONFIG_TULIP=m
> # CONFIG_TULIP_MWI is not set
> CONFIG_TULIP_MMIO=y
> CONFIG_TULIP_NAPI=y
> # CONFIG_TULIP_NAPI_HW_MITIGATION is not set

I don't have a card with that chipset, so I'm going to have to ask you to try a few things:

1) Does this happen with the equivalent vanilla sources?  If so can you bracket which version bump it broke under?

If it doesn't happen under vanilla, but does under hardened, then

2) Try using CONFIG_PAX_KERNEXEC_PLUGIN_METHOD_BTS for your Return address method under non-exe pages in the PaX config menu, rather than OR.

3) If BTS vs OR makes no difference, then please test the very latest hardened-sources and see if its an issue there.  I'll have the very latest from upstream available by the end of the day.  If it is, then we'll have to pass stuff along upstream.
Comment 2 PaX Team 2012-11-27 11:57:38 UTC
would be nice to see your config. also as a first try, disable the PaX features that rely on a gcc plugin and see if that changes anything (KERNEXEC/SIZE_OVERFLOW/STACKLEAK/CONSTIFY/LATENT_ENTROPY).
Comment 3 Felix Tiede 2012-11-28 06:25:18 UTC
Created attachment 330802 [details]
Current 3.6.7-hardened kernel config

(In reply to comment #2)
This is the config which I used to compile 3.6.7 (failing) with. It's basically the same configuration as it was with older versions as I only use 'make oldconfig' for newer kernels.

(In reply to comment #1)
> 2) Try using CONFIG_PAX_KERNEXEC_PLUGIN_METHOD_BTS for your Return address method
> under non-exe pages in the PaX config menu, rather than OR.
I am not using CONFIG_PAX_KERNEXEC at all. (And currently I have no idea why, but I'm sure there was a reason when that option was new...)
Comment 4 Felix Tiede 2012-11-28 08:27:34 UTC
(In reply to comment #1)
> I don't have a card with that chipset, so I'm going to have to ask you to
> try a few things:
> 
> 1) Does this happen with the equivalent vanilla sources?  If so can you
> bracket which version bump it broke under?
It is at least broken for vanilla-sources-3.6.7 and vanilla-sources-3.6.8 as well. Can't narrow down further as I have to keep the box alive and can't experiment with network down for too long (each cycle takes about 15 minutes).

Given this I suspect this is an upstream regression introduced somewhere between 3.4.5 and 3.5.4.
Comment 5 Felix Tiede 2012-12-14 13:28:55 UTC
Is already open upstream at https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like my bug.
Comment 6 Anthony Basile gentoo-dev 2013-01-22 13:38:46 UTC
(In reply to comment #5)
> Is already open upstream at
> https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like
> my bug.

Given your info above, this should be fixed in hardened-sources-3.7.3 which is based on vanilla 3.7.3.  Can you confirm?
Comment 7 Felix Tiede 2013-01-24 14:11:59 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Is already open upstream at
> > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like
> > my bug.
> 
> Given your info above, this should be fixed in hardened-sources-3.7.3 which
> is based on vanilla 3.7.3.  Can you confirm?

Unfortunately not.
I've tried with hardened-sources-3.7.3 and the link worked for about 13.5 minutes (that's what pppd using the link in question went down) and I was unable to revive it.

Going back to hardened-sources-3.4.5 with unchanged configuration and the link is as of now up and stable for more than 6 hours.

I might have been mistaken with my earlier assumption about this bug and https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same.
Comment 8 Anthony Basile gentoo-dev 2013-04-13 22:50:36 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > Is already open upstream at
> > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like
> > > my bug.
> > 
> > Given your info above, this should be fixed in hardened-sources-3.7.3 which
> > is based on vanilla 3.7.3.  Can you confirm?
> 
> Unfortunately not.
> I've tried with hardened-sources-3.7.3 and the link worked for about 13.5
> minutes (that's what pppd using the link in question went down) and I was
> unable to revive it.
> 
> Going back to hardened-sources-3.4.5 with unchanged configuration and the
> link is as of now up and stable for more than 6 hours.
> 
> I might have been mistaken with my earlier assumption about this bug and
> https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same.

Do you have anymore information on this?  Have you tried any of the 3.8 series?
Comment 9 Felix Tiede 2013-04-20 18:18:31 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > (In reply to comment #5)
> > > > Is already open upstream at
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=48691 - at least that seems like
> > > > my bug.
> > > 
> > > Given your info above, this should be fixed in hardened-sources-3.7.3 which
> > > is based on vanilla 3.7.3.  Can you confirm?
> > 
> > Unfortunately not.
> > I've tried with hardened-sources-3.7.3 and the link worked for about 13.5
> > minutes (that's what pppd using the link in question went down) and I was
> > unable to revive it.
> > 
> > Going back to hardened-sources-3.4.5 with unchanged configuration and the
> > link is as of now up and stable for more than 6 hours.
> > 
> > I might have been mistaken with my earlier assumption about this bug and
> > https://bugzilla.kernel.org/show_bug.cgi?id=48691 being the same.
> 
> Do you have anymore information on this?  Have you tried any of the 3.8
> series?

Yes, just so a few hours ago with hardened-sources-3.8.5: Absolute same result as with 3.7.3. The link worked for a short amount of time, went down and not even rebooting the box helped.

So my guess is that any version above 3.4.5 does "something" to the card after it lived for some time which kills the link and only booting the older kernel resets this "something" so the link comes back up.
Comment 10 Felix Tiede 2013-04-30 08:51:37 UTC
I just thought about something:
My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four network chips and the system's PCI-bus. Is it possible that this problem is less related to the tulip NIC driver and more a problem with the kernel's PCI subsystem and the driver for that PCI-PCI-bridge?

I also do apologize if that actually was the missing bit of information here.

Hardware information below:
# lspci -n
01:04.0 0604: 1011:0024 (rev 03)

# lspci -vvv
01:04.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64, Cache Line Size: 16 bytes
	Bus: primary=01, secondary=02, subordinate=02, sec-latency=64
	I/O behind bridge: 0000a000-0000bfff
	Memory behind bridge: edc00000-eddfffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=220mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
		Bridge: PM- B3+
Comment 11 Anthony Basile gentoo-dev 2013-04-30 13:08:39 UTC
(In reply to comment #10)
> I just thought about something:
> My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four
> network chips and the system's PCI-bus. Is it possible that this problem is
> less related to the tulip NIC driver and more a problem with the kernel's
> PCI subsystem and the driver for that PCI-PCI-bridge?
> 

I could be.  In comment 4 you say that you hit it with vanilla.  Was it ever working and then it broke?  If so, git bisect down to the commit that broke it.
Comment 12 Felix Tiede 2013-05-01 18:57:29 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > I just thought about something:
> > My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four
> > network chips and the system's PCI-bus. Is it possible that this problem is
> > less related to the tulip NIC driver and more a problem with the kernel's
> > PCI subsystem and the driver for that PCI-PCI-bridge?
> > 
> 
> I could be.  In comment 4 you say that you hit it with vanilla.  Was it ever
> working and then it broke?  If so, git bisect down to the commit that broke
> it.

I'll try. It will take some time as it is my main server and takes a long time for testing as well as I can't take it down for long periods of time.
Comment 13 Anthony Basile gentoo-dev 2013-06-24 21:28:42 UTC
(In reply to Felix Tiede from comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > I just thought about something:
> > > My card is actually a multi-port NIC using a PCI-PCI-Bridge between its four
> > > network chips and the system's PCI-bus. Is it possible that this problem is
> > > less related to the tulip NIC driver and more a problem with the kernel's
> > > PCI subsystem and the driver for that PCI-PCI-bridge?
> > > 
> > 
> > I could be.  In comment 4 you say that you hit it with vanilla.  Was it ever
> > working and then it broke?  If so, git bisect down to the commit that broke
> > it.
> 
> I'll try. It will take some time as it is my main server and takes a long
> time for testing as well as I can't take it down for long periods of time.

Any news here?
Comment 14 Anthony Basile gentoo-dev 2013-09-27 11:12:17 UTC
(In reply to Anthony Basile from comment #13)
> 
> Any news here?

I'm going to assume this issue is fixed.  I've compiled tulip many times recently, configured as in comment 0 and had no problem.  I never did hit the original issue.