Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 94668 - e1000 startup delays complicates net init process, especially for newer/advanced net scripts in latest baselayout
Summary: e1000 startup delays complicates net init process, especially for newer/adva...
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] baselayout (show other bugs)
Hardware: AMD64 Linux
: Low normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 109803
  Show dependency tree
 
Reported: 2005-05-31 13:51 UTC by Matthew Marlowe (RETIRED)
Modified: 2006-01-25 05:45 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Marlowe (RETIRED) gentoo-dev 2005-05-31 13:51:35 UTC
use six e1000 cards in a server with latest baselayout setup to put each pair of cards into bonding mode....notice that several of the bonds will complain or not properly initialize because the cards were not active long enough before bonding called (latest e1000 drivers dont power on card until module loaded).  Rebooting server multiple times will result in different working configurations (see behavior across multiple machines and switches).

I've also seen ntp-client fail in simpler configs because e1000 isn't fully setup by the time script is run.  I'm tempted to think we need to provide some place for user specified delay after module loading.

yes, I'm aware that portfast on some switches helps but it seems we need to modify the startup scripts generically somehow for people who dont have it.
Comment 1 SpanKY gentoo-dev 2005-05-31 14:59:08 UTC
and have you tried tweaking RC_NET_STRICT_CHECKING in /etc/conf.d/rc ?
Comment 2 Roy Marples (RETIRED) gentoo-dev 2005-06-01 03:05:15 UTC
Are you depending the bonds correctly?

/etc/conf.d/net sample

depend_bond0() {
   need net.eth0 net.eth1
}

depend_bond1() {
   need net.eth2 net.eth3
}

depend_bond2() {
   need net.eth4 net.eth5
}
Comment 3 Roy Marples (RETIRED) gentoo-dev 2005-07-13 04:19:52 UTC
Closing as WORKSFORME
Comment 4 Matthew Marlowe (RETIRED) gentoo-dev 2006-01-04 21:16:02 UTC
problem still exists.  Discussed bug tonight on gentoo-server irc channel.  Consensus seems to be that a bug does exist.  And, I believe Dell sent out a notice about 1 year ago saying that certain intel gigE nics would have long initialization times.  

strict net checking doesnt really help as the net startup script succeeds. The nics arent just setup yet.  It takes a few seconds more.

Bonding dependencies are correct.
Comment 5 Roy Marples (RETIRED) gentoo-dev 2006-01-05 02:53:22 UTC
So would you say that the real problem is with the kernel driver then as it's returning too fast?
Comment 6 Roy Marples (RETIRED) gentoo-dev 2006-01-09 03:32:40 UTC
You could do this do delay init.

preup() {
   # Sleep 5 seconds before bringing up a bonded interface
   [[ ${IFACE} == "bond"* ]] && sleep 5
}

But otherwise this sounds like a kernel bug.
Comment 7 Matthew Marlowe (RETIRED) gentoo-dev 2006-01-10 00:13:12 UTC
another note:  I have found that some switches, especially cisco, have a portfast option that allows ports to be manually designated as 'server only'. It skips the whole spanning tree loop checks and allows the link to go up real fast.  Currently, I am having to enable this option on my switches to get around the e1000 bug.
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2006-01-10 08:14:41 UTC
Usual procedure is to get this reported upstream. Is it reproducible on 2.6.15?
Comment 9 Daniel Drake (RETIRED) gentoo-dev 2006-01-20 14:44:31 UTC
Try a recent -git release of Linus' tree (e.g. git-sources). There have been many e1000 fixes committed over the last few days.
Comment 10 Daniel Drake (RETIRED) gentoo-dev 2006-01-25 05:45:12 UTC
Please reopen when you respond to comment #8 or #9.