Hi, eth0 is connected to an ADSL router. I have just updated baselayout to 1.8.6.12, this is what I get: * Bringing eth0 up... * eth0 is not plugged in or has no carrier signal [ !! ] When I comment these lines in /etc/init.d/net.eth0 it works: ---CUT---- else # Check that eth0 was not brough up by the kernel ... # if [ "${status_IFACE}" != "up" ] # then # Check that the interface has a carrier # if [ "${carrier_IFACE}" = "running" ] # then /sbin/dhcpcd ${dhcpcd_IFACE} ${IFACE} >/dev/null || { retval=$? eend ${retval} "Failed to bring ${IFACE} up" return ${retval} } # else # eend 1 "${IFACE} is not plugged in or has no carrier signal" # return 1 # fi # fi fi eend 0 ---CUT--- But there still is one strange thing happening (this is the reason why I tried to update baselayout, to check if this was fixed) : eth0 takes a long time to come up (maybe 1 or 2 minutes). A few months ago it was comming up fast. I then eventually updated baselayout and it became slow. ... and now with v 1.8.6.12 it doesn't come up anymore without commenting the above lines. The eth0 driver is compiled in the kernel, no module.
better would be to alter: carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 IFACE) print "running" }')" to: carrier_IFACE="$(ifconfig | grep -A4 ${iface} | gawk '/RUNNING/ { print "running" }')" By the way, I don't have the taking-two-minutes problem.
Glad I keep backups, I diffed the last version of net.eth0 with this one and promptly removed the carrier signal test. I wish these scripts could be tested a little better before making their way into Portage, it seems like with every other version of baselayout something breaks. At least .12 seems to correct a problem I brought up quite some time ago, with the return value of fsck being treated incorrectly (the maintainer mustve finally understood it *was* a problem).
I had the same problem here... Found a solution by replacing : status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/Link/ { if ($1 == IFACE) print "up" }')" with status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '"/Link/" { if ($1 == IFACE) print "up" }')" and carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 == IFACE) print "running" }')" with carrier_IFACE="$(ifconfig -a | gawk -v IFACE="${iface}" '"/RUNNING/" { if ($1 == IFACE) print "running" }')" Now, eth0 is bringing up as before :-) Hope this can help...
Same problem here, running development-sources 2.6.0-test9 with nic compiled into the kernel. When coming back from boot or if I stop net.etho then ifconfig shows only lo, however ifconfig -a gives lo and eth0. By following Jack's instructions, net.eth0 works again. I'm running net-tools-1.60-r7 if that's of any consequence
I ran into the same problem, if I look at my eth0 the term RUNNING is not displayed in the output of ifconfig eth0, so the carrier detection seems broken, if I comment out the line (or just change the condition from = to !=, it brings it up fine (with dhcpcd). Should the interface be reproted as running? I'm using the nvnet driver.
I just removed the carrier test, but this could really be a problem for people running systems that's not easy to get to, since they would have problems after a reboot.
Have a laptop with build-in natsemi lan, which connects through dhcp. Try with build natsemi support on kernel and as a module and eth0 fail to load on both configurations. After system is up, I was able to startup eth0 manualy with dhcpcd eth0 command. To resolve the problem I keep baselayout 1.8.6.12 and use a previous copy of net.eth0 from baselayout 1.8.6.10 (or 11). BTW, my kernel is ck-sources 2.4.22-r2.
Same probleme with mm-sources-2.6.0-test9-mm1 and nic build in module with the modifactions of jack_mort (post 3) works again
Please try below patch. -- Index: init.d/net.eth0 =================================================================== RCS file: /home/cvsroot/gentoo-src/rc-scripts/init.d/net.eth0,v retrieving revision 1.32 diff -u -r1.32 net.eth0 --- init.d/net.eth0 11 Nov 2003 19:37:24 -0000 1.32 +++ init.d/net.eth0 13 Nov 2003 17:37:22 -0000 @@ -32,8 +32,8 @@ dhcpcd_IFACE="$(eval echo \$\{dhcpcd_${iface}\})" inet6_IFACE="$(eval echo \$\{inet6_${iface}\})" alias_IFACE="$(eval echo \$\{alias_${iface}\})" - status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/Link/ { if ($1 == IFACE) print "up" }')" - carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 == IFACE) print "running" }')" + status_IFACE="$(ifconfig "${iface}" | gawk '/Link/ { print "up" }')" + carrier_IFACE="$(ifconfig "${iface}" | gawk '/RUNNING/ { print "running" }')" vlans="$(eval echo \$\{iface_${IFACE}_vlans\})" }
Yeah, this RUNNING check is a terrible idea. You get chicken-egg with wireless cards. For my orinoco to show RUNNING, dhcpcd eth1 has to be executed... dhcpcd is executed by net.eth1.
I should really be more specific. Here's the output of 'ifconfig eth1' at system startup: eth1 Link encap:Ethernet HWaddr 11:11:11:11:11:11 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:11 Base address:0x100 I've removed my MAC address. Line 66: if [ "${carrier_IFACE}" = "running" ] then /sbin/dhcpcd ${dhcpcd_IFACE} ${IFACE} >/dev/null This condition can never be reached.
patch does not work for me although the net seems to come up it really does not. I first thought it had worked as "it came up" but eth0 was not working. Old version still works properly after it is sub'd it back in. derk
NOTE: my net.eth0 is also configured for 'dhcp' configuration so comment #11 applies to me also ... (by the way my net.eth1 was still old style and therefore came up properly. In order to get internet access back I copied it over top of the problematic net.eth0) derk
In reply to the patch in comment 9, that patch makes things worse - `ifconfig eth0` will ALWAYS contain "eth0 Link", whether or not the interface is actually up, so not only does it not bring the interface up, it SAYS it IS bringing the interface up. The "fix" here is to remove all references to the carrier_IFACE, and to _not_ apply the comment 9 patch for status_IFACE. After a bit of testing, my ifconfig eth0 does not change at all when the cable it plugged in vs. not plugged in. Neither does mii-tool help as it won't report on a down interface.
Comment #9 changed the "No link" message to "SCIOADDR: Network Unreachable", but does not solve the problem at all. Can somebody tell me how to extract that init script from the previous baselayout version, because it worked fine.
Hi, I've just made a patch, as it seems the problem is solved for people who tried my modifications :-) Here it is : --- /etc/init.d/net.eth0 2003-11-14 11:03:40.000000000 +0100 +++ /etc/init.d/net.eth0 2003-11-14 11:06:57.000000000 +0100 @@ -32,8 +32,8 @@ dhcpcd_IFACE="$(eval echo \$\{dhcpcd_${iface}\})" inet6_IFACE="$(eval echo \$\{inet6_${iface}\})" alias_IFACE="$(eval echo \$\{alias_${iface}\})" - status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/Link/ { if ($1 == IFACE) print "up" }')" - carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 == IFACE) print "running" }')" + status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '"/Link/" { if ($1 == IFACE) print "up" }')" + carrier_IFACE="$(ifconfig -a | gawk -v IFACE="${iface}" '"/RUNNING/" { if ($1 == IFACE) print "running" }')" vlans="$(eval echo \$\{iface_${iface}_vlans\})" } Good luck :-)
None of the two patches work for me (I use dhcp for my network card). The problem is that upon boot (or after running /etc/init.d/net.eth0 stop), eth0 is not listed in the output of ifconfig. So ${status_IFACE} contains the empty string, as does ${carrier_IFACE}. Therefore the script outputs that no cable is plugged into my card, instead of starting dhcpcd.
For Comment #15: simply re-merge baselayout-1.8.6.11, eg: emerge =baselayout-1.8.6.11 Then do the etc-update dance and review the changes, most you can probably just delete (in which case you'll keep the 1.8.6.12 version), but the net.eth0 changes you should look at carefully and either merge them or overwrite the file (in which case you'll go from a .12 net.eth0 to a .11 net.eth0) BTW, I got bitten by this too, had to play musical monitors and keyboards to find out why my headless box in the corner was not responding to any network requests :)
In replay to comment #14, sure the Link changes was not really thought through, or tested. As for 'ifconfig eth0' not showing 'RUNNING' when the cable is in ... broken driver ?
i dont think its a broken driver, i use the 8139too driver which is pretty thoroughly tested by now. I think the carrier test should be removed, but there must be some way of telling the script there isnt a cable connected so it doesnt spend 5 minutes waiting for a DHCPOFFER that will never come. Until that can be figured out just remove the offending carrier test lines
just noticed a typo in it as well: # Check that eth0 was not brough up by the kernel ... s/brough/brought
*** Bug 33472 has been marked as a duplicate of this bug. ***
Could those of you that have issues with carrier detection merge 'ethtool', and do: # ethtool eth0 | grep Link Link detected: yes # Thanks.
Results on my - carrier detection broken - machine: root@tomsk swat # ethtool eth0 Settings for eth0: No data available root@tomsk swat #
In reply to comment 19, RUNNING is displayed if the IFF_RUNNING bit is set - however very few drivers actually SET this bit, so, as you put it, most drivers are "broken". If you don't believe me, grep through the kernel source for IFF_RUNNING - only a very small number of drivers set it directly. Most call netif_carrier_on/_off, which doesn't set it. linux/net/core/dev.c will set IFF_RUNNING when you retrieve interface information - but only if both the carrier is ok AND the interface is running. Regarding the ethtool information, the below is from a Broadcom Gigabit network adapter (using the tg3 kernel driver): [root@hades root]# ethtool eth0 && ifconfig eth0 down && ethtool eth0 && ifconfig eth0 up Settings for eth0: Supported ports: [ MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: 10Mb/s Duplex: Half Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: yes Settings for eth0: Cannot get device settings: Resource temporarily unavailable Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: no Running the same on another machine with a 3Com card gives similar results. The current carrier logic needs a bit of work, IFF_RUNNING _is_ useful, but is only reliable when the interface is up. For example, if I run ``ifconfig eth0 up'' (but don't specify any network address) the RUNNING shows up as you would expect - it's there if the cable is plugged in, not there if the cable is unplugged. So, the logic should be like this: - run ifconfig eth0 up - check ``ifconfig eth0'' for 'RUNNING' - if not there, display warning, otherwise - run dhcpcd (dhcpcd doesn't care if the interface is already up) I'm attaching (so as not to lose tabs, etc.) a /etc/init.d/net.eth0 patch that fixes the carrier detection logic without removing the feature. I quite like the feature, in fact, as it now means I can start up my notebook without an active connection and won't have to wait for dhcpcd to timeout. There's also a minor bug in the baselayout ebuild - the ``chown root.uucp'' needs to change to root:uucp for updated POSIX compliance with newer glibc's, along the same vein as the `head -1' => `head -n 1' changes that have been needed all over the place recently.
Created attachment 20807 [details, diff] Patch to fix net.eth0 carrier detection
Created attachment 20812 [details, diff] patch against net.ethX the bug is due to a typo in line 66 comparison won't work: there is a missing '='.
Hi, I am using 3c59x driver with my 3com card Unless I either run dhcpcd or ifconfig eth0 up my eth0 doesn't show up on ifconf at all, so I would say either disable the link check completely or do ifconfig up first. Now, about the actual check procedure. When my interface is down, ifconfig will not show it in the list at all, so the check will fail because it thinks there is no link. ifconfig -a however will actually list the interface. The problem is, Link work will be there and RUNNING will not. here is teh output of a "down" interface ifconfig: lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:6 errors:0 dropped:0 overruns:0 frame:0 TX packets:6 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:380 (380.0 b) TX bytes:380 (380.0 b) ifconfig -a eth0 Link encap:Ethernet HWaddr 00:60:08:A9:DD:67 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:31494 errors:0 dropped:0 overruns:3 frame:0 TX packets:302 errors:0 dropped:0 overruns:0 carrier:0 collisions:5 txqueuelen:1000 RX bytes:2215009 (2.1 Mb) TX bytes:38780 (37.8 Kb) Interrupt:11 Base address:0xd000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:6 errors:0 dropped:0 overruns:0 frame:0 TX packets:6 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:380 (380.0 b) TX bytes:380 (380.0 b) so, I had to swap status and carrier vars, and change ifconfig to ifconfig -a to get it to work. P.S. Not to mention the fact that the stupid dependency checking promptly brought down all the services that depended on net, but since restart of the interface failed, it didn;t bring any of them up. After fixing the script and starting net.eth0 I had to manualy start half a dozen of services since it wouldn't pick it up. Sucks.
In reply to comment 27, = works just as well (and exactly the same) as == in a bash [ comparison ]: $ if [ "foo" = "foo" ]; then echo "bar"; fi bar $ if [ "foo2" = "foo" ]; then echo "bar"; fi $ if [ "foo" == "foo" ]; then echo "bar"; fi bar $ if [ "foo2" == "foo" ]; then echo "bar"; fi In reply to comment 28, try applying my patch - it does an ifconfig up before checking for RUNNING. Also, for next time, it might be useful to remember the ``rc'' command - without arguments, it starts any services needed for your current runlevel, and stops any that shouldn't be running.
Created attachment 20832 [details, diff] net_eth0-fix-carrier-detection.patch Jason has the answer, thanks! Please try attached patch with a bit more error checking. As for the other 'fix' in comment #3: -- Found a solution by replacing : status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/Link/ { if ($1 == IFACE) print "up" }')" with status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '"/Link/" { if ($1 == IFACE) print "up" }')" -- I do not like the '"/Link/"', etc things in gawk - its not correct according to me (could be wrong). I unfortunately did not check the status_IFACE at the time I added the patch (cant remember which bug), but the 'more correct' way is to do "gawk '$0 ~ /Link/ ....".
Az, that patch seems to work for me
Hm, i'm not happy with the patches so far. My understanding is that the new net.ethX tries to avoid the long delay if the network cable is not pluggend in and the interface want an ip adress with dhcp. That is a problem if you use a laptop 'in fresh air' ;-) But that is not real life. DHCP is desinged no minimize admin work. In most cases if no network connection is found it is an error condition. Not only if you wat use it on a server, even also if you try to plugin the laptop on customer site and the network connection is not working propertly. What you have to do in this case is to bring up the interface manualy and to start every depended service by hand. Or, if you a usual windows user, restart the machine. dhcpcd has the option -t <timeout>, default 60sec. And i think, that can be a better resolution. Use a short timeout for laptops or a long for servers. This resolution not only fixes this ugly carrier detect problem, it will also fix the upcoming problem if a dhcp server is unavalilable due to reboot or simple network overloads/problems or if there is no dhcp server in a working network at all. It is also possible to fork a script or dhcpcd into the background to bring up the interface if a connection/dhcp server becomes available. Pls forget the carrier detection.
Alright, let's see if I can address this and save the devs some work: > Hm, i'm not happy with the patches so far. My understanding is that the new > net.ethX tries to avoid the long delay if the network cable is not pluggend > in and the interface want an ip adress with dhcp. That is a problem if you use > a laptop 'in fresh air' ;-) > > But that is not real life. DHCP is desinged no minimize admin work. In most > cases if no network connection is found it is an error condition. Not only > if you wat use it on a server, even also if you try to plugin the laptop on > customer site and the network connection is not working propertly. What you > have to do in this case is to bring up the interface manualy and to start > every depended service by hand. Or, if you a usual windows user, restart the > machine. Yes, that is precisely the point. If no cable is plugged in, it isn't possible that a connection can be established - there's no cable to establish the connection on. The carrier detection aids in that it displays a message on boot if no network cable is plugged in. An error is going to be displayed, you can't change that fact, but an error such as "No cable connected" is going to help you fix the problem faster than a generic "Couldn't start eth0" error message. With regards to your point about manually starting the interface, well, yes you will have to do this - but the carrier detect doesn't factor in here because the only time it will show up (and bypass dhcpcd) is if there is no cable connected. Once you've figured out the problem, and have brought the network connection up manually, it's simply a matter of running ``rc'' - you certainly don't have to manually start each service. > dhcpcd has the option -t <timeout>, default 60sec. And i think, that can be > a better resolution. Use a short timeout for laptops or a long for servers. > This resolution not only fixes this ugly carrier detect problem, it will also > fix the upcoming problem if a dhcp server is unavalilable due to reboot or > simple network overloads/problems or if there is no dhcp server in a working > network at all. It is also possible to fork a script or dhcpcd into the > background to bring up the interface if a connection/dhcp server becomes > available. I've done that on my own laptop; I lowered it from 60 to 15. I would have lowered it to 10 or even 5, except those didn't work - I ended up with occassional dhcpcd timeouts on my home connection. It was still a nuissance as if I wanted to start up my notebook without a connection I'd have to wait an additional 15 seconds for something that I _know_ is going to fail. I fail to see what is "ugly" about the detection (at least, after applying the final patch in comment 30). Forking it off into the background isn't a realistic option - many, many services depend on the network interface being up, and can't be started until the network interface is up. Nothing in your objection actually has anything to do with the carrier detection - it seems to be more of a rant about not liking Gentoo's init scripts for bringing up the network interface. If you have a well thought out suggestion on how to improve it in such a way that it doesn't introduce other problems, please, by all means, file a bug! I may even help you with it, and I'm sure several devs would come on board if it was a really promising idea. But that isn't what THIS bug is about. This bug is about using carrier detection to immediately return an informative error rather than waiting for a dhcpcd timeout that is _GUARANTEED_ to happen. It's a nice shortcut, and I expect many users will find it useful - I know I've run into situations in the back where a "network cable unplugged"-type error would have saved me time. If there are situations where the carrier detection breaks (for example, some exotic network card that doesn't properly set the kernel's network interface carrier bit), then please support then - in such a case it may be necessary to make the code optional. > Pls forget the carrier detection. If you have some legitimate objections to the carrier detection rather than a rant about Gentoo's boot design, please add your comments and I'll attempt to help get them resolved. Otherwise, the carrier detection is a useful feature that, once patched, should help users diagnose network startup issues without adverse effect. That reminds me of something else: Azarah, perhaps the carrier detection should go into the non-DHCP network setup as well? You could check RUNNING immediately after running ifconfig up. In such a case it should probably be a warning rather than a failure since the machine _does_ have an IP and the other services _can_ start, even if they won't be accessible. But even just a warning might be useful to many.
Ok. last post here,the discussion should be carry on another places. gento-dev? > This bug is about using carrier detection to immediately return an > informative error rather than waiting for a dhcpcd timeout that is > _GUARANTEED_ to happen. Exacty that does'nt happen every time. Bring to mind you have running some servers far far away, running gentoo. In this location you have a power failure. You lost the machines if in this millisecond where you ask for the carrier the switch is not available for ever reason. Services like sshd won't start up even the network is available a second later. If you need this server for your laptop you are lost.
Hmm, you've got a point (re: comment 34). I'm all in favour of making this an option in /etc/conf.d/net. It _is_ a nice feature for users with laptops, and a useful debugging tool for desktop systems, but it is a potential problem with headless machines (i.e. servers), as mentioned, since the check is an instantaneous thing, while dhcpcd by default allows up to 60 seconds for the network connection to be established. However, the approach I am suggesting is not "drop it, it's stupid, it can be bad sometimes so should never be used" but rather "do the carrier check by default, but provide an easy option to turn it off for those cases where it isn't wanted."
This is a report of comment 34, prompted by Bugzilla's wrapping (or lack thereof). I apologize. Hmm, you've got a point (re: comment 34). I'm all in favour of making this an option in /etc/conf.d/net. It _is_ a nice feature for users with laptops, and a useful debugging tool for desktop systems, but it is a potential problem with headless machines (i.e. servers), as mentioned, since the check is an instantaneous thing, while dhcpcd by default allows up to 60 seconds for the network connection to be established. However, the approach I am suggesting is not "drop it, it's stupid, it can be bad sometimes so should never be used" but rather "do the carrier check by default, but provide an easy option to turn it off for those cases where it isn't wanted."
Can we get back to the point please? The simple problem that I am having is that carrier detection does not work on my machine. For one reason or another my network card does not appear to support it (see comment 24). This has caused problems for me, and also it seems for other people. It returned that there was no carrier - when there blatantly was! Surely if carrier detection is that unreliable - then there is no way we should allow it to be a default option?
Simon, did you even try Jason's original patch, or my final version ??
Martin, many thanks, your latest patch fixed it :-)
Created attachment 20905 [details, diff] This workaround is needed because the behaviour of some ethernet drivers breaks the carrier autodection stuff in net.eth0 The new carrier autodetection stuff in net.eth0 has introduced a problem: after a 'ifconfig up' some ethernet drivers (eg: e100) do not immediately mark the interface as 'RUNNING' (and as Jason has pointed it out in comment #25, some drivers do not even do it). But for those that have the chance to have a driver that set this IFF_RUNNING bit (but after some delay), I propose this patch.
*** Bug 33761 has been marked as a duplicate of this bug. ***
I noticed the same thing as Marc (comment 40) yesterday on my broadcom (tg3) card, but only when the system first comes up. `ifconfig down; ifconfig up' would show RUNNING immediately, but as Marc described, I also found a 2 second timeout was needed when the system first booted for my tg3 adapter.
Even comment #3 got things running on my set of mostly-e100 machines. Checking for a carrier does sound like a decent idea. I might even suggest firing up dhcp but *not waiting* to see if it succeeded if there was no carrier, but watching for success if there already was a carrier. So leaving it on for laptops and such would not slow down booting but allow them to hopefully come to life as soon as they are plugged in. And get around the whole problem of cards not being properly recognized by this logic. In any case, what are the chances of getting something - anything - applied to baselayout soon? I had to go patch this file on one of my machines here that I'd forgotten to fix... I was hoping it would've solved itself by now.
OK I could be wrong on this, but I believe that looking for Link in the script is incorrect, the only time this will not be present is in event of the interface not being present at all but this is not what the script appears to be looking for, it is just trying to check whether the status of the interface is up. As such I have made the following modifications to my script, which appears to work properly in all situations apart from a complete card failure (If we want to check for this it will require some additional code). Changes to net.eth0 script are as follows: <status_IFACE="$(ifconfig ${iface} | gawk '/UP/ { print "up" }')" <carrier_IFACE="$(ifconfig ${iface} | gawk '/RUNNING/ { print "running" }')" --- >status_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/Link/ { if ($1 == IFACE) print "up" }')" >carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 == IFACE) print "running" }')" <if [ "${carrier_IFACE}" != "running" ] --- >if [ "${carrier_IFACE}" = "running" ] Kind regards, Robert Thorneycroft
Re #38 - Sorry, yes I tried the final patch and it does seem better.
Created attachment 20935 [details, diff] A little revised patch that works for me (none of the others did) This is a revised version of Roberts script (comment 44). It works for me (none of the others did), plz give it a try! /Andreas (I use the eepro100 driver, maybe interesting to know that to be able to decide what works and not)
Re #46 and latest patch. Some parts of this patch seem to be unnecessary with the new entries suggested in my earlier post. Specifically the following sections make the code more difficult to read and do not perform anything that was not covered before: - carrier_IFACE="$(ifconfig | gawk -v IFACE="${iface}" '/RUNNING/ { if ($1 == IFACE) print "running" }')" and # Check that the interface has a carrier - if [ "${carrier_IFACE}" = "running" ] + /sbin/ifconfig ${IFACE} |grep -q 'RUNNING' >/dev/null + if [ $? -eq 0 ] If the line: carrier_IFACE="$(ifconfig ${iface} | gawk '/RUNNING/ { print "running" }')" is used instead of removing the carrier_IFACE definition from setup_env(), then there is no need to run ifconfig commands and check return codes in the middle of iface_start(), doing so only makes the code more complex and difficult to understand. It should still be noted that the following modification also needs to be made: - if [ "${carrier_IFACE}" = "running" ] + if [ "${carrier_IFACE}" != "running" ] This is because if the status is already running then dhcpcd does not need to be started, in your example you appear to be only trying to start dhcpcd if the interface is already up and running? I apologise if I did not understand your code correctly. Kind regards, Robert Thorneycroft
Some thoughts: A dhcp interface is like a pcmcia card. If it is not plugged at boot time it is a bad idea to wait for a timeout that somebody plug it in. But you are able to plug in this card at every time. And all services will immediatly listen also on this interface magically, if not otherwise configured. You can at every time load a module for a network card, even if it is not connected to a network. If it is successful loaded you can see it in /proc/net/dev. A 'dhcpcd -t 99999 &' will wait a long time to bring that interface UP;-) A carrier detection is not really nessesary, but will speed up things. Especial when it is build into dhcpcd. Some services should start even no (remote) network is available. sshd is a good example. From a server site view. And sometimes you can abuse a laptop as a server. Or think about PCI hotplug.
*** Bug 33961 has been marked as a duplicate of this bug. ***
*** Bug 33970 has been marked as a duplicate of this bug. ***
*** Bug 33969 has been marked as a duplicate of this bug. ***
Re: comment 47 The problem is as Jason stated in comment 25, that the 'RUNNING' info is not shown if the device isn't up (at least for my device). That's why we/I need to do the check later on. Also I added a try to take the device down when no carrier was found, which makes it behave better if I try to restart the net.eth0 script later - else I end up with a message that everything is ok when it isn't. More comments? Anybody else that have tried my little patch? Results?
Can baselayout 1.8.6.12 be pulled from the portage tree until these issues are thrashed out? There's a good chance than anyone who installs 1.8.6.12 will break their network connection (as I did) - that well and truly cancels any potential gains it may have over 1.8.6.11 (and which works fine for me.)
Just to add some data: net.eth0 failed on my system, too. NIC: SMSC EPIC/100 83c170 Martin's patch from comment 30 solved the problem. ethtool's output is independent of ifconfig up|down status: With cable plugged in: bash-2.05b# ethtool eth0|grep Link Link detected: yes With cable plugged off: bash-2.05b# ethtool eth0|grep Link Link detected: no But all ifconfig output is independent of the cable plugging, so there is no way to detect carrier with this NIC simply by looking at the ifconfig output. You would need the routines of ethtool to achieve this. If ifconfig is down: bash-2.05b# ifconfig|grep eth0 bash-2.05b# ifconfig eth0 eth0 Protokoll:Ethernet Hardware Adresse 00:E0:29:28:0C:A1 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:314937 errors:0 dropped:0 overruns:0 frame:3 TX packets:59428 errors:0 dropped:0 overruns:0 carrier:0 Kollisionen:0 Sendewarteschlangenl
Just to add some data: net.eth0 failed on my system, too. NIC: SMSC EPIC/100 83c170 Martin's patch from comment 30 solved the problem. ethtool's output is independent of ifconfig up|down status: With cable plugged in: bash-2.05b# ethtool eth0|grep Link Link detected: yes With cable plugged off: bash-2.05b# ethtool eth0|grep Link Link detected: no But all ifconfig output is independent of the cable plugging, so there is no way to detect carrier with this NIC simply by looking at the ifconfig output. You would need the routines of ethtool to achieve this. If ifconfig is down: bash-2.05b# ifconfig|grep eth0 bash-2.05b# ifconfig eth0 eth0 Protokoll:Ethernet Hardware Adresse 00:E0:29:28:0C:A1 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:314937 errors:0 dropped:0 overruns:0 frame:3 TX packets:59428 errors:0 dropped:0 overruns:0 carrier:0 Kollisionen:0 Sendewarteschlangenlänge:100 RX bytes:86662192 (82.6 Mb) TX bytes:3285136 (3.1 Mb) Interrupt:10 Basisadresse:0xd400 If ifconfig is up: bash-2.05b# ifconfig eth0 Protokoll:Ethernet Hardware Adresse 00:E0:29:28:0C:A1 inet Adresse:134.60.106.64 Bcast:134.60.106.127 Maske:255.255.255.128 UP BROADCAST NOTRAILERS RUNNING MULTICAST MTU:1500 Metric:1 RX packets:315476 errors:0 dropped:0 overruns:0 frame:3 TX packets:59438 errors:0 dropped:0 overruns:0 carrier:0 Kollisionen:0 Sendewarteschlangenlänge:100 RX bytes:86701793 (82.6 Mb) TX bytes:3286730 (3.1 Mb) Interrupt:10 Basisadresse:0xd400
I hate to sound harsh, this needs to be pulled and rolled back until a better solution is implimented. It took down 9 of 20 servers here and 4 workstations. I can deal with most all issues, however, this is a bigger problem than a buggy application or ebuild. Please roll it back.
As per comment #54-> This works great for me (2.6 8139too driver). (replacing just the carrier_IFACE= line): carrier_IFACE="$(ethtool ${iface} | gawk '/Link detected: yes/ { print "running" }')" Though I'm not sure it would work for all network drivers: From ethtool manpage: BUGS Not supported (in part or whole) on all ethernet drivers.
How about giving up this current net.X and make some good ifplugd replacements?
*** Bug 34370 has been marked as a duplicate of this bug. ***
Just thought I'd add a 'me-too', v1.32 of net.eth0 broke all our network cards in the office on every gentoo box. Going back to the 1.31 version (or, I would suspect, 1.34 currently in cvs) will fix the issue. Perhaps this should be made live asap if not done already?
Created attachment 21458 [details, diff] $(LC_MESSAGES=C ifconfig ...) On a localized system, you can't rely on messages displayed by the software, unless you reset LC_MESSAGES when running it. On my french gentoo, ifconfig say Lien instead of Link ! ;-) hope this help.
I thought I'd add my two cents as well... since everyone else seems to be doing so. I'm in favor of not checking for a link state in net.eth0. I was un-aware of ethtool. I've been using mii-tool for years. It seems, at least for me; that mii-tool works a bit better for the cards that support it. I can ifconfig eth0 down, and ethtool will not show a link. mii-tool however, will properly show what is going on at the "plug" level. In my case, after a slight pause for autonegotiation to occur; mii-tool shows that there is a link even though eth0 is down. ethtool still shows no link. I suspect that when one uses ifconfig to up or down an interface, it resets the connection and auto-negotiation needs to occur before settling to link on. I've done some network driver programming myself, and have been using mii-tool for a while. Ethtool is new to me. It must operate on a somewhat higher level, perhaps that of ethX (hence the name!). mii-tool talks more or less directly to the driver, which in turn takes a look at the MII registers for the card. Not all cards support MII (would have to be pretty old for this to be the case), I don't think I even have anything like this... I've got about 20 old NICs in a box in my basement (various brands) and I can't remember ONE that does not support MII. Anyways, that's my two cents.
none of the fixes provided thus far work on all nic's one fix, without downgrading baselayout, is to grab net.eth0 script from rc-scripts-1.4.3.11p2.tar.bz2, and use it until this problem is solved.
Wow, this bug is still "new". :) After checking out the latest addition, I'd like to ammend my comment above. It has since become apparent to me that mii-tool is "old" and on the way out, while ehtool is "new" and on the way in. Any use of mii-tool should be migrated to ethtool. Any planned use of checking for link state in net.ethX scripts should use ethtool.
Re: comment 63, ethtool won't solve this problem. In particular, see 'man ethtool': BUGS Not supported (in part or whole) on all ethernet devices Furthermore, try this: # mii-tool eth0 # ethtool eth0 # ifconfig eth0 down # mii-tool eth0 # ethtool eth0 In reply to this bug in general, here are the outputs of mii-tool and ethtool given four different network cards I have quick access to, which are all fairly common: On my 'tg3' (Broadcom gigabit) on my Dell notebook, I get: # mii-tool # Same whether up or down: eth0: negotiated 100baseTx-FD flow-control, link ok # ethtool eth0 # when eth0 down: Settings for eth0: Cannot get device settings: Resource temporarily unavailable Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: no # ethtool eth0 # when eth0 up: Settings for eth0: Supported ports: [ MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: yes Now, switching over to my desktop system, which uses 'forcedeth' (the reverse-engineered nForce Ethernet driver): # mii-tool eth0 # same whether up or down: SIOCGMIIPHY on 'eth0' failed: Operation not supported # ethtool eth0 # same whether up or down: Settings for eth0: No data available Here's a via-rhine: # mii-tool eth0 # eth0 up: eth0: negotiated 100baseTx-FD flow-control, link ok # mii-tool eth0 # eth0 down: SIOCGMIIPHY on 'eth0' failed: Operation not supported # ethtool eth0 # same output whether up or down: Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: MII PHYAD: 1 Transceiver: internal Auto-negotiation: on Current message level: 0x00000001 (1) Link detected: yes And finally, here's a 3c59x: # mii-tool eth0 # eth0 up: eth0: negotiated 100baseTx-FD, link ok # mii-tool eth0 # eth0 down: No MII transceiver present!. # ethtool eth0 # Whether up or down: Settings for eth0: No data available So, I've given 4 network cards, which are all reasonably common (I've got a Realtek around here somewhere, but it's not as easily accessible as it isn't currently in a machine), none of which work in both mii-tool and ethtool. It doesn't appear that either mii-tool or ethtool can get the job done, so depending on either would appear to be out of the question. In the interests of getting this bug fixed, how about adding some new configuration options to /etc/conf.d/net - perhaps along the lines of: # To enable link-detection (does not work on all network cards): #checklink_eth0="mii-tool" #checklink_eth1="ethtool" #checklink_eth2="ifconfig" That would give the capability to people with cards that support it, and not force a significant problem onto those with cards that don't support it. Furthermore, as has already been mentioned about 50 comments ago, not bringing up a network interface because the cable is unplugged is often a Bad Thing - if a cable is unplugged, to get the system back on the network all I should have to do is plug the cable back in. The above mentioned configuration option would give people the ability by simply commenting it out.
As Jason mentioned in the previous comment, mii-tool and ethtool support are spotty in the drivers. mii-tool only works up to 100 Mbit adapters (though tg3 seems to support it for link detection). I have two 10 Gbit adapters that only support ethtool. Since this bug was opened, net.eth0 was completely rewritten. It now has preup() and predown() functions which can be defined in /etc/conf.d/net. The functions are called with the interface as the first parameter, for example you can do: preup() { if [[ $1 == eth0 ]]; then # hey, I *know* this card supports ethtool ethtool | grep -q 'Link detected: yes' return $? fi return 0 } net.eth0 will abort if the return value from preup() is non-zero. Yes, this requires more work on your part, but it means you get to choose unequivocally what kind of link detection you want to do. That's what Gentoo is about, right? Choice! :-) I don't foresee using mii-tool or ethtool or any other kind of link detection directly in net.* any time in the future, so I'm closing this bug at this point FIXED since you have preup() available.
*** Bug 25480 has been marked as a duplicate of this bug. ***
*** Bug 58586 has been marked as a duplicate of this bug. ***