Ethernet over firewire devices (driver: eth1394) have - starting with kernel 2.6.21 - no longer a proper parent-device/subsystem in sysfs. The code change resulting in this is: --- linux-2.6.20-gentoo/drivers/ieee1394/eth1394.c +++ linux-2.6.21-gentoo/drivers/ieee1394/eth1394.c @@ -585,7 +584,10 @@ } SET_MODULE_OWNER(dev); +#if 0 + /* FIXME - Is this the correct parent device anyway? */ SET_NETDEV_DEV(dev, &host->device); +#endif priv = netdev_priv(dev); This missing parent-device breaks persistence-net, as that adds DRIVERS=="?*" to the generated rules, but the driver-attribute resides in the parent-device that no longer is available now. This leads to not matching the existing rule, but at every boot running into the code to get a new number for the interface, and thus enlarging the ruleset with one line at every boot.
The commitdiff for this change can be found here: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7a9eeb2fa1b3a3a83670b9ba08dd396beedb88f8
I reported your bug to LKML as reply in the thread of the bug report which lead to the patch. http://lkml.org/lkml/2007/5/13/53 Note, I don't see this bug here myself: $ ls -l /etc/udev/rules.d/70-persistent-net.rules -rw-r--r-- 1 root root 352 Mar 4 13:17 /etc/udev/rules.d/70-persistent-net.rules $ cat /etc/udev/rules.d/70-persistent-net.rules # Firewire device 080028560000319b) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1" # Firewire device ) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2" # PCI device 0x8086:0x109a (e1000) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0" (I have two FireWire controllers installed.) However, I don't reboot often, and I don't load eth1394 anymore during startup. I still load eth1394 frequently for testing though. Also, I'm using kernel.org kernels patched with recent IEEE 1394 development patches, but that shouldn't matter.
Stefan, thanks for jumping in. I'd be interested to see your output of "udevinfo -a -p /sys/class/net/eth2" where eth2 is an eth1394 device.
# udevinfo -a -p /sys/class/net/eth2 Udevinfo starts with the device specified by the devpath and then walks up the chain of parent devices. It prints for every device found, all possible attributes in the udev rules key format. A rule to match, can be composed by the attributes of the device and the attributes from one single parent device. looking at device '/devices/virtual/net/eth2': KERNEL=="eth2" SUBSYSTEM=="net" DRIVER=="" ATTR{weight}=="0" ATTR{tx_queue_len}=="1000" ATTR{flags}=="0x1003" ATTR{mtu}=="1500" ATTR{operstate}=="unknown" ATTR{dormant}=="0" ATTR{carrier}=="1" ATTR{broadcast}=="ff:ff:ff:ff:ff:ff:ff:ff" ATTR{address}=="00:10:dc:56:00:fe:d2:d4" ATTR{link_mode}=="0" ATTR{type}=="24" ATTR{features}=="0x20" ATTR{ifindex}=="4" ATTR{iflink}=="4" ATTR{addr_len}=="8" looking at parent device '/devices/virtual/net': KERNELS=="net" SUBSYSTEMS=="" DRIVERS=="" looking at parent device '/devices/virtual': KERNELS=="virtual" SUBSYSTEMS=="" DRIVERS=="" Different to that, udevinfo shows e1000 as DRIVER of the grandparent device of my eth0 device. However, /etc/udev/rules.d/70-persistent-net.rules looks thus: # Firewire device 080028560000319b) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1" # Firewire device ) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2" # PCI device 0x8086:0x109a (e1000) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0" The filestamp of my 70-persistent-net.rules is March 4 2007. I wrote the patch which moves eth1394's devices beneath the virtual device on March 20. --------------------------------------------- Side note, if this is of any relevance: I get the following in /var/log/messages when eth1394 creates its interfaces. May 20 10:33:46 stein ieee1394: Host added: ID:BUS[0-00:1023] GUID[080028560000319b] May 20 10:33:46 stein ieee1394: Host added: ID:BUS[1-00:1023] GUID[0010dc5600fed2d4] May 20 10:33:48 stein eth1394: eth1: IPv4 over IEEE 1394 (fw-host0) May 20 10:33:48 stein eth1394: eth2: IPv4 over IEEE 1394 (fw-host1) May 20 10:33:48 stein rc-scripts: We only hotplug for ethernet interfaces May 20 10:33:48 stein rc-scripts: We only hotplug for ethernet interfaces I bring the interfaces up when I need them with # /etc/init.d/net.eth1 start # /etc/init.d/net.eth2 start I have only their IP address and netmask configured in /etc/conf.d/net.
How are entries added to 70-persistent-net.rules anyway? I just created /etc/init.d/net.eth3 and plugged in a 3rd FireWire card (I have a PCI to CardBus adapter installed). 70-persistent-net.rules remains unchanged.
They are generated by /etc/udev/rules.d/75-persistent-net-generator.rules
That file checks for DRIVERS=="?*" before generating any rules so it's not clear why the sysfs change would be "enlarging the ruleset with one line at every boot." unless this is a new addition. However the bug is still there in that previously-generated rules matched DRIVERS="?*" but such rules won't work for 2.6.21 kernels. e.g. try swapping your eth1 and eth2 entires in 70-persistent-net.rules -- I don't think the devices would follow the new order (and if they do, it's probably coincidence)
What does it take to rename network interfaces anyway? I tested with the following changes. In 70-persistent-net.rules: # Firewire device 080028560000319b) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="fwti" # Firewire device ) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="fwvia" # PCI device 0x8086:0x109a (e1000) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0" In /etc/init.d: net.eth0 -> net.lo net.eth1 -> net.lo net.eth2 -> net.lo net.eth3 -> net.lo net.fwti -> net.lo net.lo In /etc/conf.d/net: ... config_fwti=( "10.10.222.42 netmask 255.255.255.0" ) ... # modprobe eth1394 May 20 16:47:11 stein eth1394: eth1: IPv4 over IEEE 1394 (fw-host0) May 20 16:47:11 stein eth1394: eth2: IPv4 over IEEE 1394 (fw-host1) May 20 16:47:11 stein rc-scripts: We only hotplug for ethernet interfaces May 20 16:47:11 stein rc-scripts: We only hotplug for ethernet interfaces # /etc/init.d/net.fwti start * Caching service dependencies ... [ ok ] * Service net.fwti starting network interface fwti does not exist Please verify hardware or kernel module (driver) [ !! ] * ERROR: net.fwti failed to start May 20 16:49:42 stein rc-scripts: network interface fwti does not exist May 20 16:49:42 stein rc-scripts: Please verify hardware or kernel module (driver) May 20 16:49:42 stein rc-scripts: ERROR: net.fwti failed to start Can't I specify network interfaces by their MAC? The MAC is the only persistent and unique device property, hence the only property that can safely be used to persistently refer to network interfaces.
(In reply to comment #8) > > Can't I specify network interfaces by their MAC? The MAC is the only > persistent and unique device property, hence the only property that can safely > be used to persistently refer to network interfaces. > This is what this bug is about: udev uses the MAC to persist net devices. But to exclude secondary interfaces, like bridges, bondings, vlan stuff using the same MAC like the master device it is attached to, there is the additional check for DRIVERS=="?*". This check is broken by latest kernel changes!
Just to clarify, Stefan: The rules you changed will take effect in kernels older than 2.6.21. However, as of 2.6.21, due to the DRIVERS stuff going away, they don't work. If you remove the DRIVERS key from your rules, they will probably start working. I asked Matthias not to modify udev to take account for the new 2.6.21 behaviour until we have a grasp on whether the kernel sysfs change was intentional, and whether it is fixable. Hopefully you can help us out here :)
As you are probably aware of, Greg KH is converting more and more subsystems away from class_device. The ieee1394 subsystem is the last one which wasn't converted yet. After Greg submitted the conversion of the networking subsystem during the 2.6.21-rc1 merge window, the bug mentioned in the git commit in comment #1 and the lkml thread in comment #2 became know. I spent a day or so trying to find a fix, and ended up with the #if 0--#endif "solution". I shall ask Greg how (if?) he intends to finish the conversion of the 1394 subsystem; also I shall try one or two other ideas how that bug could possibly avoided. I am not sure anymore what I already tried back then, so I might run into dead ends once more.
Created attachment 119851 [details, diff] ieee1394: eth1394: bring back a parent device This patch links to the device as parent device which was eth1394's grandparent device in Linux 2.6.20 and older. Is this sufficient for you? (Tested on a different computer than in my previous comments, therefore it has a different MAC.) # udevinfo -a -p /sys/class/net/eth1 Udevinfo starts with the device specified by the devpath and then walks up the chain of parent devices. It prints for every device found, all possible attributes in the udev rules key format. A rule to match, can be composed by the attributes of the device and the attributes from one single parent device. looking at device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/net/eth1': KERNEL=="eth1" SUBSYSTEM=="net" DRIVER=="" ATTR{weight}=="0" ATTR{tx_queue_len}=="1000" ATTR{flags}=="0x1002" ATTR{mtu}=="1500" ATTR{operstate}=="down" ATTR{broadcast}=="ff:ff:ff:ff:ff:ff:ff:ff" ATTR{address}=="00:17:f2:ff:fe:66:fb:80" ATTR{link_mode}=="0" ATTR{type}=="24" ATTR{features}=="0x20" ATTR{ifindex}=="6" ATTR{iflink}=="6" ATTR{addr_len}=="8" looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/net': KERNELS=="net" SUBSYSTEMS=="" DRIVERS=="" looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0': KERNELS=="0000:03:03.0" SUBSYSTEMS=="pci" DRIVERS=="ohci1394" ATTRS{msi_bus}=="" ATTRS{broken_parity_status}=="0" ATTRS{enable}=="2" ATTRS{modalias}=="pci:v000011C1d00005811sv000011C1sd00005811bc0Csc00i10" ATTRS{local_cpus}=="3" ATTRS{irq}=="19" ATTRS{class}=="0x0c0010" ATTRS{subsystem_device}=="0x5811" ATTRS{subsystem_vendor}=="0x11c1" ATTRS{device}=="0x5811" ATTRS{vendor}=="0x11c1" looking at parent device '/devices/pci0000:00/0000:00:1e.0': KERNELS=="0000:00:1e.0" SUBSYSTEMS=="pci" DRIVERS=="" ATTRS{msi_bus}=="1" ATTRS{broken_parity_status}=="0" ATTRS{enable}=="1" ATTRS{modalias}=="pci:v00008086d00002448sv00000000sd00000000bc06sc04i01" ATTRS{local_cpus}=="3" ATTRS{irq}=="0" ATTRS{class}=="0x060401" ATTRS{subsystem_device}=="0x0000" ATTRS{subsystem_vendor}=="0x0000" ATTRS{device}=="0x2448" ATTRS{vendor}=="0x8086" looking at parent device '/devices/pci0000:00': KERNELS=="pci0000:00" SUBSYSTEMS=="" DRIVERS=="" ATTRS{uevent}==""
I pulled the patch over to the machine with the 3 FireWire cards, deleted the previous entries for eth1...eth3 from 70-persistent-net.rules. When I loaded eth1394 with the two fixed cards present and ohci1394 inserted, I got the lines # PCI device 0x1106:0x3044 (ohci1394) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2" # PCI device 0x104c:0x8025 (ohci1394) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1" back. When I popped in the CardBus card, I got # PCI device 0x1033:0x00cd (ohci1394) SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:d0:f5:20:08:00:61:3d", NAME="eth3" Also, if I exchange names in 70-persistent-net.rules, the replacement names are honored whenever I load eth1394. (If I revert the patch, there are neither new rules added to 70-persistent-net.rules if I deleted them beforehand, nor are existing rules used to rename interfaces.) I will have one more go at it circa tomorrow, in order to attempt to link to the fw-host device again instead of the PCI device.
I received a patch from Kay Sievers today which converts ieee1394 away from class_device. This should enable us to bring back the original SET_NETDEV_DEV(dev, &host->device);. But I have yet to test the conversion patch, and it might be too dangerous for a production kernel package. Therefore I'd like to hear your opinion whether the attachment 119851 [details, diff], i.e. SET_NETDEV_DEV(dev, host->device.parent), is OK as temporary measure for kernel 2.6.21 and 2.6.22.
Assuming it allows the udev rules to work again (I can't test it, no firewire hardware), it's fine by me. Would you consider submitting this for upstream inclusion or is it too ugly for that?
I think what I'm going to do is a) with the "bring back a parent device" patch: - compare with udevinfo output from 2.6.20, - test with udev of a SuSE 10.1 machine nearby, - check back whether anybody noted an objection here, otherwise - send it upstream to 2.6.22-rc and 2.6.21.y soon. b) in search for other workarounds: I gave up now. c) after Kay Sievers' ieee1394 subsystem conversion: Perhaps restore 2.6.20's host device relationship, or keep the new parent device relationship from the small patch.
Re comment #16: a.1) The old device hierarchy had this as parent of the ethX device: looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/fw-host0': KERNELS=="fw-host0" SUBSYSTEMS=="ieee1394" DRIVERS=="nodemgr" ATTRS{is_busmgr}=="0" ATTRS{is_irm}=="1" ATTRS{is_cycmst}=="1" ATTRS{is_root}=="1" ATTRS{in_bus_reset}=="0" ATTRS{nodes_active}=="1" ATTRS{selfid_count}=="2" ATTRS{node_count}=="2" (The "nodemgr" driver was new in 2.6.20; the fw-host devices had no driver in 2.6.19 or older.) SUBSYSTEMS=="ieee1394" is relevant in so far as it obviously enables the write_net_rules program to mention Firewire instead of PCI in the comment above an entry in 70-persistent-net.rules. However, the comments with the new rules as seen in comment #13 should be fine too as it mentions to the PCI(-to-FireWire) driver ohci1394. a.2) SuSE 10.1's udev scripts don't care for a driver. Therefore they work the same with the old parent device link, without parent device link, and new parent device link. They don't add comments to the persistent rules file, only the actual rule. The rules look the same with all 3 variants. I.e. the patch is good enough on Gentoo and unnecessary + harmless on SuSE 10.1. The 3rd distro I have here, Mandrake 10.1, is an older one which doesn't have udev-based persistent naming rules for network interfaces. So in short, the patch is all good.
Sounds good to me, many thanks for all the time you have put into this. I'll wait until this patch has had some review and will then add it to gentoo-sources and close this bug.
Fixed in gentoo-sources-2.6.21-r3. Thanks again!
fix is also in vanilla 2.6.21.5 and 2.6.22-rcsomething