Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 177199 - [2.6.21 regression] eth1394 devices break udev/persistent-net rules
Summary: [2.6.21 regression] eth1394 devices break udev/persistent-net rules
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Daniel Drake (RETIRED)
URL:
Whiteboard: linux-2.6.21-regression
Keywords:
Depends on:
Blocks:
 
Reported: 2007-05-05 16:18 UTC by Matthias Schwarzott
Modified: 2007-06-12 15:12 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ieee1394: eth1394: bring back a parent device (477-ieee1394-eth1394-bring-parent-device-back.patch,1.32 KB, patch)
2007-05-20 23:17 UTC, Stefan Richter
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Schwarzott gentoo-dev 2007-05-05 16:18:05 UTC
Ethernet over firewire devices (driver: eth1394) have - starting with kernel 2.6.21 - no longer a proper parent-device/subsystem in sysfs.

The code change resulting in this is:
--- linux-2.6.20-gentoo/drivers/ieee1394/eth1394.c
+++ linux-2.6.21-gentoo/drivers/ieee1394/eth1394.c
@@ -585,7 +584,10 @@
         }

        SET_MODULE_OWNER(dev);
+#if 0
+       /* FIXME - Is this the correct parent device anyway? */
        SET_NETDEV_DEV(dev, &host->device);
+#endif

        priv = netdev_priv(dev);

This missing parent-device breaks persistence-net, as that adds DRIVERS=="?*" to the generated rules, but the driver-attribute resides in the parent-device that no longer is available now.

This leads to not matching the existing rule, but at every boot running into the code to get a new number for the interface, and thus enlarging the ruleset with one line at every boot.
Comment 1 Mike Pagano gentoo-dev 2007-05-06 01:46:16 UTC
The commitdiff for this change can be found here:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7a9eeb2fa1b3a3a83670b9ba08dd396beedb88f8
Comment 2 Stefan Richter 2007-05-13 11:55:16 UTC
I reported your bug to LKML as reply in the thread of the bug report which lead to the patch.  http://lkml.org/lkml/2007/5/13/53

Note, I don't see this bug here myself:
$ ls -l /etc/udev/rules.d/70-persistent-net.rules
-rw-r--r-- 1 root root 352 Mar  4 13:17
/etc/udev/rules.d/70-persistent-net.rules
$ cat /etc/udev/rules.d/70-persistent-net.rules

# Firewire device 080028560000319b)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1"

# Firewire device )
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2"

# PCI device 0x8086:0x109a (e1000)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0"


(I have two FireWire controllers installed.)
However, I don't reboot often, and I don't load eth1394 anymore during startup.  I still load eth1394 frequently for testing though.  Also, I'm using kernel.org kernels patched with recent IEEE 1394 development patches, but that shouldn't matter.
Comment 3 Daniel Drake (RETIRED) gentoo-dev 2007-05-20 04:48:58 UTC
Stefan, thanks for jumping in. I'd be interested to see your output of "udevinfo -a -p /sys/class/net/eth2" where eth2 is an eth1394 device.
Comment 4 Stefan Richter 2007-05-20 08:46:16 UTC
# udevinfo -a -p /sys/class/net/eth2

Udevinfo starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/virtual/net/eth2':
    KERNEL=="eth2"
    SUBSYSTEM=="net"
    DRIVER==""
    ATTR{weight}=="0"
    ATTR{tx_queue_len}=="1000"
    ATTR{flags}=="0x1003"
    ATTR{mtu}=="1500"
    ATTR{operstate}=="unknown"
    ATTR{dormant}=="0"
    ATTR{carrier}=="1"
    ATTR{broadcast}=="ff:ff:ff:ff:ff:ff:ff:ff"
    ATTR{address}=="00:10:dc:56:00:fe:d2:d4"
    ATTR{link_mode}=="0"
    ATTR{type}=="24"
    ATTR{features}=="0x20"
    ATTR{ifindex}=="4"
    ATTR{iflink}=="4"
    ATTR{addr_len}=="8"

  looking at parent device '/devices/virtual/net':
    KERNELS=="net"
    SUBSYSTEMS==""
    DRIVERS==""

  looking at parent device '/devices/virtual':
    KERNELS=="virtual"
    SUBSYSTEMS==""
    DRIVERS==""


Different to that, udevinfo shows e1000 as DRIVER of the grandparent device of my eth0 device.  However, /etc/udev/rules.d/70-persistent-net.rules looks thus:

# Firewire device 080028560000319b)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1"

# Firewire device )
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2"

# PCI device 0x8086:0x109a (e1000)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0"

The filestamp of my 70-persistent-net.rules is March 4 2007.  I wrote the patch which moves eth1394's devices beneath the virtual device on March 20.

---------------------------------------------

Side note, if this is of any relevance:
I get the following in /var/log/messages when eth1394 creates its interfaces.

May 20 10:33:46 stein ieee1394: Host added: ID:BUS[0-00:1023]  GUID[080028560000319b]
May 20 10:33:46 stein ieee1394: Host added: ID:BUS[1-00:1023]  GUID[0010dc5600fed2d4]
May 20 10:33:48 stein eth1394: eth1: IPv4 over IEEE 1394 (fw-host0)
May 20 10:33:48 stein eth1394: eth2: IPv4 over IEEE 1394 (fw-host1)
May 20 10:33:48 stein rc-scripts: We only hotplug for ethernet interfaces
May 20 10:33:48 stein rc-scripts: We only hotplug for ethernet interfaces

I bring the interfaces up when I need them with
# /etc/init.d/net.eth1 start
# /etc/init.d/net.eth2 start
I have only their IP address and netmask configured in /etc/conf.d/net.
Comment 5 Stefan Richter 2007-05-20 08:53:33 UTC
How are entries added to 70-persistent-net.rules anyway?

I just created /etc/init.d/net.eth3 and plugged in a 3rd FireWire card (I have a PCI to CardBus adapter installed).  70-persistent-net.rules remains unchanged.
Comment 6 Daniel Drake (RETIRED) gentoo-dev 2007-05-20 14:18:39 UTC
They are generated by /etc/udev/rules.d/75-persistent-net-generator.rules
Comment 7 Daniel Drake (RETIRED) gentoo-dev 2007-05-20 14:23:18 UTC
That file checks for DRIVERS=="?*" before generating any rules so it's not clear why the sysfs change would be "enlarging the ruleset with one line at every boot." unless this is a new addition.

However the bug is still there in that previously-generated rules matched DRIVERS="?*" but such rules won't work for 2.6.21 kernels. e.g. try swapping your eth1 and eth2 entires in 70-persistent-net.rules -- I don't think the devices would follow the new order (and if they do, it's probably coincidence)
Comment 8 Stefan Richter 2007-05-20 14:58:27 UTC
What does it take to rename network interfaces anyway?


I tested with the following changes.
In 70-persistent-net.rules:

# Firewire device 080028560000319b)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="fwti"

# Firewire device )
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="fwvia"

# PCI device 0x8086:0x109a (e1000)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:16:17:a5:be:0c", NAME="eth0"


In /etc/init.d:

net.eth0 -> net.lo
net.eth1 -> net.lo
net.eth2 -> net.lo
net.eth3 -> net.lo
net.fwti -> net.lo
net.lo


In /etc/conf.d/net:
...
config_fwti=( "10.10.222.42 netmask 255.255.255.0" )
...


# modprobe eth1394
May 20 16:47:11 stein eth1394: eth1: IPv4 over IEEE 1394 (fw-host0)
May 20 16:47:11 stein eth1394: eth2: IPv4 over IEEE 1394 (fw-host1)
May 20 16:47:11 stein rc-scripts: We only hotplug for ethernet interfaces
May 20 16:47:11 stein rc-scripts: We only hotplug for ethernet interfaces


# /etc/init.d/net.fwti start
 * Caching service dependencies ...                                [ ok ]
 * Service net.fwti starting
 network interface fwti does not exist
 Please verify hardware or kernel module (driver)                  [ !! ]
 * ERROR:  net.fwti failed to start

May 20 16:49:42 stein rc-scripts: network interface fwti does not exist
May 20 16:49:42 stein rc-scripts: Please verify hardware or kernel module (driver)
May 20 16:49:42 stein rc-scripts: ERROR:  net.fwti failed to start


Can't I specify network interfaces by their MAC?  The MAC is the only persistent and unique device property, hence the only property that can safely be used to persistently refer to network interfaces.
Comment 9 Matthias Schwarzott gentoo-dev 2007-05-20 15:03:13 UTC
(In reply to comment #8)
> 
> Can't I specify network interfaces by their MAC?  The MAC is the only
> persistent and unique device property, hence the only property that can safely
> be used to persistently refer to network interfaces.
> 
This is what this bug is about:
udev uses the MAC to persist net devices. But to exclude secondary interfaces, like bridges, bondings, vlan stuff using the same MAC like the master device it is attached to, there is the additional check for DRIVERS=="?*".
This check is broken by latest kernel changes!
Comment 10 Daniel Drake (RETIRED) gentoo-dev 2007-05-20 17:24:13 UTC
Just to clarify, Stefan:

The rules you changed will take effect in kernels older than 2.6.21. However, as of 2.6.21, due to the DRIVERS stuff going away, they don't work.

If you remove the DRIVERS key from your rules, they will probably start working.

I asked Matthias not to modify udev to take account for the new 2.6.21 behaviour until we have a grasp on whether the kernel sysfs change was intentional, and whether it is fixable. Hopefully you can help us out here :)
Comment 11 Stefan Richter 2007-05-20 18:34:39 UTC
As you are probably aware of, Greg KH is converting more and more subsystems away from class_device.  The ieee1394 subsystem is the last one which wasn't converted yet.  After Greg submitted the conversion of the networking subsystem during the 2.6.21-rc1 merge window, the bug mentioned in the git commit in comment #1 and the lkml thread in comment #2 became know.  I spent a day or so trying to find a fix, and ended up with the #if 0--#endif "solution".

I shall ask Greg how (if?) he intends to finish the conversion of the 1394 subsystem; also I shall try one or two other ideas how that bug could possibly avoided.  I am not sure anymore what I already tried back then, so I might run into dead ends once more.
Comment 12 Stefan Richter 2007-05-20 23:17:15 UTC
Created attachment 119851 [details, diff]
ieee1394: eth1394: bring back a parent device

This patch links to the device as parent device which was eth1394's grandparent device in Linux 2.6.20 and older.  Is this sufficient for you?  (Tested on a different computer than in my previous comments, therefore it has a different MAC.)

# udevinfo -a -p /sys/class/net/eth1

Udevinfo starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/net/eth1':
    KERNEL=="eth1"
    SUBSYSTEM=="net"
    DRIVER==""
    ATTR{weight}=="0"
    ATTR{tx_queue_len}=="1000"
    ATTR{flags}=="0x1002"
    ATTR{mtu}=="1500"
    ATTR{operstate}=="down"
    ATTR{broadcast}=="ff:ff:ff:ff:ff:ff:ff:ff"
    ATTR{address}=="00:17:f2:ff:fe:66:fb:80"
    ATTR{link_mode}=="0"
    ATTR{type}=="24"
    ATTR{features}=="0x20"
    ATTR{ifindex}=="6"
    ATTR{iflink}=="6"
    ATTR{addr_len}=="8"

  looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/net':
    KERNELS=="net"
    SUBSYSTEMS==""
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0':
    KERNELS=="0000:03:03.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="ohci1394"
    ATTRS{msi_bus}==""
    ATTRS{broken_parity_status}=="0"
    ATTRS{enable}=="2"
    ATTRS{modalias}=="pci:v000011C1d00005811sv000011C1sd00005811bc0Csc00i10"
    ATTRS{local_cpus}=="3"
    ATTRS{irq}=="19"
    ATTRS{class}=="0x0c0010"
    ATTRS{subsystem_device}=="0x5811"
    ATTRS{subsystem_vendor}=="0x11c1"
    ATTRS{device}=="0x5811"
    ATTRS{vendor}=="0x11c1"

  looking at parent device '/devices/pci0000:00/0000:00:1e.0':
    KERNELS=="0000:00:1e.0"
    SUBSYSTEMS=="pci"
    DRIVERS==""
    ATTRS{msi_bus}=="1"
    ATTRS{broken_parity_status}=="0"
    ATTRS{enable}=="1"
    ATTRS{modalias}=="pci:v00008086d00002448sv00000000sd00000000bc06sc04i01"
    ATTRS{local_cpus}=="3"
    ATTRS{irq}=="0"
    ATTRS{class}=="0x060401"
    ATTRS{subsystem_device}=="0x0000"
    ATTRS{subsystem_vendor}=="0x0000"
    ATTRS{device}=="0x2448"
    ATTRS{vendor}=="0x8086"

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{uevent}==""
Comment 13 Stefan Richter 2007-05-20 23:58:10 UTC
I pulled the patch over to the machine with the 3 FireWire cards, deleted the previous entries for eth1...eth3 from 70-persistent-net.rules.  When I loaded eth1394 with the two fixed cards present and ohci1394 inserted, I got the lines

# PCI device 0x1106:0x3044 (ohci1394)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:10:dc:56:00:fe:d2:d4", NAME="eth2"

# PCI device 0x104c:0x8025 (ohci1394)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="08:00:28:56:00:00:31:9b", NAME="eth1"

back. When I popped in the CardBus card, I got

# PCI device 0x1033:0x00cd (ohci1394)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:d0:f5:20:08:00:61:3d", NAME="eth3"

Also, if I exchange names in 70-persistent-net.rules, the replacement names are honored whenever I load eth1394.

(If I revert the patch, there are neither new rules added to 70-persistent-net.rules if I deleted them beforehand, nor are existing rules used to rename interfaces.)

I will have one more go at it circa tomorrow, in order to attempt to link to the fw-host device again instead of the PCI device.
Comment 14 Stefan Richter 2007-05-25 10:32:18 UTC
I received a patch from Kay Sievers today which converts ieee1394 away from class_device.  This should enable us to bring back the original SET_NETDEV_DEV(dev, &host->device);.  But I have yet to test the conversion patch, and it might be too dangerous for a production kernel package.  Therefore I'd like to hear your opinion whether the attachment 119851 [details, diff], i.e. SET_NETDEV_DEV(dev, host->device.parent), is OK as temporary measure for kernel 2.6.21 and 2.6.22.
Comment 15 Daniel Drake (RETIRED) gentoo-dev 2007-05-25 16:03:43 UTC
Assuming it allows the udev rules to work again (I can't test it, no firewire hardware), it's fine by me. Would you consider submitting this for upstream inclusion or is it too ugly for that?
Comment 16 Stefan Richter 2007-05-25 18:29:23 UTC
I think what I'm going to do is
a) with the "bring back a parent device" patch:
  - compare with udevinfo output from 2.6.20,
  - test with udev of a SuSE 10.1 machine nearby,
  - check back whether anybody noted an objection here, otherwise
  - send it upstream to 2.6.22-rc and 2.6.21.y soon.
b) in search for other workarounds: I gave up now.
c) after Kay Sievers' ieee1394 subsystem conversion: Perhaps restore 2.6.20's host device relationship, or keep the new parent device relationship from the small patch.
Comment 17 Stefan Richter 2007-05-26 10:01:36 UTC
Re comment #16:

a.1) The old device hierarchy had this as parent of the ethX device:

  looking at parent device '/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/fw-host0':
    KERNELS=="fw-host0"
    SUBSYSTEMS=="ieee1394"
    DRIVERS=="nodemgr"
    ATTRS{is_busmgr}=="0"
    ATTRS{is_irm}=="1"
    ATTRS{is_cycmst}=="1"
    ATTRS{is_root}=="1"
    ATTRS{in_bus_reset}=="0"
    ATTRS{nodes_active}=="1"
    ATTRS{selfid_count}=="2"
    ATTRS{node_count}=="2"

(The "nodemgr" driver was new in 2.6.20; the fw-host devices had no driver in 2.6.19 or older.)

SUBSYSTEMS=="ieee1394" is relevant in so far as it obviously enables the write_net_rules program to mention Firewire instead of PCI in the comment above an entry in 70-persistent-net.rules.  However, the comments with the new rules as seen in comment #13 should be fine too as it mentions to the PCI(-to-FireWire) driver ohci1394.

a.2) SuSE 10.1's udev scripts don't care for a driver.  Therefore they work the same with the old parent device link, without parent device link, and new parent device link.  They don't add comments to the persistent rules file, only the actual rule.  The rules look the same with all 3 variants.

I.e. the patch is good enough on Gentoo and unnecessary + harmless on SuSE 10.1.  The 3rd distro I have here, Mandrake 10.1, is an older one which doesn't have udev-based persistent naming rules for network interfaces.

So in short, the patch is all good.
Comment 18 Daniel Drake (RETIRED) gentoo-dev 2007-05-26 16:58:49 UTC
Sounds good to me, many thanks for all the time you have put into this. I'll wait until this patch has had some review and will then add it to gentoo-sources and close this bug.
Comment 19 Daniel Drake (RETIRED) gentoo-dev 2007-06-12 13:23:07 UTC
Fixed in gentoo-sources-2.6.21-r3. Thanks again!
Comment 20 Stefan Richter 2007-06-12 15:12:10 UTC
fix is also in vanilla 2.6.21.5 and 2.6.22-rcsomething