Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 410127 - Infinite loop reading /proc/net/dev in openrc iproute2.sh
Summary: Infinite loop reading /proc/net/dev in openrc iproute2.sh
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal critical (vote)
Assignee: OpenRC Team
URL:
Whiteboard: openrc:oldnet
Keywords:
Depends on:
Blocks: openrc-0.10
  Show dependency tree
 
Reported: 2012-03-29 11:11 UTC by John Keeping
Modified: 2013-04-25 23:08 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Keeping 2012-03-29 11:11:58 UTC
When starting net.eth0 the initscript entered an infinite loop, which I have traced to the while read ... done </proc/net/dev block in the _ifindex() function in iproute2.sh (also in ifconfig.sh but my system doesn't use that).

This appears to be the same problem described at:

http://lists.debian.org/debian-kernel/2012/02/msg01089.html

I have managed to work around it by changing the loop in _ifindex() from:

while read line; do
    ...
done </proc/net/dev

to:

cat /proc/net/dev |
while read line; do
    ...
done


I've managed to restart the interface before with this kernel (vanilla 3.3.0 with a patch for Apple backlight support) and haven't rebooted since that worked, so I think some set of conditions needs to be met before this bug is triggered, but once I got into the state where reading /proc/net/dev failed it kept failing until I edited the script as described above.

Reproducible: Sometimes
Comment 1 William Hubbs gentoo-dev 2012-03-29 19:00:31 UTC
@robbat2:
Can we rewrite _ifindex() to not use /proc/net/dev, at least for linux
systems? I think if we can get away from /proc/net/dev that will be a
good solution for this.
Comment 2 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-03-30 19:13:29 UTC
WilliamH:
can we rely on awk instead?
there is basically no other way to get the indices of an interface that will not change the ordering.
The interfaces in that file are ordered by when they were added to a system, and that order is important to us.

The entire function could be replaced by this awk function:
awk "
BEGIN{n=-2}
/^$IFACE:/{n=NR-2; }
END{if(n<0){n=NR-1}; print n}"
/proc/net/dev


if you want debugging, add this between the /^$IFACE:/ line and the END line:
{ printf(\"nr-2=%d n=%d s=%s\n\",NR-2,n,\$1);}
Comment 3 William Hubbs gentoo-dev 2012-03-30 22:29:33 UTC
(In reply to comment #2)
> WilliamH:
> can we rely on awk instead?
> there is basically no other way to get the indices of an interface that will
> not change the ordering.
> The interfaces in that file are ordered by when they were added to a system,
> and that order is important to us.

@robbat2:
I would rather not use awk, because that adds another dependency to openrc, and  I'm not sure it resolves the issue.

Looking at the related debian bug report, it looks like the issue is a kernel issue with reading from /proc/net/dev, regardless of how we do it. Since the issue is sporatic, I'm not sure that the fix originally proposed, or using awk would actually fix this.
Comment 4 John Keeping 2012-03-30 23:08:51 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > can we rely on awk instead?
> > there is basically no other way to get the indices of an interface that will
> > not change the ordering.
> > The interfaces in that file are ordered by when they were added to a system,
> > and that order is important to us.

Is this definitely the case?  It seems that the following commit means that the iteration order can now change since the file is now generated by iterating over the buckets in the hash table net->dev_name_head (see list_netdevice() in net/core/dev.c).

commit f04565ddf52e401880f8ba51de0dff8ba51c99fd
Author: Mihai Maruseac <mihai.maruseac@gmail.com>
Date:   Thu Oct 20 20:45:10 2011 +0000

    dev: use name hash for dev_seq_ops

> @robbat2:
> I would rather not use awk, because that adds another dependency to openrc,
> and  I'm not sure it resolves the issue.
> 
> Looking at the related debian bug report, it looks like the issue is a
> kernel issue with reading from /proc/net/dev, regardless of how we do it.
> Since the issue is sporatic, I'm not sure that the fix originally proposed,
> or using awk would actually fix this.

I'm not sure the issue is sporadic - I've only had an infinite loop once, but I consistently (across 4 machines) get corrupted output running:

while read -r line ; do echo "$line" ; done </proc/net/dev

whereas `cat /proc/net/dev` always seems reasonable.

After digging into this a bit more the problem seems to be that the read builtin reads a chunk from the file and then seeks backwards to the first newline so that the next call reads the next line and it is this seeking that is broken - so yes, it does seem to be a kernel issue, but it's much simpler to make OpenRC work around the bug.

cat seems to use a 4096 byte chunk which is bigger than the total size of /proc/net/dev on all of my systems, so I'm not sure if this still works once you have a lot of interfaces - I suspect that it will be fine since cat will not call lseek at all and the bug appears to be in dev_seq_next() in net/core/dev.c, which does not use the pos parameter in determining what to return - so it will work fine if called to output the devices in order but not if the caller seeks at all.
Comment 5 William Hubbs gentoo-dev 2012-03-31 01:00:48 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > can we rely on awk instead?
> > > there is basically no other way to get the indices of an interface that will
> > > not change the ordering.
> > > The interfaces in that file are ordered by when they were added to a system,
> > > and that order is important to us.
> 
> Is this definitely the case?  It seems that the following commit means that
> the iteration order can now change since the file is now generated by
> iterating over the buckets in the hash table net->dev_name_head (see
> list_netdevice() in net/core/dev.c).
> 
> commit f04565ddf52e401880f8ba51de0dff8ba51c99fd
> Author: Mihai Maruseac <mihai.maruseac@gmail.com>
> Date:   Thu Oct 20 20:45:10 2011 +0000
> 
>     dev: use name hash for dev_seq_ops

If this order assumption is not valid, there is a way we can get the device names from /sys/class/net. I have a patch that does this.

> I'm not sure the issue is sporadic - I've only had an infinite loop once,
> but I consistently (across 4 machines) get corrupted output running:
> 
> while read -r line ; do echo "$line" ; done </proc/net/dev
> 
> whereas `cat /proc/net/dev` always seems reasonable.

What about using the read command to parse out the device name, e.g.:

while read dev rest; do echo $dev; done < /proc/net/dev

Does that also give corrupted output?
Comment 6 John Keeping 2012-03-31 10:02:42 UTC
(In reply to comment #5)
> If this order assumption is not valid, there is a way we can get the device
> names from /sys/class/net. I have a patch that does this.

Interestingly (assuming you're talking about reading /sys/class/net/$dev/ifindex), I get the following ordering on this system - which would seem to imply the ordering assumption is indeed wrong:

$ cat /proc/net/dev | cut -c-70    # cut to avoid confusing line wrapping
Inter-|   Receive                                                |  Tr
 face |bytes    packets errs drop fifo frame compressed multicast|byte
    lo: 8836302   12384    0    0    0     0          0         0  883
virbr0:       0       0    0    0    0     0          0         0     
  sit0:       0       0    0    0    0     0          0         0     
 wlan0: 15583373697 11384063    0    0    0     0          0         0
  eth0:       0       0    0    0    0     0          0         0     

$ for i in /sys/class/net/*; do echo $(cat $i/ifindex) ${i##*/}; done | sort
1 lo
2 eth0
3 sit0
4 wlan0
5 virbr0


> (In reply to comment #4)
> > I'm not sure the issue is sporadic - I've only had an infinite loop once,
> > but I consistently (across 4 machines) get corrupted output running:
> > 
> > while read -r line ; do echo "$line" ; done </proc/net/dev
> > 
> > whereas `cat /proc/net/dev` always seems reasonable.
> 
> What about using the read command to parse out the device name, e.g.:
> 
> while read dev rest; do echo $dev; done < /proc/net/dev
> 
> Does that also give corrupted output?

Yes (with gentoo-sources-3.2.6 on the same machine as above):

$ while read dev rest; do echo $dev; done </proc/net/dev
Inter-|
face
virbr0:
0
lo:
eth0:

I think any solution using read like that will expose the bug because it will try to seek within the file.
Comment 7 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-03-31 19:17:57 UTC
Look at how the ordering in ifindex vs /proc/net/dev changes over the change of interfaces here:

# for i in /sys/class/net/*; do echo $(cat $i/ifindex) ${i##*/}; done | sort -n
cat: /sys/class/net/bonding_masters/ifindex: Not a directory
bonding_masters
1 lo
2 eth0
3 eth1
4 eth2
5 eth3
6 bond0
15 bond1
16 dummy0

# cat /proc/net/dev |cut -c-40
Inter-|   Receive                       
 face |bytes    packets errs drop fifo f
  eth3: 4826672837 16589455    4  100   
    lo: 2923881716 5759650    0    0    
  eth2: 4941753388 17055969    0  116   
  eth1: 10177055798 12884546    0 35777 
dummy0:       0       0    0    0    0  
  eth0:       0       0    0    0    0  
 bond1:       0       0    0    0    0  
 bond0: 9768426225 33645424    4  216   

# rmmod dummy

# for i in /sys/class/net/*; do echo $(cat $i/ifindex) ${i##*/}; done | sort -n
cat: /sys/class/net/bonding_masters/ifindex: Not a directory
bonding_masters
1 lo
2 eth0
3 eth1
4 eth2
5 eth3
6 bond0
15 bond1

# cat /proc/net/dev |cut -c-40
Inter-|   Receive                       
 face |bytes    packets errs drop fifo f
  eth3: 4826709377 16589687    4  100   
    lo: 2923971677 5759851    0    0    
  eth2: 4941785392 17056227    0  116   
  eth1: 10177239712 12884987    0 35778 
  eth0:       0       0    0    0    0  
 bond1:       0       0    0    0    0  
 bond0: 9768494769 33645914    4  216   

# modprobe dummy

# for i in /sys/class/net/*; do echo $(cat $i/ifindex) ${i##*/}; done | sort -n
cat: /sys/class/net/bonding_masters/ifindex: Not a directory
bonding_masters
1 lo
2 eth0
3 eth1
4 eth2
5 eth3
6 bond0
15 bond1
89 dummy0

# cat /proc/net/dev |cut -c-40
Inter-|   Receive                       
 face |bytes    packets errs drop fifo f
  eth3: 4826732972 16589806    4  100   
    lo: 2924075819 5760049    0    0    
  eth2: 4941809618 17056347    0  116   
  eth1: 10177583293 12885629    0 35780 
dummy0:       0       0    0    0    0  
  eth0:       0       0    0    0    0  
 bond1:       0       0    0    0    0  
 bond0: 9768542590 33646153    4  216
Comment 8 William Hubbs gentoo-dev 2012-03-31 22:49:29 UTC
(In reply to comment #6)
> I think any solution using read like that will expose the bug because it
> will try to seek within the file.

I think you mean any solution using read like that with /proc/net/dev right?

@robbat2:
Can we use the number in /sys/class/net/${IFACE}/ifindex as the default metric? It seems like that number is constant for each interface looking at your information in comment #7. What do you think? Do you know where that value comes from?
Comment 9 John Keeping 2012-04-01 11:26:22 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > I think any solution using read like that will expose the bug because it
> > will try to seek within the file.
> 
> I think you mean any solution using read like that with /proc/net/dev right?

Yes, anything of the form:

while read ... ; done </proc/net/dev

exposes the bug.  AFAICT the issue is in the implementation of /proc/net/dev and affects any attempt to seek within that file.
Comment 10 William Hubbs gentoo-dev 2012-04-01 14:53:02 UTC
All,

here is my suggestion for a new _ifindex() function for Linux. It
returns the value the kernel assigns to /sys/class/net/${IFACE}/ifindex
instead of trying to calculate a value.

Here are my questions about this:

- If we ever hit the else path, what is a reasonable response? I have
  just assigned 9999 as a metric in that situation, because I do not
  know what else to do with it.

- Does a return code matter since we do not check for it in the network
  scripts anyway?

_ifindex()
{
	local i=0 rc=0
	if [ -e /sys/class/net/"${IFACE}"/ifindex ]; then
		i=$(cat /sys/class/net/"${IFACE}"/ifindex)
	else
		rc=1
		i=9999
	fi
	echo "${i}"
	return $rc
}
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-04-01 16:47:11 UTC
I'm not sure where the kernel gets the ordering in /proc/net/dev from. It's certainly not coming from the ifindex value as that changed in my testcase, while the position of dummy0 stayed the same.

(In reply to comment #10)
> - If we ever hit the else path, what is a reasonable response? I have
>   just assigned 9999 as a metric in that situation, because I do not
>   know what else to do with it.
The else path is important. It's a guess of the lowest possible value for the next ifindex value assigned. It needs to be unique as well, so that you don't get conflicting metric values in routes.

> - Does a return code matter since we do not check for it in the network
>   scripts anyway?
Nope.

> 
> _ifindex()
> {
> 	local i=0 rc=0
> 	if [ -e /sys/class/net/"${IFACE}"/ifindex ]; then
> 		i=$(cat /sys/class/net/"${IFACE}"/ifindex)
> 	else
> 		rc=1
> 		i=9999
> 	fi
> 	echo "${i}"
> 	return $rc
> }

I suggest this for the else path:
index=-1
for f in /sys/class/net/*/ifindex ; do
	v=$(cat $f)
	[ $v -gt $index ] && index=$v
done 
echo $(($index+1))
Comment 12 John Keeping 2012-04-01 18:02:59 UTC
(In reply to comment #11)
> I'm not sure where the kernel gets the ordering in /proc/net/dev from. It's
> certainly not coming from the ifindex value as that changed in my testcase,
> while the position of dummy0 stayed the same.

Since the commit referenced in comment #4 (which was included in kernel 3.2) /proc/net/dev is generated by iterating over a hash table, hashed on the name of the interface.
Comment 13 William Hubbs gentoo-dev 2012-04-01 18:05:40 UTC
All,

here is my updated replacement for _ifindex on linux which encorporates
the changes from robbat2.

_ifindex()
{
	local index=-1
	local f v
	if [ -e /sys/class/net/"${IFACE}"/ifindex ]; then
		index=$(cat /sys/class/net/"${IFACE}"/ifindex)
	else
		for f in /sys/class/net/*/ifindex ; do
			v=$(cat $f)
			[ $v -gt $index ] && index=$v
		done 
		: $(( index += 1 ))
	fi
	echo "${index}"
	return 0
}

@robbat2:
Are you ok with this version? If so I'll replace the function for linux
systems and commit.
Comment 14 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-04-01 18:17:01 UTC
williamh: yes, but definitely as 0.10 version
Comment 15 William Hubbs gentoo-dev 2012-04-02 14:11:54 UTC
This is fixed by commit 9127684.
Thanks for the report.