Bug 158867 - net-misc/dhcp - dhcp client not setting default route to seemlingly unreachable gateway
|
Bug#:
158867
|
Product: Gentoo Linux
|
Version: 2006.1
|
Platform: All
|
|
OS/Version: Other
|
Status: RESOLVED
|
Severity: major
|
Priority: P2
|
|
Resolution: FIXED
|
Assigned To: base-system@gentoo.org
|
Reported By: michael@weiser.dinsnail.net
|
|
Component: Applications
|
|
|
URL:
|
|
Summary: net-misc/dhcp - dhcp client not setting default route to seemlingly unreachable gateway
|
|
Keywords: Bug
|
|
Status Whiteboard:
|
|
Opened: 2006-12-22 15:11 0000
|
I've just set up a rented server with German provider 1&1. They've got quite a
unique network setup that has routes like this:
Destination Gateway Genmask Flags MSS Window irtt Iface
10.255.255.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
0.0.0.0 10.255.255.1 0.0.0.0 UG 0 0 0 eth0
The actual IP of the machine is in a completely different, official subnet
(let's say it's 1.2.3.4). The subnet mask used is 255.255.255.255 to put the
server into its own little subnet world with just its gateway.
The servers aquire their routing information via DHCP. This includes the above
extra static route to the gateway so that it can be reached with the default
route. With dhcpcd this looks like this:
IPADDR='1.2.3.4'
NETMASK='255.255.255.255'
BROADCAST='1.2.3.4'
ROUTES='0.0.0.0,0.0.0.0,10.255.255.1 10.255.255.1,255.255.255.255,10.255.255.1'
1&1's debian-based, RAM-disk booting recovery system uses dhclient which will
get an option like "static-routes" in it's DHCP information as well.
Now I've tried to get Gentoo to work with it and neither dhcpcd nor dhclient
can set the default route with an error of "network unreachable". This makes
sense because the gateway "10.255.255.1" can not be reached via interface eth0
with IP "1.2.3.4" and subnet mask "255.255.255.255".
dhcpcd gets the ROUTES setting from the DHCP server and puts it into its info
file but doesn't do anything with it. dhclient for some reason never gets the
"static-routes" option in its settings. Maybe the debian dhclient used in the
recovery system of 1&1 is patched somehow.
My old Fedora Core System was working fine in a similar setup though and it was
using dhclient, so I had a look at it. I found that the /sbin/dhclient-script
has been heaviliy extended on Fedora. Amongst lots of other things it can cope
with DHCP servers that hand out gateway addresses that are not reachable with
just the IP/netmask information but need an extra route.
In such a case it checks if the host is reachable via the given interface
(using arping) and then just sets a host route to the gateway. I took some
snippets from that script and expanded my /sbin/dhclient-script. I will shortly
upload a patch showing what I changed.
Is this a known problem? Is there a nicer solution than mine?
Thanks in advance,
Michael
Which ebuild is this "patch" for? Really can't assign bugs about stuff I don't
know where it belongs.
(In reply to comment #2)
> Which ebuild is this "patch" for? Really can't assign bugs about stuff I don't
> know where it belongs.
It's against the /sbin/dhclient-script installed by =net-misc/dhcp-3.0.5.
Created an attachment (id=104622) [details]
Change order in dhcpcd
That RedHat patch is the devil and is just wrong.
If this is a dhcp client issue, I would rather get dhcpcd working first -
especially as it supports Gentoo/FreeBSD too (whereas I would have to hack the
freebsd dhclient-script where arping is not available by default).
It looks like we have the information we need with dhcpcd, and this patch just
reverses the order in which we get the routes as it never struct me that you
might need a static host route to then set the gateway. You can verify this
patch as the route entries in info file should be reversed.
Created an attachment (id=104640) [details]
Fix static host routes
This patch for dhcpcd-3.0.8 should fix things for you.
Now, we just need someone to test this with normal styles of static routes to
ensure that they still work.
(In reply to comment #6)
> This patch for dhcpcd-3.0.8 should fix things for you.
It did indeed. Thanks for the lightning-like response.
> Now, we just need someone to test this with normal styles of static routes to
> ensure that they still work.
I'm not sure either. But normally static routes shouldn't need the default
route anyways. So it should be fine to set the static routes before the default
route to be able to reach the gateway in the first place.
Now how do we proceed from here? I've made an overlay ebuild of dhcpcd-3.0.8
with the patch included. Is this of any use to anyone? Will you forward your
patch to the dhcpcd maintainer or should I? What about dhclient?
--
Michael
(In reply to comment #7)
> > Now, we just need someone to test this with normal styles of static routes to
> > ensure that they still work.
>
> I'm not sure either. But normally static routes shouldn't need the default
> route anyways. So it should be fine to set the static routes before the default
> route to be able to reach the gateway in the first place.
This is true.
> Now how do we proceed from here? I've made an overlay ebuild of dhcpcd-3.0.8
> with the patch included. Is this of any use to anyone? Will you forward your
> patch to the dhcpcd maintainer or should I? What about dhclient?
No need - I am the maintainer and upstream for dhcpcd :)
This will have to linger here until after christmas and I'm now very busy, but
I should release a new dhcpcd with this fix in, and a fixed dhclient before the
New Year.
Thanks
(In reply to comment #8)
> > with the patch included. Is this of any use to anyone? Will you forward your
> > patch to the dhcpcd maintainer or should I? What about dhclient?
>
> No need - I am the maintainer and upstream for dhcpcd :)
> This will have to linger here until after christmas and I'm now very busy, but
> I should release a new dhcpcd with this fix in, and a fixed dhclient before the
> New Year.
Sounds great. Thanks again and happy christmas!
--
Michael
dhcpcd is fixed in dhcpcd-3.0.8-r1
I'll close this bug when dhclient is fixed also.
I tested the new ebuild dhcpcd-3.0.8-r1 and everything continues to work
nicely. Thanks again.
(In reply to comment #11)
> I tested the new ebuild dhcpcd-3.0.8-r1 and everything continues to work
> nicely. Thanks again.
Good :)
The patch changed quite a bit so I can do the same thing on FreeBSD as dhcpcd
supports that too.
I'll see if I can do dhclient over the next few days. Should be straight
forward ish
Hello again Roy,
dhcpcd worked nicely for quite some time. Thanks again for your work!
But after updating to 3.1.1 I now get:
[root@host:/sbin] ./dhcpcd.3.1.1 eth0
Error, eth0: netlink: Invalid argument
Error, eth0: netlink: Network is unreachable
debug output says:
[root@host:/sbin] ./dhcpcd.3.1.1 -d eth0
Info, eth0: dhcpcd 3.1.1 starting
Info, eth0: hardware address = 00:41:05:de:88:06
Info, eth0: DUID = 00:01:00:01:0e:40:02:1a:00:41:05:de:88:06
Info, eth0: broadcasting for a lease
Debug, eth0: sending DHCP_DISCOVER with xid 0x44782312
Debug, eth0: waiting on select for 20 seconds
Debug, eth0: got a packet with xid 0x44782312
Info, eth0: offered 1.2.3.4 from 1.2.3.5
Debug, eth0: sending DHCP_REQUEST with xid 0x44782312
Debug, eth0: waiting on select for 20 seconds
Debug, eth0: got a packet with xid 0x44782312
Info, eth0: got subsequent offer of 1.2.3.4, ignoring
Debug, eth0: waiting on select for 20 seconds
Debug, eth0: got a packet with xid 0x44782312
Info, eth0: checking 1.2.3.4 is available on attached networks
Debug, eth0: sending ARP probe #1
Debug, eth0: sending ARP probe #2
Debug, eth0: sending ARP probe #3
Debug, eth0: sending ARP claim #1
Debug, eth0: sending ARP claim #2
Info, eth0: leased 1.2.3.4 for 172800 seconds
Debug, eth0: renew in 86400 seconds
Debug, eth0: rebind in 151200 seconds
Info, eth0: adding IP address 1.2.3.4/32
Info, eth0: adding route to 10.255.255.1 (255.0.0.0) metric 0
Error, eth0: netlink: Invalid argument
Info, eth0: adding default route via 10.255.255.1 metric 0
Error, eth0: netlink: Network is unreachable
Debug, eth0: writing /etc/resolv.conf
Debug, eth0: writing /var/lib/dhcpcd/dhcpcd-eth0.info
Debug, eth0: forking to background
This seems to be a regression in 3.1.1. I've downgraded to 3.0.16-r3 and it
works again. Any help would be greatly appreciated.
--
Thanks,
Micha
Could you attach a wireshark strace of the dhcp transaction please?
If you don't want to attach it due to it showing potential sensitive data then
please email it to me at uberlord@gentoo.org
Thanks
The invalid argument message is gone but "Error, eth0: netlink: Network is
unreachable" still appears and the default route isn't set afterwards.
Indeed, this patch seems to do it. Thanks!
Should I patch my ebuild or will you make a new release anyway?
It fails to apply cleanly:
[michael@host:~] gtar -xjf /usr/portage/distfiles/dhcpcd-3.1.1.tar.bz2
[michael@host:~] cd dhcpcd-3.1.1
[michael@host:~/dhcpcd-3.1.1] cat ../dhcpcd-3.1.1-2.patch | patch -p0
patching file interface.c
Hunk #6 FAILED at 863.
Hunk #7 FAILED at 871.
Hunk #8 succeeded at 897 (offset 3 lines).
Hunk #9 succeeded at 907 (offset 3 lines).
2 out of 9 hunks FAILED -- saving rejects to file interface.c.rej
patching file dhcp.c
Hunk #1 succeeded at 501 (offset 9 lines).
Hunk #2 succeeded at 741 (offset 9 lines).
patching file ChangeLog
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED -- saving rejects to file ChangeLog.rej
What am I missing?
Patch applied and compiled cleanly. The resulting dhcpcd still works and
correctly sets the default route.
(In reply to comment #23)
> Fixed in 3.1.2, thanks
The problem just reappeared after updating to 4.0.0_beta5. Downgrading to 3.2.3
made it go away again. Can you have another look at it?
Hello,
is there anything happending on this?
--
Thanks, Micha
Raising severity from 'enhancement' and assigning to base-system herd as they
are currently listed in the package metadata. I've just spent hours trying to
deal with the very same problem (as described by the original reporter) whilst
setting up a host for a client using 1&1. I got to the point where I was able
to switch between a skeletal Gentoo system and a default Ubuntu setup over a
serial console. Ubuntu was having no problems configuring the interface with
dhclient3, but no dice with Gentoo. It's an extremely unintuitive problem to
diagnose and had me on the verge of tearing my hair out. I followed many
potential leads - all of which were dead ends - before realising that:
1) All 4.x versions of dhcpcd seem to be broken (I tried 4.0.2 and 4.0.5)
2) dhcpcd-3.2.3 is fine
To be clear, in the second case the following two routes are added as
necessary:
10.255.255.1 * 255.255.255.255 UH 2 0 0 eth0
default 10.255.255.1 0.0.0.0 UG 2 0 0 eth0
Subsequently, I found this bug. It's interesting that it can be traced back to
two years prior and that almost 6 months has elapsed since its being re-opened
without a response, which I presume is due to Roy's retirement. As it stands, I
don't think that the current series of dhcpcd should be marked stable with a
bug such as this outstanding. However unusual the setup may be at 1&1, it
strikes me as important that a DHCP client do its job in a straightforward
fashion and stay out of the way otherwise.
The host in question is using almost exclusively stable tree packages, the
exceptions being openrc-0.3.0-r1 and baselayout-2.0.0. If there is any further
information that may be necessary to help fix the bug, I'd be more than happy
to provide it.
They must have changed something.
dhcpcd-4.0.5 (dhcpcd-4.0.2 has identical code here)
Option: (121) Classless Static Route
Length: 14
Value: 200AFFFF0100000000000AFFFF01
Subnet/MaskWidth-Router: 10.255.255.1/32-0.0.0.0
Subnet/MaskWidth-Router: default-10.255.255.1
uberpc ~ # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.255.255.1 0.0.0.0 255.255.255.255 UH 0 0 0 ral0
10.73.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ral0
127.0.0.0 127.0.0.1 255.0.0.0 UG 0 0 0 lo
0.0.0.0 10.255.255.1 0.0.0.0 UG 0 0 0 ral0
So it should work just fine. Could you attach a wireshark trace of dhcpcd-3.2.3
working and dhcpcd-4.0 failing?
I ran into this, too. I worked around it by using static configuration like
that:
modules=( "iproute2" )
config_eth0=( "x.x.x.x netmask 255.255.255.255 brd x.x.x.x" )
routes_eth0=( "10.255.255.1 dev eth0" "default via 10.255.255.1" )
DHCP can be used using udhcpc, but you still have to add a static route.
Workarounds are nice, but this bug will be fixed by someone attaching a full
wireshark trace AND what they expect the routing table to look like afterwards.
Heya folks!
I can do tcpdumps. Roy: I'll send you captures of
# /etc/init.d/net.eth0 start
on my 1&1 server with dhcpcd-3.2.3 and 4.0.4 respectively via private mail. The
tcpdump command was
# tcpdump -i eth0 -n -s 2048 -w tcpdump.dhcpcd-4.0.4 port bootpc or bootps
Also net.eth0 shows additional console messages with dhcpcd-4.0.4:
[root@heinz:~] /etc/init.d/net.eth0 start
* Bringing up interface eth0
* dhcp ...
* Running dhcpcd ...
eth0: dhcpcd 4.0.4 starting
eth0: broadcasting for a lease
eth0: offered 1.2.3.4 from 1.2.3.249
eth0: checking 1.2.3.4 is available on attached networks
eth0: ignoring offer of 1.2.3.4 from 1.2.3.250
eth0: acknowledged 1.2.3.4 from 1.2.3.249
eth0: leased 1.2.3.4 for 172800 seconds
eth0: add_route: Network is unreachable
eth0: add_route: Network is unreachable [ ok
]
* received address 1.2.3.4/32 [ ok
]
Created an attachment (id=173266) [details]
Re-order options
I've analsyed the dump and looked at the patches here. The only thing I can
think of is that the order of options matters. Does this patch fix things?
Hi Roy,
unfortunately it doesn't. Same behaviour as without.
--
Micha
Created an attachment (id=173339) [details]
Fix adding host routes
This patch (along with the re-order options one) should now work as I've had a
chance to actually test it on a linux box. Let me know and I'll roll a new
dhcpcd version :)
Hi Roy,
the new patch works nicely without the options reorder patch (I forgot to apply
it). Output from Gentoo's /etc/init.d/net.eth0 restart is now:
* dhcp ...
* Running dhcpcd ...eth0: dhcpcd 4.0.4 starting
eth0: broadcasting for a lease
eth0: offered 1.2.3.4 from 1.2.3.249
eth0: checking 1.2.3.4 is available on attached networks
eth0: ignoring offer of 1.2.3.4 from 1.2.3.250
eth0: acknowledged 1.2.3.4 from 1.2.3.249
eth0: leased 1.2.3.4 for 172800 seconds
[ ok ]
* received address 1.2.3.4/32
The routing table looks as expected afterwards.
--
Micha
dhcpcd-4.0.6 and dhcpcd-4.99.6 released.
Hopefully in portage soon.
Fixed in dhcpcd-4.0.6, which is now in-tree.