When starting net.eth2 on my system, dhcpcd 5.1.0 times out whereas 4.0.13 succedds. No config files were changed between the emerge. Reproducible: Always Steps to Reproduce: See above. I was trying to use dhcpcd to lease a public IP from the ONC of FIOS. In my conf.d/net have config_eth2=dhcp modules_eth2=dhcpcd I have /etc//dhcpcd.exit-hook which is never executed for the 5.1.0 since the lease is never acquired.1. 2. 3. Actual Results: dhcpcd times out. Expected Results: Lease to be acquired.
Created attachment 203811 [details] emerge info
Created attachment 203815 [details] debug output of 4.0.13
Created attachment 203817 [details] 5.1.0 debug output
Try adding the duid option to /etc/dhcpcd.conf
reopen with info.
duid does not work and 5.1.1 still gives timeout error.(In reply to comment #5) > reopen with info.
Could you attach wireshark traces of dhcpcd-4 working and dhcpcd-5 failing please?
(In reply to comment #7) > Could you attach wireshark traces of dhcpcd-4 working and dhcpcd-5 failing > please? I have tcpdump here, if that is satisfactory, I can attach traces for both.
Yes, provided they are a full capture like so tcpdump -s0 -w/tmp/dhcpcd.cap -ieth2
Created attachment 205255 [details] tcpdump trace of dhcpcd v4 finding the lease
Created attachment 205257 [details] dhcpcd v5 trying to find a lease
Created attachment 205270 [details, diff] Report ClientID sent in debug To make this a little more visible, this patch reports the ClientID being sent in debug mode for dhcpcd-5. A DUID is a fancy ClientID.
John, is this still an issue with 5.1.3-r1? Thanks, William
(In reply to comment #13) > John, > is this still an issue with 5.1.3-r1? > Thanks, > William Yep, issue is still there, no joy.
Ok, no problem, I'll leave you and Roy to work on it, I just wanted to see if this was still happening.
(In reply to comment #15) > Ok, no problem, I'll leave you and Roy to work on it, I just wanted to > see if this was still happening. I don't see this as a dhcpcd issue - one version is using a clientid and the other not. This is plainly visible from the packet captures. This is due to a client config issue as neither dhcpcd-4.0.13 or dhcpcd-5.1.3 will send a clientid by default.
John, can you test with 5.1.4? Also, please check your dhcpcd configuration and verify that you are set up correctly. You might also want to try using the --noconfmem option for emerge and go back to the default dhcpcd configuration. Please let us know whether this works or not. Also, let us know if 4.0.15 works for you. Thanks, William
(In reply to comment #17) > John, > can you test with 5.1.4? Also, please check your dhcpcd configuration > and verify that you are set up correctly. > You might also want to try using the --noconfmem option for emerge and > go back to the default dhcpcd configuration. > Please let us know whether this works or not. > Also, let us know if 4.0.15 works for you. > Thanks, > William 5.1.4 does not work, 4.0.15 has been working. Should there be a difference in the setup between the two?
(In reply to comment #18) > 5.1.4 does not work, 4.0.15 has been working. Should there be a difference in > the setup between the two? Well you have configured dhcpcd-4 to send a ClientID and dhcpcd-5 to not. That is the difference. Try adding a line to /etc/dhcpcd.conf with just the word clientid on it.
(In reply to comment #19) > (In reply to comment #18) > > 5.1.4 does not work, 4.0.15 has been working. Should there be a difference in > > the setup between the two? > Well you have configured dhcpcd-4 to send a ClientID and dhcpcd-5 to not. That > is the difference. Try adding a line to /etc/dhcpcd.conf with just the word > clientid on it. OK, here is the problem, if I use even 4.0.15 and add the clientid it does not start, -- I have it normally commented out, but the version 5 adds it anyway and the server does NOT want it. How do I tell it no?
You remove clientid AND duid from /etc/dhcpcd.conf if you do not want them sent (duid is a clientid also).
(In reply to comment #21) > You remove clientid AND duid from /etc/dhcpcd.conf if you do not want them sent > (duid is a clientid also). But I have no duid and clientid is commented out, nevertheless apparently v5 sends it anyway.
(In reply to comment #22) > But I have no duid and clientid is commented out, nevertheless apparently v5 > sends it anyway. No it doesn't. I'm about to give up on this - I've already told you what the issue is and how to fix it. But as a last ditch attempt, please attach your /etc/conf.d/net and /etc/dhcpcd.conf
Created attachment 214637 [details] dhcpcd.conf
Created attachment 214638 [details] conf.d/net
(In reply to comment #23) > (In reply to comment #22) > > But I have no duid and clientid is commented out, nevertheless apparently v5 > > sends it anyway. > No it doesn't. > I'm about to give up on this - I've already told you what the issue is and how > to fix it. But as a last ditch attempt, please attach your /etc/conf.d/net and > /etc/dhcpcd.conf As you see -- with this config -- v4.0.15 works and v5 sends the clientid and s does not work. Thanks for all your efforts.
As root, run this on a console # pkill dhcpcd # dhcpcd -dB Capture the output and attach it here please, once it either times out or gets an address.
Created attachment 214739 [details] output of dhcpcd 5.1.4 with -dB eth2 I have the following use flags -zeroconf and compat which is apparently no longer supported in the 5 series.
Now do the same for this command - note the two single quotes at the end. # pkill dhcpcd # dhcpcd -dBI ''
(In reply to comment #29) > Now do the same for this command - note the two single quotes at the end. > # pkill dhcpcd > # dhcpcd -dBI '' Well, can I put eth2 instead of '' so it doesn't try to broadcast to other interfaces? Also, which version of dhcpcd do you want me to do this for? Thanks.
(In reply to comment #30) > (In reply to comment #29) > > Now do the same for this command - note the two single quotes at the end. > > # pkill dhcpcd > > # dhcpcd -dBI '' > Well, can I put eth2 instead of '' so it doesn't try to broadcast to other > interfaces? Also, which version of dhcpcd do you want me to do this for? > Thanks. I think he needs the output from dhcpcd-5.1.4, like you used in comment #28. Also, with the 5 series, it shouldn't try to broadcast to any interfaces that are already set up with addresses, so I would suggest using the exact command he gave you. Thanks, William
(In reply to comment #30) > (In reply to comment #29) > > Now do the same for this command - note the two single quotes at the end. > > # pkill dhcpcd > > # dhcpcd -dBI '' > > > Well, can I put eth2 instead of '' so it doesn't try to broadcast to other > interfaces? Also, which version of dhcpcd do you want me to do this for? No, you would need to put it afterwards like so: # pkill dhcpcd # dhcpcd -dBI '' eth2 It's for dhcpcd-5 only
Created attachment 214997 [details] dhcpcd 5.1.4 with cmd dhcpcd -dBI '' eth2
John, we need you to attach a backtrace of the previous command. The document on the gentoo site that explains how to get one is: http://www.gentoo.org/proj/en/qa/backtraces.xml Basically this involves adding "-ggdb" to your cflags then either "nostrip" or "splitdebug" to features, re-emerging dhcpcd, then using gdb to run dhcpcd then saving the backtrace to a file and attaching it to the bug. Are you comfortable with following the instructions in that document? Let us know if you have any questions.
Created attachment 215357 [details] backtrace of dhcpcd 5.1.4
That backtrace is not useable - there are no funciton calls there. Can you compile dhcpcd manually like so? cd /tmp tar xvjpf /usr/portage/distfiles/dhcpcd-5.1.4.tar.bz2 cd dhcpcd-5.1.4 ./configure CFLAGS=-ggdb make ./dhcpcd -dBI '' eth2 *crash* gdb ./dhcpcd > core dhcpcd.core > bt
I did what you said, but doesn't look like I got anything. The core dump came out in my / partition and was only 464k in size. I did set ulimit -c unlimited before doing this. (In reply to comment #36) > That backtrace is not useable - there are no funciton calls there. > Can you compile dhcpcd manually like so? > cd /tmp > tar xvjpf /usr/portage/distfiles/dhcpcd-5.1.4.tar.bz2 > cd dhcpcd-5.1.4 > ./configure > CFLAGS=-ggdb make > ./dhcpcd -dBI '' eth2 > *crash* > gdb ./dhcpcd > > core dhcpcd.core > > bt
Created attachment 215464 [details] backtrace of manually compiled dhcpcd
Are there any updates on this bug? Thanks, William
Could be related to this upstream ticket http://roy.marples.name/projects/dhcpcd/ticket/185 Which in turn links to ArchLinux http://bugs.archlinux.org/task/17838 where there is more info. I've also been sent a lot more stuff privately and hopefully it can be fixed soon. However, the segfault here is still odd and the backtrace is not useable - nothing dhcpcd related is found. Dunno if that's due to the OP using some kind of wrapper script as indicated at the top and bottom of his output.
I ran again with no scripting at all excep redirection and it did not crash, or work either -- here is what I got.
Created attachment 217572 [details] trace without any scripting did not crash, but did not work.
I should note that I have a dhcp server on the desktop and in my laptop which is also gentoo, I have v 5.1.4 and it works fine with the dhcpc server on the desktop which is 3.1.2_p1. So its something with the FIOS server and the dhcpcd client. Now if I put clientid in 4.0.x, I can make it timeout as well.
Can you try the patch here please? http://roy.marples.name/projects/dhcpcd/changeset/9eee804aed30995b1dd8f075cae5e6c343289242
Another one here http://roy.marples.name/projects/dhcpcd/changeset/beffe25d106d2b2452bd0d87b1b249ddb2e22528 If it fails to apply, you may want to try a git snapshot.
Both changes went in, but no joy. I did get another crash, but my ulimit was not set, so no core dump, but I tried again and no crash. Next time I will make sure ulimit is unlimited. Otherwise same results as always.
It's the crash that puzzles me the most. Can you try running it under valgrind? valgrind dhcpcd -dBI '' eth2 You may have to emerge glibc with FEATURES="splitdebug" enabled in /etc/make.conf to get valgrind to work though.
John, dhcpcd-5.1.5 was just released. Can you try with that version? Thanks, William
No joy with .5. Also, I did try .4 with volgrind, but it didn't crash when I tried it. .5 did not crash, but exhibited the same behavior.
Hi John, dhcpcd 5.2.1 just hit portage. Can you update and try this again? Thanks, William
sorry no joy --exacly the same behavior.
There is a similar discussion on my upstream bug http://roy.marples.name/projects/dhcpcd/ticket/186 Although that is on a VM on a s390, so may not be of any use.
The recently released 5.2.2 has some IP header tweaks which may or may not resolve this issue.
Sorry, no joy, no crash, but exactly the same behavior
I just tried to install gentoo from the most recent installation CD (8 apr 2010) on amd64 and cannot access the network, dhcpcd fails (times out). On covici's recommendation, I switched to grml as the boot method and that succeeded. My machine is a dell latitude e6500 and my router is a linksys running the tomato firmware.
John, is this still an issue with 5.2.5? Thanks, William
(In reply to comment #56) > John, > is this still an issue with 5.2.5? > Thanks, > William Yep, still the same.
This bug doesn't seem to be getting anywhere even though a fair bit of effort is going in to it. Roy, you suggested in the stablereq for dhcpcd that this bug is in fact caused by the OP's DHCP server rejecting messages without a clientid: can you succinctly demonstrate that here from the OP's log? It would be great if we could settle this bug one way or the other. There is a problem *somewhere*, and if we can show that it's not in dhcpcd, then we can close the bug. (Right?) Thanks.
(In reply to comment #58) > Roy, you suggested in the stablereq for dhcpcd that this bug is in fact caused > by the OP's DHCP server rejecting messages without a clientid: can you > succinctly demonstrate that here from the OP's log? It would be great if we > could settle this bug one way or the other. There is a problem *somewhere*, > and if we can show that it's not in dhcpcd, then we can close the bug. > (Right?) Thanks. I agree with this. If we can show that this is a configuration issue and not an issue with dhcpcd itself, I will close the bug.
If the reporter provided logs are not proof enough then I'm not sure what kind of proof you need. The working one clearly shows a clientid, the non working one clearly shows no clientid.
Roy, I understand that your patience is waning, but please bear with us. :) Unfortunately, when I look at the logs (dhcpcd-v[45]-dump.txt) I just see a lot of nonsense, I don't understand what's going on in there. I'm not sure if William is any more enlightened by them either? So could you, if possible, please explain to us why/how those logs demonstrate that the server and not the client is malfunctioning. It would allow us to close the bug as invalid with confidence. On another note, John's comment (#43) about testing a different server that worked also sounds like a fair argument that the client is not at fault. Maybe we could find out what version the FiOS server is and confirm that it's faulty. Just an idea. I'd rather understand the logs. :)
Open the logs with an analyzer like wireshark. You will see that one of the DHCP options that dhcpcd-4.x sends (by default) is a client id. dhcpcd-5 does NOT send this by default. DHCP servers should NOT require a clientid. However, some do. From the reporters dhcpcd.conf # To share the DHCP lease across OSX and Windows a ClientID is needed. # Enabling this may get a different lease than the kernel DHCP client. # Some upstream DHCP servers may also require a ClientID, such as FRITZ!Box. #clientid My patience is now at an end. I document, I fix and I demonstrate configuration issues and solutions. But apparently that's not enough anymore.
OK, the FIOS people changed the server so now the 4.x series does not work and the 5.x series does, and I didn't change the configs at all, so I guess you can close this bug -- sorry for all the trouble.
lol. (Really.) Well, John, it's your bug, so it's probably best for you to close it if you feel that it's no longer valid. Roy, thank you for explaining how to view the log files.
(In reply to comment #63) > OK, the FIOS people changed the server so now the 4.x series does not work and > the 5.x series does, and I didn't change the configs at all, so I guess you can > close this bug -- sorry for all the trouble. I am closing this as invalid per your request. Thanks, William
I just got updated to 5.2.8, now I consistently get timeouts whereas the previous version worked with no issues. I have not changed the configuration of either version.
I tried uncommenting the "clientid" line in /etc/dhcpcd.conf but this had no affect.