The resolv.conf information for the interface is not removed when bringing down with SIGTERM as is default in baselayout 1.11.14-r8. This is due to the execOnStop stuff introduced in this change: http://svn.berlios.de/wsvn/dhcpcd/trunk/src/?rev=45&sc=1 http://svn.berlios.de/wsvn/dhcpcd/trunk/src/?rev=0&sc=0 dhcpStop in client.c returns early when execOnStop is 0 as set in sigHandler in signals.c. The behaviour is inconsistent with the unrenaming of resolv.conf which still occurs with SIGTERM when /sbin/resolvconf is not used. This means that when a different interface (wlan0) is brought up (after eth0) on a different network, gethostbyaddr is called for the new ip address using the old dns servers, which don't respond, causing the whole DHCP communication to time out. Work-around is to use dhcp_eth0="release" etc in /etc/conf.d/net.
This is not easily fixable - here's why resovlconf in this instance would restart any currently started dns resolvers - like nscd, dnsmasq, bind, etc. when dhcpcd is called with SIGINT everything works as planned. However when called with SIGTERM everything *appears* to work, but the dns resolver daemons are then stuck and have to be killed with a SIGKILL which is very very nasty. I've spent quite some time trying to get dhcpcd play nicely with SIGTERM and resolvconf -d, but I just cannot find a way to fix it at present. That is why we don't call resolvconf -d on SIGTERM. Simon Kelly (dhcpcd co maintainer) said that this maybe fixable by using a siglongjmp, but I haven't had time to look into this yet. We get around this as baselayout-1.12 has resolvconf support which will call resolvconf -d when eth0 goes down anyway, which should solve your problem.
Created attachment 88189 [details, diff] Allow dhcpStop to exec on SIGTERM OK, this patch should fix the issue. Please test :)
Patch works :) Thank you for a quick response. But I don't think I really understand the issues with SIGTERM: If the daemons are restarted through resolvconf from dchpcd (i.e. before execute_on_change("down")) then they hang, but when restarted from baselayout scripts they don't? Isn't the behaviour on SIGTERM the same as SIGINT (unless Persistent)? What is the difference between calling dhcpStop and exiting, or calling siglongjmp before dhcpStop and exiting? Restoring saved signals? Should env and jmpTerm be sigjmp_buf instead of jmp_buf?
(In reply to comment #3) > But I don't think I really understand the issues with SIGTERM: > > If the daemons are restarted through resolvconf from dchpcd > (i.e. before execute_on_change("down")) then they hang, > but when restarted from baselayout scripts they don't? No. When restarted via resolvconf when triggered by SIGTERM on dhcpcd they hang. So when you do a /etc/init.d/nscd restart it doesn't restart and the nscd daemon can only be stopped with a kill -9. Why this happens I don't know. I've known about this for some time, but baselayout-1.12 did the resolvconf -d too so I've not really been fussed about fixing it. > > Isn't the behaviour on SIGTERM the same as SIGINT (unless Persistent)? From a dhcpcd perspective yes - from a keeping daemons running perspective no. > What is the difference between calling dhcpStop and exiting, or calling > siglongjmp before dhcpStop and exiting? Restoring saved signals? siglongjmp is basically a restore env and goto, so the signals are saved. > > Should env and jmpTerm be sigjmp_buf instead of jmp_buf? sigjmp_buf is just an alias to jmp_buf AFAIK Glad it's fixed :) I'll see if I can push out 2.0.6 next week.
Fixed in dhcpcd-2.0.6