I just bumped sys-apps/openrc from 0.2.4 to 0.2.5 last night and was unable to boot anymore. This happened on two machines. Both machines have parallel startup set to YES. Both of them would hang after the first 4-5 lines that openrc produces (something related to /dev/shm, etc.). I never get to see any service-name-prefixed lines. Only the lines that start with a green " * ". At that time I can press control-c which will tell me that a whole lot of services are getting interrupted but "/etc/init.d/$service status" tells me that $service is still "starting". I can try restarting the services but it doesn't help. I have some mingettys starting after /sbin/rc boot and before /sbin/rc default so I am able to do at least something after I interrupt the boot level but I cannot get the system to a usable state since I cannot really get any service up. If someone has suggestions on debugging methods, I'm listening. For now, I've masked openrc-0.2.5 locally.
Same thing here. (amd64, parallel startup) Please mask 0.2.5 ASAP, because this bug really sucks.
Does it work if you turn parallel off?
(In reply to comment #2) > Does it work if you turn parallel off? > In fact it does. Roy, are you unable to reproduce? I've already got this bug on two machines, I can only assume it's reproducible. Both amd64 (well, one's core2), both with parallel on.
Just to ensure that there is no ambiguity, I am rephrasing my response in comment #3 to "Everything works fine and the system boots properly if I set parallel_startup to NO".
(In reply to comment #3) > Roy, are you unable to reproduce? Correct.
(In reply to comment #1) > Same thing here. (amd64, parallel startup) > Please mask 0.2.5 ASAP, because this bug really sucks. > Same here, on x86, also with parallel startup switched on.
Could the people affected by this bug attach (post in the comment box and I'll kill ya!) their /lib/rc/init.d/deptree.
Created attachment 154861 [details] deptree
Created attachment 154865 [details] deptree of my system
Created attachment 154899 [details] deptree And mine.
Created attachment 154913 [details] The deptree of my system Happens also on my machine. After switching back to openrc-0.2.4-r1 I get a bunch of "* Caching service dependencies ..." which slooooow down the boot process almost 4 times.
Created attachment 154953 [details] deptree
Same thing here. But it works (sometimes) with the interactive mode.
I don't now if it's important, but the when i start the first service in the interactive mode, i'll get this massage: hwclock | * ERROR: cannot start hwclock as fsck would not start.
Created attachment 155001 [details] deptree x86
Created attachment 155003 [details] deptree amd64
I've been bitten by this one too. I'm not sure if this is relevant but it's not been mentioned yet so here goes. I left my machine for a while to see if anything would time out and eventually I got a bunch of "timed out waiting for $foo" messages from my various services, from what I could see device-mapper seemed to be the blocking service - most services were waiting on hwclock which was waiting on device-mapper.I've been bitten by this one too. I'm not sure if this is relevant but it's not been mentioned yet so here goes. I left my machine for a while to see if anything would time out and eventually I got a bunch of "timed out waiting for $foo" messages from my various services, from what I could see device-mapper seemed to be the blocking service - most services were waiting on hwclock which was waiting on device-mapper.
Hello everyone. I have attached two deptrees. The first one (x86) is from a box where I'm having problems. The second one (amd64) is from a notebook where I have not experienced any problems at all. Both boxes are running openrc-0.2.5 with rc_paralell="YES" HTH, Norberto
Created attachment 155065 [details] Deptree The deptree was attached while booting with sys-apps/openrc-0.2.4-r1 at an x86 machine. (I downgraded 'cause of this bug)
Same problem here, except this does not hang forever, if you wait > 20 mins then it will boot. The service that was hanging WAS readahead-list so once it booted I removed that from the startup. NOW it is timing out on hwclock which I am sure is OK. It seems to be actual openrc rather than one script which is stopping this
Some more information... This bug only affects boot level services, the default level ones seem to work once openrc gives up on the boot services. The boot services seem to actually start but they do not report back to openrc. This must be the case because I can get a working system once everything times out.
Created attachment 155219 [details] My deptree amd64 (core 2) In case it helps ;)
OK, I can repo this now. I'll work on a fix ASAP. For the time being either don't use parallel startup or remove the before clock depend from /etc/init.d/readahead-list
Created attachment 155583 [details, diff] Remove broken before dependencies OK, I think I have this fixed now. Try this patch or the 9999 ebuild in portage.
(In reply to comment #24) > Created an attachment (id=155583) [edit] > Remove broken before dependencies > > OK, I think I have this fixed now. Try this patch or the 9999 ebuild in > portage. > What version is this supposed to apply to? It fails against 0.2.5. $ cat ../attachment.cgi\?id\=155583 | patch -p1 --dry-run patching file src/librc/librc-depend.c Hunk #1 FAILED at 182. Hunk #2 FAILED at 252. Hunk #3 FAILED at 683. Hunk #4 FAILED at 865. Hunk #5 succeeded at 910 (offset 26 lines). Hunk #6 succeeded at 1002 with fuzz 2 (offset 26 lines). 4 out of 6 hunks FAILED -- saving rejects to file src/librc/librc-depend.c.rej
OK, just try the 9999 git ebuild :)
(In reply to comment #26) > OK, just try the 9999 git ebuild :) > Nothing. Fails as before. By the way, with rc_parallel="NO" I get * ERROR: hwclock did not start because fsck failed to start though this is just an error and the boot goes on until the end. That happens right before the modules get loaded, which in turn is before I see some of the partitions get checked... could that be related to the problem? I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at my deptree (attachment #154899 [details]) I see this: depinfo_35_service='hwclock' depinfo_35_ibefore_35='hwclock' depinfo_35_iafter_1='hwclock' Could a circular dep be causing this issue?
(In reply to comment #27) > Nothing. Fails as before. OK, please attach a new deptree > I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at > my deptree (attachment #154899 [details] [edit]) I see this: > > depinfo_35_service='hwclock' > depinfo_35_ibefore_35='hwclock' > depinfo_35_iafter_1='hwclock' That shows you've not updated to the new openrc, or the deptree needs a rebuild. /lib/rc/bin/rc-depend -u and try again. > Could a circular dep be causing this issue? This entire problem is a circular dep :) readhead-list has this `need localmount; before clock;` However, clock has to come before localmount.
Created attachment 155795 [details] working deptree (In reply to comment #28) > (In reply to comment #27) > > Nothing. Fails as before. > > OK, please attach a new deptree > That shows you've not updated to the new openrc, or the deptree needs a > rebuild. > /lib/rc/bin/rc-depend -u > and try again. Works now. I also rebuilt the git version of openrc but I don't think anything had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed it. By the way, this is probably a separate problem but I'll mention it anyway. Just tell me if you want a separate bug for that. I have: chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd rc_need="net" chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount net.ath0 default net.eth0 net.lo boot nfsmount default When I boot I get the following (copying manually the output): net.ath0: Backgrounding... net.ath0: WARNING: net.ath0 has started but is inactive rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start Shouldn't nfsmount get scheduled to start when rpc.idmapd has started?
(In reply to comment #29) > > That shows you've not updated to the new openrc, or the deptree needs a > > rebuild. > > /lib/rc/bin/rc-depend -u > > and try again. > > Works now. I also rebuilt the git version of openrc but I don't think anything > had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed > it. Great. Hopefully a dev can add that to the ebuild at the end of pkg_postinst() > > By the way, this is probably a separate problem but I'll mention it anyway. > Just tell me if you want a separate bug for that. > > I have: > > chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd > rc_need="net" > chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount > net.ath0 default > net.eth0 > net.lo boot > nfsmount default > > When I boot I get the following (copying manually the output): > > net.ath0: Backgrounding... > net.ath0: WARNING: net.ath0 has started but is inactive > rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started > nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start > > Shouldn't nfsmount get scheduled to start when rpc.idmapd has started? File a new bug for that please :)
fixed. Thanks guys
I "upgraded" to version 0.2.5 today and it still hangs on my system. (I ran lib/rc/bin/rc-depend -u before rebooting).
Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or with 0.2.5 but it will not start. I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken, right?
(In reply to comment #33) > Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and > again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or > with 0.2.5 but it will not start. > I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken, > right? No, 0.2.5 is fixed for THIS bug. Maybe you have a different one. You can mask package versions by editing /etc/portage/package.mask - see the portage documentation about this.
actually my initial bug was 224447. I also tend to think that is the same thing, but every time I bump into 0.2.5 it fails to start. Maybe I didn't get something?
Created attachment 162033 [details] Depgraph after a rc-depend -u
I still have this problem. Apparently, after analising my depgraph, even after I run rc-depend -u, fsck and hwclock seem to develop a circular denedency between them. This happens to me in both openrc-0.2.5 and -9999 as of today. I attached my depgraph in the previous comment.
(In reply to comment #37) Do you, by any chance still have /etc/init.d/clock ? I was having the same problem. Removed that script and now all works well, hwclock doesn't complain about fsck, and paralel startup works too. ~amd64, openrc-0.2.5 here. So I guess the ebuild should remove the old clock script, when upgrading/switching to openRC. HTH
(In reply to comment #38) Or remove the localmount dependency in that script.
Ok, deleting /etc/init.d/clock fixed the problem. On the other hand, clock is still present at my deptree, even after executing rc-depend -u. Since everything works, I'm assuming this is expected, but is it?