Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 224171 - sys-apps/openrc-0.2.5 hangs forever on startup
Summary: sys-apps/openrc-0.2.5 hangs forever on startup
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-29 23:40 UTC by Georgi Georgiev
Modified: 2008-08-25 00:26 UTC (History)
11 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
deptree (deptree,35.39 KB, text/plain)
2008-05-30 18:24 UTC, octoploid
Details
deptree of my system (deptree,41.92 KB, text/plain)
2008-05-30 19:18 UTC, Dirk Heinrichs
Details
deptree (deptree,40.01 KB, text/plain)
2008-05-31 02:15 UTC, Georgi Georgiev
Details
The deptree of my system (deptree.tar.bz2,6.90 KB, text/plain)
2008-05-31 06:19 UTC, Maciej Grela
Details
deptree (deptree,27.69 KB, text/plain)
2008-05-31 13:52 UTC, Michael Mair-Keimberger
Details
deptree x86 (deptree,43.32 KB, text/plain)
2008-05-31 20:42 UTC, Norberto Bensa
Details
deptree amd64 (deptree,31.89 KB, text/plain)
2008-05-31 20:43 UTC, Norberto Bensa
Details
Deptree (deptree,34.97 KB, text/plain)
2008-06-01 09:47 UTC, Thomas Lauckner
Details
My deptree amd64 (core 2) (deptree,38.05 KB, text/plain)
2008-06-02 11:42 UTC, Mike Dransfield
Details
Remove broken before dependencies (depend.diff,5.06 KB, patch)
2008-06-05 10:18 UTC, Roy Marples
Details | Diff
working deptree (deptree,32.69 KB, text/plain)
2008-06-07 09:48 UTC, Georgi Georgiev
Details
Depgraph after a rc-depend -u (deptree,23.12 KB, text/plain)
2008-08-02 21:14 UTC, Pedro Coelho
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Georgi Georgiev 2008-05-29 23:40:46 UTC
I just bumped sys-apps/openrc from 0.2.4 to 0.2.5 last night and was unable to boot anymore. This happened on two machines.

Both machines have parallel startup set to YES.

Both of them would hang after the first 4-5 lines that openrc produces (something related to /dev/shm, etc.). I never get to see any service-name-prefixed lines. Only the lines that start with a green " * ".

At that time I can press control-c which will tell me that a whole lot of services are getting interrupted but "/etc/init.d/$service status" tells me that $service is still "starting". I can try restarting the services but it doesn't help.

I have some mingettys starting after /sbin/rc boot and before /sbin/rc default so I am able to do at least something after I interrupt the boot level but I cannot get the system to a usable state since I cannot really get any service up.

If someone has suggestions on debugging methods, I'm listening. For now, I've masked openrc-0.2.5 locally.
Comment 1 octoploid 2008-05-30 10:38:29 UTC
Same thing here. (amd64,  parallel startup)
Please mask 0.2.5 ASAP, because this bug really sucks.
Comment 2 Roy Marples 2008-05-30 12:32:59 UTC
Does it work if you turn parallel off?
Comment 3 Georgi Georgiev 2008-05-30 16:42:25 UTC
(In reply to comment #2)
> Does it work if you turn parallel off?
> 

In fact it does.

Roy, are you unable to reproduce? I've already got this bug on two machines, I can only assume it's reproducible. Both amd64 (well, one's core2), both with parallel on.
Comment 4 Georgi Georgiev 2008-05-30 16:43:51 UTC
Just to ensure that there is no ambiguity, I am rephrasing my response in comment #3 to "Everything works fine and the system boots properly if I set parallel_startup to NO".
Comment 5 Roy Marples 2008-05-30 17:10:45 UTC
(In reply to comment #3)
> Roy, are you unable to reproduce?

Correct.
Comment 6 Dirk Heinrichs 2008-05-30 17:17:26 UTC
(In reply to comment #1)
> Same thing here. (amd64,  parallel startup)
> Please mask 0.2.5 ASAP, because this bug really sucks.
> 

Same here, on x86, also with parallel startup switched on.
Comment 7 Roy Marples 2008-05-30 18:18:39 UTC
Could the people affected by this bug attach (post in the comment box and I'll kill ya!) their /lib/rc/init.d/deptree.
Comment 8 octoploid 2008-05-30 18:24:06 UTC
Created attachment 154861 [details]
deptree
Comment 9 Dirk Heinrichs 2008-05-30 19:18:59 UTC
Created attachment 154865 [details]
deptree of my system
Comment 10 Georgi Georgiev 2008-05-31 02:15:13 UTC
Created attachment 154899 [details]
deptree

And mine.
Comment 11 Maciej Grela 2008-05-31 06:19:12 UTC
Created attachment 154913 [details]
The deptree of my system

Happens also on my machine. After switching back to openrc-0.2.4-r1 I get a bunch of "* Caching service dependencies ..." which slooooow down the boot process almost 4 times.
Comment 12 Michael Mair-Keimberger 2008-05-31 13:52:47 UTC
Created attachment 154953 [details]
deptree
Comment 13 Michael Mair-Keimberger 2008-05-31 13:53:44 UTC
Same thing here.
But it works (sometimes) with the interactive mode.

Comment 14 Michael Mair-Keimberger 2008-05-31 14:00:51 UTC
I don't now if it's important, but the when i start the first service in the interactive mode, i'll get this massage:

hwclock   | * ERROR: cannot start hwclock as fsck would not start.
Comment 15 Norberto Bensa 2008-05-31 20:42:43 UTC
Created attachment 155001 [details]
deptree x86
Comment 16 Norberto Bensa 2008-05-31 20:43:25 UTC
Created attachment 155003 [details]
deptree amd64
Comment 17 Scott Thomson 2008-05-31 20:45:30 UTC
I've been bitten by this one too.

I'm not sure if this is relevant but it's not been mentioned yet so here goes. I left my machine for a while to see if anything would time out and eventually I got a bunch of "timed out waiting for $foo" messages from my various services, from what I could see device-mapper seemed to be the blocking service - most services were waiting on hwclock which was waiting on device-mapper.I've been bitten by this one too.

I'm not sure if this is relevant but it's not been mentioned yet so here goes. I left my machine for a while to see if anything would time out and eventually I got a bunch of "timed out waiting for $foo" messages from my various services, from what I could see device-mapper seemed to be the blocking service - most services were waiting on hwclock which was waiting on device-mapper.
Comment 18 Norberto Bensa 2008-05-31 20:47:11 UTC
Hello everyone.

I have attached two deptrees. The first one (x86) is from a box where I'm having problems. The second one (amd64) is from a notebook where I have not     experienced any problems at all.

Both boxes are running openrc-0.2.5 with rc_paralell="YES"

HTH,
Norberto
Comment 19 Thomas Lauckner 2008-06-01 09:47:40 UTC
Created attachment 155065 [details]
Deptree

The deptree was attached while booting with sys-apps/openrc-0.2.4-r1 at an x86 machine. (I downgraded 'cause of this bug)
Comment 20 Mike Dransfield 2008-06-02 11:26:07 UTC
Same problem here, except this does not hang forever, if you wait > 20 mins then it will boot.

The service that was hanging WAS readahead-list so once it booted I removed that from the startup.  NOW  it is timing out on hwclock which I am sure is OK.

It seems to be actual openrc rather than one script which is stopping this
Comment 21 Mike Dransfield 2008-06-02 11:38:11 UTC
Some more information...

This bug only affects boot level services, the default level ones seem to work once openrc gives up on the boot services.

The boot services seem to actually start but they do not report back to openrc.  This must be the case because I can get a working system once everything times out.
Comment 22 Mike Dransfield 2008-06-02 11:42:12 UTC
Created attachment 155219 [details]
My deptree amd64 (core 2)

In case it helps ;)
Comment 23 Roy Marples 2008-06-03 14:05:34 UTC
OK, I can repo this now. I'll work on a fix ASAP.
For the time being either don't use parallel startup or remove the before clock depend from /etc/init.d/readahead-list
Comment 24 Roy Marples 2008-06-05 10:18:21 UTC
Created attachment 155583 [details, diff]
Remove broken before dependencies

OK, I think I have this fixed now. Try this patch or the 9999 ebuild in portage.
Comment 25 Georgi Georgiev 2008-06-05 10:54:59 UTC
(In reply to comment #24)
> Created an attachment (id=155583) [edit]
> Remove broken before dependencies
> 
> OK, I think I have this fixed now. Try this patch or the 9999 ebuild in
> portage.
> 

What version is this supposed to apply to? It fails against 0.2.5.

$ cat ../attachment.cgi\?id\=155583 | patch -p1 --dry-run
patching file src/librc/librc-depend.c
Hunk #1 FAILED at 182.
Hunk #2 FAILED at 252.
Hunk #3 FAILED at 683.
Hunk #4 FAILED at 865.
Hunk #5 succeeded at 910 (offset 26 lines).
Hunk #6 succeeded at 1002 with fuzz 2 (offset 26 lines).
4 out of 6 hunks FAILED -- saving rejects to file src/librc/librc-depend.c.rej
Comment 26 Roy Marples 2008-06-05 12:32:38 UTC
OK, just try the 9999 git ebuild :)
Comment 27 Georgi Georgiev 2008-06-05 15:21:22 UTC
(In reply to comment #26)
> OK, just try the 9999 git ebuild :)
> 

Nothing. Fails as before.

By the way, with rc_parallel="NO" I get
* ERROR: hwclock did not start because fsck failed to start
though this is just an error and the boot goes on until the end.

That happens right before the modules get loaded, which in turn is before I see some of the partitions get checked... could that be related to the problem?

I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at my deptree (attachment #154899 [details]) I see this:

depinfo_35_service='hwclock'
depinfo_35_ibefore_35='hwclock'
depinfo_35_iafter_1='hwclock'

Could a circular dep be causing this issue?
Comment 28 Roy Marples 2008-06-05 15:46:11 UTC
(In reply to comment #27)
> Nothing. Fails as before.

OK, please attach a new deptree

> I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at
> my deptree (attachment #154899 [details] [edit]) I see this:
> 
> depinfo_35_service='hwclock'
> depinfo_35_ibefore_35='hwclock'
> depinfo_35_iafter_1='hwclock'

That shows you've not updated to the new openrc, or the deptree needs a rebuild.
/lib/rc/bin/rc-depend -u
and try again.

> Could a circular dep be causing this issue?

This entire problem is a circular dep :)
readhead-list has this `need localmount; before clock;`
However, clock has to come before localmount.
Comment 29 Georgi Georgiev 2008-06-07 09:48:37 UTC
Created attachment 155795 [details]
working deptree

(In reply to comment #28)
> (In reply to comment #27)
> > Nothing. Fails as before.
> 
> OK, please attach a new deptree

> That shows you've not updated to the new openrc, or the deptree needs a
> rebuild.
> /lib/rc/bin/rc-depend -u
> and try again.

Works now. I also rebuilt the git version of openrc but I don't think anything had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed it.

By the way, this is probably a separate problem but I'll mention it anyway. Just tell me if you want a separate bug for that.

I have:

chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd 
rc_need="net"
chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount
  net.ath0                  default
  net.eth0                  
  net.lo                    boot
  nfsmount                  default

When I boot I get the following (copying manually the output):

net.ath0: Backgrounding...
net.ath0: WARNING: net.ath0 has started but is inactive
rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started
nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start

Shouldn't nfsmount get scheduled to start when rpc.idmapd has started?
Comment 30 Roy Marples 2008-06-07 10:00:04 UTC
(In reply to comment #29)
> > That shows you've not updated to the new openrc, or the deptree needs a
> > rebuild.
> > /lib/rc/bin/rc-depend -u
> > and try again.
> 
> Works now. I also rebuilt the git version of openrc but I don't think anything
> had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed
> it.

Great. Hopefully a dev can add that to the ebuild at the end of pkg_postinst()

> 
> By the way, this is probably a separate problem but I'll mention it anyway.
> Just tell me if you want a separate bug for that.
> 
> I have:
> 
> chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd 
> rc_need="net"
> chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount
>   net.ath0                  default
>   net.eth0                  
>   net.lo                    boot
>   nfsmount                  default
> 
> When I boot I get the following (copying manually the output):
> 
> net.ath0: Backgrounding...
> net.ath0: WARNING: net.ath0 has started but is inactive
> rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started
> nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start
> 
> Shouldn't nfsmount get scheduled to start when rpc.idmapd has started?

File a new bug for that please :)
Comment 31 Doug Goldstein (RETIRED) gentoo-dev 2008-06-09 14:37:46 UTC
fixed. Thanks guys
Comment 32 octoploid 2008-06-12 15:30:01 UTC
I "upgraded" to version 0.2.5 today and it still hangs on my system.
(I ran lib/rc/bin/rc-depend -u before rebooting).
Comment 33 cucu ionut 2008-07-05 05:53:21 UTC
Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or with 0.2.5 but it will not start. 
I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken, right? 
Comment 34 Roy Marples 2008-07-06 06:53:50 UTC
(In reply to comment #33)
> Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and
> again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or
> with 0.2.5 but it will not start. 
> I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken,
> right?

No, 0.2.5 is fixed for THIS bug. Maybe you have a different one.
You can mask package versions by editing /etc/portage/package.mask - see the portage documentation about this. 

Comment 35 cucu ionut 2008-07-06 07:41:48 UTC
actually my initial bug was 224447. I also tend to think that is the same thing, but every time I bump into 0.2.5 it fails to start. Maybe I didn't get something?
Comment 36 Pedro Coelho 2008-08-02 21:14:09 UTC
Created attachment 162033 [details]
Depgraph after a rc-depend -u
Comment 37 Pedro Coelho 2008-08-02 21:24:17 UTC
I still have this problem. Apparently, after analising my depgraph, even after I run rc-depend -u, fsck and hwclock seem to develop a circular denedency between them.

This happens to me in both openrc-0.2.5 and -9999 as of today. I attached my depgraph in the previous comment.
Comment 38 Emilian Huminiuc 2008-08-04 23:22:03 UTC
(In reply to comment #37)
Do you, by any chance still have /etc/init.d/clock ?
I was having the same problem. Removed that script and now all works well, hwclock doesn't complain about fsck, and paralel startup works too.
~amd64, openrc-0.2.5 here. 
So I guess the ebuild should remove the old clock script, when upgrading/switching to openRC.

HTH
Comment 39 Emilian Huminiuc 2008-08-04 23:29:53 UTC
(In reply to comment #38)
Or remove the localmount dependency in that script. 

Comment 40 Pedro Coelho 2008-08-06 00:39:50 UTC
Ok, deleting /etc/init.d/clock fixed the problem. On the other hand, clock is still present at my deptree, even after executing rc-depend -u. 

Since everything works, I'm assuming this is expected, but is it?