First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 224171
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo's Team for Core System packages <base-system@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Georgi Georgiev <chutz@gg3.net>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
deptree deptree text/plain octoploid 2008-05-30 18:24 0000 35.39 KB Details
deptree deptree of my system text/plain Dirk Heinrichs 2008-05-30 19:18 0000 41.92 KB Details
deptree deptree text/plain Georgi Georgiev 2008-05-31 02:15 0000 40.01 KB Details
deptree.tar.bz2 The deptree of my system text/plain Maciej Grela 2008-05-31 06:19 0000 6.90 KB Details
deptree deptree text/plain Michael Mair-Keimberger 2008-05-31 13:52 0000 27.69 KB Details
deptree deptree x86 text/plain Norberto Bensa 2008-05-31 20:42 0000 43.32 KB Details
deptree deptree amd64 text/plain Norberto Bensa 2008-05-31 20:43 0000 31.89 KB Details
deptree Deptree text/plain Thomas Lauckner 2008-06-01 09:47 0000 34.97 KB Details
deptree My deptree amd64 (core 2) text/plain Mike Dransfield 2008-06-02 11:42 0000 38.05 KB Details
depend.diff Remove broken before dependencies patch Roy Marples 2008-06-05 10:18 0000 5.06 KB Details | Diff
deptree working deptree text/plain Georgi Georgiev 2008-06-07 09:48 0000 32.69 KB Details
deptree Depgraph after a rc-depend -u text/plain Pedro Coelho 2008-08-02 21:14 0000 23.12 KB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 224171 depends on: Show dependency tree
Show dependency graph
Bug 224171 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2008-05-29 23:40 0000
I just bumped sys-apps/openrc from 0.2.4 to 0.2.5 last night and was unable to
boot anymore. This happened on two machines.

Both machines have parallel startup set to YES.

Both of them would hang after the first 4-5 lines that openrc produces
(something related to /dev/shm, etc.). I never get to see any
service-name-prefixed lines. Only the lines that start with a green " * ".

At that time I can press control-c which will tell me that a whole lot of
services are getting interrupted but "/etc/init.d/$service status" tells me
that $service is still "starting". I can try restarting the services but it
doesn't help.

I have some mingettys starting after /sbin/rc boot and before /sbin/rc default
so I am able to do at least something after I interrupt the boot level but I
cannot get the system to a usable state since I cannot really get any service
up.

If someone has suggestions on debugging methods, I'm listening. For now, I've
masked openrc-0.2.5 locally.

------- Comment #1 From octoploid 2008-05-30 10:38:29 0000 -------
Same thing here. (amd64,  parallel startup)
Please mask 0.2.5 ASAP, because this bug really sucks.

------- Comment #2 From Roy Marples 2008-05-30 12:32:59 0000 -------
Does it work if you turn parallel off?

------- Comment #3 From Georgi Georgiev 2008-05-30 16:42:25 0000 -------
(In reply to comment #2)
> Does it work if you turn parallel off?
> 

In fact it does.

Roy, are you unable to reproduce? I've already got this bug on two machines, I
can only assume it's reproducible. Both amd64 (well, one's core2), both with
parallel on.

------- Comment #4 From Georgi Georgiev 2008-05-30 16:43:51 0000 -------
Just to ensure that there is no ambiguity, I am rephrasing my response in
comment #3 to "Everything works fine and the system boots properly if I set
parallel_startup to NO".

------- Comment #5 From Roy Marples 2008-05-30 17:10:45 0000 -------
(In reply to comment #3)
> Roy, are you unable to reproduce?

Correct.

------- Comment #6 From Dirk Heinrichs 2008-05-30 17:17:26 0000 -------
(In reply to comment #1)
> Same thing here. (amd64,  parallel startup)
> Please mask 0.2.5 ASAP, because this bug really sucks.
> 

Same here, on x86, also with parallel startup switched on.

------- Comment #7 From Roy Marples 2008-05-30 18:18:39 0000 -------
Could the people affected by this bug attach (post in the comment box and I'll
kill ya!) their /lib/rc/init.d/deptree.

------- Comment #8 From octoploid 2008-05-30 18:24:06 0000 -------
Created an attachment (id=154861) [edit]
deptree

------- Comment #9 From Dirk Heinrichs 2008-05-30 19:18:59 0000 -------
Created an attachment (id=154865) [edit]
deptree of my system

------- Comment #10 From Georgi Georgiev 2008-05-31 02:15:13 0000 -------
Created an attachment (id=154899) [edit]
deptree

And mine.

------- Comment #11 From Maciej Grela 2008-05-31 06:19:12 0000 -------
Created an attachment (id=154913) [edit]
The deptree of my system

Happens also on my machine. After switching back to openrc-0.2.4-r1 I get a
bunch of "* Caching service dependencies ..." which slooooow down the boot
process almost 4 times.

------- Comment #12 From Michael Mair-Keimberger 2008-05-31 13:52:47 0000 -------
Created an attachment (id=154953) [edit]
deptree

------- Comment #13 From Michael Mair-Keimberger 2008-05-31 13:53:44 0000 -------
Same thing here.
But it works (sometimes) with the interactive mode.

------- Comment #14 From Michael Mair-Keimberger 2008-05-31 14:00:51 0000 -------
I don't now if it's important, but the when i start the first service in the
interactive mode, i'll get this massage:

hwclock   | * ERROR: cannot start hwclock as fsck would not start.

------- Comment #15 From Norberto Bensa 2008-05-31 20:42:43 0000 -------
Created an attachment (id=155001) [edit]
deptree x86

------- Comment #16 From Norberto Bensa 2008-05-31 20:43:25 0000 -------
Created an attachment (id=155003) [edit]
deptree amd64

------- Comment #17 From Scott Thomson 2008-05-31 20:45:30 0000 -------
I've been bitten by this one too.

I'm not sure if this is relevant but it's not been mentioned yet so here goes.
I left my machine for a while to see if anything would time out and eventually
I got a bunch of "timed out waiting for $foo" messages from my various
services, from what I could see device-mapper seemed to be the blocking service
- most services were waiting on hwclock which was waiting on device-mapper.I've
been bitten by this one too.

I'm not sure if this is relevant but it's not been mentioned yet so here goes.
I left my machine for a while to see if anything would time out and eventually
I got a bunch of "timed out waiting for $foo" messages from my various
services, from what I could see device-mapper seemed to be the blocking service
- most services were waiting on hwclock which was waiting on device-mapper.

------- Comment #18 From Norberto Bensa 2008-05-31 20:47:11 0000 -------
Hello everyone.

I have attached two deptrees. The first one (x86) is from a box where I'm
having problems. The second one (amd64) is from a notebook where I have not    
experienced any problems at all.

Both boxes are running openrc-0.2.5 with rc_paralell="YES"

HTH,
Norberto

------- Comment #19 From Thomas Lauckner 2008-06-01 09:47:40 0000 -------
Created an attachment (id=155065) [edit]
Deptree

The deptree was attached while booting with sys-apps/openrc-0.2.4-r1 at an x86
machine. (I downgraded 'cause of this bug)

------- Comment #20 From Mike Dransfield 2008-06-02 11:26:07 0000 -------
Same problem here, except this does not hang forever, if you wait > 20 mins
then it will boot.

The service that was hanging WAS readahead-list so once it booted I removed
that from the startup.  NOW  it is timing out on hwclock which I am sure is OK.

It seems to be actual openrc rather than one script which is stopping this

------- Comment #21 From Mike Dransfield 2008-06-02 11:38:11 0000 -------
Some more information...

This bug only affects boot level services, the default level ones seem to work
once openrc gives up on the boot services.

The boot services seem to actually start but they do not report back to openrc.
 This must be the case because I can get a working system once everything times
out.

------- Comment #22 From Mike Dransfield 2008-06-02 11:42:12 0000 -------
Created an attachment (id=155219) [edit]
My deptree amd64 (core 2)

In case it helps ;)

------- Comment #23 From Roy Marples 2008-06-03 14:05:34 0000 -------
OK, I can repo this now. I'll work on a fix ASAP.
For the time being either don't use parallel startup or remove the before clock
depend from /etc/init.d/readahead-list

------- Comment #24 From Roy Marples 2008-06-05 10:18:21 0000 -------
Created an attachment (id=155583) [edit]
Remove broken before dependencies

OK, I think I have this fixed now. Try this patch or the 9999 ebuild in
portage.

------- Comment #25 From Georgi Georgiev 2008-06-05 10:54:59 0000 -------
(In reply to comment #24)
> Created an attachment (id=155583) [edit]
> Remove broken before dependencies
> 
> OK, I think I have this fixed now. Try this patch or the 9999 ebuild in
> portage.
> 

What version is this supposed to apply to? It fails against 0.2.5.

$ cat ../attachment.cgi\?id\=155583 | patch -p1 --dry-run
patching file src/librc/librc-depend.c
Hunk #1 FAILED at 182.
Hunk #2 FAILED at 252.
Hunk #3 FAILED at 683.
Hunk #4 FAILED at 865.
Hunk #5 succeeded at 910 (offset 26 lines).
Hunk #6 succeeded at 1002 with fuzz 2 (offset 26 lines).
4 out of 6 hunks FAILED -- saving rejects to file src/librc/librc-depend.c.rej

------- Comment #26 From Roy Marples 2008-06-05 12:32:38 0000 -------
OK, just try the 9999 git ebuild :)

------- Comment #27 From Georgi Georgiev 2008-06-05 15:21:22 0000 -------
(In reply to comment #26)
> OK, just try the 9999 git ebuild :)
> 

Nothing. Fails as before.

By the way, with rc_parallel="NO" I get
* ERROR: hwclock did not start because fsck failed to start
though this is just an error and the boot goes on until the end.

That happens right before the modules get loaded, which in turn is before I see
some of the partitions get checked... could that be related to the problem?

I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at
my deptree (attachment #154899 [edit]) I see this:

depinfo_35_service='hwclock'
depinfo_35_ibefore_35='hwclock'
depinfo_35_iafter_1='hwclock'

Could a circular dep be causing this issue?

------- Comment #28 From Roy Marples 2008-06-05 15:46:11 0000 -------
(In reply to comment #27)
> Nothing. Fails as before.

OK, please attach a new deptree

> I also have clock_systohc="YES" in /etc/conf.d/hwclock. Actually, if I look at
> my deptree (attachment #154899 [edit] [edit]) I see this:
> 
> depinfo_35_service='hwclock'
> depinfo_35_ibefore_35='hwclock'
> depinfo_35_iafter_1='hwclock'

That shows you've not updated to the new openrc, or the deptree needs a
rebuild.
/lib/rc/bin/rc-depend -u
and try again.

> Could a circular dep be causing this issue?

This entire problem is a circular dep :)
readhead-list has this `need localmount; before clock;`
However, clock has to come before localmount.

------- Comment #29 From Georgi Georgiev 2008-06-07 09:48:37 0000 -------
Created an attachment (id=155795) [edit]
working deptree

(In reply to comment #28)
> (In reply to comment #27)
> > Nothing. Fails as before.
> 
> OK, please attach a new deptree

> That shows you've not updated to the new openrc, or the deptree needs a
> rebuild.
> /lib/rc/bin/rc-depend -u
> and try again.

Works now. I also rebuilt the git version of openrc but I don't think anything
had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed
it.

By the way, this is probably a separate problem but I'll mention it anyway.
Just tell me if you want a separate bug for that.

I have:

chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd 
rc_need="net"
chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount
  net.ath0                  default
  net.eth0                  
  net.lo                    boot
  nfsmount                  default

When I boot I get the following (copying manually the output):

net.ath0: Backgrounding...
net.ath0: WARNING: net.ath0 has started but is inactive
rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started
nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start

Shouldn't nfsmount get scheduled to start when rpc.idmapd has started?

------- Comment #30 From Roy Marples 2008-06-07 10:00:04 0000 -------
(In reply to comment #29)
> > That shows you've not updated to the new openrc, or the deptree needs a
> > rebuild.
> > /lib/rc/bin/rc-depend -u
> > and try again.
> 
> Works now. I also rebuilt the git version of openrc but I don't think anything
> had changed (I have git-e640fdc5 now) so it must be the rc-depend -u that fixed
> it.

Great. Hopefully a dev can add that to the ebuild at the end of pkg_postinst()

> 
> By the way, this is probably a separate problem but I'll mention it anyway.
> Just tell me if you want a separate bug for that.
> 
> I have:
> 
> chutz@wolf ~ $ cat /etc/conf.d/rpc.idmapd 
> rc_need="net"
> chutz@wolf ~ $ eselect rc list | egrep net\\.\|nfsmount
>   net.ath0                  default
>   net.eth0                  
>   net.lo                    boot
>   nfsmount                  default
> 
> When I boot I get the following (copying manually the output):
> 
> net.ath0: Backgrounding...
> net.ath0: WARNING: net.ath0 has started but is inactive
> rpc.idmapd: WARNING: rpc.idmapd is scheduled to start when net.ath0 has started
> nfsmount: ERROR: cannot start nfsmount as rpc.idmapd would not start
> 
> Shouldn't nfsmount get scheduled to start when rpc.idmapd has started?

File a new bug for that please :)

------- Comment #31 From Doug Goldstein 2008-06-09 14:37:46 0000 -------
fixed. Thanks guys

------- Comment #32 From octoploid 2008-06-12 15:30:01 0000 -------
I "upgraded" to version 0.2.5 today and it still hangs on my system.
(I ran lib/rc/bin/rc-depend -u before rebooting).

------- Comment #33 From cucu ionut 2008-07-05 05:53:21 0000 -------
Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and
again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or
with 0.2.5 but it will not start. 
I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken,
right? 

------- Comment #34 From Roy Marples 2008-07-06 06:53:50 0000 -------
(In reply to comment #33)
> Sorry, but fixed how? Every time I do a sync 0.2.5 gets unmasked again and
> again it will fail to start. I've ran /lib/rc/bin/rc-depend -u with 0.2.4-r1 or
> with 0.2.5 but it will not start. 
> I don't argue with being fixed in 9999 version ,but 0.2.5 is still broken,
> right?

No, 0.2.5 is fixed for THIS bug. Maybe you have a different one.
You can mask package versions by editing /etc/portage/package.mask - see the
portage documentation about this. 

------- Comment #35 From cucu ionut 2008-07-06 07:41:48 0000 -------
actually my initial bug was 224447. I also tend to think that is the same
thing, but every time I bump into 0.2.5 it fails to start. Maybe I didn't get
something?

------- Comment #36 From Pedro Coelho 2008-08-02 21:14:09 0000 -------
Created an attachment (id=162033) [edit]
Depgraph after a rc-depend -u

------- Comment #37 From Pedro Coelho 2008-08-02 21:24:17 0000 -------
I still have this problem. Apparently, after analising my depgraph, even after
I run rc-depend -u, fsck and hwclock seem to develop a circular denedency
between them.

This happens to me in both openrc-0.2.5 and -9999 as of today. I attached my
depgraph in the previous comment.

------- Comment #38 From Emilian Huminiuc 2008-08-04 23:22:03 0000 -------
(In reply to comment #37)
Do you, by any chance still have /etc/init.d/clock ?
I was having the same problem. Removed that script and now all works well,
hwclock doesn't complain about fsck, and paralel startup works too.
~amd64, openrc-0.2.5 here. 
So I guess the ebuild should remove the old clock script, when
upgrading/switching to openRC.

HTH

------- Comment #39 From Emilian Huminiuc 2008-08-04 23:29:53 0000 -------
(In reply to comment #38)
Or remove the localmount dependency in that script. 

------- Comment #40 From Pedro Coelho 2008-08-06 00:39:50 0000 -------
Ok, deleting /etc/init.d/clock fixed the problem. On the other hand, clock is
still present at my deptree, even after executing rc-depend -u. 

Since everything works, I'm assuming this is expected, but is it?

First Last Prev Next    No search results available      Search page      Enter new bug