Bug 289486

Summary:	emerge --regen: "sandbox:main signal SIGQUIT already had a handler ..." confuses eix
Product:	Portage Development	Reporter:	Navid Zamani <navid.zamani>
Component:	Sandbox	Assignee:	Portage team <dev-portage>
Status:	RESOLVED FIXED
Severity:	normal	CC:	matrix47
Priority:	High	Keywords:	InVCS
Version:	unspecified
Hardware:	All
OS:	Linux
See Also:	https://bugs.gentoo.org/show_bug.cgi?id=675828
Whiteboard:
Package list:		Runtime testing required:	---
Bug Depends on:
Bug Blocks:	910332, 349307
Attachments:	portage _exec: disable SIGQUIT handler override by parent

Description Navid Zamani 2009-10-17 12:23:28 UTC

Hello,
there is a problem with emerge --regen, the message mentioned in the summary, and the exit status:
When you run emerge --regen, and this message occurs, emerge returns with an error exit status.
This causes programs like eix-sync to stop furter work and exit too. Especially when running it from a file inside /etc/cron.*/.
Because the message not really is an error, and there are no problems when this message occurs, emerge should not give back an error, so that eix-sync can continue to do its work.

Reproducible: Always

Steps to Reproduce:
1. Have “app-portage/eix” installed.
2. Add “@@emerge --regen” to “/etc/eix-sync.conf” (after the “*”).
3. Run “eix-sync”
Actual Results:  
eix-sync exits right after the successful regen, and never actually executes the required eix-update that it has to do at the end. Which leaves eix in a improper state. This is *not* a problem of eix-sync, because it correctly exits when one of the executed programs exits with an error. It expects that one command depends on previous commands. Wich happens to be true in this case.

Expected Results:  
emerge should not treat the messages as errors, and return with a non-error exit status, indicating success.
Alternatively, the problem with the SIGQUIT handlers should be fixed, so that the message only occurs on actual errors.

Of course this is a problem which only occurs in the teamwork of the sandbox, emerge and eix-sync. But it can only be fixed, if all three programs' maintainers work together. Not by everyone blaming the others. :)

Comment 1 Navid Zamani 2009-10-17 12:26:50 UTC

Warning: This one is the correct bug. Bugs #289487 and #289488 are duplicates, created by a bug in bugzilla. Please ignoe them.

Comment 2 Navid Zamani 2009-10-17 12:28:21 UTC

*** Bug 289484 has been marked as a duplicate of this bug. ***

Comment 3 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev

2009-10-17 12:44:38 UTC

*** Bug 289487 has been marked as a duplicate of this bug. ***

Comment 4 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev

2009-10-17 12:45:46 UTC

*** Bug 289488 has been marked as a duplicate of this bug. ***

Comment 5 SpanKY gentoo-dev

2009-10-17 21:21:25 UTC

sandbox does not exit in any way due to this warning.  it merely writes it to stderr and continues on.

Comment 6 Navid Zamani 2009-10-17 21:23:37 UTC

And what’s the status it exists with at the end? :)

Comment 7 Zac Medico gentoo-dev

2009-10-17 21:32:09 UTC

What versions of portage and sandbox did you see this problem with?

Comment 8 Navid Zamani 2009-10-17 21:38:06 UTC

Right now I have one machine with sandbox-2.1 and portage-2.2_rc46, and another machine with sandbox-1.6-r2 and portage-2.1.6.13. Eix is at 0.18.2 for both systems. They both show the problem. Which itself exists for quite some time now (several version updates, from where I first did put emerge --regen into /etc/eix-sync.conf).

Comment 9 Zac Medico gentoo-dev

2009-10-17 21:43:59 UTC

Hmm, I wonder why I haven't seen this problem. Perhaps something abnormal on you system is causing this SIGQUIT signal to be delivered to the sandbox process? Does it seem happen for all ebuilds or just some?

Comment 10 Navid Zamani 2009-10-17 21:52:52 UTC

have you tried to run “emerge --regen” in “eix-sync”, which you’d call in a cron job. And let your cron software send you the output. I think it only occurs when ran from the cron job (fcron here). I don’t know anything about the inner workings of that, so this is a wild guess: To me it looks like maybe fcron defines a handler for SIGQUIT, and sandbox:main tries to define it too, while only one definition is allowed. Causing the message. Dunno, sorry. :/

Comment 11 SpanKY gentoo-dev

2009-10-17 22:15:36 UTC

the warning from sandbox means that when it registered its signal handler, some parent had already registered it previously.  this is the same behavior that has always existed in sandbox, but semi-recently i added the stderr warning to see if anything was actually doing this.

Comment 12 Navid Zamani 2009-10-17 22:23:48 UTC

Seems that you found something. :)
Now how to handle this? Disable the message? Find the other SIGQUIT handler registerer(?) and fix that one?

Comment 13 SpanKY gentoo-dev

2009-10-18 00:27:04 UTC

this bug doesnt have anything to do with the error message itself.  there is another bug open about that.  that's the whole point -- sandbox is not exiting and emerge's exit status should not be influenced in any way by this harmless message.

Comment 14 Navid Zamani 2009-10-18 00:58:08 UTC

Agreed. But calm down. There’s no point in spreading hate. :)

Comment 15 SpanKY gentoo-dev

2009-10-28 05:54:30 UTC

emerge syncing exits with 0 regardless of this message.  if eix has a problem with the output, that's a bug with eix.

Comment 16 Navid Zamani 2009-10-28 13:32:17 UTC

Are you kidding? When I edit /etc/eix-sync.conf, and replace the line
@@emerge --regen
with
@@emerge --regen || echo "„emerge --regen“ returned the error status $?."
, eix-sync suddenly runs through without problems.

Interesting. Exactly as I predicted. Everybody is blaming the others, and nobody is responsible. This replacement proves that it's emerge --regen. Prove me wrong, re-check your facts, or quit lying. Because this is getting silly.

Comment 17 Navid Zamani 2010-12-19 22:40:49 UTC

By the way: This error also comes from being started by a cron demon.  So it’s still not 100% fixed. I still get the errors. Maybe you can reproduce it better by running eix-sync  from cron.

Comment 18 Martin Väth 2010-12-20 13:58:32 UTC

(In reply to comment #16)
> Everybody is blaming the others, and nobody is responsible.

You made sure that this happens by blaming programs like eix which
have nothing to do with it - which you knew when you posted this bug
as you prove in comment #16.
Now it is assigned to me (eix maintainer) which of course makes no sense
at all, and the portage team is not even on the CC list; thus you latest
comment will go unread.

> Prove me wrong, re-check your facts, or quit lying. Because this is
> getting silly.

However, when I see this style of discussion, I have no urge to help.

Comment 19 Zac Medico gentoo-dev

2010-12-23 19:45:49 UTC

Created attachment 257885 [details, diff]
portage _exec: disable SIGQUIT handler override by parent

Does this help?

patch /usr/lib/portage/pym/portage/process.py sigquit_override.patch

Comment 20 Navid Zamani 2010-12-28 17:26:26 UTC

(In reply to comment #19)
> Does this help?

Thank you, I’ve scheduled a cron.hourly job and when it runs and I have the results in, I’ll check back. :)

Comment 21 Navid Zamani 2010-12-28 20:14:57 UTC

(In reply to comment #19)
> Does this help?

Yes, it does indeed fix the problem. :)
Nice job man!
Some inspirational music as a thank you: http://www.youtube.com/watch?v=k7x51zWOBBs ^^

Comment 22 Navid Zamani 2010-12-28 20:17:05 UTC

Oh wait, we shouldn’t close it, until it’s in portage. Feel free to re-mark as fixed when it’s in the repository. Thanks. :)

Comment 23 Jeremy Olexa (darkside) (RETIRED) archtester

2010-12-28 20:22:25 UTC

well then, assigning back to dev-portage. :)

Comment 24 Zac Medico gentoo-dev

2010-12-28 21:51:12 UTC

Thanks for testing. This is in git now:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=2c2764a400c1fcc17d50aebccd5ec60692722761

Comment 25 Zac Medico gentoo-dev

2010-12-31 10:02:38 UTC

This is fixed in 2.1.9.27.

Comment 26 Navid Zamani 2011-01-03 05:37:50 UTC

Thanks. :)