Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 177424 - >app-admin/metalog-0.8_rc1-r2 hangs applications writing to syslog when console parameter fails
Summary: >app-admin/metalog-0.8_rc1-r2 hangs applications writing to syslog when conso...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High critical (vote)
Assignee: SpanKY
URL:
Whiteboard:
Keywords:
: 184748 185356 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-05-07 09:20 UTC by Pim Dennendal
Modified: 2009-02-14 19:23 UTC (History)
7 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pim Dennendal 2007-05-07 09:20:26 UTC
Metalog v 0.8-rc4 was popped in as an emerge --sync, etc upgrade on 2006.1. The prior sample and conf refered to the shell script "consolelog.sh" as being in /usr/sbin/ and this worked fine. The ...-rc4 idem ditto, but it put the script in /usr/bin. This error occurs on both x86 and x86_64 (AMD Turion) hardware. It prevents boot from completing, the appearance of any kernel or other log messages either in the log or on the virtual console (tty12, for me) and X-server is unusable. Also dmesg | less shows wtmp contents instead of kernel ring-buffer. I am very surprised this is not already a known issue and also for the for me inexplicable move - I see logging as a "system" task, not an (ordinary) user one.

Reproducible: Always

Steps to Reproduce:
1. Install metalog prior to Version 0.8-rc4 and taylor the conf including log to (virtual) console - ie: uncomment.
2. Upgrade to 0.8-rc4, but do not change the log to (virtual) cons line(s).
3. Re-boot !

Actual Results:  
As in "Steps to Reproduce".

Expected Results:  
Console log on (Virtual) console, log file(s) and ability to log on to system. Also, with Interactive or Single boot, dmesg | less should show the log from the kernel ring buffer not the contents of wtmp - log-on/-off(s) history.

The change in "style" of the sample "metalog.conf" is irrelevant to this problem - only the unexpected/undocumented move of the script. What is happenning is that that vfs buffer pool is filled with log messages and causes the system logger to abnormally terminate with a very high code - 127 I believe.
Comment 1 Pim Dennendal 2007-05-07 09:25:07 UTC
Resolution is simple: just remove the "s" form /usr/bin/consolelog.sh/usr/sbin/consolelog.sh in /etc/metalog.conf.

emerge --info is irrelevant to this problem - and now lost due to the upgrade activities.
Comment 2 Jakub Moc (RETIRED) gentoo-dev 2007-05-07 09:32:07 UTC
http://sources.gentoo.org/viewcvs.py/gentoo-x86/app-admin/metalog/metalog-0.8_rc1-r2.ebuild?r1=1.14&r2=1.15

All ebuilds in the tree install this into /usr/bin because exeinto has no effect on dobin.

<snip>
exeinto /usr/sbin
dobin "${FILESDIR}"/consolelog.sh || die
</snip>
Comment 3 SpanKY gentoo-dev 2007-05-07 10:34:01 UTC
i should update metalog so that it doesnt hang when a command doesnt exist ...
Comment 4 Pim Dennendal 2007-05-09 14:16:42 UTC
(In reply to comment #2)
> All ebuilds in the tree install this into /usr/bin because exeinto has no
> effect on dobin.

In which case, how about the same for the sample metalog.conf file contents. The difference had me running around for a month. After which I decided I would be better off using somebody else's Distro!
Comment 5 Pim Dennendal 2007-05-09 14:20:26 UTC
(In reply to comment #3)
> i should update metalog so that it doesnt hang when a command doesnt exist ...
> 
I agree that is the long-term solution. The PTF is as originally stated and noted in Comment # 2 - My reply.

Personally, I would have done that as pretty well the first thing in the development of the product.

Having said which, It is a fantastic, great, etc version of the System-Logger Service. Thanks.

Comment 6 Workaphobia 2007-06-20 03:51:00 UTC
I just got bitten by this problem today, and it had me scratching my head as I enabled the console logger on the same boot cycle as a bunch of other system changes. In my case the major symptom was that the terminal login program would stall after entering a password.

If I hadn't seen this thread I would've switched to another system logger and not looked back. As it's an exceedingly simple problem and this bug is marked New, I take it that just means no dev has seen this bug yet. What needs to happen to change that?
Comment 7 Jakub Moc (RETIRED) gentoo-dev 2007-07-09 18:42:07 UTC
*** Bug 184748 has been marked as a duplicate of this bug. ***
Comment 8 Jakub Moc (RETIRED) gentoo-dev 2007-07-09 18:43:31 UTC
(In reply to comment #3)
> i should update metalog so that it doesnt hang when a command doesnt exist ...

That's nice, feel free to work on it once you've fixed the broken ebuilds. Thanks. 

Comment 9 Chris Richards 2007-07-09 19:21:29 UTC
(In reply to comment #6)

> If I hadn't seen this thread I would've switched to another system logger and
> not looked back. As it's an exceedingly simple problem and this bug is marked
> New, I take it that just means no dev has seen this bug yet. What needs to
> happen to change that?
> 

That would be my question as well, considering this bug has existed for 2 months, and has the potential (as in my case) to prevent a server from even coming up to the point where you can log in, even from a console.

At the very least, 0.8_rc4 and 0.8 ebuilds should be masked so that x86 and x86_64 platforms don't get wounded by it.

I would be delighted to volunteer my time, except I haven't got the foggiest idea how to manage an e-build.
Comment 10 Jakub Moc (RETIRED) gentoo-dev 2007-07-10 06:38:53 UTC
I'm posting a diff here since the maintainer is clearly more concerned about this bug keywords than about fixing the trivial issues in clearly broken ebuild. Why does this take two months to fix one line in ebuild that causes severe borkage to users, goes beyond me.

<snip>
--- metalog-0.8.ebuild	2007-06-21 09:05:34.000000000 +0200
+++ metalog-0.8.ebuild	2007-07-10 08:35:00.000000000 +0200
@@ -30,7 +30,7 @@
 	newconfd "${FILESDIR}"/metalog.confd metalog
 
 	exeinto /usr/sbin
-	dobin "${FILESDIR}"/consolelog.sh || die
+	doexe "${FILESDIR}"/consolelog.sh || die
 }
 
 pkg_preinst() {
</snip>
Comment 11 SpanKY gentoo-dev 2007-07-10 06:53:40 UTC
perhaps you should learn to stop being a jackass and maybe people would actually listen to you
Comment 12 Jakub Moc (RETIRED) gentoo-dev 2007-07-10 07:01:48 UTC
(In reply to comment #11)
> perhaps you should learn to stop being a jackass

Perhaps you could just fix the bug, which would take less time than what you've actually spent on pointless comments like the one above and on useless messing with this bug's keywords? If you don't want to, that's what the QAcanfix keyword is for - unless you assume that QA in unable to properly change one line in the ebuild so that it actually does what's intended and what it fails to do due to an obvious mistake.

Comment 13 SpanKY gentoo-dev 2007-07-10 07:14:20 UTC
if you want to piss and moan, go somewhere else

i'd rather have a completely broken system than read another comment from you
Comment 14 SpanKY gentoo-dev 2007-07-10 07:38:37 UTC
Pim: i'm pretty sure i already fixed the hang issue if the thing specified by command fails ... can you double check with metalog-0.8 ?
Comment 15 Chris Richards 2007-07-10 14:01:40 UTC
(In reply to comment #14)
> Pim: i'm pretty sure i already fixed the hang issue if the thing specified by
> command fails ... can you double check with metalog-0.8 ?
> 

Both 0.8_rc4 and 0.8 are bugged.  0.8 is the version that took my system down.  Once I got it back up, I downgraded through 0.8_rc4 before settling on 0.8_rc1-r2 as the version that didn't kill my system.  The other versions simply hang, and hang any process that tries to write to syslog.

Hope that helps.
Comment 16 Workaphobia 2007-07-10 18:10:42 UTC
> "if you want to piss and moan, go somewhere else"

He's pissed because the default behavior of a non-masked package is to hang the system when a sample configuration is uncommented, and this default behavior has remained uncorrected, either with a fix to the metalog program itself or to the ebuild, after the problem has been known for months, due to a perceived lack of effort. Whether or not this perception is correct is irrelevant to his being upset.

> "i'd rather have a completely broken system than read another comment from you"

That's all well and good for you, but I for one wouldn't mind listening to someone piss and moan if it saved me the time I spent determining my logger of all things was the reason my system wouldn't boot.

This is pretty much my only experience so far with the gentoo bug system or any bug tracker, so I'm not quite sure what the specific responsibilities of a maintainer are, or what the deal is with QAcanfix. But I would prefer that you not make that kind of decision for the rest of us.
Comment 17 SpanKY gentoo-dev 2007-07-10 18:47:55 UTC
you arent familiar with Jakub ... this is his behavior regardless of the bug and i'm tired of it ... so stick to the issue: metalog, and not the other crap

Chris: can you build metalog with debugging and no stripping and when metalog hangs, attach to it with gdb and get some information ?  i know older versions of metalog had problems in the signal handler with logging, but i fixed those in 0.8 and i cant get metalog to hang on my systems anymore ...
Comment 18 Chris Richards 2007-07-11 03:57:00 UTC
(In reply to comment #17)
> Chris: can you build metalog with debugging and no stripping and when metalog
> hangs, attach to it with gdb and get some information ?  i know older versions
> of metalog had problems in the signal handler with logging, but i fixed those
> in 0.8 and i cant get metalog to hang on my systems anymore ...
> 

I've sent you an e-mail, though I'm afraid what I got isn't going to be very useful to you.
Comment 19 Chris Richards 2007-07-12 04:23:51 UTC
SpanKY, were you able to make use of the info I sent you?  Is there anything else I can do?
Comment 20 Pim Dennendal 2007-07-12 09:40:58 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Pim: i'm pretty sure i already fixed the hang issue if the thing specified by
> > command fails ... can you double check with metalog-0.8 ?
> > 
> 
> Both 0.8_rc4 and 0.8 are bugged.  0.8 is the version that took my system down. 
> Once I got it back up, I downgraded through 0.8_rc4 before settling on
> 0.8_rc1-r2 as the version that didn't kill my system.  The other versions
> simply hang, and hang any process that tries to write to syslog.
> 
> Hope that helps.
> 
Sorry to be so slow in coming back.

It does not really help as the "PTF", ie: the get you out a hole type fix is as originally stated by me, does all that I need it to do at the moment. Perhaps I was not explicit or clear enough? Either simply change the path of the "module" (shell script) to point to where it actually was placed by the installation process or move the "module" to where in my utter stupidity would believe to be the "natural" place to put this type of "module" - ie "/usr/sbin/".

I stated in the original problem report that I had as a matter of course upgraded to "0.8_rc4" and the release of the "base" that I then had in use. Quite frankly I have simply no idea what is being requested of me here.

I have been subjected to an absolutely ginormous amount of being told that I am "legacy", etc., etc. But the way I have always handled these issues at either end of this very terrible stick is to 1) Report the actual cause of the problem along with its symptons 2) Provide a "quick and dirty" way of circumventing - that is avoiding for us normal bozos - the problem(s) and 3) either fix it properly 'cos that is what I used to get paid mega-bucks to do by tiny and entirely unknown labs/companies like IBM, GEC (US one) or pass it to some poor sucker saddled with the doing the job for my client - for some reason they got pretty narked at paying me mega-bucks to do what the supplier promised to do as part of the supplier's fee. Very strange world - just can not get rich quick - not no how not ever.

I would very strongly recommend to the developer(s) that they do their utmost to keep the provided sample - even if commented out - in line with what the poor old "legacy" berks like me get delivered at our end. Also I would add a "dire warning" to the self same sample as to the consequences of getting it wrong. This has effectively already been explicitly stated in other comments.

Slightly off-topic, but ... Almost any provider of "proprietry" software would term this issue as "user error". 'Cos this is "Free" software and far more obviously from the multifarious comments above and below, this and many other issues are being taken seriously by amongst others the developer(s). For the great joy of being permitted to take part in all this - my hartfelt thanks to one and all.
Comment 21 Jakub Moc (RETIRED) gentoo-dev 2007-07-14 22:54:05 UTC
*** Bug 185356 has been marked as a duplicate of this bug. ***
Comment 22 Chris Richards 2007-07-14 23:14:03 UTC
Summary is incorrect: metalog-0.8_rc1-r2 is the only version that works for me.  ALL later versions (0.8_rc2, 0.8_rc4, 0.8, and 0.8-r1) hang my system.  To make matters worse, the only ebuild in the system is 0.8-r1, meaning that if some other poor sap gets nailed by this bug, he's going to have to go dig the e-build for a previous version out of CVS in order to recover his system.

I am willing to help in whatever way my meager skills allow to solve this problem, but whatever the issue is, it is NOT as simple as not being able to find the command file, because the only command file I'm using is consolelog.sh, and I have copied it into both /usr/sbin and /sbin, and I still have the problem.
Comment 23 Jakub Moc (RETIRED) gentoo-dev 2007-07-14 23:15:54 UTC
(In reply to comment #22)
> Summary is incorrect: metalog-0.8_rc1-r2 is the only version that works for me.

The summary is just fine, note the > in front.
Comment 24 Chris Richards 2007-07-14 23:24:35 UTC
Ah, So I see.  Thanks.
Comment 25 SpanKY gentoo-dev 2009-02-14 08:40:42 UTC
unable to reproduce with metalog-0.9
Comment 26 Chris Richards 2009-02-14 19:23:02 UTC
(In reply to comment #25)
> unable to reproduce with metalog-0.9
> 

I recompiled my system in late 2007 as part of a move to a new server, and moved to a newer system profile ( /usr/portage/profiles/hardened/linux/x86).  The net result is that I am currently running Metalog 0.8-rc1 without issues, where previously it would kill my system.  It would thus appear that there was something 'interesting' about my previous system configuration that was interacting with metalog in some way to produce the issues I was having.  As that system configuration no longer exists, there's no way to determine what the problem was, or if the problem still exists.