Bug 177424 - >app-admin/metalog-0.8_rc1-r2 hangs applications writing to syslog when console parameter fails
|
Bug#:
177424
|
Product: Gentoo Linux
|
Version: unspecified
|
Platform: All
|
|
OS/Version: Linux
|
Status: RESOLVED
|
Severity: critical
|
Priority: P2
|
|
Resolution: WORKSFORME
|
Assigned To: vapier@gentoo.org
|
Reported By: pim@go.big-orange.com
|
|
Component: Ebuilds
|
|
|
URL:
|
|
Summary: >app-admin/metalog-0.8_rc1-r2 hangs applications writing to syslog when console parameter fails
|
|
Keywords:
|
|
Status Whiteboard:
|
|
Opened: 2007-05-07 09:20 0000
|
Metalog v 0.8-rc4 was popped in as an emerge --sync, etc upgrade on 2006.1. The
prior sample and conf refered to the shell script "consolelog.sh" as being in
/usr/sbin/ and this worked fine. The ...-rc4 idem ditto, but it put the script
in /usr/bin. This error occurs on both x86 and x86_64 (AMD Turion) hardware. It
prevents boot from completing, the appearance of any kernel or other log
messages either in the log or on the virtual console (tty12, for me) and
X-server is unusable. Also dmesg | less shows wtmp contents instead of kernel
ring-buffer. I am very surprised this is not already a known issue and also for
the for me inexplicable move - I see logging as a "system" task, not an
(ordinary) user one.
Reproducible: Always
Steps to Reproduce:
1. Install metalog prior to Version 0.8-rc4 and taylor the conf including log
to (virtual) console - ie: uncomment.
2. Upgrade to 0.8-rc4, but do not change the log to (virtual) cons line(s).
3. Re-boot !
Actual Results:
As in "Steps to Reproduce".
Expected Results:
Console log on (Virtual) console, log file(s) and ability to log on to system.
Also, with Interactive or Single boot, dmesg | less should show the log from
the kernel ring buffer not the contents of wtmp - log-on/-off(s) history.
The change in "style" of the sample "metalog.conf" is irrelevant to this
problem - only the unexpected/undocumented move of the script. What is
happenning is that that vfs buffer pool is filled with log messages and causes
the system logger to abnormally terminate with a very high code - 127 I
believe.
Resolution is simple: just remove the "s" form
/usr/bin/consolelog.sh/usr/sbin/consolelog.sh in /etc/metalog.conf.
emerge --info is irrelevant to this problem - and now lost due to the upgrade
activities.
i should update metalog so that it doesnt hang when a command doesnt exist ...
(In reply to comment #2)
> All ebuilds in the tree install this into /usr/bin because exeinto has no
> effect on dobin.
In which case, how about the same for the sample metalog.conf file contents.
The difference had me running around for a month. After which I decided I would
be better off using somebody else's Distro!
(In reply to comment #3)
> i should update metalog so that it doesnt hang when a command doesnt exist ...
>
I agree that is the long-term solution. The PTF is as originally stated and
noted in Comment # 2 - My reply.
Personally, I would have done that as pretty well the first thing in the
development of the product.
Having said which, It is a fantastic, great, etc version of the System-Logger
Service. Thanks.
I just got bitten by this problem today, and it had me scratching my head as I
enabled the console logger on the same boot cycle as a bunch of other system
changes. In my case the major symptom was that the terminal login program would
stall after entering a password.
If I hadn't seen this thread I would've switched to another system logger and
not looked back. As it's an exceedingly simple problem and this bug is marked
New, I take it that just means no dev has seen this bug yet. What needs to
happen to change that?
*** Bug 184748 has been marked as a duplicate of this bug. ***
(In reply to comment #3)
> i should update metalog so that it doesnt hang when a command doesnt exist ...
That's nice, feel free to work on it once you've fixed the broken ebuilds.
Thanks.
(In reply to comment #6)
> If I hadn't seen this thread I would've switched to another system logger and
> not looked back. As it's an exceedingly simple problem and this bug is marked
> New, I take it that just means no dev has seen this bug yet. What needs to
> happen to change that?
>
That would be my question as well, considering this bug has existed for 2
months, and has the potential (as in my case) to prevent a server from even
coming up to the point where you can log in, even from a console.
At the very least, 0.8_rc4 and 0.8 ebuilds should be masked so that x86 and
x86_64 platforms don't get wounded by it.
I would be delighted to volunteer my time, except I haven't got the foggiest
idea how to manage an e-build.
I'm posting a diff here since the maintainer is clearly more concerned about
this bug keywords than about fixing the trivial issues in clearly broken
ebuild. Why does this take two months to fix one line in ebuild that causes
severe borkage to users, goes beyond me.
<snip>
--- metalog-0.8.ebuild 2007-06-21 09:05:34.000000000 +0200
+++ metalog-0.8.ebuild 2007-07-10 08:35:00.000000000 +0200
@@ -30,7 +30,7 @@
newconfd "${FILESDIR}"/metalog.confd metalog
exeinto /usr/sbin
- dobin "${FILESDIR}"/consolelog.sh || die
+ doexe "${FILESDIR}"/consolelog.sh || die
}
pkg_preinst() {
</snip>
perhaps you should learn to stop being a jackass and maybe people would
actually listen to you
(In reply to comment #11)
> perhaps you should learn to stop being a jackass
Perhaps you could just fix the bug, which would take less time than what you've
actually spent on pointless comments like the one above and on useless messing
with this bug's keywords? If you don't want to, that's what the QAcanfix
keyword is for - unless you assume that QA in unable to properly change one
line in the ebuild so that it actually does what's intended and what it fails
to do due to an obvious mistake.
if you want to piss and moan, go somewhere else
i'd rather have a completely broken system than read another comment from you
Pim: i'm pretty sure i already fixed the hang issue if the thing specified by
command fails ... can you double check with metalog-0.8 ?
(In reply to comment #14)
> Pim: i'm pretty sure i already fixed the hang issue if the thing specified by
> command fails ... can you double check with metalog-0.8 ?
>
Both 0.8_rc4 and 0.8 are bugged. 0.8 is the version that took my system down.
Once I got it back up, I downgraded through 0.8_rc4 before settling on
0.8_rc1-r2 as the version that didn't kill my system. The other versions
simply hang, and hang any process that tries to write to syslog.
Hope that helps.
> "if you want to piss and moan, go somewhere else"
He's pissed because the default behavior of a non-masked package is to hang the
system when a sample configuration is uncommented, and this default behavior
has remained uncorrected, either with a fix to the metalog program itself or to
the ebuild, after the problem has been known for months, due to a perceived
lack of effort. Whether or not this perception is correct is irrelevant to his
being upset.
> "i'd rather have a completely broken system than read another comment from you"
That's all well and good for you, but I for one wouldn't mind listening to
someone piss and moan if it saved me the time I spent determining my logger of
all things was the reason my system wouldn't boot.
This is pretty much my only experience so far with the gentoo bug system or any
bug tracker, so I'm not quite sure what the specific responsibilities of a
maintainer are, or what the deal is with QAcanfix. But I would prefer that you
not make that kind of decision for the rest of us.
you arent familiar with Jakub ... this is his behavior regardless of the bug
and i'm tired of it ... so stick to the issue: metalog, and not the other crap
Chris: can you build metalog with debugging and no stripping and when metalog
hangs, attach to it with gdb and get some information ? i know older versions
of metalog had problems in the signal handler with logging, but i fixed those
in 0.8 and i cant get metalog to hang on my systems anymore ...
(In reply to comment #17)
> Chris: can you build metalog with debugging and no stripping and when metalog
> hangs, attach to it with gdb and get some information ? i know older versions
> of metalog had problems in the signal handler with logging, but i fixed those
> in 0.8 and i cant get metalog to hang on my systems anymore ...
>
I've sent you an e-mail, though I'm afraid what I got isn't going to be very
useful to you.
SpanKY, were you able to make use of the info I sent you? Is there anything
else I can do?
(In reply to comment #15)
> (In reply to comment #14)
> > Pim: i'm pretty sure i already fixed the hang issue if the thing specified by
> > command fails ... can you double check with metalog-0.8 ?
> >
>
> Both 0.8_rc4 and 0.8 are bugged. 0.8 is the version that took my system down.
> Once I got it back up, I downgraded through 0.8_rc4 before settling on
> 0.8_rc1-r2 as the version that didn't kill my system. The other versions
> simply hang, and hang any process that tries to write to syslog.
>
> Hope that helps.
>
Sorry to be so slow in coming back.
It does not really help as the "PTF", ie: the get you out a hole type fix is as
originally stated by me, does all that I need it to do at the moment. Perhaps I
was not explicit or clear enough? Either simply change the path of the "module"
(shell script) to point to where it actually was placed by the installation
process or move the "module" to where in my utter stupidity would believe to be
the "natural" place to put this type of "module" - ie "/usr/sbin/".
I stated in the original problem report that I had as a matter of course
upgraded to "0.8_rc4" and the release of the "base" that I then had in use.
Quite frankly I have simply no idea what is being requested of me here.
I have been subjected to an absolutely ginormous amount of being told that I am
"legacy", etc., etc. But the way I have always handled these issues at either
end of this very terrible stick is to 1) Report the actual cause of the problem
along with its symptons 2) Provide a "quick and dirty" way of circumventing -
that is avoiding for us normal bozos - the problem(s) and 3) either fix it
properly 'cos that is what I used to get paid mega-bucks to do by tiny and
entirely unknown labs/companies like IBM, GEC (US one) or pass it to some poor
sucker saddled with the doing the job for my client - for some reason they got
pretty narked at paying me mega-bucks to do what the supplier promised to do as
part of the supplier's fee. Very strange world - just can not get rich quick -
not no how not ever.
I would very strongly recommend to the developer(s) that they do their utmost
to keep the provided sample - even if commented out - in line with what the
poor old "legacy" berks like me get delivered at our end. Also I would add a
"dire warning" to the self same sample as to the consequences of getting it
wrong. This has effectively already been explicitly stated in other comments.
Slightly off-topic, but ... Almost any provider of "proprietry" software would
term this issue as "user error". 'Cos this is "Free" software and far more
obviously from the multifarious comments above and below, this and many other
issues are being taken seriously by amongst others the developer(s). For the
great joy of being permitted to take part in all this - my hartfelt thanks to
one and all.
*** Bug 185356 has been marked as a duplicate of this bug. ***
Summary is incorrect: metalog-0.8_rc1-r2 is the only version that works for me.
ALL later versions (0.8_rc2, 0.8_rc4, 0.8, and 0.8-r1) hang my system. To
make matters worse, the only ebuild in the system is 0.8-r1, meaning that if
some other poor sap gets nailed by this bug, he's going to have to go dig the
e-build for a previous version out of CVS in order to recover his system.
I am willing to help in whatever way my meager skills allow to solve this
problem, but whatever the issue is, it is NOT as simple as not being able to
find the command file, because the only command file I'm using is
consolelog.sh, and I have copied it into both /usr/sbin and /sbin, and I still
have the problem.
(In reply to comment #22)
> Summary is incorrect: metalog-0.8_rc1-r2 is the only version that works for me.
The summary is just fine, note the > in front.
unable to reproduce with metalog-0.9
(In reply to comment #25)
> unable to reproduce with metalog-0.9
>
I recompiled my system in late 2007 as part of a move to a new server, and
moved to a newer system profile ( /usr/portage/profiles/hardened/linux/x86).
The net result is that I am currently running Metalog 0.8-rc1 without issues,
where previously it would kill my system. It would thus appear that there was
something 'interesting' about my previous system configuration that was
interacting with metalog in some way to produce the issues I was having. As
that system configuration no longer exists, there's no way to determine what
the problem was, or if the problem still exists.