It happened several times, but I have no idea what triggered it. metalog version: 0.8_pre20031130 kernel version: 2.6.17-gentoo-r4 I have console logging enabled on tty11. Lots of things that needed logging - shutdown, login, etc - became impossible. Logins timed out. The console log froze too (didn't display the USB drive I plugged in). When I checked ps ax, I found several DEFUNCT instances of consolelog.sh. After sigkilling metalog, everything went on its way, fixed. I could restart metalog and it would work again. It did this to me right after bootup just now. Couldn't log in, couldn't restart... (Man, I hate hard resetting a Linux box.) I downgraded to 0.7-r1, I will keep you informed if it does the same thing. (Hope not.)
Why don't you use 0.8_rc1-r2 which is stable everywhere???
Well, it's not. Tried it. It did the exact same thing. 0.7-r1 didn't kill the system, it just logged a bit, and then - God knows why - just stopped logging. No error messages, no nothing, just no logging. No logging on console, no logging in files. Dead silence. I unmerged metalog and emerged syslog-ng. I propose the keyword ~amd64 to be set for ALL metalog versions until this is solved.
I've got the same bug with metalog-0.8_rc1-r2 on a suspend2-2.6.18 smp kernel. I can log in immediately after boot, but not a few seconds later. `rc-update del metalog' fixed it. It seems like there is currently no solution to this problem other than rebuilding metalog (a temporary solution, since I did this two days ago but have the same problem again now) or using another logger like syslog-ng. There are other people suffering from this metalog misbehaviour: http://forums.gentoo.org/viewtopic-t-484031-highlight-metalog.html http://forums.gentoo.org/viewtopic-t-500043-highlight-metalog.html?sid=5521630ad18fda4965305337878ae6e4 http://forums.gentoo.org/viewtopic-t-494029-highlight-metalog.html?sid=5521630ad18fda4965305337878ae6e4
Ok...... this can be just me but, can you guys try to replace the very first line of /usr/sbin/consolelog.sh? replace "#!/bin/sh" with "#!/bin/bash" this is stupid since /bin/sh is a link to bash, but it SOLVES the problem.... maybe the problem is in bash and not in metalog nor the script....
still cant reproduce this myself ... the source code looks fine too you could try adding this to the top of your metalog.conf: Metalog : program = "metalog" logdir = "/var/log/metalog" break = 1
ok, i dont think this is consolelog.sh, i think this is any command ... i just observed it on a machine of mine and poking at it with strace makes it look like there's a race condition / dead lock in there somewhere going to rebuild with debugging turned on so i can throw gdb at it if it happens again
Created attachment 105080 [details] gdb output of hung metalog
looking at the gdb backtrace it's readily apparent what the problem is ... metalog uses functions in its signal handler that are not reentrant and according to the POSIX specs, that is not valid usage and may lead to undefined behavior in this case, the localtime() function is allowed to not be reentrant by definition: http://www.opengroup.org/onlinepubs/009695399/functions/localtime.html since the metalog signal handlers call the doLog() function and that calls localtime(), this misbehavior is allowed according to spec in other words, the fault here lies in the implementation of unsafe signal handlers in metalog
(In reply to comment #8) > looking at the gdb backtrace it's readily apparent what the problem is ... > metalog uses functions in its signal handler that are not reentrant and > according to the POSIX specs, that is not valid usage and may lead to undefined > behavior [snip] > in other words, the fault here lies in the implementation of unsafe signal > handlers in metalog Shouldn't that be sufficient to mask the package in the portage tree? The gentoo handbook lists this logger in the install guide. However, metalog - though powerful - makes the whole system unstable, and after looking at the forums at sourceforge it seems this won't be fixed soon. Imagine a new user trying gentoo for the first time, ending up with a system which is constantly hanging for no apparent reason...
no
fixed in 0.8-rc2