First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 8607
Alias:
Product:
Component:
Status: RESOLVED
Resolution: TEST-REQUEST
Assigned To: Heinrich Wendel (RETIRED) <lanius@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Sascha Silbe <sascha-gentoo-bugzilla@silbe.org>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
metalog-spawn.diff Patch to make it multiprocess patch Jedi/Sector One 2003-04-16 16:02 0000 1.37 KB Details | Diff
metalog.patch this is a patch made against metalog-0.6-r10 patch roger 2003-05-06 15:42 0000 1.38 KB Details | Diff
metalog.patch this is a patch made against metalog-0.6-r10 application/octet-stream roger 2003-05-06 15:42 0000 1.38 KB Details
metalog.patch this is a patch made against metalog-0.6-r10 text/plain roger 2003-05-06 15:42 0000 1.38 KB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 8607 depends on: 3434 Show dependency tree
Show dependency graph
Bug 8607 blocks: 20593
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2002-10-01 07:20 0000
From time to time, metalog hangs. Because it handles syslog requests, many
other processes hang, too.
First I thought it would occur after a long uptime, but today it happened 10
hours after the reboot.
There's no apparent cause for this, so I cannot really reproduce this. I guess
it's some kind of race condition.
strace shows that the two metalog processes are waiting for pause() and
syslog() to return:

root@cube:/home/sascha# strace -ff -p 6239
pause(

root@cube:/home/sascha# strace -ff -p 6247
syslog(0x2, 0xbffff518, 0x800

It happened with gcc 2.95.x and 3.2 and with Kernel 2.4.18 and 2.4.19.

Any suggestion how to reproduce or at least trace this (without generating
gigabytes of logs during normal operation)?

------- Comment #1 From SpanKY 2002-10-01 10:04:45 0000 -------
well, anything that the kernel logs gets stored in its internal buffer and is 
sent to a user space logging daemon. 
 
so you could type `dmesg` to see if there are any funky kernel logs ... 

------- Comment #2 From Sascha Silbe 2002-10-03 06:51:07 0000 -------
It just happened again. There's nothing new in the kernel buffer (dmesg
output). Sending a CONT does not help, sending a HUP just tells metalog to
exit:

=== Begin screenshot 1 ===
root@cube:/home/sascha# strace -p 6312
pause()                                 = ? ERESTARTNOHAND (To be restarted)
--- SIGCONT (Continued) ---
pause()                                 = ? ERESTARTNOHAND (To be restarted)
--- SIGHUP (Hangup) ---
write(2, "Unlinking pid file: /var/run/met"..., 41) = -1 EBADF (Bad file
descriptor)
unlink("/var/run/metalog.pid")          = 0
getpid()                                = 6312
write(2, "Process [6312] died with signal "..., 36) = -1 EBADF (Bad file
descriptor)
kill(6320, SIGTERM)                     = 0
--- SIGCHLD (Child exited) ---
wait4(-1, NULL, WNOHANG, NULL)          = 6320
write(2, "Klog child [6320] died\n", 23) = -1 EBADF (Bad file descriptor)
wait4(-1, NULL, WNOHANG, NULL)          = -1 ECHILD (No child processes)
munmap(0x40018000, 4096)                = 0
munmap(0x40017000, 4096)                = 0
munmap(0x4001a000, 4096)                = 0
munmap(0x40019000, 4096)                = 0
munmap(0x40016000, 4096)                = 0
_exit(1)                                = ?
=== End screenshot 1 ===

=== Begin screenshot 2 ===
root@cube:/home/sascha# strace -p 6320
syslog(0x2, 0xbffff518, 0x800)          = ? ERESTARTSYS (To be restarted)
--- SIGCONT (Continued) ---
syslog(0x2, 0xbffff518, 0x800)          = ? ERESTARTSYS (To be restarted)
--- SIGHUP (Hangup) ---
--- SIGTERM (Terminated) ---
=== End screenshot 2 ===

The last few lines of dmesg output:

=== Begin dmesg ===
eth0: no IPv6 routers present
ipsec0: no IPv6 routers present
NVRM: AGPGART: allocated 136 pages
NVRM: AGPGART: freed 136 pages
NVRM: AGPGART: allocated 42 pages
NVRM: AGPGART: freed 42 pages
=== End dmesg ===

The old syslog file:

=== Begin /var/log/syslog/current.old ===
Oct  2 23:34:09 [pluto] "cube" #13: initiating Main Mode to replace #12
Oct  2 23:34:10 [pluto] "cube" #13: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  2 23:34:10 [pluto] "cube" #13: ISAKMP SA established
Oct  2 23:38:39 [sSMTP mail] sendmail sent mail for sascha
Oct  2 23:52:37 [sSMTP mail] sendmail sent mail for sascha
Oct  3 00:09:45 [kernel] NVRM: AGPGART: allocated 136 pages
Oct  3 00:11:14 [kernel] NVRM: AGPGART: freed 136 pages
Oct  3 00:24:52 [pluto] "cube" #14: initiating Main Mode to replace #13
Oct  3 00:24:52 [pluto] "cube" #14: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 00:24:52 [pluto] "cube" #14: ISAKMP SA established
Oct  3 00:33:40 [sSMTP mail] sendmail sent mail for sascha
Oct  3 00:42:21 [kernel] NVRM: AGPGART: allocated 42 pages
Oct  3 00:42:46 [kernel] NVRM: AGPGART: freed 42 pages
Oct  3 01:00:07 [sSMTP mail] /usr/lib/sendmail sent mail for root
Oct  3 01:10:20 [pluto] "cube" #15: initiating Main Mode to replace #14
Oct  3 01:10:21 [pluto] "cube" #15: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 01:10:21 [pluto] "cube" #15: ISAKMP SA established
Oct  3 01:55:16 [pluto] "cube" #16: responding to Quick Mode
Oct  3 01:55:16 [pluto] "cube" #16: IPsec SA established
Oct  3 02:00:28 [pluto] "cube" #17: initiating Main Mode to replace #15
Oct  3 02:00:28 [pluto] "cube" #17: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 02:00:28 [pluto] "cube" #17: ISAKMP SA established
Oct  3 02:06:39 [pluto] "cube" #17: ignoring Delete SA payload
Oct  3 02:06:39 [pluto] "cube" #17: received and ignored informational message
Oct  3 02:42:41 [pluto] "cube" #18: initiating Main Mode to replace #17
Oct  3 02:42:41 [pluto] "cube" #18: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 02:42:41 [pluto] "cube" #18: ISAKMP SA established
Oct  3 03:29:48 [pluto] "cube" #19: initiating Main Mode to replace #18
Oct  3 03:29:49 [pluto] "cube" #19: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 03:29:49 [pluto] "cube" #19: ISAKMP SA established
Oct  3 04:13:40 [pluto] "cube" #20: initiating Main Mode to replace #19
Oct  3 04:13:40 [pluto] "cube" #20: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 04:13:40 [pluto] "cube" #20: ISAKMP SA established
Oct  3 04:55:54 [pluto] "cube" #21: initiating Main Mode to replace #20
Oct  3 04:55:54 [pluto] "cube" #21: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 04:55:54 [pluto] "cube" #21: ISAKMP SA established
Oct  3 05:38:06 [pluto] "cube" #22: initiating Main Mode to replace #21
Oct  3 05:38:06 [pluto] "cube" #22: Peer ID is ID_IPV4_ADDR: '192.168.1.1'
Oct  3 05:38:06 [pluto] "cube" #22: ISAKMP SA established
Oct  3 06:22:36 [pluto] "cube" #23: initiating Main Mode to replace #22
=== End /var/log/syslog/current.old ===

The new file (i.e. after restarting metalog) contains these new entries:

=== Begin /var/log/syslog/current ===
Oct  3 13:40:53 [sshd(pam_unix)] session closed for user root
Oct  3 13:41:28 [pluto] "cube" #54: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #53: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #52: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #51: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #50: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #49: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #48: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #47: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #46: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #45: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #44: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #43: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #42: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #41: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #40: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #39: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #38: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #37: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #36: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #35: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #34: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #33: max number of retransmissions (2) reached
STATE_MAIN_R1
Oct  3 13:41:28 [pluto] "cube" #32: max number of retransmissions (2) reached
STATE_QUICK_R1
Oct  3 13:41:28 [pluto] "cube" #31: max number of retransmissions (2) reached
STATE_QUICK_R1
Oct  3 13:41:28 [pluto] "cube" #30: max number of retransmissions (2) reached
STATE_QUICK_R1
Oct  3 13:41:28 [pluto] "cube" #29: max number of retransmissions (2) reached
STATE_QUICK_R1
Oct  3 13:41:28 [pluto] "cube" #28: max number of retransmissions (2) reached
STATE_QUICK_R1
=== End /var/log/syslog/current ===

Seems like it has to do with syslog, not klog.
Any suggestion how to proceed?

------- Comment #3 From Marcel Köppen 2003-01-20 05:31:23 0000 -------
I have the same problem here, and it seems to depend on the kernel I use. With
2.4.20-pre8-ac3 everything works as expected, but with vanilla 2.4.20,
2.4.20-ck1 and 2.4.21-pre3 metalog stops working after some time.

------- Comment #4 From Martin Zwickel 2003-02-12 02:59:09 0000 -------
i have the same problem!
it just sux...*argh*
i try to find a solution for my own, until it gets fixed.

------- Comment #5 From Wouter Deconinck 2003-03-30 05:20:37 0000 -------
I have the same problem.

Also, logging in is not possible anymore (not as a user, not as root).  I type the username (if not using su), then I get "Password:".  If I type the password <ENTER>, nothing happens.  On the console I get a 60 seconds timeout.

My internet connection is also down while I have this problem (vpn, using pptp and pppd).

These problems are caused both by the logger (su want to write an entry, probably pppd also wants to write something, and they have to wait...).
Two choices: kernelproblem (mine is 2.4.20) with sending logentries or problem with metalog (0.6-r10) receiving logentries.


I have put some diagnostics here about su.

=== where does it go wrong for 'su'? ===
# strace su > su.crash
# reboot
# strace su > su.normal
# diff su.crash su.normal
<cut>
3556c3472,3508
< send(3, "<37>Mar 30 00:16:14 su(pam_unix)"..., 133, 0 <unfinished ...>
---
> send(3, "<37>Mar 30 00:44:40 su(pam_unix)"..., 133, 0) = 133
<cut>
... the rest is only executed in the normal case (although strace somehow doesn't allow me to log in).
========================================

------- Comment #6 From Jens Kreiensiek 2003-04-02 02:40:19 0000 -------
Same problem here too! My configuration is
- gentoo-sources-2.4.20-r2
- metalog-0.6-r10

------- Comment #7 From simon 2003-04-06 07:31:30 0000 -------
I have the same problem with metalog stopping from time to time with
gentoo-2.4.20-r2 Kernel.

------- Comment #8 From Nils Ohlmeier 2003-04-11 09:26:11 0000 -------
I have the same problem here. And it is definatly a metalog problem, 
because i had a reproduceable freeze with a program on which i develop. 
But the same freeze did not occured on my laptop where sysklogd is 
runing. 
The problem occures with synchronization turn on and without. 
 
My workaround is that i have always a root console open from where i can 
restart metalog, which solve the problem temporarly. 
If the system do not shut down a tripple hit on CRTL-ALT-DEL brings the 
system down. But sadly wihout unmounting the disks. 
 
I just installed a version of metalog with debuging symbols. Maybe i can 
deliver some gdb output. 
 
For completness: 
Portage 2.0.47-r10 (default-x86-1.4, gcc-3.2.2, glibc-2.3.1-r4) 
================================================================= 
System uname: 2.4.20-gentoo-r2 i686 AMD Athlon(TM) XP 1900+ 
GENTOO_MIRRORS="ftp://ftp.tu-clausthal.de/pub/linux/gentoo/ 
ftp://ftp.ibiblio.org/pub/linux/distribution/gentoo " 
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config 
/usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config 
/usr/share/config" 
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" 
PORTDIR="/usr/portage" 
DISTDIR="/usr/portage/distfiles" 
PKGDIR="/usr/portage/packages" 
PORTAGE_TMPDIR="/var/tmp" 
PORTDIR_OVERLAY="/usr/local/portage" 
USE="x86 libg++ mikmod gdbm slang readline svga tcltk tcpd libwww perl 
python motif 3dnow acpi acpi4linux alsa -apm arts avi berkdb -bonobo cdr 
crypt cups dga directfb dvb dvd encode -esd -evo fbcon flash -gb gif 
-gnome gphoto2 gpm gtk imap imlib innodb java jpeg -ldap kde maildir 
matrox mbox mmx mozilla mpeg mysql ncurses nls odbc oggvorbis opengl 
oss pam pda pdflib pic png qt qtmt quicktime samba sasl scanner sdl slp 
spell sse ssl tiff tetex truetype wmf X xml2 xmms xv zlib" 
COMPILER="gcc3" 
CHOST="i686-pc-linux-gnu" 
CFLAGS="-mcpu=athlon-xp -O3 -pipe -fomit-frame-pointer -fPIC" 
CXXFLAGS="-mcpu=athlon-xp -O3 -pipe -fomit-frame-pointer -fPIC" 
ACCEPT_KEYWORDS="x86" 
MAKEOPTS="-j2" 
AUTOCLEAN="yes" 
SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" 
FEATURES="sandbox ccache" 

------- Comment #9 From Nils Ohlmeier 2003-04-11 15:11:06 0000 -------
Here backtrace of the problem. I dont have the time right now to 
investigate this further so i just store it here: 
 
cloudcity root # gdb /usr/sbin/metalog 8033 
GNU gdb 5.3 
Copyright 2002 Free Software Foundation, Inc. 
GDB is free software, covered by the GNU General Public License, and 
you are 
welcome to change it and/or distribute copies of it under certain 
conditions. 
Type "show copying" to see the conditions. 
There is absolutely no warranty for GDB.  Type "show warranty" for 
details. 
This GDB was configured as "i686-pc-linux-gnu"... 
Attaching to program: /usr/sbin/metalog, process 8033 
Reading symbols from /usr/lib/libpcre.so.0...done. 
Loaded symbols for /usr/lib/libpcre.so.0 
Reading symbols from /lib/libc.so.6...done. 
Loaded symbols for /lib/libc.so.6 
Reading symbols from /lib/ld-linux.so.2...done. 
Loaded symbols for /lib/ld-linux.so.2 
0x43bdb9a7 in pause () from /lib/libc.so.6 
(gdb) bt 
#0  0x43bdb9a7 in pause () from /lib/libc.so.6 
#1  0x0804b14b in spawnCommand (command=0x7 <Address 0x7 out of 
bounds>, 
    date=0x804d920 "Apr 11 20:24:20", prg=0x804bf4c "kernel", 
    info=0xbfffefc3 "usb-storage: queuecommand() called") at 
metalog.c:716 
#2  0x0804b2e1 in processLogLine (logcode=7, date=0x804d920 "Apr 11 
20:24:20", 
    prg=0x804bf4c "kernel", info=0xbfffefc3 "usb-storage: queuecommand() 
called") 
    at metalog.c:772 
#3  0x0804b4f8 in process (sockets=0xbffff810) at metalog.c:854 
#4  0x0804ba8d in main (argc=6, argv=0xbffff874) at metalog.c:1058 
#5  0x43b45dc4 in __libc_start_main () from /lib/libc.so.6 
(gdb) 

------- Comment #10 From Nils Ohlmeier 2003-04-12 05:43:47 0000 -------
After looking at the code and my backtrace i have a question to the people 
who have the problem: do you use console logging from the end of the 
configuration? 
 
It seems that this is the cause of the problem. I just disabled my console 
logging and will report if this will fix the problem. 

------- Comment #11 From Sascha Silbe 2003-04-12 06:48:59 0000 -------
Yes, I do use console logging. Will disable it temporarily, too.


------- Comment #12 From Christoph Probst 2003-04-16 08:43:19 0000 -------
I'm able to trigger this bug using fetchnews comming with the newsserver
leafnode.  I just have to run four or five fetchnews processes at the same time
(just by starting them with fetchnews &) and metalog freezes reproducable.
(Because of fetchnews bad timeout behaviour the bug is triggert by crond from
time to time)

The bug only appeares when metalogs console-logging is activated. I use:

| chris@starbed2:/etc/metalog$ tail -4 metalog.conf
| console loggin :
|
|   facitity = "*"
|   command = "/usr/sbin/consolelog.sh"

| chris@starbed2:/etc/metalog$ cat /usr/sbin/consolelog.sh
| #!/bin/sh
| echo "$1 [%2] $3" > /dev/vc/10
| ...


The metalog master stucks at

| #0 0x400ce9c7 in pause () from /lib/libc.so.6
| #1 0x0804b15b in spawnCommand (command=0x3e <Adress 0x3e out of bounds>, ...)
| #2 0x0804b2f1 in processLogLine

Looking at the source:

| static int spawnCommand(const char * const command, ...)

is called by processLogLine() in line 772.

| spawnCommand(block->command, date, prg, info);

While "command" is out of bounce "block->command" isn't and its value is

| (gdb) p block->command
| $4 = 0x804ea38 "/usr/sbin/consolelog.sh"


Ok, now. Who can tell what happens there?

------- Comment #13 From Nils Ohlmeier 2003-04-16 15:20:36 0000 -------
I think the out of bounds is an error in the debugger. Because if the 
command address would not be accessible, the stat call in line 705 should 
fail (probably with a seg fault). 
I fear the problem lies more in the way how they wait for the return of the 
external command with the pause() and the signal handler which should 
change command_child value.  
One idea is that this way of programming is not mutli process save, 
because the signal handler could change the value command_child 
between the while check and pause call. And then pause call will never be 
interrupted. 

------- Comment #14 From Nils Ohlmeier 2003-04-16 15:32:25 0000 -------
Sorry forgot to mention that the problem (at least for me) only occurs with 
console logging activated. I ran my system one day without console 
logging, which resulted in no metalog hanging during the hole day (where i 
have >3 blocks per day with console logging). 
And just someone comes to the idea the new 0.7 could solve the problem: 
no it does not. 
 
So the workaround for gentoo is easy: remove the console logging from 
the config file or at least document in the config file that it is risky to 
activate this because of the possible blocks. 
 
I'll try to point the metalog developers to this bug and my concern about 
their programming solution. 

------- Comment #15 From Jedi/Sector One 2003-04-16 16:02:27 0000 -------
Created an attachment (id=10754) [edit]
Patch to make it multiprocess

This patch (untested) against 0.7 should avoid Metalog waiting for processes to
complete before going on with logging.

------- Comment #16 From Christoph Probst 2003-04-17 19:17:40 0000 -------
Your patch seems to solve the problem.  I used it and wasn't able to trigger
the bug anymore.  To cross-check I reinstalled metalog without patch and the
bug reappeared.  Everything seems to be ok now. :-)

------- Comment #17 From Nils Ohlmeier 2003-04-18 12:20:05 0000 -------
Yes, allthough the solution of the patch is not best, it works. And that 
counts. With the patch applied i had no lockup for the last 36 hours. 
 
With a new ebuild which applies this patch we can close this bug :-) 
A backport of the patch to 0.6 should not be hard, because the code did 
not changed in the relevant areas. 

------- Comment #18 From Jedi/Sector One 2003-04-18 12:39:07 0000 -------
I will release 0.8 (with the patch and some minor changes) in a few days. I'll
submit 
the new ebuild as soon as it will be released. 

Thanks again to all Gentoo Linux freaks not only for the cool distro, but also
for their 
help, reactivity and coolness. 

-Frank. 


------- Comment #19 From Grant Goodyear 2003-04-20 18:55:50 0000 -------
Waiting a few days to see if 0.8 is released as promised.
(Also taking over this bug, since I have a few others related
to metalog.  Unfortunately, I don't currently use metalog because it's
still lacking remote logging.)

------- Comment #20 From roger 2003-05-06 14:05:31 0000 -------
Found this bug also and it makes the system quite unstable if one doesn't have
a root window open to restart metalog.  (As one can see from my posts to the
gentoo-user mailling list :-/)

------- Comment #21 From roger 2003-05-06 15:42:06 0000 -------
Created an attachment (id=11594) [edit]
this is a patch made against metalog-0.6-r10

This was done after a ebuild <metalog.ebuild> unpack (or after all other
patches were applied.  This is mearily the same patch submitted but back-ported
to version 0.6-r10. 

This problem turned up as soon as I installed procmail/postfix/spamassassin
combo... guess the logging done by postfix & spamassassin greatly irritates the
problem.

I would strongly suggest that this patch be incorporated or that users use
sysklogd instead.

The only thing left to do is to modify the ebuild file to incorporate the
patch/hack.  If this patch is not recommend, maybe masking the ebuild file or
something.

------- Comment #22 From roger 2003-05-06 15:42:44 0000 -------
Created an attachment (id=11595) [edit]
this is a patch made against metalog-0.6-r10

This was done after a ebuild <metalog.ebuild> unpack (or after all other
patches were applied.  This is mearily the same patch submitted but back-ported
to version 0.6-r10. 

This problem turned up as soon as I installed procmail/postfix/spamassassin
combo... guess the logging done by postfix & spamassassin greatly irritates the
problem.

I would strongly suggest that this patch be incorporated or that users use
sysklogd instead.

The only thing left to do is to modify the ebuild file to incorporate the
patch/hack.  If this patch is not recommend, maybe masking the ebuild file or
something.

------- Comment #23 From roger 2003-05-06 15:42:52 0000 -------
Created an attachment (id=11596) [edit]
this is a patch made against metalog-0.6-r10

This was done after a ebuild <metalog.ebuild> unpack (or after all other
patches were applied.  This is mearily the same patch submitted but back-ported
to version 0.6-r10. 

This problem turned up as soon as I installed procmail/postfix/spamassassin
combo... guess the logging done by postfix & spamassassin greatly irritates the
problem.

I would strongly suggest that this patch be incorporated or that users use
sysklogd instead.

The only thing left to do is to modify the ebuild file to incorporate the
patch/hack.  If this patch is not recommend, maybe masking the ebuild file or
something.

------- Comment #24 From roger 2003-05-06 15:46:56 0000 -------
sorry about the triple post. bugzilla borked/errored during the send for some
reason and appeared to fail sending.

------- Comment #25 From roger 2003-05-07 13:49:35 0000 -------
ok. i give up on metalog.  not only did it have this bug, but it also prevents
my cardbus/pcmcia for entirely working!  somehow metalog prevents the second
(or the first) pcmcia slot from working.  As such, I only get one pcmcia slot
working.  Very odd how a system logger will have so many bugs in it!  from a
little more research, looks like syslog-ng is the actual contender here.  and
this patch mearly hacks the system freeze from occurring but the metalog dameon
will still freeze (logging will freeze..permanently?).  this metalog package
*should* be masked! ..completely. lol.

------- Comment #26 From Peter Simons 2003-05-19 07:33:44 0000 -------
Just wondring: Is there anything going on regarding fixing metalog? (I have
these hangs, too, since I upgraded to kernel 2.4.20). The 0.8 version someone
mentioned a month ago doesn't seem to appear.

------- Comment #27 From Martin Holzer (RETIRED) 2003-06-29 14:03:51 0000 -------
*** Bug 18384 has been marked as a duplicate of this bug. ***

------- Comment #28 From jani@iv.ro 2003-08-14 07:47:27 0000 -------
I tracked down this bug myself just to find that it has been found months ago
on gentoo and still there's no bugfix ebuild. At least it was educative for me.

------- Comment #29 From Martin Holzer (RETIRED) 2003-10-09 09:22:57 0000 -------
is this fixed in 0.7 ?

------- Comment #30 From Nils Ohlmeier 2003-10-09 10:08:13 0000 -------
No as i wrote already before 0.7 does not fix the problem.
But you can apply one of the patches from the head of this bug (they are
all the same), they fix the problem.
Personaly i run metalog without console-loging to prevent trouble. But as
there does not seem to be 0.8 available until today i would prefer if one
of the gentoo developers could create a metalog-0.7-r2.ebuild which applies
the patch to finaly close this bug.

------- Comment #31 From Heinrich Wendel (RETIRED) 2003-10-09 12:22:39 0000 -------
I contacted the developer, he said he forgot about 0.8 but will provide it
soon.

------- Comment #32 From Heinrich Wendel (RETIRED) 2003-10-17 04:31:11 0000 -------
*** Bug 31277 has been marked as a duplicate of this bug. ***

------- Comment #33 From Jonathan Manning 2003-10-24 11:05:41 0000 -------
I don't have console logging enabled, and it still happens. I don't want
to be another "me too", but this seems to be different than most reports
here. Perhaps it happens when I 'tail -f /var/log/everything/current', but
I haven't noticed a strong correlation there.

In addition to my primary machine with Gentoo already setup, I'm having this
same problem with a LiveCD install (1.4-rc4, yes I know 1.4 is out...). Metalog
is the logger for the LiveCD, and it's exhibiting the same hanging symptoms
(that a '/etc/init.d/metalog restart' fixes).

I think my solution is just to move to syslog-ng on my primary machine. I'll
give metalog one more try using "~x86" to see if unstable fixes it first.
I second the request for new 0.7 rev to add this patch.

The issues with the LiveCD are a major showstopper for an install. Is this
fixed in 1.4, or is 1.4 == 1.4-rc4?


------- Comment #34 From Joerg Schaible 2003-10-25 11:17:09 0000 -------
Been hit by this problem, too. Added me to cc. Wanna be informed about anything
new.

------- Comment #35 From Seemant Kulleen (RETIRED) 2003-11-12 13:03:08 0000 -------
heinrich, any further contact with the developer?

------- Comment #36 From Heinrich Wendel (RETIRED) 2003-11-13 05:13:08 0000 -------
not yet, i'll write him another mail

------- Comment #37 From Heinrich Wendel (RETIRED) 2003-11-30 12:44:40 0000 -------
I added a snapshot of the cvs tree (not touched since 6 month), the bug should
be fixed in this version, please test.

------- Comment #38 From Heinrich Wendel (RETIRED) 2003-11-30 12:46:57 0000 -------
(you have to wait for the tgz to be synced to the mirrors)

------- Comment #39 From Joerg Schaible 2003-12-18 13:28:52 0000 -------
Just to give a feedback: I've been running the new version now for more than
two weeks and I had no occurrence of this issue anymore.

------- Comment #40 From Paul Tötterman 2003-12-19 04:45:38 0000 -------
I'd like to add my experiences to this. With gentoo-sources-2.4.20-r[89] I
haven't had any problems with metalog. But when upgrading to
gentoo-dev-sources-2.6.0-test* and 2.6.0 I've had the same kind of login
failures as described here. And yes, I was using console-logging.

------- Comment #41 From Evan Powers 2004-02-22 11:02:31 0000 -------
I've just started testing kernel 2.6.3 on my system. No problems with my old
2.4 kernel, but the new one exhibits this bug. Console logging enabled of
course. I emerged the ~x86 metalog-0.8_pre20031130; it appears to resolve the
problem.

------- Comment #42 From Thomas R. (TRauMa) 2004-02-22 19:50:04 0000 -------
Currently stable metalog-0.8-CVS WFM. I'm still a bit uncomfortable with
metalog, now that I read all the comments. From the install guide sect. 10:

"Gentoo offers several system loggers to choose from. There are sysklogd, which
is the traditional set of system logging daemons, msyslog, a flexible system
logger with a modularized design, syslog-ng, an advanced system logger, and
metalog which is a highly-configurable system logger.

If you can't choose one, use syslog-ng as it is very powerful yet comes with a
great default configuration. "

Perhaps some words of warning against metalog, now that it seems to be quite
unmaintained?

------- Comment #43 From Richard Scott 2007-04-09 04:23:23 0000 -------
I'm still getting this problem....and I'm using v0.8_rc4

I don't use console logging, but I do execute external bash scripts via
metalog.conf

Can I be of any help to de-bug as this is getting worse for me.

My "emerge --info" is as follows:

Portage 2.1.2.2 (hardened/x86/2.6, gcc-3.4.6, glibc-2.3.6-r5,
2.6.18-hardened-r6 i686)
=================================================================
System uname: 2.6.18-hardened-r6 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz
Gentoo Base System release 1.12.9
Timestamp of tree: Sun, 08 Apr 2007 22:50:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632)
[disabled]
ccache version 2.4 [disabled]
dev-lang/python:     2.3.5-r3, 2.4.3-r4
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.4-r6
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.60
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.14
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.17-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=i686 -O2 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /var/bind"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/php/apache1-php5/ext-active/
/etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/
/etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo
/etc/texmf/web2c"
CXXFLAGS="-march=i686 -O2 -pipe -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="ftp://212.219.56.132/sites/www.ibiblio.org/gentoo/
ftp://212.219.56.135/sites/www.ibiblio.org/gentoo/
ftp://212.219.56.138/sites/www.ibiblio.org/gentoo/
http://212.219.56.135/sites/www.ibiblio.org/gentoo/
ftp://ftp.free.fr/mirrors/ftp.gentoo.org/"
LC_ALL="C"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --delete-after --stats --timeout=180
--exclude=/distfiles --exclude=/local --exclude=/packages
--filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/opt/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="apache2 apm berkdb bzip2 crypt curl curlwrappers gd gdbm gif gmp gpm
hardened idn innodb jpeg libg++ libwww midi mysql ncurses nls nptl nptlonly pam
pcre perl php pic png python readline session snmp ssl tcpd tetex tiff truetype
winbind x86 xml xml2 xorg zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix
dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter
mulaw multi null plug rate route share shm softvol" ELIBC="glibc"
INPUT_DEVICES="mouse keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz
cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LDFLAGS, LINGUAS,
PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

------- Comment #44 From SpanKY 2007-04-09 04:31:03 0000 -------
file a new bug please ... the only way to really figure this out is to build
metalog with debugging and when it hangs, attach to the process with gdb and
run a backtrace

First Last Prev Next    No search results available      Search page      Enter new bug