Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 101482 - Infinite loop in com_err-1.38.
Summary: Infinite loop in com_err-1.38.
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-08-05 13:43 UTC by Fredrik Tolf
Modified: 2005-11-27 22:46 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fredrik Tolf 2005-08-05 13:43:38 UTC
Ever since sometime after the seperation of com_err and mit-krb5, I've been
experiencing that some processes that use Kerberos hang, while staying runnable
all the time (smells like an infinite loop). In particular, this goes for
saslauthd, but I've seen it in imapd as well.

I attached gdb to such a process while hung in this manner, and it seems that
the error lies in libcom_err. I currently have com_err-1.38 installed. The stack
trace is very long, but here's the top of it:
#0  0xb7ef1a96 in error_message () from /lib/libcom_err.so.2
#1  0xb7f78468 in krb5_locate_kdc () from /usr/lib/libkrb5.so.3
#2  0xb7f77c1b in krb5int_locate_server () from /usr/lib/libkrb5.so.3
[...]

I tried to go further, but the following output from gdb confuses me, since
there is no file called auth.c in e2fsprogs:
[...]
(gdb) f 0
#0  0xb7ef1a96 in error_message () from /lib/libcom_err.so.2
(gdb) list
192     auth.c: No such file or directory.
        in auth.c

The "error_message" function appears to be defined in lib/et/error_message.c,
but it's not even 192 lines long. Of course, I didn't emerge any of the
associated libraries or programs with debugging, so it may well be
understandable that debugging doesn't work perfectly...
Comment 1 Fredrik Tolf 2005-08-08 13:31:42 UTC
I have re-emerged com_err with FEATURES="noclean nostrip" and CFLAGS="-g", and
I've managed to get some more debugging info from an ipop3d process that died
from this.

Apparently, the initial debugging info was, as I suspected, wrong, and the loop
is taking place in error_message.c rather than "auth.c" (whatever that came
from). The actual loop in question is the one from lines 57 to 65 in that file:

    for (et = _et_list; et; et = et->next) {
	if (et->table->base == table_num) {
	    /* This is the right table */
	    if (et->table->n_msgs <= offset)
		goto oops;
	    return(et->table->msgs[offset]);
	}
    }

Apparently, the first element of the _et_list points to itself:
(gdb) p _et_list
$5 = (struct et_list *) 0xb7ba1bc8
(gdb) p *_et_list
$6 = {next = 0xb7ba1bc8, table = 0xb7b9c8c4}

In other words, _et_list and _et_list->next both point to the same address,
which, of course, helps explain why this occurs.

That's all I have for now. Hopefully I'll be able to find out how this happened
_et_list in the first place.
Comment 2 SpanKY gentoo-dev 2005-08-31 21:01:49 UTC
i think ive seen mention of this before ... could you try contacting the
upstream author please ?  tytso@alum.mit.edu
Comment 3 Fredrik Tolf 2005-11-27 13:18:23 UTC
For unknown reasons, this doesn't happen for me anymore. Maybe upstream fixed it?
Comment 4 SpanKY gentoo-dev 2005-11-27 22:46:43 UTC
dunno ... there have been fixes incorporated which bring com_err up to speed
with the version used in mit-krb