Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 98303 - samba 3.0.14a-r1 collecting dead smbd processes with mit-krb-1.4.1
Summary: samba 3.0.14a-r1 collecting dead smbd processes with mit-krb-1.4.1
Status: RESOLVED DUPLICATE of bug 95283
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Gentoo Kerberos Maintainers
URL:
Whiteboard:
Keywords:
: 98848 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-07-07 23:01 UTC by Wolf Giesen (RETIRED)
Modified: 2005-07-18 07:53 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolf Giesen (RETIRED) gentoo-dev 2005-07-07 23:01:10 UTC
Samba log shows free() errors like "** glibc detected *** free(): invalid
pointer: 0xb7f77e7c ***" for each connection. After that, the smbd subprocess
hangs and can only be killed with "-9".

This seems to correlate with my rebuild of the toolchain after updating to
linux-headers-2.6.11-r2.


Reproducible: Always
Steps to Reproduce:
1. update linux-headers to -2.6.11-r2 (stable)
2. rebuild toolchain, rebuild samba
3. start samba

Actual Results:  
Connect (smblcient -L is enough) and watch # of smbd processes grow.

smbd log file shows lot of "*** free(): invalid pointer:"

Do /etc/init.d/samba stop and find lots of dead smbd processes.


Expected Results:  
Working like before.

I cannot yet confirm this but it is a serious issue. I'll try to downgrade the
toolchain again :/
Comment 1 Christian Andreetta (RETIRED) gentoo-dev 2005-07-08 07:26:44 UTC
x86: 3.0.14a using 2.6.11-r2 (with 2.6.8.* toolchain) isn't affected
Comment 2 Wolf Giesen (RETIRED) gentoo-dev 2005-07-08 07:46:11 UTC
Forgot to say I am on x86 with NPTL.

Well, must be a different problem, then. I rebuilt toolchain based on 2.6.8 (and
samba, just in case) without recognizable change.

Since the problem started occuring a couple of days ago (can't say exactly since
it just became more and more sluggish undtil I finally felt I should look at
things) there not too much that might have changed.

Reverting to kernel 2.6.11-r11 from 2.6.12-r4 did not solce the problem, either.

I'll investigat further :/
Comment 3 Christian Andreetta (RETIRED) gentoo-dev 2005-07-08 08:24:53 UTC
I'd like a try with the previous mit-krb5, but I see now that it was removed
from portage. Could you try with USE='-kerberos'? (just for info: you can save
the installed package in /usr/portage/packages with 'quickpkg samba'. Copy it to
a proper location, since the next time you run quickpkg, it will be overwritten)
Comment 4 Wolf Giesen (RETIRED) gentoo-dev 2005-07-08 08:30:52 UTC
I have a suspicion about mit-krb5, too. But I have a major problem trying
without kerberos, because I can't access domain resources any longer, then.
Which basically means samba would be as useless as it is right now.
Comment 5 Wolf Giesen (RETIRED) gentoo-dev 2005-07-08 08:35:41 UTC
Well, 1.3.6-r2 is still in portage. Maybe I'll try on Monday.
This probably means I am going through libcom_err.so hell again :(

Well, what gives. :(
Comment 6 Wolf Giesen (RETIRED) gentoo-dev 2005-07-10 22:22:21 UTC
Ok, I tried it on a on-production server. It really seems to be kerberos. If I
connect to the machine with an anonymous "smbclient -L //machine" and no
password, everything is okay.

As soon as I connect with "smbclient -k -L //machine", it get a hanging smbd.
Comment 7 Wolf Giesen (RETIRED) gentoo-dev 2005-07-10 23:25:22 UTC
Well, the good news is: after downgrading to mit-krb5-1.3.6-r2, the described
behaviour is fixed.

Unfortunately though, I can no longer access the shares on my box (neither from
Linux nor from _the_thing_I_will_not_name_).

At least "net ads join" not works without error messages. Ho, ho, ho :/
Comment 8 Wolf Giesen (RETIRED) gentoo-dev 2005-07-11 00:17:56 UTC
Got it working again. winbindd going rogue, but now it's as good as before
(there is actually *nothing* good about CIFS/SMB with the exception that it
opens up the friggin' sealed Billyworld).

But there were security issues with mit-krb5 < 1.4, I remember. (?)
Comment 9 Seemant Kulleen (RETIRED) gentoo-dev 2005-07-11 07:25:19 UTC
wgi, what did you do to get it working?
Comment 10 Wolf Giesen (RETIRED) gentoo-dev 2005-07-11 08:49:43 UTC
Masked >=mit-krb-1.4.1 and downgraded to 1.3.6-r2 (security hole!)

rebuilt everything (well, most of, I hope) that linked against the crippled
libcom_err.so, like samba and openssh

Rebooted the system since I had processes hanging so hard i couldn't kill them.

Started samba built agains old mit-krb5, and just to be sure took my boxes out
of the domain and joined them again.
Comment 11 Wolf Giesen (RETIRED) gentoo-dev 2005-07-12 22:01:18 UTC
Now, any idea how to make it work with mit-krb-1.4.1?
Comment 12 Wolf Giesen (RETIRED) gentoo-dev 2005-07-13 02:09:40 UTC
Does still not work with mit-krb-1.4.1-r2, same behaviour.
Comment 13 Seemant Kulleen (RETIRED) gentoo-dev 2005-07-13 04:31:29 UTC
*** Bug 98848 has been marked as a duplicate of this bug. ***
Comment 14 Seemant Kulleen (RETIRED) gentoo-dev 2005-07-13 05:00:01 UTC
wgi -- any chance you and I can talk in real time on IRC?  come to freenode so
we can please?
Comment 15 Wolf Giesen (RETIRED) gentoo-dev 2005-07-13 05:06:06 UTC
(In reply to comment #14)
> wgi -- any chance you and I can talk in real time on IRC?  come to freenode so
> we can please?

I'm on quakenet all day. But just gimme a server, I'll try.
Comment 16 Wolf Giesen (RETIRED) gentoo-dev 2005-07-13 06:05:21 UTC
mit-krb5-1.3.6-r3 confirmed working. More info to come.
Comment 17 Wolf Giesen (RETIRED) gentoo-dev 2005-07-13 23:47:51 UTC
Ok, rebuilt everything again including new OpenLDAP version.

Behaviour is still the same.

Interestingly, I am not really able to reproduce the problem when I run 
"smbd -i". The glibc error occurs, but the process bails out to my shell. If I
just start samba from the initscript, it collects dead processes.
Comment 18 Wolf Giesen (RETIRED) gentoo-dev 2005-07-13 23:54:30 UTC
Last lines of output from smbd -i with log level 10 after doing 'smbclient -k -L
//mymachine':




timeout_processing: End of file from client (client has disconnected).
Closing cache file
namecache_shutdown: netbios namecache closed successfully.
tallocs left:
global talloc allocations in pid: 22412
name                                       chunks    bytes
---------------------------------------- -------- --------
end_description                                 1      159
passdb internal SAM_ACCOUNT allocation         18      514
pdb_context internal allocation context         5     1521
passdb internal SAM_ACCOUNT allocation          8      399
---------------------------------------- -------- --------
TOTAL                                          32     2593

setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
NT user token: (NULL)
UNIX token of user 0
Primary group is 0 and contains 0 supplementary groups
change_to_root_user: now uid=(0,0) gid=(0,0)
Closing connections
attempting to free (and zero) a server_info structure
Yielding connection to 
receive_local_message: doing select with timeout of 1 ms
Server exit (normal exit)
*** glibc detected *** free(): invalid pointer: 0xb7f90cbc ***
===============================================================
INTERNAL ERROR: Signal 6 in pid 22412 (3.0.14a)
Please read the appendix Bugs of the Samba HOWTO collection
===============================================================
PANIC: internal error
BACKTRACE: 18 stack frames:
 #0 smbd(smb_panic2+0x100) [0x820e154]
 #1 smbd(smb_panic+0x19) [0x8211589]
 #2 smbd [0x81f86ea]
 #3 [0xffffe420]
 #4 /lib/tls/libc.so.6(abort+0x1d0) [0xb7c503e8]
 #5 /lib/tls/libc.so.6(posix_memalign+0) [0xb7c8a4fd]
 #6 /lib/tls/libc.so.6 [0xb7c8920d]
 #7 /lib/tls/libc.so.6(__libc_free+0x8c) [0xb7c8814b]
 #8 /lib/libcom_err.so.2(remove_error_table+0x55) [0xb7ee0c58]
 #9 /usr/lib/libkrb5.so.3 [0xb7f2c752]
 #10 /usr/lib/libkrb5.so.3 [0xb7f2c43b]
 #11 /usr/lib/libkrb5.so.3 [0xb7f884a6]
 #12 /lib/ld-linux.so.2 [0xb7ff6ab4]
 #13 /lib/tls/libc.so.6(exit+0x59) [0xb7c51531]
 #14 smbd(exit_server+0x21b) [0x8298cbe]
 #15 smbd(main+0x524) [0x82993ab]
 #16 /lib/tls/libc.so.6(__libc_start_main+0xed) [0xb7c3c15d]
 #17 smbd [0x8079501]
Abgebrochen




libcom_err?
Comment 19 Wolf Giesen (RETIRED) gentoo-dev 2005-07-14 06:14:50 UTC
Preliminary info:

ss-1.38 + com_err-1.38 + mit-krb5-1.4.1-r2 => works
Comment 20 Wolf Giesen (RETIRED) gentoo-dev 2005-07-14 21:59:07 UTC
Okay, 16 hours and some stress testing later I'd say it is stable. So,

upgraded to
- sys-libs/ss-1.38
- sys-libs/com_err-1.38

after having built mit-krb-1.4.1-r2

and samba is up and running. Case solved for me at least.

Thank you, Seemant!
Comment 21 Seemant Kulleen (RETIRED) gentoo-dev 2005-07-18 07:53:43 UTC
thanks for helping me fix, WGi :)

*** This bug has been marked as a duplicate of 95283 ***