Bug 98303 - samba 3.0.14a-r1 collecting dead smbd processes with mit-krb-1.4.1
|
Bug#:
98303
|
Product: Gentoo Linux
|
Version: unspecified
|
Platform: x86
|
|
OS/Version: Linux
|
Status: RESOLVED
|
Severity: major
|
Priority: P2
|
|
Resolution: DUPLICATE
|
Assigned To: kerberos@gentoo.org
|
Reported By: frilled@gentoo.org
|
|
Component: Applications
|
|
|
URL:
|
|
Summary: samba 3.0.14a-r1 collecting dead smbd processes with mit-krb-1.4.1
|
|
Keywords:
|
|
Status Whiteboard:
|
|
Opened: 2005-07-07 23:01 0000
|
Samba log shows free() errors like "** glibc detected *** free(): invalid
pointer: 0xb7f77e7c ***" for each connection. After that, the smbd subprocess
hangs and can only be killed with "-9".
This seems to correlate with my rebuild of the toolchain after updating to
linux-headers-2.6.11-r2.
Reproducible: Always
Steps to Reproduce:
1. update linux-headers to -2.6.11-r2 (stable)
2. rebuild toolchain, rebuild samba
3. start samba
Actual Results:
Connect (smblcient -L is enough) and watch # of smbd processes grow.
smbd log file shows lot of "*** free(): invalid pointer:"
Do /etc/init.d/samba stop and find lots of dead smbd processes.
Expected Results:
Working like before.
I cannot yet confirm this but it is a serious issue. I'll try to downgrade the
toolchain again :/
x86: 3.0.14a using 2.6.11-r2 (with 2.6.8.* toolchain) isn't affected
Forgot to say I am on x86 with NPTL.
Well, must be a different problem, then. I rebuilt toolchain based on 2.6.8 (and
samba, just in case) without recognizable change.
Since the problem started occuring a couple of days ago (can't say exactly since
it just became more and more sluggish undtil I finally felt I should look at
things) there not too much that might have changed.
Reverting to kernel 2.6.11-r11 from 2.6.12-r4 did not solce the problem, either.
I'll investigat further :/
I'd like a try with the previous mit-krb5, but I see now that it was removed
from portage. Could you try with USE='-kerberos'? (just for info: you can save
the installed package in /usr/portage/packages with 'quickpkg samba'. Copy it to
a proper location, since the next time you run quickpkg, it will be overwritten)
I have a suspicion about mit-krb5, too. But I have a major problem trying
without kerberos, because I can't access domain resources any longer, then.
Which basically means samba would be as useless as it is right now.
Well, 1.3.6-r2 is still in portage. Maybe I'll try on Monday.
This probably means I am going through libcom_err.so hell again :(
Well, what gives. :(
Ok, I tried it on a on-production server. It really seems to be kerberos. If I
connect to the machine with an anonymous "smbclient -L //machine" and no
password, everything is okay.
As soon as I connect with "smbclient -k -L //machine", it get a hanging smbd.
Well, the good news is: after downgrading to mit-krb5-1.3.6-r2, the described
behaviour is fixed.
Unfortunately though, I can no longer access the shares on my box (neither from
Linux nor from _the_thing_I_will_not_name_).
At least "net ads join" not works without error messages. Ho, ho, ho :/
Got it working again. winbindd going rogue, but now it's as good as before
(there is actually *nothing* good about CIFS/SMB with the exception that it
opens up the friggin' sealed Billyworld).
But there were security issues with mit-krb5 < 1.4, I remember. (?)
wgi, what did you do to get it working?
Masked >=mit-krb-1.4.1 and downgraded to 1.3.6-r2 (security hole!)
rebuilt everything (well, most of, I hope) that linked against the crippled
libcom_err.so, like samba and openssh
Rebooted the system since I had processes hanging so hard i couldn't kill them.
Started samba built agains old mit-krb5, and just to be sure took my boxes out
of the domain and joined them again.
Now, any idea how to make it work with mit-krb-1.4.1?
Does still not work with mit-krb-1.4.1-r2, same behaviour.
*** Bug 98848 has been marked as a duplicate of this bug. ***
wgi -- any chance you and I can talk in real time on IRC? come to freenode so
we can please?
(In reply to comment #14)
> wgi -- any chance you and I can talk in real time on IRC? come to freenode so
> we can please?
I'm on quakenet all day. But just gimme a server, I'll try.
mit-krb5-1.3.6-r3 confirmed working. More info to come.
Ok, rebuilt everything again including new OpenLDAP version.
Behaviour is still the same.
Interestingly, I am not really able to reproduce the problem when I run
"smbd -i". The glibc error occurs, but the process bails out to my shell. If I
just start samba from the initscript, it collects dead processes.
Last lines of output from smbd -i with log level 10 after doing 'smbclient -k
-L
//mymachine':
timeout_processing: End of file from client (client has disconnected).
Closing cache file
namecache_shutdown: netbios namecache closed successfully.
tallocs left:
global talloc allocations in pid: 22412
name chunks bytes
---------------------------------------- -------- --------
end_description 1 159
passdb internal SAM_ACCOUNT allocation 18 514
pdb_context internal allocation context 5 1521
passdb internal SAM_ACCOUNT allocation 8 399
---------------------------------------- -------- --------
TOTAL 32 2593
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
NT user token: (NULL)
UNIX token of user 0
Primary group is 0 and contains 0 supplementary groups
change_to_root_user: now uid=(0,0) gid=(0,0)
Closing connections
attempting to free (and zero) a server_info structure
Yielding connection to
receive_local_message: doing select with timeout of 1 ms
Server exit (normal exit)
*** glibc detected *** free(): invalid pointer: 0xb7f90cbc ***
===============================================================
INTERNAL ERROR: Signal 6 in pid 22412 (3.0.14a)
Please read the appendix Bugs of the Samba HOWTO collection
===============================================================
PANIC: internal error
BACKTRACE: 18 stack frames:
#0 smbd(smb_panic2+0x100) [0x820e154]
#1 smbd(smb_panic+0x19) [0x8211589]
#2 smbd [0x81f86ea]
#3 [0xffffe420]
#4 /lib/tls/libc.so.6(abort+0x1d0) [0xb7c503e8]
#5 /lib/tls/libc.so.6(posix_memalign+0) [0xb7c8a4fd]
#6 /lib/tls/libc.so.6 [0xb7c8920d]
#7 /lib/tls/libc.so.6(__libc_free+0x8c) [0xb7c8814b]
#8 /lib/libcom_err.so.2(remove_error_table+0x55) [0xb7ee0c58]
#9 /usr/lib/libkrb5.so.3 [0xb7f2c752]
#10 /usr/lib/libkrb5.so.3 [0xb7f2c43b]
#11 /usr/lib/libkrb5.so.3 [0xb7f884a6]
#12 /lib/ld-linux.so.2 [0xb7ff6ab4]
#13 /lib/tls/libc.so.6(exit+0x59) [0xb7c51531]
#14 smbd(exit_server+0x21b) [0x8298cbe]
#15 smbd(main+0x524) [0x82993ab]
#16 /lib/tls/libc.so.6(__libc_start_main+0xed) [0xb7c3c15d]
#17 smbd [0x8079501]
Abgebrochen
libcom_err?
Preliminary info:
ss-1.38 + com_err-1.38 + mit-krb5-1.4.1-r2 => works
Okay, 16 hours and some stress testing later I'd say it is stable. So,
upgraded to
- sys-libs/ss-1.38
- sys-libs/com_err-1.38
after having built mit-krb-1.4.1-r2
and samba is up and running. Case solved for me at least.
Thank you, Seemant!
thanks for helping me fix, WGi :)
*** This bug has been marked as a duplicate of 95283 ***