Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 189578 - net-misc/openssh-4.6_p1-r3 breaks LDAP auth
Summary: net-misc/openssh-4.6_p1-r3 breaks LDAP auth
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: High major (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-20 11:44 UTC by vad3R
Modified: 2007-09-04 15:33 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Ldap query log (ldap_log.txt,14.72 KB, text/plain)
2007-08-20 16:49 UTC, vad3R
Details
SSHD debug log (ltrace) (debug_log,6.21 KB, text/plain)
2007-08-24 15:37 UTC, vad3R
Details

Note You need to log in before you can comment on or make changes to this bug.
Description vad3R 2007-08-20 11:44:44 UTC
After upgrading openssh from 4.5_p1-r1p to 4.6_p1-r3 LPK LDAP authentication is broken. Included in this updates are changes to pam.d/sshd. Even without ignoring this config-update nothing works anymore. It's possible to switch to a LDAP account after logging in as root

Reproducible: Always

Steps to Reproduce:
1. login with a ldap account
2. see how it fails
3.

Actual Results:  
No login is granted. The auth.log shows that the user doesn't belong to the group defined. The group membership is correct.

Expected Results:  
Login should be permitted

Here's the sshd_config:

UseLPK yes
#LpkLdapConf /etc/ldap.conf
LpkServers  ldap://1.2.3.4
LpkUserDN   ou=users,ou=dmz-auth,o=dc,c=de,dc=company,dc=com
LpkGroupDN  ou=groups,ou=dmz-auth,o=dc,c=de,dc=company,dc=com
LpkBindDN   cn=Auth,dc=company,dc=com
LpkBindPw   Just_a_poor_password
LpkServerGroup www1
LpkForceTLS  no
LpkSearchTimelimit 3
LpkBindTimelimit 3

Here are example messages from the auth.log

Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'admin'
Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'admin'
Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'admin' 
Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'admin'
Comment 1 Rob Holland 2007-08-20 13:05:52 UTC
It's not clear to me how a config with 'LpkServerGroup www1' would result in an error message regarding 'admin'.

This is definitely the config which is being used on the failing machine?
Comment 2 vad3R 2007-08-20 13:24:49 UTC
You're right. I was experimenting and created a test group admin. It should be "Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'ww1'. Sorry for that mistake.
Comment 3 Rob Holland 2007-08-20 15:38:29 UTC
Sorry to be picky, but you meant: www1 right? Your last comment shows the line as:

Aug 20 13:26:12 www1 sshd[12622]: [LDAP] 'dkerwin' is not in 'ww1'

Assuming that was a typo when commenting here and not in the config:

Please show the output of: ldapsearch <whatever options you need to auth here> "(&(objectClass=posixGroup)(|(cn=www1)(memberUid=dkerwin))"

and:

ldapsearch <whatever options you need to auth here> "(&(objectClass=posixAccount)(objectClass=ldapPublicKey)(uid=dkerwin))"

censoring as required.

LdapServerGroup is functioning OK locally here with the latest version of the patch, so it's not clear where the problem is at the moment.
Comment 4 vad3R 2007-08-20 16:18:23 UTC
# ldapsearch -h 1.2.3.4 -D "cn=User,dc=company,dc=com" -W -x "(&(objectClass=posixGroup)(cn=www1)(memberUid=dkerwin))"
Enter LDAP Password: 
# extended LDIF
#
# LDAPv3
# base <> with scope subtree
# filter: (&(objectClass=posixGroup)(cn=www1)(memberUid=dkerwin))
# requesting: ALL
#

# www1, groups, dmz-auth, company.com
dn: cn=www1,ou=groups,ou=dmz-auth,dc=company,dc=com
cn: www1
gidNumber: 1100
description: SSH login group for www1
objectClass: posixGroup
objectClass: top
memberUid: dkerwin

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1


And here's the user query.

# dkerwin, users, dmz-auth, company.com
dn: uid=dkerwin,ou=users,ou=dmz-auth,dc=company,dc=com
uid: dkerwin
mail: john.doe@noname.de
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
objectClass: posixAccount
objectClass: top
objectClass: shadowAccount
objectClass: ldapPublicKey
shadowMax: 99999
shadowWarning: 7
homeDirectory: /home/dkerwin
gecos: Daniel Kerwin
sshPublicKey: ssh-dss xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx .......
uidNumber: 12345
cn: Daniel Kerwin
loginShell: /usr/local/bin/my_shell
shadowLastChange: 1234567
gidNumber: 12345
sn: dkerwin

Comment 5 Rob Holland 2007-08-20 16:25:56 UTC
The LpkUserDN/LpkGroupDn don't match the results of the query you should. You have 'o=dc,c=de,' in there. Assuming that also may be a transcription error, can you please show query logs from the ldap server while a login is attempted? I cannot see why ldapsearch would return valid answers and LPK not :/ 
Comment 6 vad3R 2007-08-20 16:49:56 UTC
Created attachment 128701 [details]
Ldap query log
Comment 7 vad3R 2007-08-20 16:50:52 UTC
What's really strange is that if i switch to the user i can see that all group memberships are fine. :-(
Comment 8 Rob Holland 2007-08-22 12:15:07 UTC
Well, lpk and pam/nss_ldap are not using the same code, so it's possible for them to disagree. Having said that, I cannot see a reason for the failure you see I'm afraid. I've no real clue where to go from here.. :/ It's definitely the upgrade that broke it? Can you try reverting and see if that works?
Comment 9 vad3R 2007-08-22 12:21:44 UTC
I can revert but i still have servers running the "old version" with the identical config and they work. As a mentioned earlier this only fails with SSH. The only change to the systems was a openssh update :-(
Comment 10 Rob Holland 2007-08-22 12:33:59 UTC
Ok, please install ltrace and try this, replacing paths/filenames as required:

ltrace -l $(ldd /usr/sbin/sshd|grep ldap|cut -d' ' -f1) /usr/sbin/sshd -d -p 8022 > /tmp/debug_log

Try to connect to the sshd on port 8022, then please send me a copy of the debug_log file, either attach it to the bug or mail me it.
Comment 11 vad3R 2007-08-22 15:28:00 UTC
Ok. I'll do this but have to wait to tomorrow. I'll attach the file asap.

Thanks
Comment 12 vad3R 2007-08-24 15:37:46 UTC
Created attachment 129079 [details]
SSHD debug log (ltrace)

SSHD debug log (ltrace)
Comment 13 Rob Holland 2007-08-24 16:04:31 UTC
The logs show the ldap query timing out. Please try increasing LpkSearchTimelimit and LpkBindTimelimit and see if that changes anything.
Comment 14 vad3R 2007-08-24 16:33:27 UTC
Changed the timelimits to 60s for both lines. Doesn't change anything. Password is requested max 2 seconds after starting the ssh session. Doesn't look like a timeout for me...
Comment 15 vad3R 2007-08-27 12:35:21 UTC
I did same testing today and maybe it has to do with a change to /etc/pam.d/sshd which is included in the update. This is the sshd config before the update (where it worked):

auth       include      system-auth
auth       required     pam_shells.so
auth       required     pam_nologin.so
account    include      system-auth
password   include      system-auth
session    include      system-auth

and here the config after the update:

auth       required     pam_shells.so
auth       required     pam_nologin.so
auth       include      system-auth
account    include      system-auth
password   include      system-auth
session    include      system-auth

Here's my system-auth:

auth       required     pam_env.so
auth       sufficient   pam_unix.so likeauth nullok
auth       sufficient   /lib/security/$ISA/pam_ldap.so use_first_pass
auth       required     pam_deny.so

account    required     pam_unix.so
account    sufficient   pam_ldap.so

password   required     pam_cracklib.so difok=2 minlen=8 dcredit=2 ocredit=2 retry=3
password   sufficient   pam_unix.so nullok md5 shadow use_authtok
password   required     pam_ldap.so use_authtok
password   required     pam_deny.so

session    required     pam_limits.so
session    required     pam_unix.so
session    required     pam_mkhomedir.so skel=/etc/skel/ umask=0077
session    optional     pam_ldap.so



Comment 16 SpanKY gentoo-dev 2007-08-27 13:23:26 UTC
make sure you're using latest shadow and you've properly updated your pam.d files via etc-update
Comment 17 vad3R 2007-08-27 13:46:23 UTC
(In reply to comment #16)
> make sure you're using latest shadow and you've properly updated your pam.d
> files via etc-update
> 

The systems run shadow 4.0.18.1-r and i updated all configs after the ssh update. It works with old version but the new ssh version including the pam update doesn't work. I even checked the ldap querys from the openldap log and they return the right entry. I'm running out of ideas...

Comment 18 vad3R 2007-09-03 15:06:16 UTC
There's a solution for the problem. Neil send an email to me containing the following information:

Removing the lines

LpkSearchTimelimit 3
LpkBindTimelimit   3 

makes it work like charm. Even if this gives me a working setup there's a bug in the patch or the Gentoo integration. These parameters should be configurable...

Thanks to Neil
Comment 19 SpanKY gentoo-dev 2007-09-03 22:35:56 UTC
the fact that they're in the config file sounds like they're configurable to me
Comment 20 Neil Scholten 2007-09-04 07:55:27 UTC
I got my bugzilla-password, finally 8).

From the docs at http://dev.inversepath.com/openssh-lpk/ they should be configureable.
But they changed the default-values from
http://dev.inversepath.com/openssh-lpk/openssh-lpk-4.5p1-0.3.8.patch :
+	options->lpk.b_timeout.tv_sec = 0;
+	options->lpk.s_timeout.tv_sec = 0;

to
http://dev.inversepath.com/openssh-lpk/openssh-lpk-4.6p1-0.3.9.patch :
+	options->lpk.b_timeout.tv_sec = -1;
+	options->lpk.s_timeout.tv_sec = -1;

Maybe they broke sth. with this change. I'm not a C-Professional, nor have spare time today to check this, so I'm not sure if it's related to this change.

--
Kind regards,

neil

Comment 21 Rob Holland 2007-09-04 15:11:02 UTC
vapier: I'm not sure on what basis you've decided this is resolved, but it isn't.

That said, I cannot replicate any problems with the setting of either timeout here. The change to set the values to -1 on start was to fix a bug with the values being reset, and is working correctly as far as I can tell.

Please run under gdb with the values set in a config file as you had them when it was not working, break on servconf.c:294, step through that code and watch if it is assigning the values from the config file. When you've reached line 298, run: print options->lpk.s_timeout
and: print options->lpk.b_timeout

And let me know what is shown.
Comment 22 vad3R 2007-09-04 15:33:31 UTC
(In reply to comment #21)
> vapier: I'm not sure on what basis you've decided this is resolved, but it
> isn't.
> 
> That said, I cannot replicate any problems with the setting of either timeout
> here. The change to set the values to -1 on start was to fix a bug with the
> values being reset, and is working correctly as far as I can tell.
> 
> Please run under gdb with the values set in a config file as you had them when
> it was not working, break on servconf.c:294, step through that code and watch
> if it is assigning the values from the config file. When you've reached line
> 298, run: print options->lpk.s_timeout
> and: print options->lpk.b_timeout
> 
> And let me know what is shown.
> 

I will start debugging it when i got some free time. It may no be resolved but i can use my systems again. This is a huge improvement and makes me happy. I'll paste the results soon.