11:17 < tomk> looks like gannet/godwit have rebooted again, can someone kick ssh? 11:26 <@antarus> tomk: hi 11:26 <@antarus> tomk: no reboots 11:26 <@antarus> tomk: forums problems? 11:26 < tomk> antarus: it won't let me in via ssh so it's the ldap thing 11:27 <@antarus> Feb 9 11:16:53 godwit sshd[25847]: [LDAP] no keys found for 'tomk'! 11:27 <@antarus> Feb 9 11:16:53 godwit sshd[25847]: [LDAP] no keys found for 'tomk'! 11:27 <@antarus> tomk: anyone file a bug for that yet? 11:27 < tomk> not sure 11:27 <@antarus> tomk: try godwit 11:28 < tomk> godwit works but it didn't earlier 11:28 <@antarus> yes I restarted ssh on both 11:29 <@tampakrap> antarus: there is a known ldap/pam issue, christian and robin know more 11:29 <@tampakrap> or not, because it's not fixed :) 11:29 <@antarus> Feb 9 11:26:26 godwit sshd[26272]: pam_ldap: ldap_starttls_s: Connect error 11:29 <@antarus> no one knows jack about it Feb 9 11:16:53 godwit sshd[25847]: [LDAP] no keys found for 'tomk'! Feb 9 11:16:53 godwit sshd[25847]: [LDAP] no keys found for 'tomk'! Feb 9 11:26:26 godwit sshd[26272]: [LDAP] no keys found for 'antarus'! Feb 9 11:26:26 godwit sshd[26272]: pam_ldap: ldap_starttls_s: Connect error Feb 9 11:26:26 godwit sshd[26272]: Accepted publickey for antarus from 75.147.136.182 port 52882 ssh2 Feb 9 11:26:26 godwit sshd[26272]: pam_unix(sshd:session): session opened for user antarus by (uid=0) Feb 9 11:26:28 godwit sshd[26279]: Received disconnect from 75.147.136.182: 11: disconnected by user Feb 9 11:26:28 godwit sshd[26272]: pam_unix(sshd:session): session closed for user antarus Feb 9 11:26:36 godwit sshd[26296]: error: [LDAP] could not initialize ldap connection Feb 9 11:26:36 godwit sshd[26296]: SSH: Server;Ltype: Version;Remote: 75.147.136.182-52883;Protocol: 2.0;Client: OpenSSH_5.6 Feb 9 11:26:38 godwit sshd[26296]: [LDAP] no keys found for 'antarus'! Feb 9 11:26:38 godwit sshd[26296]: pam_ldap: ldap_starttls_s: Connect error Feb 9 11:26:38 godwit sshd[26296]: Accepted publickey for antarus from 75.147.136.182 port 52883 ssh2 Feb 9 11:26:38 godwit sshd[26296]: pam_unix(sshd:session): session opened for user antarus by (uid=0) Feb 9 11:26:44 godwit sudo: antarus : TTY=pts/0 ; PWD=/home/antarus ; USER=root ; COMMAND=/bin/bash Feb 9 11:26:44 godwit sudo: pam_unix(sudo:session): session opened for user root by antarus(uid=2097) Feb 9 11:26:47 godwit sshd[26315]: error: [LDAP] could not initialize ldap connection Feb 9 11:26:47 godwit sshd[26315]: SSH: Server;Ltype: Version;Remote: 140.211.166.162-45254;Protocol: 2.0;Client: check_ssh_1.4.14 Feb 9 11:27:31 godwit sshd[3247]: Received signal 15; terminating. basically it would be nice if someone found the root cause. For now the fix is to restart sshd (?).
There's no proper fix for it yet, so restarting SSHD for now is necessary. It's a bug in the LPK patch of OpenSSH, we have some patches but none of them help and the real root cause is not yet found. This is actually not an infra bug but more a upstream/openssh maintainer bug.
(In reply to comment #1) > There's no proper fix for it yet, so restarting SSHD for now is necessary. > > It's a bug in the LPK patch of OpenSSH, we have some patches but none of them > help and the real root cause is not yet found. > This is actually not an infra bug but more a upstream/openssh maintainer bug. So this bug is to track the 'issue' that sometimes forums people can't login to their servers. It also records the fix (restarting sshd.) Ideally we would do some debugging and then open a second bug with more patches. I just want some kind of note down that we know about the issue, and the workaround so it isn't crazy infra lore. -A
I spent some time on this tonight. Basically the client connects, immediately issues a STARTTLS op, the server sends a response, the client responds, and then the TLS negociation fails. The ldap server claims to support stuff like 'packet tracing' or 'DEBUG_LDAP_ANY' which prints all kinds of 'interesting' messages about TLS to syslog, but I can't seem to get those to work. I rebuilt ldap on meadowlark with USE="debug syslog", FYI. -A
afaict, we haven't had issues with this in a while. Closing this for now.