Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 41498 - SSH "partially" hangs after several hours of server uptime
Summary: SSH "partially" hangs after several hours of server uptime
Status: RESOLVED CANTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Unspecified (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Daniel Ahlberg (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-02-13 13:24 UTC by Charles N. Burns
Modified: 2004-10-05 11:59 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Charles N. Burns 2004-02-13 13:24:18 UTC
After some period of time (a few hours, a few days sometimes), OpenSSH "hangs".
A person can log in remotely, and is then asked for a username. Once that is entered, the user is asked for their password. If the user enters the wrong password, they are told (as normal) that access is denied and are prompted again for the password.
If, however, a user enters the correct password...Nothing happens. No further text is sent to the user. The connection is still alive, and the user can, for example, hit "enter" and scroll a line, but nothing other than that.
When this happens, existing SSH sessions also "sort of" work. A user can hit enter and get the command prompt again, and can do some simple tasks, but doing something like, say, looking at a log file ("tail -n 50 /var/log/messages") will hang the terminal similarly.

Samba seems to die as well, refusing to accept any incoming connections. Possibly other network services die. I will try to test Apache.

I very much doubt that the server hardware is at fault. First, this doesn't "seem" to be a hardware-like problems. Additionally, the hardware is fairly stable stuff. Tyan TigerMPX (one CPU), a RAM module whose model is explicitely certified for exactly this motherboard, Seagate SCSI hard drives (no SMART errors), overkill Antec power supply, overkill CPU heatsink/fan, and 3Com networking hardware.

That said, I do not know where to start as far as providing more useful or specific information. This began happening on 12 Feb 2004 after a system update on 11 or 12 Feb. In a disturbingly Windows-like fashion, the only remedy seems to be a system reboot.
I tried some of the magic sysrq key combinations. For reference, the system seems to be borked enough that some do not work. I cannot, for example, unmount filesystems or display the contents of registers. "immediate reboot" seems to work fine though.
I am using the 2.6.2 vanilla kernel with fairly conservative settings. Preemption off, modules off, etc. My kernel config is attached. I am completely willing to try different kernel options if suggested, and do anything else needed to get more useful information out.


Reproducible: Always
Steps to Reproduce:




sheridan samba # emerge info
Portage 2.0.50-r1 (default-x86-1.4, gcc-3.3.2, glibc-2.3.2-r9, 2.6.2)
=================================================================
System uname: 2.6.2 i686 AMD Athlon(tm) XP 1800+
Gentoo Base System version 1.4.3.13
Autoconf: sys-devel/autoconf-2.58
Automake: sys-devel/automake-1.7.7
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=athlon-mp -O2 -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/share/config /var/qmail/alias /var/qm
ail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-march=athlon-mp -O2 -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox userpriv"
GENTOO_MIRRORS="http://gentoo.oregonstate.edu
http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="apm arts avi berkdb crypt encode foomaticdb gdbm gif gpm gtk2 imlib java
libg++ libwww mad mikmod mpeg mys
ql ncurses nls oggvorbis oss pam pdflib perl python qt quicktime readline sdl
slang spell ssl svga tcpd truetyp
e x86 xml2 xmms xv zlib"







-----------------
kernel config
------------------
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_X86_PC=y
CONFIG_MK7=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
CONFIG_MTRR=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_HOTPLUG=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
CONFIG_BLK_DEV_FD=y
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_SCSI_QLA2XXX_CONFIG=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_NETFILTER=y
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_IRC=y
CONFIG_IP_NF_AMANDA=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_IPRANGE=y
CONFIG_IP_NF_MATCH_MAC=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_TARGET_NETMAP=y
CONFIG_IP_NF_NAT_IRC=y
CONFIG_IP_NF_NAT_AMANDA=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_LOG=y
CONFIG_IP_NF_TARGET_ULOG=y
CONFIG_IPV6_SCTP__=y
CONFIG_NETDEVICES=y
CONFIG_DUMMY=y
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_NET_PCI=y
CONFIG_E100=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_EVDEV=y
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_HW_RANDOM=y
CONFIG_RTC=y
CONFIG_HANGCHECK_TIMER=y
CONFIG_I2C=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCF=y
CONFIG_I2C_AMD756=y
CONFIG_VIDEO_SELECT=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_JBD=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_AUTOFS4_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_NTFS_FS=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_DEVFS_FS=y
CONFIG_DEVFS_MOUNT=y
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_SECURITY=y
CONFIG_SECURITY_CAPABILITIES=y
CONFIG_ZLIB_INFLATE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
Comment 1 Charles N. Burns 2004-02-16 15:54:59 UTC
Upon trying many combinations of kernel options, I have narrowed it down to ACPI. Having any ACPI option of any kind hangs the system, presumably when the server goes into an idle state at night. APM seems to work fine in 2.4 and 2.6, and ACPI works fine in 2.4 on this system.

The server hardware that may have something to do with the ACPI bug is:
Motherboard: Tyan TigerMPX with AMD760MPX chipset

CPU: Single Athlon XP (not MP) 1800+ 0.13 micron

Storage: Various SCSI devices (3 hard drives, CD-ROM, tape backup) in an Adaptec 29160 64-bit PCI controller, probably running at 33MHz. Using the "new" Adaptec driver, not the older unmaintained driver.
System using ReiserFS and EXT3 partitions.

Video: Matrox G200 or G400

One connected USB mouse.

All fans are connected directly to the power supply, so likely do not turn off regardless of internal temperature.

The bug is probably something specific to the AMD760MP/MPX chipset, as these are usually used with SMP systems and thus ACPI is likely not tested particularly thoroughly.
Comment 2 Daniel Ahlberg (RETIRED) gentoo-dev 2004-10-05 11:59:07 UTC
.