Hi, I have weird server hangs since a few days. I think the problems started when I updated to a vserver kernel. Basically, the system starts hanging when woking on the server via SSH. For example, this hang happened when logging in via SSH (using -vv, the hang occurs directly after "starting interactive session") or when pressing tab after typing "ls /var/log/apa". After the hang is resolved (which does so automatically after 2-10 mins), dmesg shows the following entry: INFO: task sshd:28383 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. sshd D ffff880035e27e48 0 28383 16208 0x00000000 ffff880035e27bd8 0000000000000086 ffff880024f9cf00 ffffea00006a2ca0 ffffea0000d6eed8 ffff88007c244000 ffff88007d90c660 ffff88007c244270 0000000135e27b98 ffffffff810bc8c8 0000000135e27b98 00007f3b1c316000 Call Trace: [<ffffffff810bc8c8>] ? 0xffffffff810bc8c8 [<ffffffff8141cc15>] 0xffffffff8141cc15 [<ffffffff810bcf3f>] ? 0xffffffff810bcf3f [<ffffffff8141cd2d>] 0xffffffff8141cd2d [<ffffffff810b458d>] 0xffffffff810b458d [<ffffffff810c212d>] ? 0xffffffff810c212d [<ffffffff810b3d20>] ? 0xffffffff810b3d20 [<ffffffff810b65f5>] 0xffffffff810b65f5 [<ffffffff810b6a55>] 0xffffffff810b6a55 [<ffffffff810b6b88>] 0xffffffff810b6b88 [<ffffffff810b7d6c>] 0xffffffff810b7d6c [<ffffffff810c212d>] ? 0xffffffff810c212d [<ffffffff810c0a2f>] ? 0xffffffff810c0a2f [<ffffffff810aa6ab>] 0xffffffff810aa6ab [<ffffffff810aa77b>] 0xffffffff810aa77b [<ffffffff810028eb>] 0xffffffff810028eb The program (sshd in this example) varies often. Sometimes, this also happens when doing a simple "ls" command in a directory; I was able to attach to the process via strace, but strace did not display any system calls at all. To me it looks like a kernel scheduler bug, but I am not really sure how to further debug this one. The system base data: Kernel: kernel-genkernel-x86_64-2.6.32-vs2.3.0.36.28-gentoo or kernel-genkernel-x86_64-2.6.33-vs2.3.0.36.30.4-gentoo Profile: default/linux/amd64/10.0/server Filesystem: ext3 Raid: Software Raid 1 I have consulted google already and there doesn't seem to be any solution. Right now, I'll test the following things: * Boot the kernel with highres=off * Upgrading to hardened profile and recompile everything If you have any other ideas on how to debug the issue, please attach them to this bug report. Also, if I have missed important information, please notify me and I'll attach them. I know that this is probably a meta-bug, probably caused by mis-configuration, but I think it's time to identify what's the cause of this and probably come up with a solution. Reproducible: Sometimes Steps to Reproduce: See above Actual Results: Random hangs Expected Results: No hangs Note: This emerge --info stuff is from after I updated the profile to hardened. Server is compiling gcc toolchain now. Portage 2.1.8.3 (hardened/linux/amd64/10.0, gcc-4.1.2, glibc-2.10.1-r1, 2.6.33-vs2.3.0.36.30.4-gentoo x86_64) ================================================================= System uname: Linux-2.6.33-vs2.3.0.36.30.4-gentoo-x86_64-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_5600+-with-gentoo-1.12.13 Timestamp of tree: Sun, 13 Jun 2010 21:00:23 +0000 distcc 3.1 x86_64-pc-linux-gnu [disabled] app-shells/bash: 4.0_p37 dev-lang/python: 2.4.6, 2.5.4-r4, 2.6.4-r1, 3.1.2-r3 sys-apps/baselayout: 1.12.13 sys-apps/sandbox: 1.6-r2 sys-devel/autoconf: 2.13, 2.65 sys-devel/automake: 1.5-r1, 1.7.9-r2, 1.9.6-r3, 1.10.3, 1.11.1 sys-devel/binutils: 2.18-r3 sys-devel/gcc: 4.1.2, 4.3.4, 4.4.3-r2 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.2.6b virtual/os-headers: 2.6.30-r1 ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=athlon64 -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=athlon64 -O2 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="assume-digests distlocks fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch" GENTOO_MIRRORS="http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ " LANG="en_US.UTF-8" LDFLAGS="-Wl,-O1" LINGUAS="en de" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="LINGUAS acl amd64 apache2 bash-completion bcmath berkdb bzip2 cgi clamd cli cluster cracklib crypt ctype cups curl cxx dri extraengine filter ftp gd gdbm geoip gpm hardened iconv imagemagick imap innodb jpeg json justify libwww maildir mbstring mmx modules mudflap multilib mysql mysqli ncurses nls nptl nptlonly openmp pam pcre pdo perl pic png posix pppd python rcypt readline reflection sasl session simplexml snmp soap sockets spl sqlite sse sse2 ssl subject-rewrite suexec svg sysfs tcpd tidy unicode urandom vchroot xml xmlrpc xorg xsl zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en de" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nv r128 radeon savage sis tdfx trident vesa via vmware voodoo" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
I can now reproduce this problem on the server by typing cd /var<tab>log<tab>/apa<tab> i.e. the system hangs when it tries to access the log directory, it's also not possible to login via SSH anymore. Will do a full fsck on next reboot. Any further hints to debug the issue are greatly appreciated.
please report this issue upstream