|Summary:||gentoo-dev-sources 2.6.7 unstable after 15 minutes of heavy file i/o|
|Product:||Gentoo Linux||Reporter:||Steve Romanow <slestak989>|
|Component:||New packages||Assignee:||x86-kernel (DEPRECATED) <x86-kernel>|
|Package list:||Runtime testing required:||---|
Description Steve Romanow 2004-07-24 03:50:35 UTC
This machine is my lan rsync mirror. 3 other machines sync off of this one nightly. With 2.6.7 kernel, at ~ 3:08am this thing flakes out. Reproducible: Sometimes Steps to Reproduce: 1.boot system on 2.6.7 kernel (any i've tried) 2.sync to gentoo.org, no issue. have other workstations sync to this machine. 3.start applying random ebuilds (normal upgrades from emerge -pv world) Actual Results: normally, after working for a while after reboot, it dies at 3:08 am approximately (when sync crons start). Last night at 11pm i tried to simulate 3am cron activity manually, and system crashed. I had manually synced 2 machines to problem machine, and had run rdiff-backup job against another machine (via ssh, one of my nightly cron jobs) and after 15-30 miutes, consoles flaked out reporting Buffer I/O error for /dev/hdb. That is the drive that / and /mnt/backup reside on. When I type reboot, system returned that /sbin/shutdown was not available. Expected Results: all rsync related jobs to complete successfully and system to remain up for weeks and days, not hours. I put part of /var/log/messages in http://bugs.gentoo.org/show_bug.cgi?id=54413 attachment. THis was repeated several times, over several nights. In regard to 3am timing, I swapped three rdiff-backup crons around to see if problem moved and it did not. I moved jobs out of /etc/cron.daily to see if one of thos scripts was the cause, but have not determined anything with that yet. Only odd thing I can add is I stagger the other workstations emerge sync cron scripts with a sleep statement so I can use stock /etc/cron.daily dir. script contains sleep 900 ; /usr/bin/emerge sync >> ~root/emerge.sync.log System does not exhibit problem with g-d-s-2.6.5. Does with all versions of 2.6.7 (couple of r1 through r11, and also mm4). Half of comments on http://bugs.gentoo.org/show_bug.cgi?id=54413 are this problem. Thought it was related to that initially since I use hotplug + usb2 on this machine as well (added within last two months by addon Ali PCI card, stable with 2.6.5) Had issue with make oldconfig while troubleshooting this, so thought I had issue resolved when kernel config straightened out, but issue remains. Have nvidia mx440 card in system, thought it might be related to that, but I have appropriate nvidia packages installed. * media-video/nvidia-kernel : [ ] 1.0.4363-r3 (2.4.26-grsec-2.0) [ ] 1.0.4496-r3 (2.4.26-grsec-2.0) [M ] 1.0.4499 (2.4.26-grsec-2.0) [ ~ ] 1.0.5328-r1 (2.4.26-grsec-2.0) [M ] 1.0.5332-r1 (2.4.26-grsec-2.0) [ ~ ] 1.0.5336-r2 (2.4.26-grsec-2.0) [ ~ ] 1.0.5336-r3 (2.4.26-grsec-2.0) [ ~I] 1.0.5336-r4 (2.4.26-grsec-2.0) [ ~I] 1.0.6106 (2.4.26-grsec-2.0) lol2 root # etcat -v nvidia-glx [ Results for search key : nvidia-glx ] [ Candidate applications found : 7 ] Only printing found installed programs. * media-video/nvidia-glx : [ ] 1.0.4363-r1 (0) [ ] 1.0.4496-r2 (0) [M ] 1.0.4499-r1 (0) [ ~ ] 1.0.5328-r2 (0) [M ] 1.0.5332-r2 (0) [ ~ ] 1.0.5336-r2 (0) [ ~I] 1.0.6106-r3 (0) I have one or two of my other machines using 2.6.7 with no issues, but this is the fileserver/backupserver. It is also the only usb2, non-intel (atlon-xp) nvidia video card machine i have. lol2 root # emerge info Portage 2.0.50-r9 (default-x86-1.4, gcc-3.3.3, glibc-18.104.22.16840420-r0, 2.6.5-gentoo-r1) ================================================================= System uname: 2.6.5-gentoo-r1 i686 AMD Athlon(TM) XP1800+ Gentoo Base System version 1.4.16 Autoconf: sys-devel/autoconf-2.59-r3 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-O2 -march=athlon-xp -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.1/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -march=athlon-xp -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs ccache sandbox" GENTOO_MIRRORS="http://mirrors.tds.net/gentoo http://open-systems.ufl.edu/mirrors/gentoo http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.us.gentoo.org/gentoo-portage" USE="3dnow X aalib alsa apm artswrappersuid avi berkdb bonobo cdr crypt cups curl dvd encode esd foomaticdb freetype gdbm gif gimpprint gnome gphoto2 gpm gstreamer gtk gtk2 gtkhtml guile imap imlib jabber java jpeg kde libg++ libwww live mad mikmod mozilla moznoirc moznomail mpeg mysql ncurses nls nntp nptl oggvorbis opengl oss pam pda pdflib perl png ppds python qt quicktime readline scanner sdl slang spell sse ssl svga tcltk tcpd tiff truetype usb x86 xine xml2 xmms xv zlib"
Comment 1 Steve Romanow 2004-07-24 03:54:09 UTC
i have also tried with and without 4k stacks, with both nv and nvidia drivers for xorg. disabled bootsplash for 2.6.7 kernel (still use it for 2.6.5, works ok)
Comment 2 Steve Romanow 2004-07-29 05:48:37 UTC
The Athlon XP this was reported with is no longer running gentoo. I set it up for my wifes desktop. However, moved rsync mirror services to my webserver (P2-400, 96M ram) and with 2.6.7-g-d-s-r11, its rsync ability is flaky. My other workstations (two gentoo machines) cannot sync to it, hangs during initial file transfer (before compare step). Reboot with 2.6.5-mm4 (mm for sure, I think mm4), and rsyncd works fine. No hangs.
Comment 3 Daniel Drake (RETIRED) 2004-08-23 00:06:55 UTC
Could you please try with 22.214.171.124 from kernel.org?
Comment 4 Steve Romanow 2004-08-23 05:50:58 UTC
will do. note that the machine having this problem has been re-dispatched. My current rsync server is running 2.6.7-r13 and has not exhibited this problem. (At least in a long while.) Im not sure if I can recreate the problem.
Comment 5 Daniel Drake (RETIRED) 2004-09-05 06:30:53 UTC
Any luck reproducing the problem? If not, theres nothing much we can do..
Comment 6 Steve Romanow 2004-09-06 17:38:08 UTC
nothing. i cannot reinstall gentoo on offending ws or get second machine that exhibited problems to do it again. lets close it and I'll open a new bug and reference this one (or re-open this one) if needed. ty for your effort.
Comment 7 Daniel Drake (RETIRED) 2004-09-07 02:31:22 UTC