This machine is my lan rsync mirror. 3 other machines sync off of this one nightly. With 2.6.7 kernel, at ~ 3:08am this thing flakes out.
Steps to Reproduce:
1.boot system on 2.6.7 kernel (any i've tried)
2.sync to gentoo.org, no issue. have other workstations sync to this machine.
3.start applying random ebuilds (normal upgrades from emerge -pv world)
normally, after working for a while after reboot, it dies at 3:08 am
approximately (when sync crons start). Last night at 11pm i tried to simulate
3am cron activity manually, and system crashed.
I had manually synced 2 machines to problem machine, and had run rdiff-backup
job against another machine (via ssh, one of my nightly cron jobs) and after
15-30 miutes, consoles flaked out reporting Buffer I/O error for /dev/hdb. That
is the drive that / and /mnt/backup reside on. When I type reboot, system
returned that /sbin/shutdown was not available.
all rsync related jobs to complete successfully and system to remain up for
weeks and days, not hours.
I put part of /var/log/messages in http://bugs.gentoo.org/show_bug.cgi?id=54413
attachment. THis was repeated several times, over several nights. In regard to
3am timing, I swapped three rdiff-backup crons around to see if problem moved
and it did not. I moved jobs out of /etc/cron.daily to see if one of thos
scripts was the cause, but have not determined anything with that yet. Only odd
thing I can add is I stagger the other workstations emerge sync cron scripts
with a sleep statement so I can use stock /etc/cron.daily dir. script contains
sleep 900 ; /usr/bin/emerge sync >> ~root/emerge.sync.log
System does not exhibit problem with g-d-s-2.6.5. Does with all versions of
2.6.7 (couple of r1 through r11, and also mm4).
Half of comments on http://bugs.gentoo.org/show_bug.cgi?id=54413 are this
problem. Thought it was related to that initially since I use hotplug + usb2
on this machine as well (added within last two months by addon Ali PCI card,
stable with 2.6.5)
Had issue with make oldconfig while troubleshooting this, so thought I had issue
resolved when kernel config straightened out, but issue remains.
Have nvidia mx440 card in system, thought it might be related to that, but I
have appropriate nvidia packages installed.
* media-video/nvidia-kernel :
[ ] 1.0.4363-r3 (2.4.26-grsec-2.0)
[ ] 1.0.4496-r3 (2.4.26-grsec-2.0)
[M ] 1.0.4499 (2.4.26-grsec-2.0)
[ ~ ] 1.0.5328-r1 (2.4.26-grsec-2.0)
[M ] 1.0.5332-r1 (2.4.26-grsec-2.0)
[ ~ ] 1.0.5336-r2 (2.4.26-grsec-2.0)
[ ~ ] 1.0.5336-r3 (2.4.26-grsec-2.0)
[ ~I] 1.0.5336-r4 (2.4.26-grsec-2.0)
[ ~I] 1.0.6106 (2.4.26-grsec-2.0)
lol2 root # etcat -v nvidia-glx
[ Results for search key : nvidia-glx ]
[ Candidate applications found : 7 ]
Only printing found installed programs.
* media-video/nvidia-glx :
[ ] 1.0.4363-r1 (0)
[ ] 1.0.4496-r2 (0)
[M ] 1.0.4499-r1 (0)
[ ~ ] 1.0.5328-r2 (0)
[M ] 1.0.5332-r2 (0)
[ ~ ] 1.0.5336-r2 (0)
[ ~I] 1.0.6106-r3 (0)
I have one or two of my other machines using 2.6.7 with no issues, but this is
the fileserver/backupserver. It is also the only usb2, non-intel (atlon-xp)
nvidia video card machine i have.
lol2 root # emerge info
Portage 2.0.50-r9 (default-x86-1.4, gcc-3.3.3, glibc-220.127.116.1140420-r0,
System uname: 2.6.5-gentoo-r1 i686 AMD Athlon(TM) XP1800+
Gentoo Base System version 1.4.16
CFLAGS="-O2 -march=athlon-xp -pipe"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config
/usr/kde/3.1/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config
/usr/lib/mozilla/defaults/pref /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -march=athlon-xp -pipe"
FEATURES="autoaddcvs ccache sandbox"
USE="3dnow X aalib alsa apm artswrappersuid avi berkdb bonobo cdr crypt cups
curl dvd encode esd foomaticdb freetype gdbm gif gimpprint gnome gphoto2 gpm
gstreamer gtk gtk2 gtkhtml guile imap imlib jabber java jpeg kde libg++ libwww
live mad mikmod mozilla moznoirc moznomail mpeg mysql ncurses nls nntp nptl
oggvorbis opengl oss pam pda pdflib perl png ppds python qt quicktime readline
scanner sdl slang spell sse ssl svga tcltk tcpd tiff truetype usb x86 xine xml2
xmms xv zlib"
i have also tried with and without 4k stacks, with both nv and nvidia drivers for xorg. disabled bootsplash for 2.6.7 kernel (still use it for 2.6.5, works ok)
The Athlon XP this was reported with is no longer running gentoo. I set it up for my wifes desktop.
However, moved rsync mirror services to my webserver (P2-400, 96M ram) and with 2.6.7-g-d-s-r11, its rsync ability is flaky. My other workstations (two gentoo machines) cannot sync to it, hangs during initial file transfer (before compare step). Reboot with 2.6.5-mm4 (mm for sure, I think mm4), and rsyncd works fine. No hangs.
Could you please try with 18.104.22.168 from kernel.org?
will do. note that the machine having this problem has been re-dispatched. My current rsync server is running 2.6.7-r13 and has not exhibited this problem. (At least in a long while.) Im not sure if I can recreate the problem.
Any luck reproducing the problem? If not, theres nothing much we can do..
nothing. i cannot reinstall gentoo on offending ws or get second machine that exhibited problems to do it again. lets close it and I'll open a new bug and reference this one (or re-open this one) if needed.
ty for your effort.