Running reiserfsck with the following line > reiserfsck --rebuild-tree -l /root/archive.fsck.log -y /dev/vg0/archive does stop its work with the following error: > Pass 3 (semantic): > ... some/path/to/a-specific/file.extufile.c 391 are_file_items_correct > are_file_items_correct: Position (offset = 133857281) in the middle ofthe file [22 97180] was not found. Then running the line > reiserfsck --rebuild-tree -l /root/archive.fsck.log -y --adjust-size /dev/vg0/archive Crashes the kernel at an unknown point (it works very long, and i'm not staring at the screen for hours, just to see that second before it crashes). I can only guess that it's not in the pass 1, because i saw it still working at over 60% of that pass. I could reproduce it at least 3 times. But only with the "--adjust-size" option My recommendation would be to try to reproduce this on a test-system with a small reiserfs-partition and some realistic test data. Reproducible: Always Steps to Reproduce: 1. Create a realistic test-partition with reiserfs 2. Run "reiserfsck --rebuild-tree -l /root/archive.fsck.log -y /dev/vg0/archive" 3. Run "reiserfsck --rebuild-tree -l /root/archive.fsck.log -y --adjust-size /dev/vg0/archive" Actual Results: System stops responding. Black screen. Going trough the terminals with Alt-F1 to All-F6 does not change the completely black screen. Num lock does not work too. No network connection to the system possible. Looks like a hard crash. Expected Results: Finishing of the repair of the file system. - kernel 2.6.20-hardened-r6 - hardened profile - reiserfs on lvm2 on Delock 70096 pci-sata-controller on 250GB Samsung sata harddisk - used sys-fs/reiserfsprogs-3.6.19-r1 Running programs: None. Running services: acpid, courier-authlib, courier-imapd-ssl, ddclient, dhcp, dracd, fcron, gpm, local, metalog, mldonkey, named, net.eth0, net.eth1, netmount, ntp-client, numlock, postfix, samba, sensord, shorewall, smartd, sshd. Emerge info: Portage 2.1.2.12 (hardened/x86/2.6, gcc-3.4.6, glibc-2.5-r4, 2.6.20-hardened-r6 i686) ================================================================= System uname: 2.6.20-hardened-r6 i686 AMD Athlon(tm) Processor Gentoo Base System release 1.12.9 Timestamp of tree: Sat, 01 Sep 2007 07:50:01 +0000 ccache version 2.4 [enabled] app-shells/bash: 3.2_p17 dev-java/java-config: 1.3.7, 2.0.33-r1 dev-lang/python: 2.3.5-r3, 2.4.4-r4 dev-python/pycrypto: 2.0.1-r6 dev-util/ccache: 2.4-r7 sys-apps/baselayout: 1.12.9-r2 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61-r1 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.17 sys-devel/gcc-config: 1.3.16 sys-devel/libtool: 1.5.24 virtual/os-headers: 2.6.21 ACCEPT_KEYWORDS="x86" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=athlon-tbird -fomit-frame-pointer -pipe -falign-functions=4" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /var/bind" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/terminfo /etc/texmf/web2c" CXXFLAGS="-O2 -march=athlon-tbird -fomit-frame-pointer -pipe -falign-functions=4" DISTDIR="/usr/portage/distfiles" FEATURES="candy ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict userpriv usersandbox" GENTOO_MIRRORS="ftp://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://mir.zyrianes.net/gentoo/ http://www.gigaload.org/gentoo.org/" LANG="de_DE.utf8" LC_ALL="de_DE.utf8" LINGUAS="de" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_EXTRA_OPTS="--timeout=300" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" USE="3dnow aalib acl acpi apache2 audiofile bash-completion bluetooth bzip2 chroot clamav cracklib crypt cscope cups curl curlwrappers dbm dedicated dio doc encode examples exif expat fam fbcon fftw flac flash flatfile foomaticdb ftp gd gdbm geoip gpm gstreamer hardened hardenedphp idn imagemagick imap imlib innodb java javascript jbig jikes jpeg jpeg2k junit lash lcms libcaca libwww lm_sensors mad maildir matroska memlimit midi mime ming mmap mmx mng mp3 mpeg mysql mysqli ncurses nls nocd nptl nptlonly ocaml offensive ogg oggvorbis pam pcre pdf perl php pic png portaudio posix postgres ppds prelude python readline samba sasl session sharedmem shorten simplexml sndfile sockets source sox speex spell spl sse ssl svg tcpd threads tidy tiff tokenizer truetype unicode urandom usb utf8 vhosts vorbis x86 xml xorg xsl zeo zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" FRITZCAPI_CARDS="fcpci" INPUT_DEVICES="mouse keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de" USERLAND="GNU" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LDFLAGS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
the first thing to do is to actually upgrade to the latest kernel and see if it works there
This actually is the latest kernel... for the hardened profile. I even tried the latest "unstable (~x86)" kernel (2.6.22-r3 if i remember it right), and could not boot with it. Additionally i have a fear of destroying even more data by producing another crash while rebuilding the fs. :( It's a production server. A test-system would be better suited for this kind of tests. I had a hope that someone other could run the exact commandlines with my kernel and see if it crashes too. Then try the newest kernel and see if it runs. If yes he could test the lines on that kernel too. This would help locate the source of the problem. Eg. Is it a hardware problem, a kernel problem, a reiserfsprogs problem or some other software/strange config in my system.
When you say it crashes the kernel, do you mean you see an oops/call trace? If so, can you post that here? Can you post your dmesg output from the time of the crash? Any info you can give us will help. Please also post your kernel .config.
Created attachment 131230 [details] .config for kernel 2.6.20-hardened-r6
(In reply to comment #3) > When you say it crashes the kernel, do you mean you see an oops/call trace? > If so, can you post that here? Can you post your dmesg output from the time > of the crash? Nope. At the moment where i switch on the monitor, i only get a black screen. That's it. And as i said there is no network-connection possible. So the only choice left is to hard-reset the machine. Then i look at the log files and there's no error message. Just the usual entries like "temperature is ok" "smart is fine" "a mail came in". Then the first bootup message follows. So there is no unusual dmesg-ouput too. I can post it the next time, but i don't think there will be a next time, because - as you may understand - i will not "repair" (read destroy) the FS even more by running reiserfsck another time. > Any info you can give us will help. Please also post your kernel .config. Ok. It's attached now.
Oh, i found something interesting: While writing to a DVD-RAM on teh same SATA-controller via pktdvd i get the following errors several times: Sep 14 01:54:52 [kernel] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Sep 14 01:54:52 [kernel] ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x5a data 12 in Sep 14 01:54:52 [kernel] res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout) Sep 14 01:54:53 [kernel] ata1: soft resetting port Sep 14 01:55:00 [kernel] ata1: port is slow to respond, please be patient (Status 0xf8) Sep 14 01:55:23 [kernel] ata1: port failed to respond (30 secs, Status 0xf8) Sep 14 01:55:23 [kernel] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 14 01:55:23 [kernel] ATA: abnormal status 0xF8 on port 0xC88DC087 - Last output repeated 5 times - Sep 14 01:55:53 [kernel] ata1.00: qc timeout (cmd 0xa1) Sep 14 01:55:53 [kernel] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) Sep 14 01:55:53 [kernel] ata1.00: revalidation failed (errno=-5) Sep 14 01:55:53 [kernel] ata1: failed to recover some devices, retrying in 5 secs Sep 14 01:55:58 [kernel] ata1: hard resetting port Sep 14 01:55:58 [kernel] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 14 01:55:59 [kernel] ata1.00: configured for UDMA/33 Sep 14 01:55:59 [kernel] ata1: EH complete It happens sporadically, but is completely reproducible and the messages always are exactly the same. I just do a growisofs -Z /dev/sr0=/somepath/someimage.iso Additionally the formatting of a DVD+RW is not possible. It causes the same errors and then the program exits in them mittle of the formatting. So there is a possibility that it's a problem with the sata-controller's driver . (I heard they still are experimental so i guess such a bug report would be appreciated there.)
Created attachment 131231 [details] Hardware-Information By the way... some infos on the hardware, created with lshw.
Can you confirm that the error you get when writing to a DVD-RAM is reproducible on the latest development kernel (2.6.23-rc9 as of this writing)? If so, please post the kernel .config and the complete dmesg output, including the error(s).
Please reopen if/when you can provide the info requested in comment #8.