Today I merged mm-sources-2.6.0_beta3-r2, built it with the same config. Now, at boot up when checking the reiser journal, reiserfsck 3.6.11 segfaults, and I am asked to enter the root password to check the filesystem manually (it also segfaults). I type "exit", and the bootup goes on as usual. Then, in gnome, I tried reiserfsck on another partition. It segfaults. I tried to rebuild reiserfsprogs 3.6.11, it doesn't help. Everything was working fine with mm-sources-2.6.0_beta3-r1.
Could you please give the output of 'strace reiserfscheck'? [emerge strace]
Created attachment 16082 [details] strace reiserfsck output Ok, I remerged reiserfsprogs without -fomit-frame-pointer. The strace output is attached.
I forgot to add: reiserfsck alone on the command line does not segfault.
Got the same problem here, reiserfsck segfaults when checking /dev/sda5 (/usr) snippet from output: ------------[ cut here ]------------ kernel BUG at mm/filemap.c:1930! invalid operand: 0000 [#10] PREEMPT CPU: 0 EIP: 0060:[<c013b669>] Not tainted VLI EFLAGS: 00010282 EIP is at generic_file_aio_write_nolock+0xe9/0x100 eax: 00010286 ebx: 00010000 ecx: de230286 edx: de220000 esi: 00000000 edi: de221f6c ebp: de221e84 esp: de221e40 ds: 007b es: 007b ss: 0068 Process fsck.reiserfs (pid: 2400, threadinfo=de220000 task=df6452e0) Stack: dfd00000 00000000 00000000 c0430c80 00000012 dfcff340 dfcff3d0 df896a40 de221e84 df896a40 00010000 00000000 c013b7e2 de221e84 de221f6c 00000001 df896a60 00000000 00000000 00000000 00000001 ffffffff df896a40 df34dba8 Call Trace: [<c013b7e2>] generic_file_write_nolock+0xa2/0xc0 [<c021f100>] write_chan+0x180/0x260 [<c011df40>] autoremove_wake_function+0x0/0x50 [<c011c6b0>] default_wake_function+0x0/0x30 [<c011c6b0>] default_wake_function+0x0/0x30 [<c011be20>] scheduler_tick+0x210/0x4a0 [<c0219224>] tty_write+0x224/0x350 [<c015e137>] blkdev_file_write+0x37/0x40 [<c015534e>] vfs_write+0xbe/0x130 [<c0155472>] sys_write+0x42/0x70 [<c031cafb>] syscall_call+0x7/0xb Code: f2 90 8b 44 24 40 89 7c 24 04 c7 44 24 08 01 00 00 00 89 2c 24 89 44 24 0c e8 74 f4 ff ff 83 7d 10 ff 89 c7 75 cd e9 6f ff ff ff <0f> 0b 8a 07 ba de 32 c0 e9 53 ff ff ff 8d 76 00 8d bc 27 00 00 I also remerged reiserfsprogs, no luck, but after ^D sytem comes up normal.... root on dev/hda5 is mounted fine: Aug 14 16:39:32 bull found reiserfs format "3.6" with standard journal Aug 14 16:39:32 bull Reiserfs journal params: device hda5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 14 16:39:32 bull reiserfs: checking transaction log (hda5) for (hda5) Aug 14 16:39:32 bull Using r5 hash to sort names Aug 14 16:39:32 bull VFS: Mounted root (reiserfs filesystem) readonly.
OK. Please UNpatch these diffs against your kernel tree: http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/reiserfs-bogus-kunmap-removal.patch http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/reiserfs-xattr-fix.patch http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/probe-udf-after-reiserfs.patch I assume these would fix the problem, if they do please say so, and I'll send Andrew the bugs you've been getting...
If those didn't work, try these [these are more near to the call trace]: http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/aio-O_SYNC-fix.patch http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/aio-12-readahead.patch http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/aio-09-o_sync.patch
The first set of patches didn't help. When applying the second set, and making the kernel, I get this : root@gentoo linux # patch -p1 < /home/charlie/tmp2/aio-09-o_sync.patch patching file fs/aio.c Hunk #1 FAILED at 28. Hunk #2 FAILED at 1246. Hunk #3 FAILED at 1270. 3 out of 3 hunks FAILED -- saving rejects to file fs/aio.c.rej patching file include/linux/pagemap.h Hunk #1 succeeded at 211 with fuzz 1 (offset 5 lines). patching file include/linux/writeback.h Reversed (or previously applied) patch detected! Assume -R? [n] Apply anyway? [n] y Hunk #1 FAILED at 87. 1 out of 1 hunk FAILED -- saving rejects to file include/linux/writeback.h.rej patching file mm/filemap.c Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #1 succeeded at 1903 (offset 19 lines). Hunk #2 succeeded at 2078 (offset 90 lines). patching file mm/page-writeback.c Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #2 FAILED at 600. Hunk #3 FAILED at 608. Hunk #4 succeeded at 640 (offset 2 lines). Hunk #5 succeeded at 654 (offset 2 lines). Hunk #6 succeeded at 664 (offset 2 lines). 2 out of 6 hunks FAILED -- saving rejects to file mm/page-writeback.c.rej root@gentoo linux # patch -p1 < /home/charlie/tmp2/aio- aio-09-o_sync.patch aio-12-readahead.patch aio-O_SYNC-fix.patch root@gentoo linux # patch -p1 < /home/charlie/tmp2/aio-12-readahead.patch patching file fs/aio.c Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #2 FAILED at 1350. 1 out of 2 hunks FAILED -- saving rejects to file fs/aio.c.rej patching file mm/filemap.c Hunk #1 FAILED at 639. 1 out of 1 hunk FAILED -- saving rejects to file mm/filemap.c.rej root@gentoo linux # patch -p1 < /home/charlie/tmp2/aio- aio-09-o_sync.patch aio-12-readahead.patch aio-O_SYNC-fix.patch root@gentoo linux # patch -p1 < /home/charlie/tmp2/aio-09-o_sync.patch patching file fs/aio.c Reversed (or previously applied) patch detected! Assume -R? [n] y Hunk #2 FAILED at 1245. Hunk #3 FAILED at 1264. 2 out of 3 hunks FAILED -- saving rejects to file fs/aio.c.rej patching file include/linux/pagemap.h Hunk #1 succeeded at 219 with fuzz 1 (offset 13 lines). patching file include/linux/writeback.h Reversed (or previously applied) patch detected! Assume -R? [n] y patching file mm/filemap.c Hunk #1 succeeded at 1903 (offset 19 lines). Hunk #2 succeeded at 2078 (offset 90 lines). patching file mm/page-writeback.c Hunk #2 FAILED at 600. Hunk #3 FAILED at 612. Hunk #4 succeeded at 650 (offset 7 lines). Hunk #5 succeeded at 664 (offset 7 lines). Hunk #6 succeeded at 674 (offset 7 lines). 2 out of 6 hunks FAILED -- saving rejects to file mm/page-writeback.c.rej root@gentoo linux # make make[1]: `arch/i386/kernel/asm-offsets.s' is up to date. CC init/main.o CHK include/linux/compile.h CC init/do_mounts.o In file included from include/linux/nfs_fs.h:15, from init/do_mounts.c:9: include/linux/pagemap.h:216: error: redefinition of `wait_on_page_writeback_wq' include/linux/pagemap.h:203: error: `wait_on_page_writeback_wq' previously defined here include/linux/pagemap.h:224: error: redefinition of `wait_on_page_writeback_wq' include/linux/pagemap.h:216: error: `wait_on_page_writeback_wq' previously defined here make[1]: *** [init/do_mounts.o] Error 1 make: *** [init] Error 2
Hmm, looks like that won't work as they layer on top of the previous patch sadly. Try de-patching every aio-* patch in http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/broken-out/ in order, 01, 02, etc...
Tried sequentially depatching of aio-* ,started at -01, but compiling the kernel fails: CC kernel/ksyms.o In file included from kernel/ksyms.c:52: include/linux/buffer_head.h: In function `wait_on_buffer_wq': include/linux/buffer_head.h:279: warning: implicit declaration of function `__wait_on_buffer_wq' CC kernel/module.o CC kernel/kallsyms.o CC kernel/pm.o CC kernel/power/process.o CC kernel/power/console.o LD kernel/power/built-in.o CC kernel/acct.o LD kernel/built-in.o CC mm/bootmem.o CC mm/filemap.o In file included from mm/filemap.c:38: include/linux/buffer_head.h: In function `wait_on_buffer_wq': include/linux/buffer_head.h:279: warning: implicit declaration of function `__wait_on_buffer_wq' mm/filemap.c: In function `wait_on_page_bit_wq': mm/filemap.c:294: warning: implicit declaration of function `is_sync_wait' mm/filemap.c:301: `EIOCBRETRY' undeclared (first use in this function) mm/filemap.c:301: (Each undeclared identifier is reported only once mm/filemap.c:301: for each function it appears in.) mm/filemap.c: In function `do_generic_mapping_read': mm/filemap.c:590: structure has no member named `io_wait' mm/filemap.c:591: warning: implicit declaration of function `is_retried_kiocb' mm/filemap.c:591: warning: implicit declaration of function `io_wait_to_kiocb' mm/filemap.c:591: structure has no member named `io_wait' mm/filemap.c:618: structure has no member named `io_wait' make[1]: *** [mm/filemap.o] Error 1 make: *** [mm] Error 2 some of the aio-patches failed to unpatch before (Hunk rejected).....
Sorry, silly me: I forgot you need to remove them in reverse order too. http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm2/announce.txt lists the patches and the order, you need to start unpatching from the bottom of the file towards the top, you can stop at aio-01-retry.patch, there doesn't seem to be anything else AIO related after that.
DONE... now Kernel comes up fine.. I used a fresh emerge of mm-sources and only unpatched that aio-stuff you've suggested, the (un)patches for reiser are still in.
Thanks a lot for your testing and contribution to kernel development ;-) I'll send a mail off to Andrew Morton and see if he can say anything on this...
It was a bogus "BUG_ON" command in mm/filemap.c. Removing line 1930 should fix your problems.
... of mm/filemap.c
In portage. If you sync and remerge it should work. Marking as fixed.
Thanks for your help. The kernel compiles fine. But the nvidia-kernel doesn't build (was building with mm1 and mm2): emerge /usr/portage/media-video/nvidia-kernel/nvidia-kernel-1.0.4496.ebuild Calculating dependencies ...done! >>> emerge (1 of 1) media-video/nvidia-kernel-1.0.4496 to / >>> md5 src_uri ;-) NVIDIA-Linux-x86-1.0-4496-pkg0.run >>> Unpacking source... Creating directory NVIDIA-Linux-x86-1.0-4496-pkg0 Verifying archive integrity... OK Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86 1.0-4496......................................................... * Linux kernel 2.6.0 * Applying tasklet patch for kernel 2.[56]... [ ok ] * Applying NVIDIA_kernel-1.0-4496-tail.diff... [ ok ] * Applying NVIDIA_kernel-1.0-4496-Makefile.diff... [ ok ]>>> Source unpacked. rm -f nv.o os-agp.o os-interface.o os-registry.o nv-linux.o nv_compiler.h *.d NVdriver nvidia.o echo \#define NV_COMPILER \"`gcc -v 2>&1 | tail -n 1`\" > nv_compiler.h gcc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts -Wparentheses -Wcast-qual -Wno-multichar -O -MD -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DKBUILD_MODNAME="nvidia" -DNTRM -D_GNU_SOURCE -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=4348 -DNV_UNIX -DNV_LINUX -DNV_INT64_OK -DNVCPU_X86 -DREMAP_PAGE_RANGE_5 -I. -I/usr/src/linux/include -I/usr/src/linux/include/asm/mach-default -Wno-cast-qual nv.c nv.c: In function `nv_kern_read_agpinfo': nv.c:1964: error: structure has no member named `name' make: *** [nv.o] Error 1 !!! ERROR: media-video/nvidia-kernel-1.0.4496 failed. !!! Function src_compile, Line 121, Exitcode 2 !!! (no error message)
Please see http://bugs.gentoo.org/show_bug.cgi?id=26812 for my other fix for the nvidia-kernel :-)
Works now. Thanks a lot !