Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 208404

Summary: [2.6.24 regression] crash in xfs_file_readdir
Product: Gentoo Linux Reporter: .:deadhead:. <andreamtp+bz>
Component: [OLD] Core systemAssignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel>
Status: RESOLVED FIXED    
Severity: major CC: tais.hansen, wschlich
Priority: Highest    
Version: unspecified   
Hardware: All   
OS: Linux   
URL: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=450790a2c51e6d9d47ed30dbdcf486656b8e186f
Whiteboard: linux-2.6.24-regression
Package list:
Runtime testing required: ---

Description .:deadhead:. 2008-02-01 10:20:47 UTC
Yesterday I got a problem with the 2.6.24 kernel and my xfs partitions. I was accessing to my mail via imap (dovecot) and then some errors occurred.

I checked the logs and I found this:


Jan 31 15:12:09 stakanov_II BUG: unable to handle kernel paging request at virtual address f8000000
Jan 31 15:12:09 stakanov_II printing eip: c022dd78 *pde = 00000000
Jan 31 15:12:09 stakanov_II Oops: 0000 [#1] PREEMPT
Jan 31 15:12:09 stakanov_II Modules linked in: b43 yenta_socket rsrc_nonstatic ssb
Jan 31 15:12:09 stakanov_II
Jan 31 15:12:09 stakanov_II Pid: 19390, comm: imap Not tainted (2.6.24-gentoo-metallica #4)
Jan 31 15:12:09 stakanov_II EIP: 0060:[<c022dd78>] EFLAGS: 00010282 CPU: 0
Jan 31 15:12:09 stakanov_II EIP is at xfs_file_readdir+0xf2/0x1aa
Jan 31 15:12:09 stakanov_II EAX: 00000000 EBX: 000001a4 ECX: 00000058 EDX: 00000000
Jan 31 15:12:09 stakanov_II ESI: 00000000 EDI: f7fffff8 EBP: f766d780 ESP: e05bbf1c
Jan 31 15:12:09 stakanov_II DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Jan 31 15:12:09 stakanov_II Process imap (pid: 19390, ti=e05ba000 task=f7310f60 task.ti=e05ba000)
Jan 31 15:12:09 stakanov_II Stack: 000001a4 00000000 0f49408a 00000000 00000000 c0159838 e05bbf94 e1de1a80
Jan 31 15:12:09 stakanov_II 00000000 00000000 00000000 000001a4 00000000 f7fff000 00001000 00000ff8
Jan 31 15:12:09 stakanov_II 000001ad 00000000 c03ae120 f766d780 fffffffe e1de27e8 c0159a00 e05bbf94
Jan 31 15:12:09 stakanov_II Call Trace:
Jan 31 15:12:09 stakanov_II [<c0159838>] filldir64+0x0/0xc5
Jan 31 15:12:09 stakanov_II [<c0159a00>] vfs_readdir+0x4a/0x74
Jan 31 15:12:09 stakanov_II [<c0159838>] filldir64+0x0/0xc5
Jan 31 15:12:09 stakanov_II [<c0159a8d>] sys_getdents64+0x63/0xa5
Jan 31 15:12:09 stakanov_II [<c0102546>] sysenter_past_esp+0x5f/0x85
Jan 31 15:12:09 stakanov_II =======================
Jan 31 15:12:09 stakanov_II Code: 04 81 e3 ff ff ff 7f 89 1c 24 ff 54 24 14 85 c0 0f 85 a9 00 00 00 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4c 24 24 19 54 24 28 01 cf <8b> 47 08 8b 57 0c 83 7c 24 28 00 89 44 24 2c 89 54 24 30 7f 99
Jan 31 15:12:09 stakanov_II EIP: [<c022dd78>] xfs_file_readdir+0xf2/0x1aa SS:ESP 0068:e05bbf1c
Jan 31 15:12:09 stakanov_II ---[ end trace 96802517b18c4092 ]---

If you need more info, just ask.
Comment 1 Tais P. Hansen 2008-02-01 13:07:42 UTC
Try using xfs_repair on your xfs partition.
Comment 2 .:deadhead:. 2008-02-01 17:39:39 UTC
(In reply to comment #1)
> Try using xfs_repair on your xfs partition.

Do you think that the corruption has occurred due to the kernel bug or is the unclean file system that has generated this error?

Anyway, after that error, I've run xfs_repair from livecd on both my partitions: there were a couple of little problems. Fixed them now everything works as expected.



Comment 3 Tais P. Hansen 2008-02-01 20:30:53 UTC
I just had a somewhat similar problem as yours on my laptop and I also used xfs_repair to fix it as well.

I can't say if it's a kernel bug or not but xfs (and tools) have had a tendency to crash (or at least look like a crash) when more serious filesystem errors occur. One could call that a bug but I think the developers must think its an intentional one. :) - Also, the fact that Gentoo uses fsck.xfs to check and repair xfs filesystems which does absolutely nothing (take a look at the script) could also be called a bug. Maybe whoever made the fsck.xfs script thought xfs filesystems can't corrupted. :)

Anyway, I'm glad it fixed your problem.
Comment 4 .:deadhead:. 2008-02-04 10:19:54 UTC
What a lucky man I am! :D

This is the new error:

Feb  4 10:49:18 stakanov_II BUG: unable to handle kernel paging request at virtual address f8000000
Feb  4 10:49:18 stakanov_II printing eip: c022dd78 *pde = 00000000
Feb  4 10:49:18 stakanov_II Oops: 0000 [#1] PREEMPT
Feb  4 10:49:18 stakanov_II Modules linked in: yenta_socket rsrc_nonstatic snd_hda_intel
Feb  4 10:49:18 stakanov_II
Feb  4 10:49:18 stakanov_II Pid: 6057, comm: imap Not tainted (2.6.24-gentoo-metallica #9)
Feb  4 10:49:18 stakanov_II EIP: 0060:[<c022dd78>] EFLAGS: 00010282 CPU: 0
Feb  4 10:49:18 stakanov_II EIP is at xfs_file_readdir+0xf2/0x1aa
Feb  4 10:49:18 stakanov_II EAX: 00000000 EBX: 000001a9 ECX: 00000058 EDX: 00000000
Feb  4 10:49:18 stakanov_II ESI: 00000000 EDI: f7fffff8 EBP: ef5a2880 ESP: ef40bf1c
Feb  4 10:49:18 stakanov_II DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Feb  4 10:49:18 stakanov_II Process imap (pid: 6057, ti=ef40a000 task=ef626520 task.ti=ef40a000)
Feb  4 10:49:18 stakanov_II Stack: 000001a9 00000000 132fea33 00000000 00000000 c0159838 ef40bf94 ef98bc00
Feb  4 10:49:18 stakanov_II 00000000 00000000 00000000 000001a9 00000000 f7fff000 00001000 00000ff8
Feb  4 10:49:18 stakanov_II 000001b2 00000000 c03a6120 ef5a2880 fffffffe ef98ae28 c0159a00 ef40bf94
Feb  4 10:49:18 stakanov_II Call Trace:
Feb  4 10:49:18 stakanov_II [<c0159838>] filldir64+0x0/0xc5
Feb  4 10:49:18 stakanov_II [<c0159a00>] vfs_readdir+0x4a/0x74
Feb  4 10:49:18 stakanov_II [<c0159838>] filldir64+0x0/0xc5
Feb  4 10:49:18 stakanov_II [<c0159a8d>] sys_getdents64+0x63/0xa5
Feb  4 10:49:18 stakanov_II su[6262]: pam_unix(su:session): session closed for user root
Feb  4 10:49:18 stakanov_II [<c0102546>] sysenter_past_esp+0x5f/0x85
Feb  4 10:49:18 stakanov_II [<c0380000>] sta_info_release+0x0/0x5b
Feb  4 10:49:18 stakanov_II =======================
Feb  4 10:49:18 stakanov_II Code: 04 81 e3 ff ff ff 7f 89 1c 24 ff 54 24 14 85 c0 0f 85 a9 00 00 00 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4c 24 24 19 54 24 28 01 cf <8b> 47 08 8b 57 0c 83 7c 24 28 00 89 44 24 2c 89 54 24 30 7f 99
Feb  4 10:49:18 stakanov_II EIP: [<c022dd78>] xfs_file_readdir+0xf2/0x1aa SS:ESP 0068:ef40bf1c
Feb  4 10:49:18 stakanov_II ---[ end trace d899adf5158e5245 ]---

The problem is that, as last time, I was using dovecot, getmail download mail and thunderbird was accessing to that data via imap :|

I guess that this 2.6.24 is not my lucky kernel...
Comment 5 Tais P. Hansen 2008-02-04 10:44:21 UTC
There's definitely a problem with xfs in kernel 2.6.24. More here:

http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/ccc9cae379683eca
Comment 6 .:deadhead:. 2008-02-04 22:40:43 UTC
(In reply to comment #5)
> There's definitely a problem with xfs in kernel 2.6.24. More here:
> 
> http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/ccc9cae379683eca

Thank you very much for this link. I've written to David Chinner (sgi) explaing him the whole thing. I hope this will help.

Comment 7 .:deadhead:. 2008-02-06 09:33:02 UTC
I contacted SGI explaining them what happened, and after few mails, a patch has been produced:

http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_file.c.diff?r1=text&tr1=1.163&r2=text&tr2=1.162&f=h

I want to say a big thanks to the whole SGI team and in particular to  David Chinner who has followed the bug.

Don't know when it'll be integrated into main vanilla tree.
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2008-02-10 00:41:10 UTC
Fixed in gentoo-sources-2.6.24-r1 (genpatches-2.6.24-2)