Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 194732 - kernel 2.6.22.3's xfs driver and xfs_repair crash with my corrupted xfs filesystem
Summary: kernel 2.6.22.3's xfs driver and xfs_repair crash with my corrupted xfs files...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-04 20:18 UTC by Alex Cannon
Modified: 2007-10-13 01:35 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Cannon 2007-10-04 20:18:38 UTC
I've been having lockups and once I noticed at least one kernel Oops related to my xfs filesystem.  I couldn't record what it said because the filesystem was crashed.

So booted in to single user mode and mounted / read only.  xfs_check printed some errors and then when I ran xfs_repair it crashed.  I have the xfs_repair crash pasted below.  I'm guessing the kernel xfs driver and xfs_repair share some of the same code, and fixing one may help fix the other?  I'll try to paste a kernel Oops if I'm able to.


gateway ~ # cat /boot/xfs_error.txt 
gateway ~ # xfs_repair -d /dev/hda6
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
error following ag 2 unlinked list
        - process known inodes and perform inode discovery...
        - agno = 0
b7a7bb90: Badness in key lookup (length)
bp=(bno 1420656, len 16384 bytes) key=(bno 1420656, len 8192 bytes)
*** glibc detected *** xfs_repair: free(): invalid next size (fast): 0x09117408
***
======= Backtrace: =========
/lib/libc.so.6[0xb7eaf9d0]
/lib/libc.so.6[0xb7eb235a]
/lib/libc.so.6(__libc_memalign+0xb1)[0xb7eb35b1]
xfs_repair[0x80824f5]
======= Memory map: ========
08048000-080d4000 r-xp 00000000 03:06 5403924    /sbin/xfs_repair
080d4000-080d5000 rw-p 0008c000 03:06 5403924    /sbin/xfs_repair
080d5000-0912b000 rw-p 080d5000 00:00 0          [heap]
b4f00000-b4f21000 rw-p b4f00000 00:00 0
b4f21000-b5000000 ---p b4f21000 00:00 0
b5068000-b5072000 r-xp 00000000 03:06 16934382   /usr/lib/gcc/i686-pc-linux-gnu/
4.1.2/libgcc_s.so.1
b5072000-b5073000 rw-p 00009000 03:06 16934382   /usr/lib/gcc/i686-pc-linux-gnu/
4.1.2/libgcc_s.so.1b5073000-b5277000 rw-p b5073000 00:00 0
b5277000-b5278000 ---p b5277000 00:00 0
b5278000-b5a78000 rw-p b5278000 00:00 0
b5a78000-b5a79000 ---p b5a78000 00:00 0
b5a79000-b6279000 rw-p b5a79000 00:00 0
b6279000-b627a000 ---p b6279000 00:00 0
b627a000-b6a7a000 rw-p b627a000 00:00 0
b6a7a000-b6a7b000 ---p b6a7a000 00:00 0
b6a7b000-b727b000 rw-p b6a7b000 00:00 0
b727b000-b727c000 ---p b727b000 00:00 0
b727c000-b7e4c000 rw-p b727c000 00:00 0
b7e4c000-b7f6f000 r-xp 00000000 03:06 6690463    /lib/libc-2.5.so
b7f6f000-b7f70000 r--p 00123000 03:06 6690463    /lib/libc-2.5.so
b7f70000-b7f72000 rw-p 00124000 03:06 6690463    /lib/libc-2.5.so
b5072000-b5073000 rw-p 00009000 03:06 16934382   /usr/lib/gcc/i686-pc-linux-gnu/
4.1.2/libgcc_s.so.1
b5073000-b5277000 rw-p b5073000 00:00 0
b5277000-b5278000 ---p b5277000 00:00 0
b5278000-b5a78000 rw-p b5278000 00:00 0
b5a78000-b5a79000 ---p b5a78000 00:00 0
b5a79000-b6279000 rw-p b5a79000 00:00 0
b6279000-b627a000 ---p b6279000 00:00 0
b627a000-b6a7a000 rw-p b627a000 00:00 0
b6a7a000-b6a7b000 ---p b6a7a000 00:00 0
b6a7b000-b727b000 rw-p b6a7b000 00:00 0
b727b000-b727c000 ---p b727b000 00:00 0
b727c000-b7e4c000 rw-p b727c000 00:00 0
b7e4c000-b7f6f000 r-xp 00000000 03:06 6690463    /lib/libc-2.5.so
b7f6f000-b7f70000 r--p 00123000 03:06 6690463    /lib/libc-2.5.so
b7f70000-b7f72000 rw-p 00124000 03:06 6690463    /lib/libc-2.5.so
b7f72000-b7f75000 rw-p b7f72000 00:00 0
b7f75000-b7f7c000 r-xp 00000000 03:06 6671750    /lib/librt-2.5.so
b7f7c000-b7f7e000 rw-p 00006000 03:06 6671750    /lib/librt-2.5.so
b7f7e000-b7f91000 r-xp 00000000 03:06 6654226    /lib/libpthread-2.5.so
b7f91000-b7f92000 r--p 00012000 03:06 6654226    /lib/libpthread-2.5.so
b7f92000-b7f93000 rw-p 00013000 03:06 6654226    /lib/libpthread-2.5.so
b7f93000-b7f95000 rw-p b7f93000 00:00 0
b7f95000-b7f97000 r-xp 00000000 03:06 58108646   /lib/libuuid.so.1.2
b7f97000-b7f98000 rw-p 00001000 03:06 58108646   /lib/libuuid.so.1.2
b7f98000-b7f99000 rw-p b7f98000 00:00 0
b7fb1000-b7fcb000 r-xp 00000000 03:06 6692170    /lib/ld-2.5.so
b7fcb000-b7fcc000 r--p 00019000 03:06 6692170    /lib/ld-2.5.so
b7fcc000-b7fcd000 rw-p 0001a000 03:06 6692170    /lib/ld-2.5.so
bff75000-bff8b000 rw-p bff75000 00:00 0          [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]
Aborted

gateway ~ # emerge --info
Portage 2.1.2.12 (default-linux/x86/2006.1, gcc-4.1.2, glibc-2.5-r4, 2.6.22.3 i686)
=================================================================
System uname: 2.6.22.3 i686 Pentium III (Coppermine)
Gentoo Base System release 1.12.9
Timestamp of tree: Wed, 03 Oct 2007 23:00:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
app-shells/bash:     3.2_p17
dev-java/java-config: 1.2.11
dev-lang/python:     2.4.4-r4
dev-python/pycrypto: 2.0.1-r6
sys-apps/baselayout: 1.12.9-r2
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.61-r1
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5, 1.10
sys-devel/binutils:  2.17
sys-devel/gcc-config: 1.3.16
sys-devel/libtool:   1.5.24
virtual/os-headers:  2.6.21
ACCEPT_KEYWORDS="x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium3 -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/lib/mozilla/defaults/pref /usr/share/X11/xkb /usr/share/config /var/qmail/alias /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O2 -march=pentium3 -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X acpi aim alsa arts berkdb bitmap-fonts cli cracklib crypt cups divx4linux dri dvd esd foomaticdb fortran gdbm gif gimpprint gpm gtk iconv ipv6 isdnlog jpeg midi mmx mmx2 mmxext mozilla mudflap ncurses nls nptl nptlonly opengl openmp pam pcre perl ppds pppd python qt readline real reflection session spl sse ssl tcpd truetype truetype-fonts type1-fonts unicode usb x86 xorg xvid zlib" ALSA_CARDS="snd-sb16" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="apm ark chips cirrus cyrix dummy fbdev glint i128 i740 i810 imstt mach64 mga neomagic nsc nv r128 radeon rendition s3 s3virge savage siliconmotion sis sisusb tdfx tga trident tseng v4l vesa vga via vmware voodoo"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Comment 1 Maarten Bressers (RETIRED) gentoo-dev 2007-10-07 20:42:05 UTC
What you've posted here is a userspace error. Please post the kernel oops if/when you manage to capture it. If you get the oops, please reproduce it with the latest development kernel, 2.6.23-rc9 as of this writing. Of course, you can switch to 2.6.23-rc9 first and then try to get it to oops, to save yourself some time.

It's possible that this is hardware-related, since you say you've been having lockups. Are all these lockups xfs related? Are you sure there's nothing wrong with your disk(s)?

When you get an oops, please post your kernel .config, dmesg output, and some info about your hardware.
Comment 2 Alex Cannon 2007-10-07 23:12:46 UTC
I guess this bug has two parts really.  The first is that the xfs kernel drivers caused an Oops at least once, and I didn't have a way to record what it said.  The second is that xfs_repair crashes when trying to repair my filesystem.

This bug probably shouldn't be assigned to "kernel" because it's xfs_repair that is causing the problem right now, and I don't have a way of reproducing the kernel Oops.

The kernel part may be related to hardware, but the xfs_repair part does the exact thing every time.  There aren't any errors in dmesg about disk access or anyhting.

I would like to work on the xfs_repair issue first before trying to figure out what may be wrong with the kernel xfs driver.
Comment 3 Maarten Bressers (RETIRED) gentoo-dev 2007-10-07 23:22:19 UTC
I agree with you that this is not (primarily) a kernel issue, so I'm reassigning it to the xfsprogs maintainers.
Comment 4 SpanKY gentoo-dev 2007-10-08 00:02:30 UTC
please open a bug with the same information here:
http://oss.sgi.com/bugzilla/

you should also test the latest version of xfsprogs rather than the current x86 stable version
Comment 5 Alex Cannon 2007-10-13 01:35:40 UTC
I downloaded xfsprogs-2.9.4 and it was able to repair my filesystem without crashing.  Now there are no more errors on it.  It is my assumption that the kernel Oops(s) that I saw was because it ran in to corruption on the filesystem and crashed (instead of remounting read only or whatever it's supposed to do in that event).  So there may still be a bug in the kernel's xfs driver but I have no way of finding it now.

Maybe the older versions of xfsprogs should be masked in portage?