Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 246179 - net-fs/openafs-kernel-1.4.8_pre3 crashes kernel
Summary: net-fs/openafs-kernel-1.4.8_pre3 crashes kernel
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: AMD64 Linux
: High normal (vote)
Assignee: Stefaan De Roeck (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-09 16:32 UTC by Mike Hammill
Modified: 2009-08-23 10:04 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Hammill 2008-11-09 16:32:46 UTC
Have reproduced the following behavior twice:
# df -h
# above hangs after listing some of the files
# Message in dmesg:
Starting AFS cache scan...found 1714 non-empty cache files (3%).
BUG: unable to handle kernel paging request at fffffffffffffffe
IP: [<ffffffff80530ce8>] _read_lock+0x0/0xc
PGD 203067 PUD 204067 PMD 0 
Oops: 0002 [1] SMP 
CPU 1 
Modules linked in: libafs(P) snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer snd i2c_i801 i2c_core soundcore snd_page_alloc
Pid: 4746, comm: afsd Tainted: P         2.6.25-gentoo-r7 #2
RIP: 0010:[<ffffffff80530ce8>]  [<ffffffff80530ce8>] _read_lock+0x0/0xc
RSP: 0018:ffff8100755cde88  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8100755cdeec RCX: 0000000000000000
RDX: 0000000000000010 RSI: ffffc20001f67348 RDI: fffffffffffffffe
RBP: 0000000049031e33 R08: 0000000000000000 R09: 0000000100000000
R10: ffffc20001d1f050 R11: 00000000000000c0 R12: 0000000049031e33
R13: 0000000049031e33 R14: 0000000049031cd8 R15: 0000000049031e21
FS:  0000000000000000(0000) GS:ffff81007f36acc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: fffffffffffffffe CR3: 000000006e80d000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process afsd (pid: 4746, threadinfo ffff8100755cc000, task ffff81007e7c6080)
Stack:  ffffffff8809134b ffff8100755cdeec ffffffff88096fe4 0000000049031e33
 ffffffff88087109 4903101349031e21 00000000490315cf 0000000049031e33
 0000000036298831 0000000049031e33 0000000036298831 ffffffff807502a0
Call Trace:
 [<ffffffff8809134b>] ? :libafs:afs_osi_TraverseProcTable+0x12/0x61
 [<ffffffff88096fe4>] ? :libafs:afs_GCPAGs+0xad/0x164
 [<ffffffff88087109>] ? :libafs:afs_Daemon+0x352/0x419
 [<ffffffff880d058c>] ? :libafs:afsd_launcher+0x225/0x6b3
 [<ffffffff8022c9cd>] ? schedule_tail+0x27/0x5c
 [<ffffffff880d0367>] ? :libafs:afsd_launcher+0x0/0x6b3
 [<ffffffff8020bc88>] ? child_rip+0xa/0x12
 [<ffffffff880d0367>] ? :libafs:afsd_launcher+0x0/0x6b3
 [<ffffffff880d0394>] ? :libafs:afsd_launcher+0x2d/0x6b3
 [<ffffffff8020bc7e>] ? child_rip+0x0/0x12


Code: 8b 07 38 e0 75 0a 66 89 c2 fe c6 f0 66 0f b1 17 0f 94 c2 0f b6 c2 85 c0 0f 95 c0 0f b6 c0 c3 fe 07 c3 fe 07 56 9d c3 fe 07 fb c3 <f0> 83 2f 01 79 05 e8 0d cb de ff c3 9c 58 fa f0 83 2f 01 79 05 
RIP  [<ffffffff80530ce8>] _read_lock+0x0/0xc
 RSP <ffff8100755cde88>
CR2: fffffffffffffffe
---[ end trace ff550a46ac0834f8 ]---


Reproducible: Always

Steps to Reproduce:
(see above)

Actual Results:  
system continues to run, but df -h is hung.  Unfortunately, the system will then NOT reboot when with "reboot", but must be powered off.


Note this is a x86_64 Intel (CXXFLAGS="-march=nocona -O2 -pipe") running Portage 2.1.4.5 (default-linux/amd64/2007.0, gcc-4.1.2, glibc-2.6.1-r0, 2.6.25-gentoo-r7 x86_64).  As it uses the AMD profile (as suggested by Gentoo), I am filing this under AMD.  Please feel free to change.
Comment 1 Wormo (RETIRED) gentoo-dev 2008-11-10 07:03:14 UTC
I updated the subject line because you left off the package name -- make sure I got it right. Also, please post output of 'emerge --info' command.
Comment 2 Wormo (RETIRED) gentoo-dev 2008-11-10 07:13:33 UTC
Hm, this list thread looks quite relevant, a slightly older afs but very similar oops:
http://www.nabble.com/Erratic-kernel-Oops-(afsd-1.4.7_pre3-and-Linux-2.6.25)-td16772220.html
Comment 3 Mike Hammill 2008-11-12 14:59:27 UTC
Hi Wormo,

Yes, you got the package name just right.  Sorry I left that out.  Below is the emerge --info.  The only thing that has changed since I reported the bug is that I switched profiles to default/linux/amd64/2008.0/no-multilib.  This resulted in no change in USE flags for me from the 2007 profile I had been using since I was running without multilib there as well.  So outside of the profile name change, this is the correct emerge --info.  I also think you are right about the relevance of nabble.com link you gave.  Since the machine in question is running production (typically without AFS), I'm a little afraid to try the GCPAGs thing suggested in that link, however.

Thanks for the interest,
/Mike

# emerge --info
Portage 2.1.4.5 (default/linux/amd64/2008.0/no-multilib, gcc-4.1.2, glibc-2.6.1-r0, 2.6.25-gentoo-r7 x86_64)
=================================================================
System uname: 2.6.25-gentoo-r7 x86_64 Intel(R) Pentium(R) D CPU 2.80GHz
Timestamp of tree: Wed, 12 Nov 2008 12:00:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p33
dev-lang/python:     2.5.2-r7
dev-python/pycrypto: 2.0.1-r6
dev-util/ccache:     2.4-r7
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.61-r2
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1-r1
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.23-r3
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=nocona -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-march=nocona -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="http://ftp.linux.ee/pub/gentoo/distfiles/ http://ftp.rhnet.is/pub/gentoo/ http://mirror.gentoo.no/ http://gentoo.osuosl.org/"
LDFLAGS="-Wl,-O1"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="acl afs amd64 apache2 berkdb bzip2 cli cracklib crypt cups curl dri emacs fam fortran gdbm geoip gpm iconv ipv6 isdnlog ldap mailwrapper midi mmx mudflap ncurses network-cron nls nptl nptlonly openmp pam pcre perl pppd python readline reflection session spl sse sse2 ssl sysfs tcpd threads unicode vhosts xattr xml xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="access actions alias asis auth auth_anon auth_dbm auth_digest authz_default authz_host autoindex cache case_filter_in case_filter cern_meta cgi cgid charset_lite dav dav_fs dav_lock deflate dir disk_cache echo env expires ext_filter file_cache filter headers imap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_connect proxy_ftp proxy_http rewrite setenvif so speling status unique_id unique_id userdir usertrack vhost_alias" APACHE2_MPMS="worker" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="fbdev glint i810 intel mach64 mga neomagic nv r128 radeon savage sis tdfx trident vesa vga via vmware voodoo"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

Comment 4 Stefaan De Roeck (RETIRED) gentoo-dev 2008-11-12 19:39:51 UTC
I think the kernel config is especially relevant for this.  Could you check whether CONFIG_KEYS is enabled or not?  If not, could you try enabling it in your kernel?  (i.e. recompile kernel, install kernel, rebuild openafs-kernel, reboot)
Comment 5 Stefaan De Roeck (RETIRED) gentoo-dev 2008-11-28 13:00:44 UTC
Could you provide some feedback please?
Comment 6 Mike Hammill 2008-12-11 17:13:45 UTC
(In reply to comment #5)
> Could you provide some feedback please?

Sorry for the slow reply.  I had surgery and was out of it for awhile.  Currently: 

tjasse ~ # cd /usr/src/linux
tjasse linux # grep CONFIG_KEYS .config
# CONFIG_KEYS is not set
tjasse linux # uname -a
Linux tjasse 2.6.26-gentoo-r4 #1 SMP Sun Dec 7 15:17:18 CET 2008 x86_64 Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux

But this also applied to the kernel I was running when I reported the error:

tjasse linux # cd ../linux-2.6.25-gentoo-r7/
tjasse linux-2.6.25-gentoo-r7 # grep CONFIG_KEYS .config
# CONFIG_KEYS is not set

I also now have OpenAFS 1.4.8 installed but not running on the system now, instead of the 1.4.8_pre3 I that I initially had the problem with.  Since this is a production machine and I unfortunately don't have any other x86_64 machines to test on, the soonest test I would be able to do would be over the weekend.  Assuming I can do that, which kernel/openafs version would it help you most for me to try?
Comment 7 Stefaan De Roeck (RETIRED) gentoo-dev 2008-12-13 08:16:48 UTC
(In reply to comment #6)
> # CONFIG_KEYS is not set
> 
> But this also applied to the kernel I was running when I reported the error:
Openafs follows different code paths depending on this parameter, and I currently have the feeling that CONFIG_KEYS enabled is the more tested path.  
This may be relevant because there lately have been changes in code depending on this CONFIG_KEYS parameter.  

> I also now have OpenAFS 1.4.8 installed but not running on the system now,
> instead of the 1.4.8_pre3 I that I initially had the problem with.  Since this
> is a production machine and I unfortunately don't have any other x86_64
> machines to test on, the soonest test I would be able to do would be over the
> weekend.  Assuming I can do that, which kernel/openafs version would it help
> you most for me to try?

Well, if you have 1.4.8 (as opposed to) without CONFIG_KEYS ready to test and you can easily reproduce, it would be interesting to know whether that works.  
Otherwise, I would still bet on 1.4.8 with a kernel that has CONFIG_KEYS enabled (take care to build the new kernel, reboot into it, make sure you build against the new kernel (KERNEL_DIR/KBUILD_OUTPUT), remerge openafs-kernel and only the start the client).  

I hope your tests turn out well.  Thanks for testing.
Comment 8 Mike Hammill 2008-12-28 12:29:06 UTC
(In reply to comment #7)
Well, I have now tried compiling 2.6.27-gentoo-r7 with both CONFIG_KEYS and CONFIG_KEYS_DEBUG_PROC_KEYS set.  I compiled net-fs/openafs-kernel-1.4.8 against this kernel but when I even try to /etc/init.d/openafs-client start I get a Kernel Oops.  libafs is involved with the Oops.  I don't seem to have any problems with the Kernel with CONFIG_KEYS set, so far, as long as I don't run openafs.  I tried to be careful in the compiling, etc., as you outlined.

Comment 9 Stefaan De Roeck (RETIRED) gentoo-dev 2009-08-23 10:04:17 UTC
Closing, as it seems to work with CONFIG_KEYS.