Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 68819 - kernel 2.6.9-gentoo-r1 oops at boot time
Summary: kernel 2.6.9-gentoo-r1 oops at boot time
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-10-25 06:03 UTC by John Robinson
Modified: 2004-11-10 22:18 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
My .config (.config,32.32 KB, text/plain)
2004-10-25 06:04 UTC, John Robinson
Details
The .config I used for 2.6.9-r1 (.config,33.02 KB, text/plain)
2004-10-25 16:34 UTC, John Robinson
Details
Here's one that works (.config,29.54 KB, text/plain)
2004-10-25 17:38 UTC, John Robinson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Robinson 2004-10-25 06:03:15 UTC
I'm afraid I've no idea what's gone wrong, but I'll give what details I can. I had to reinstall my system after a hard disc crash, and worked from the 2004.3-test1 LiveCD with EVMS. In the first instance rather than build a new kernel I simply copied the LiveCD kernel, and made up an EVMS initrd, under which my system boots and runs fine. Then I emerge'd gentoo-dev-sources 2.6.8-r10, built a kernel, remade my EVMS initrd with the new kernel's EVMS-related modules, and I find it crashes with an oops while the system is booting. It gets past the EVMS initrd and into the Gentoo init system, gets through several items there, sits "checking module dependencies" for a little while (as expected), then several screenfuls of blurb fly by and I'm left with an oops. Here's what's left of the oops that I can see on the screen (typed by hand so may not be perfectly correct):

Code: 8b 80 88 00 00 00 89 4c 24 04 c7 04 24 58 e8 2e c0 89 44 24
 <1>Unable to handle kernel paging request at virtual address ebffffd8
 printing eip:
c0118661
*pde = 00000000
Oops: 0002 [#103]
Modules linked in: via_rhine mii crc32 ide_tape st ide_cd sr_mod cdrom raid1 md
dm_bbr dm_mod
CPU:    0
EIP:    0060:[<c0118661>]    Not tainted
EFLAGS: 00010093   (2.6.8-gentoo-r10)
EIP is at scheduler_tick+0x101/0x430
eax: ebffffd0   ebx: 00000001   ecx: 00000000   edx: 00000000
esi: c0222708   edi: 00000000   ebp: c03def90   esp: c03def78
ds: 007b   es: 007b   ss: 0068
Process   XXXXXXXX (pid: 18108811, threadinfo=c03de000 task=c0222708)
Stack: 00000000 00000000 ffffff9c 00000000 00000001 00000000 cc6c7244 c0122cb4
       00000000 00000001 00000001 00000000 cc6c7244 20000001 c0122e94 00000000
       cc6c7244 c010c4b8 cc6c7244 c032c228 20000001 00000000 c01086f9 00000000
Call Trace:
Stack pointer is garbage, not printing trace
Code: 0f ba 68 08 03 83 c4 0c 5b 5e 5f 5d c3 89 f6 8b 7e 18 83 ff
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

...except that under Process where I've written XXXXXXXX there are about 10 random-looking characters: graphic double vertical bar with leg to right, diamond, dollar, superscript n, five, slash, graphic L-shape, graphic backwards F-shape, U, smiley face.

I'm not sure the oops actually occurs during the module dependency check though.  Under the LiveCD kernel, the next few steps are: autoload via-rhine, activate EVMS2 (unnecessary since the initrd already did it, but it appears harmless), Starting up RAID devices (ditto), Checking all filesystems. I guess my other partitions never got mounted when the system oopses, as the fsck on booting the working kernel says "Filesystem marked as cleanly umounted".

I'll attach my .config when I reboot to run the system under the 2.6.8-r9 kernel from the LiveCD.

Reproducible: Always
Steps to Reproduce:
1.
2.
3.




The system is a VIA EPIA M-II 10000, which until my hard disc crash ran fine
under a 2.4.26 gentoo kernel.

# emerge info
Portage 2.0.50-r11 (default-x86-2004.2, gcc-3.3.4, glibc-2.3.4.20040808-r1,
2.6.8-gentoo-r9)
=================================================================
System uname: 2.6.8-gentoo-r9 i686 VIA Nehemiah
Gentoo Base System version 1.4.16
distcc 2.16 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
Autoconf: sys-devel/autoconf-2.59-r5
Automake: sys-devel/automake-1.8.5-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium3 -O2 -Os -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config
/usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref
/usr/share/config /var/bind /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=pentium3 -O2 -Os -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache distcc sandbox"
GENTOO_MIRRORS="http://www.mirrorservice.org/sites/www.ibiblio.org/gentoo
http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo
http://ftp.easynet.nl/mirror/gentoo http://gentoo.osuosl.org"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="X alsa apache2 apm atm avi berkdb bitmap-fonts bzlib cdr crypt cups dbase
dga directfb divx4linux doc dvd emacs encode f77 fbcon foomaticdb ftp gd gdbm
gif gnome gpm gtk gtk2 imagemagick imap imlib java jikes jpeg ldap libg++ libwww
lirc mad maildir mailwrapper mbox mhash mikmod mmap mmx motif mozilla mpeg mysql
ncurses nls oggvorbis opengl oss pam pcmcia pcre pdflib perl php pic png pnp
python qt quicktime readline ruby samba sasl sdl shared slang spell sse ssl svga
tcltk tcpd truetype unicode usb vhosts x86 xinerama xml xml2 xmms xosd xprint
xsl xv xvid zlib"
Comment 1 John Robinson 2004-10-25 06:04:51 UTC
Created attachment 42556 [details]
My .config
Comment 2 Daniel Drake (RETIRED) gentoo-dev 2004-10-25 10:57:55 UTC
Please try gentoo-dev-sources-2.6.9-r1
Comment 3 John Robinson 2004-10-25 16:33:34 UTC
OK, have done. It crashes again, even earlier in the boot sequence, almost instantly after init starts - I didn't notice any green and blue for the "Gentoo Linux" banner fly past. This time I get what must be a stack trace, and unless you desperately need them, I'll skip the [<xxxxxxxx>] bits...

bad: scheduling while atomic!
 [<c02ef6bd>] schedule+0x47d/0x490
 [<c0xxxxxx>] file_read_actor+0x106/0x120
 [<c0xxxxxx>] sys_sched_yield+0x50/0x70
 [<c0xxxxxx>] coredump_wait+0x36/0xa0
 [<c0xxxxxx>] do_coredump+0xe1/0x1cd
 [<c0xxxxxx>] __dequeue_signal+0xf5/0x1b0
 [<c0xxxxxx>] dequeue_signal+0x35/0xa0
 [<c0xxxxxx>] get_signal_to_deliver+0x1f2/0x2f0
 [<c0xxxxxx>] do_signal+0x9b/0x130
 [<c0xxxxxx>] dput+0x195/0x1a0
 [<c0xxxxxx>] __fpy+0x105/0x170
 [<c0xxxxxx>] filp_close+0x59/0x90
 [<c0xxxxxx>] do_page_fault+0x0/0x5f0
 [<c0xxxxxx>] do_notify_resume+0x37/0x40
 [<c0xxxxxx>] work_notifysig+0x13/0x15
========================
 [<c0xxxxxx>] it_real_fn+0x0/0x60
 [<c0xxxxxx>] schedule+0x294/0x490
 [<c0xxxxxx>] it_real_fn+0x0/0x60
 [<c0xxxxxx>] schedule+0x294/0x490
 [<c0xxxxxx>] it_real_fn+0x0/0x60
 [<c0xxxxxx>] schedule+0x294/0x490
Kernel panic - not syncing: Aiee, killing interrupt handler!

To make this kernel config originally, I just started from fresh, i.e. as soon as I'd run `emerge gentoo-dev-sources` I used `make menuconfig` and picked all the likely-looking options for my system. To try 2.6.9-r1 I copied the .config into the new source directory, and ran `make oldconfig`, and took the offered default for the extra options (except a new iptables connection tracking module where I said 'M' rather than 'N').
Comment 4 John Robinson 2004-10-25 16:34:52 UTC
Created attachment 42585 [details]
The .config I used for 2.6.9-r1
Comment 5 John Robinson 2004-10-25 17:38:55 UTC
Created attachment 42588 [details]
Here's one that works

This is a completely different .config; I started with the config from the
2.6.8-r9 kernel on the 2004.3-test1 LiveCD, and switched off lots of things I
don't have. Both the version I used on the 2.6.8-r10 sources and this version I
used with the 2.6.9-r1 sources boot fine, but I'd still like to know what
caused the crashes (and if possible help fix it); I'm not happy with this
current config but will make do for now.
Comment 6 John Robinson 2004-10-25 17:50:24 UTC
Sorry, where I wrote __fpy in the stack trace above I meant __fput.

I just tried the broken kernel again, and got a few more lines of the trace; interpret this as appearing before the one I typed in earlier:

 [<c0106007>] syscall_call+0x7/0xb
 =======================
 [<c0xxxxxx>] it_real_fn+0x0/0x60
 [<c0xxxxxx>] schedule+0x294/0x490
Comment 7 Daniel Drake (RETIRED) gentoo-dev 2004-10-27 15:58:08 UTC
I can't immediately tell from the oops which config option is causing the problem. It's really up to you to adapt your config in small steps to narrow down the bug. You may also wish to do this on 2.6.10-rc1 incase the problem has been fixed already.
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2004-11-10 22:18:44 UTC
Please reopen if you manage to track this down.