38648 – oops from kernel 2.4.22-gentoo-r5

Bug 38648 - oops from kernel 2.4.22-gentoo-r5

Summary: oops from kernel 2.4.22-gentoo-r5

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	x86 Linux

Importance:	High normal (vote)
Assignee:	x86-kernel@gentoo.org (DEPRECATED)

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2004-01-18 14:20 UTC by Collin Starkweather
Modified:	2004-01-27 13:49 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
The version of binutils used to compile the ekrnel (binutils.txt,267 bytes, text/plain) 2004-01-18 14:24 UTC, Collin Starkweather	Details
The .config used to build the kernel (config-2.4.22-gentoo-r5.txt,22.82 KB, text/plain) 2004-01-18 14:24 UTC, Collin Starkweather	Details
The output of dmesg following boot (dmesg.txt,8.93 KB, text/plain) 2004-01-18 14:24 UTC, Collin Starkweather	Details
The version of gcc used to build the kernel (gcc.txt,283 bytes, text/plain) 2004-01-18 14:25 UTC, Collin Starkweather	Details
The output of ksymoops on the oops message (included as another attachment) (ksymoops.txt,2.79 KB, text/plain) 2004-01-18 14:25 UTC, Collin Starkweather	Details
The contents of /lib/modules/2.4.22-gentoo-r5 (lib-modules-2.4.22-gentoo-r5.txt,4.84 KB, text/plain) 2004-01-18 14:26 UTC, Collin Starkweather	Details
The oops message (given during boot) (oops.txt,792 bytes, text/plain) 2004-01-18 14:26 UTC, Collin Starkweather	Details
The contents of /proc/cpuinfo (proc-cpuinfo.txt,383 bytes, text/plain) 2004-01-18 14:27 UTC, Collin Starkweather	Details
The contents of /proc/meminfo (proc-meminfo.txt,525 bytes, text/plain) 2004-01-18 14:27 UTC, Collin Starkweather	Details
The contents of /proc/pci (proc-pci.txt,2.26 KB, text/plain) 2004-01-18 14:28 UTC, Collin Starkweather	Details
The contents of /var/log/kernel/current (var-log-kernel-current.txt,338 bytes, text/plain) 2004-01-18 14:28 UTC, Collin Starkweather	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Collin Starkweather 2004-01-18 14:20:15 UTC

I'm new to compiling kernels and on a recent compile received an oops. Per linux/Documentation/oops-tracing.txt I would like to submit the issue and was told by linux-kernel@vger.kernel.org to submit it here.

I am using Gentoo 1.4 with the 2.4.22-gentoo-r5 kernel (the latest and greatest) from the gentoo-sources ebuild with a single Pentium III (Coppermine) processor. (I do have a 2.4.19 kernel that works just fine but I didn't save the.config so unfortunately I can't refer back to check whether there are any configuration differences that might account for the oops.) I will attach attach a set of relevant files:

oops.txt: The text of the oops message
ksymoops.txt: The output from ksymoops < oops.txt
(Note that all the defaults for ksymoops apply and
are the appropriate choices.)
dmesg.txt: The dmesg output
var-log-kernel-current.txt: The contents of /var/log/kernel/current
after boot
proc-cpuinfo.txt: The contents of /proc/cpuinfo
proc-pci.txt: The contents of /proc/pci
proc-meminfo.txt: The contents of /proc/meminfo
gcc.txt: The version of gcc used to compile the kernel
binutils.txt: The version of binutils used to compile the kernel
config-2.4.22-gentoo-r5.txt: The .config from the kernel build

Please let me know if there is a more appropriate forum to deliver this oops trace to.

I also have a question for anyone who looks at this bug. Everything seems to run fine despite the oops. Can I run this kernel safely despite the oops?

Reproducible: Always
Steps to Reproduce:
1. Build the kernel based on the config on a system similar to mine
2. Reboot
3. Voila

Actual Results:
I saw the text given in oops.txt during boot.

Expected Results:
There should have been no error message.

Comment 1 Collin Starkweather 2004-01-18 14:24:03 UTC

Created attachment 24040 [details]
The version of binutils used to compile the ekrnel

Comment 2 Collin Starkweather 2004-01-18 14:24:33 UTC

Created attachment 24041 [details]
The .config used to build the kernel

Comment 3 Collin Starkweather 2004-01-18 14:24:54 UTC

Created attachment 24042 [details]
The output of dmesg following boot

Comment 4 Collin Starkweather 2004-01-18 14:25:14 UTC

Created attachment 24043 [details]
The version of gcc used to build the kernel

Comment 5 Collin Starkweather 2004-01-18 14:25:49 UTC

Created attachment 24044 [details]
The output of ksymoops on the oops message (included as another attachment)

Comment 6 Collin Starkweather 2004-01-18 14:26:20 UTC

Created attachment 24045 [details]
The contents of /lib/modules/2.4.22-gentoo-r5

Comment 7 Collin Starkweather 2004-01-18 14:26:57 UTC

Created attachment 24046 [details]
The oops message (given during boot)

Comment 8 Collin Starkweather 2004-01-18 14:27:20 UTC

Created attachment 24047 [details]
The contents of /proc/cpuinfo

Comment 9 Collin Starkweather 2004-01-18 14:27:42 UTC

Created attachment 24048 [details]
The contents of /proc/meminfo

Comment 10 Collin Starkweather 2004-01-18 14:28:05 UTC

Created attachment 24049 [details]
The contents of /proc/pci

Comment 11 Collin Starkweather 2004-01-18 14:28:34 UTC

Created attachment 24050 [details]
The contents of /var/log/kernel/current

Comment 12 Brian Jackson (RETIRED) gentoo-dev

2004-01-18 18:14:55 UTC

For future reference, you don't need to attach all that stuff. Usually the output of emerge info, the oops (decoded is best), and a description of when it happened/what you were doing when it happened.

On to the bug. You say this happens during boot, so I'm guessing the oops is when the init scripts are running. Check the md5sum of:
/usr/portage/distfiles/gentoo-sources-2.4.22-r5.patch.bz2
if if doesn't match 7f4a97d9c29f7dfc959a7a7efb077e29 then you've got the wrong patch. There was a bad patch on the mirrors for a few hours when it first was released. If it doesn't match, rm that file, emerge sync, and remerge gentoo-sources. Let us know what you come up with.

Comment 13 Collin Starkweather 2004-01-19 08:49:54 UTC

Thanks.  I got md5sum bd9fe0048efaff6382d887bfb595f31a.  Guess I got the wrong patch.

In case anyone else comes across this problem, I should note that the primary symptom I experienced was that top segfaults.  Probably has something to do with the oops trace giving "Trace; c01f8656 <uptime_read_proc+86/180>".

I will try to update the patch and report back.

Comment 14 Curtis Napier (RETIRED) gentoo-dev

2004-01-19 11:21:18 UTC

This also happened to me. a kernel oops at boot and whenever certain programs are run like:top, ps, clear. oops message was:kernel cannot read NULL pointer in memory address....
re-emerging gentoo-sources (2.4.22-gentoo-r5) and re-compiling the kernel solved the problem.
The md5 sum for the patch was incorrect with the original source. Now matches the md5 that brian jackson posted here.

Comment 15 Tim Yamin (RETIRED) gentoo-dev

2004-01-19 11:29:39 UTC

Looks like the confirmed fix is to sync and remerge gentoo-sources; so I'm closing this bug. Please reopen it if this persists.

Comment 16 Jean-Francois Patenaude 2004-01-19 11:31:17 UTC

Same problems here ...

How that bad kernel patch got out on the production tree ?

Anyway, shouldn't a new version (r6) be issued when such happens ?   I was awaiting a new r6 version before re-emerging the kernel.

Comment 17 Collin Starkweather 2004-01-19 13:58:58 UTC

The new patch did the trick.  No more oops.

Thanks, Brian!

Comment 18 Brian Jackson (RETIRED) gentoo-dev

2004-01-19 17:57:06 UTC

I was planning on doing a -r6, I was waiting for confirmation on something else first. As far as how it got on the "production tree", it's a little thing called human error. You'll always have it. You can't escape. (btw, there is no such thing as a production tree per se, there is only one tree, and any number of devs can mess it up at any given time whether intentional or not)

Comment 19 Jean-Francois Patenaude 2004-01-20 06:40:16 UTC

Brian, thanks for your answer.

Comment 20 Olivier Castan 2004-01-27 13:49:20 UTC

Could you find a way to advertise this bug, even if it is resolved ? I had the same bug, and found this report after many days of searching thanks to Collin who reported on lklm.org... It was a pain in the ass and it looks like many people may be concerned...
Thanks