I'm new to compiling kernels and on a recent compile received an oops. Per linux/Documentation/oops-tracing.txt I would like to submit the issue and was told by linux-kernel@vger.kernel.org to submit it here. I am using Gentoo 1.4 with the 2.4.22-gentoo-r5 kernel (the latest and greatest) from the gentoo-sources ebuild with a single Pentium III (Coppermine) processor. (I do have a 2.4.19 kernel that works just fine but I didn't save the.config so unfortunately I can't refer back to check whether there are any configuration differences that might account for the oops.) I will attach attach a set of relevant files: oops.txt: The text of the oops message ksymoops.txt: The output from ksymoops < oops.txt (Note that all the defaults for ksymoops apply and are the appropriate choices.) dmesg.txt: The dmesg output var-log-kernel-current.txt: The contents of /var/log/kernel/current after boot proc-cpuinfo.txt: The contents of /proc/cpuinfo proc-pci.txt: The contents of /proc/pci proc-meminfo.txt: The contents of /proc/meminfo gcc.txt: The version of gcc used to compile the kernel binutils.txt: The version of binutils used to compile the kernel config-2.4.22-gentoo-r5.txt: The .config from the kernel build Please let me know if there is a more appropriate forum to deliver this oops trace to. I also have a question for anyone who looks at this bug. Everything seems to run fine despite the oops. Can I run this kernel safely despite the oops? Reproducible: Always Steps to Reproduce: 1. Build the kernel based on the config on a system similar to mine 2. Reboot 3. Voila Actual Results: I saw the text given in oops.txt during boot. Expected Results: There should have been no error message.
Created attachment 24040 [details] The version of binutils used to compile the ekrnel
Created attachment 24041 [details] The .config used to build the kernel
Created attachment 24042 [details] The output of dmesg following boot
Created attachment 24043 [details] The version of gcc used to build the kernel
Created attachment 24044 [details] The output of ksymoops on the oops message (included as another attachment)
Created attachment 24045 [details] The contents of /lib/modules/2.4.22-gentoo-r5
Created attachment 24046 [details] The oops message (given during boot)
Created attachment 24047 [details] The contents of /proc/cpuinfo
Created attachment 24048 [details] The contents of /proc/meminfo
Created attachment 24049 [details] The contents of /proc/pci
Created attachment 24050 [details] The contents of /var/log/kernel/current
For future reference, you don't need to attach all that stuff. Usually the output of emerge info, the oops (decoded is best), and a description of when it happened/what you were doing when it happened. On to the bug. You say this happens during boot, so I'm guessing the oops is when the init scripts are running. Check the md5sum of: /usr/portage/distfiles/gentoo-sources-2.4.22-r5.patch.bz2 if if doesn't match 7f4a97d9c29f7dfc959a7a7efb077e29 then you've got the wrong patch. There was a bad patch on the mirrors for a few hours when it first was released. If it doesn't match, rm that file, emerge sync, and remerge gentoo-sources. Let us know what you come up with.
Thanks. I got md5sum bd9fe0048efaff6382d887bfb595f31a. Guess I got the wrong patch. In case anyone else comes across this problem, I should note that the primary symptom I experienced was that top segfaults. Probably has something to do with the oops trace giving "Trace; c01f8656 <uptime_read_proc+86/180>". I will try to update the patch and report back.
This also happened to me. a kernel oops at boot and whenever certain programs are run like:top, ps, clear. oops message was:kernel cannot read NULL pointer in memory address.... re-emerging gentoo-sources (2.4.22-gentoo-r5) and re-compiling the kernel solved the problem. The md5 sum for the patch was incorrect with the original source. Now matches the md5 that brian jackson posted here.
Looks like the confirmed fix is to sync and remerge gentoo-sources; so I'm closing this bug. Please reopen it if this persists.
Same problems here ... How that bad kernel patch got out on the production tree ? Anyway, shouldn't a new version (r6) be issued when such happens ? I was awaiting a new r6 version before re-emerging the kernel.
The new patch did the trick. No more oops. Thanks, Brian!
I was planning on doing a -r6, I was waiting for confirmation on something else first. As far as how it got on the "production tree", it's a little thing called human error. You'll always have it. You can't escape. (btw, there is no such thing as a production tree per se, there is only one tree, and any number of devs can mess it up at any given time whether intentional or not)
Brian, thanks for your answer.
Could you find a way to advertise this bug, even if it is resolved ? I had the same bug, and found this report after many days of searching thanks to Collin who reported on lklm.org... It was a pain in the ass and it looks like many people may be concerned... Thanks