Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 334143 - >=gentoo-sources-2.6.34 based kernel crashes after some (random) time with strange NMI related message
Summary: >=gentoo-sources-2.6.34 based kernel crashes after some (random) time with st...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard: linux-2.6.35
Keywords:
Depends on: 317231
Blocks:
  Show dependency tree
 
Reported: 2010-08-23 19:38 UTC by Ladislav Laska
Modified: 2011-08-08 20:50 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (eminfo,4.54 KB, text/plain)
2010-08-23 19:39 UTC, Ladislav Laska
Details
gentoo-sources-2.6.35-r2 config (.config,79.85 KB, text/plain)
2010-08-23 19:39 UTC, Ladislav Laska
Details
gentoo-sources-2.6.33-r2 config (.config,76.89 KB, text/plain)
2010-08-23 19:40 UTC, Ladislav Laska
Details
lspci output (lspci,29.81 KB, text/plain)
2010-08-23 19:41 UTC, Ladislav Laska
Details
lspci -v output (lspciv,10.55 KB, text/plain)
2010-08-23 19:41 UTC, Ladislav Laska
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ladislav Laska 2010-08-23 19:38:34 UTC
This might be upstream problem, but since I'm using gentoo-sources and I'm not sure in what they exactly differ from vanilla, I'm posting it here. Please, tell me if it should be reported to upstream.

I have an x60s laptop with up-to-date ~x86 Gentoo. (more hw info attached)

Kernels starting from 2.6.34 crashes in random time after boot (even if idle). After a few hours (it happend today after 7-8 hours) my machine is locked up and does not react on anything (except power-off button pushed for a few seconds...). So I set up network console on another computer and logged it all there (don't have serial and don't think USB converter is so much better than network). I received:

[21067.404724] Uhhuh. NMI received for unknown reason b1 on CPU 0.
[21067.404724] You have some hardware problem, likely on the PCI bus.
[21067.404724] Dazed and confused, but trying to continue

And then nothing. This happend only with 2.6.34 kernels (an newer, this was captured with 2.6.35-r2 from portage). With my 2.6.33-r2 everything works, I've tried for several days and it didn't crashed.

Note that I've been experiencing strange problems with my temperature sensor - it seems to be sometimes mistaken and thinks my cpu is really hot - 126 deg. C (which is way too much for my Core1Duo, Yonah, L2400) and just shuts itself down (with proper message in syslog). Could this be related or is it just a coincidence? I don't remember this happening with 2.6.35, but I don't use it enough to notice it.

Well, I know this isn't much to work with, but I don't know what else could be done (except bisecting kernel, which is way too much work because of the relatively long time between the bug appears). If you need more info, just tell me how to get it and I'll do it.

Reproducible: Always
Comment 1 Ladislav Laska 2010-08-23 19:39:10 UTC
Created attachment 244283 [details]
emerge --info
Comment 2 Ladislav Laska 2010-08-23 19:39:47 UTC
Created attachment 244285 [details]
gentoo-sources-2.6.35-r2 config
Comment 3 Ladislav Laska 2010-08-23 19:40:23 UTC
Created attachment 244287 [details]
gentoo-sources-2.6.33-r2 config
Comment 4 Ladislav Laska 2010-08-23 19:41:16 UTC
Created attachment 244289 [details]
lspci output
Comment 5 Ladislav Laska 2010-08-23 19:41:33 UTC
Created attachment 244291 [details]
lspci -v output
Comment 6 Ladislav Laska 2010-08-24 07:55:21 UTC
I've just tested with latest vanilla kernel from git and it happened too (even a lot sooner).
Comment 7 George Kadianakis (RETIRED) gentoo-dev 2010-08-27 17:26:17 UTC
I don't know if you've seen it, but d2_racing's post here: http://forums.gentoo.org/viewtopic-p-4833359.html?sid=54fe6c4f345fcd73e8610e21755441c7
at Wed Feb 06, 2008 11:25 pm, seems interesting since you share the same wireless NIC.
Comment 8 Ladislav Laska 2010-08-27 20:18:29 UTC
I've seen it but doesn't fit. My problems started only with 2.6.34 - this thread is two years old and about 2.6.24 kernel. Another thing is that in all test cases my wireless card is off (rfkilled, so the driver should be inactive). Nevertheless, it's worth a try. In the mean time, I've started bisecting the kernel, but it will take at least a week to finish..
Comment 9 DEMAINE Benoît-Pierre, aka DoubleHP 2010-09-13 07:49:38 UTC
Same for me. This bug is dup of 328889 . And blocks 317231 . Please update status.
Comment 10 Mike Pagano gentoo-dev 2010-09-15 17:55:29 UTC
Ladislav,

Can you please test with gentoo-sources-2.6.34-r6. It has fixes in for crashing and I want to see if this fixes your issue.
Comment 11 Ladislav Laska 2010-09-15 18:00:17 UTC
Compiling now, will post more info in a day or so (depends on the result).
Comment 12 Ladislav Laska 2010-09-16 09:33:38 UTC
Nope, 2.6.34-r6 still freezing. I wonder, if you meant 2.6.35-r6? I'll try and see.
Comment 13 Ladislav Laska 2010-09-18 08:18:12 UTC
Running 2.6.35-gentoo-r5 for a day and so, and no freez yet. Seems to be fixed between -r2 and -r5.