Summary: | opteron system randomly crashes | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Thomas Beutin <tb> |
Component: | [OLD] Server | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | RESOLVED NEEDINFO | ||
Severity: | normal | CC: | radek, wschlich |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
kernel config
kernel panic screen shot "sysctl -a" output /proc/cpuinfo kernel config "sysctl -a" output /proc/cpuinfo dmesg output console screen shot stat output |
Description
Thomas Beutin
2006-02-21 12:55:33 UTC
Created attachment 80375 [details]
kernel config
Created attachment 80376 [details]
kernel panic screen shot
Created attachment 80377 [details]
"sysctl -a" output
Created attachment 80378 [details]
/proc/cpuinfo
Created attachment 80379 [details]
kernel config
Created attachment 80381 [details]
"sysctl -a" output
Created attachment 80382 [details]
/proc/cpuinfo
Please enable CONFIG_KALLSYMS and post a new screenshot. It's almost impossible to diagnose this otherwise (kallsyms will add some useful text into those meaningless numbers). How often does the crash occur? Created attachment 80404 [details]
dmesg output
crashes occurs "usually" about every 3 weeks, but this week on monday morning and (after rebooting with 2.6.15.r5) tuesday evening. now i turned off swap. at the moment i recompile the kernel with CONFIG_KALLSYMS enabled but i cannot reboot before 8pm (GMT 7pm). Ok. You should also upgrade to the latest development kernel (currently 2.6.16-rc4) as the problem may have been fixed. I'm going to close this bug for now as it sounds like we might be waiting weeks for a new crash screenshot. Please reopen when you do have one. @Daniel: which sources do You mean? vanilla-sources-2.6.16-r4? vanilla-sources-2.6.16-rc4 It crashed again. I'll attach some info. Created attachment 87249 [details]
console screen shot
New screenshot, as requested the kernel was compiled with CONFIG_KALLSYMS=y.
Created attachment 87250 [details]
stat output
I log once a minute the output from utime an the content from /proc/meminfo and /proc/vmstat by a cron job. This is the last log before the system crashed.
Would it be possible to setup a serial console or netconsole to capture the full error message? You can find documentation on how to do so in Documentation/serial-console.txt and Documentation/networking/netconsole.txt, under your kernel source tree. It would also be a good idea to try with the latest vanilla sources (currently 2.6.21.1). If you can reproduce with that, and get the full error message, then you will have a much better chance of getting help from LKML (assuming we can't identify the problem here). I do not admister this system any longer, so i cannot provide more information, sorry. So You may close the bug. But i had a very similar problem on my x86 notebook using an suspend2 kernel a while ago. I had random crashes every now and then after resuming with some USB devives (external mouse) not attached as they were before i suspended, but the system seemed to run ok after a real reboot. Some days later i tried to to emerge a new bash and the system crashed reproducible every time at the same point of compiling. The same occurs on other packages as well. The reason was a corrupt reiser3 /tmp file system (i got denied permissions on some files and dirs even as root). Repairing this (and by the way the other) reiser3 filesystems solved the problem. But i couldn't check it on the server; as i left the company the machine was running smooth for a couple of months (but without changing anything except rebooting a new kernel version after every crash). OK. Thanks for the update. If your notebook still exhibits those problems on the latest version of a supported kernel (e.g. gentoo-sources) then please file a new bug. |