Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 212317 - sys-kernel/gentoo-sources-2.6.23-r8 amd64 system randomly hangs
Summary: sys-kernel/gentoo-sources-2.6.23-r8 amd64 system randomly hangs
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
Depends on:
Reported: 2008-03-04 21:15 UTC by Dmitry Ilyin
Modified: 2008-03-15 17:44 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Ilyin 2008-03-04 21:15:39 UTC
I have problems with random hangs on my new AMD64 system and i am nearly sure that it is caused by propiretary nvidia drivers.
Looks like similar problem is discussed here

System sometimes hangs and only keyboard leds are flashing (kernel panic?) and a have to reset. I switched to nv driver without 3D and there was no hangs jet.

I've tried to install 169.09 (i read somewhere that they should have better SMP support) but they do not work (failed to initialize device... maybe my card is not supported? i had same error with geforce 4200 which is not supported by non-legacy drivers).

I'll try to make tests with amd64 non-smp and i386 smp kernels and other kernel sources.

Reproducible: Didn't try

Actual Results:  
System hangs randomly

Expected Results:  
Stable system

My system:
GA-MA790FX-DQ6 motherboard
AMD Phenom(tm) 9500 Quad-Core Processor
VGA compatible controller: nVidia Corporation G71 [GeForce 7300 GS] (rev a1)
2.6.23-gentoo-r8 SMP AMD64
nvidia-drivers 100.14.19
Comment 1 Jakub Moc (RETIRED) gentoo-dev 2008-03-04 21:19:42 UTC
Bugs about binary stuff go upstream since we can't fix anything here. They'll want output among others.
Comment 2 Doug Goldstein (RETIRED) gentoo-dev 2008-03-04 23:33:05 UTC
Isn't that the processor with the known hardware flaw in it?

Either way, it's not a SMP issue unless you try a non-SMP kernel and don't have problems there.

Also, there's no kernel oops or any additional information known other then your kernel ooops randomly. Blaming it on nvidia-drivers is pretty premature.

Additionally, if you tried another driver you need to read the einfo and rmmod nvidia before starting X again. Or who knows why it wouldn't start but the first place to look is dmesg.
Comment 3 Doug Goldstein (RETIRED) gentoo-dev 2008-03-04 23:34:03 UTC
Additionally, you linked to a Ubuntu post about a different motherboard, different chipset, different video card, different nvidia-drivers version, different kernel. The only thing in common is that you both have SMP kernels.
Comment 4 Daniel Drake (RETIRED) gentoo-dev 2008-03-11 16:53:49 UTC
Can you reproduce this without the nvidia binary driver loaded? You must also make sure that it does not get loaded at all for that session (i.e. unloading it and then using the system is not acceptable)
Comment 5 Dmitry Ilyin 2008-03-11 20:53:09 UTC
I have deleted nvidia.ko rebooted and used nv driver. Then i loaded cpu for two days without hangs. Then I emerged nvidia-drivers again and started 3d game at evening. At morning i found my system hanged.

I also saw this:
kernel bug at mm/slab.c:3739!
invalide opcode: 0000[1] SMP
Comment 6 Dmitry Ilyin 2008-03-11 21:09:22 UTC
I have installed 2.6.24-gentoo-r3 #4 SMP kernel and 169.12 nvidia-drivers (Thank you Doug Goldstein for advice I really forgot to unload nvidia.ko last time)

Looks like system is stable... No hangs jet, even after one day of load test.

Maybe this bug was specific for my hardware if no one reported same issues?
Or even bad contact in pci-e slot... (I have plugged card several times, maybe it helped)

Little offtopic:
I also tried ATI Radeon X1600. This time I could install it (xorg 7.2 is now supported) and got rendering working and glxgears gave twice of geforce 7300's fps (yes, glxgears is NOT good benchmark). But all Wine games did not work... ATI still suck. I hope this will change soon as ATI opened some of there specifications.
Comment 7 David Heaps 2008-03-15 17:44:56 UTC
I just wanted to add that I have the same hardware (only a 9600) and have the same problem and it only started occuring when I upgraded to this version of the kernel. 

This most likely is the hardware flaw as I haven't updated my BIOS. This version of the kernel just most likely runs into this problem more often. There is an updated BIOS image for this motherboard (and should be available for all AM2+ motherboards) that worksaround this flaw. Although there is a performance penalty it should be used, especially with Linux. A lot of Linux programs use optimizations that could make this hardware flaw apparent. Linux users often tax their systems a lot more than the typical Windows user... especially Linux users that bought a quad core CPU. That increases the odds of this hardware flaw causing system hangs.

I don't think this bug should go upstream as there is nothing they can do really... it's a hardware problem and should be handled by the BIOS.