Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 54431 - 2.6.7-gentoo sources do not let me load stable nvidia module
Summary: 2.6.7-gentoo sources do not let me load stable nvidia module
Status: VERIFIED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: x86-kernel@gentoo.org (DEPRECATED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-19 10:34 UTC by Miguel
Modified: 2005-08-16 11:54 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Miguel 2004-06-19 10:34:44 UTC
When I try to load the nvidia module running 2.6.7-gentoo kernel I get the following message: "invalid module format". I am using the stable version of nvidia-kernel and nvidia-glx.

I have tried with the masked drivers and the module loads altought I experience other problems with these drivers: scrambled or not uniformly lit LCD screen.

Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Comment 1 matt 2004-06-19 10:49:57 UTC
This bug is a duplicate of #54431.

Try using ACCEPT_KEYWORDS to emerge media-video/nvidia-kernel-1.0.5336-r4 and see if that works...
Comment 2 matt 2004-06-19 10:52:22 UTC
Whoops. Meant duplicate of #50768.
Comment 3 Miguel 2004-06-19 12:41:17 UTC
I know the nvidia module loads fine if both nvidia-kernel and nvidia-glx are updated to the masked version but there are people like me that have problems with that version of the drivers. That is probably the reason this driver version is masked

I have tried upgrading only nvidia-kernel but it did not work.
Comment 4 aurelboiss 2004-06-19 13:46:31 UTC
i have the same bug with 2.6.7 and the latest nvidia driver (5336-r4, 336-r2). My bug is not exactly the same. Y can load the nvidia driver but when y start the X my Pc Freeze :(
Comment 5 Stefan Tittel 2004-06-19 18:56:14 UTC
I can confirm this bug on a PIV system with gentoo-dev-sources-2.6.7 and nvidia-kernel-1.0.4496-r3 (both are the most current stable packages for x86).

As I might add, manually unmasking a masked version of nvidia-kernel obviously isn't a solution since stable is supposed to work, and people who want to stick to stable shouldn't be left without a working nvidia support. As long as there aren't any known problems, I suggest to mark a newer nvidia-kernel (and corresponding nvidia-glx) stable. My amd64 machine for instance is running fine with 1.0.5332-r2 (the current stable on amd64).
Comment 6 Ronny Haryanto 2004-06-20 08:01:22 UTC
OK, here's a bit more information since I'm in the same boat.

My system:
- ASUS A7N8X-X (nforce2 chipset)
- GeForce FX5200
- gentoo-dev-sources-2.6.7
- nvidia-kernel-1.0.5336-r4
- nvidia-glx-1.0.5336-r2
- xorg-x11-6.7.0-r1

During boot right after "starting pci hotplug" and before "starting usb hotplug", there was a message "Disabling IRQ 12" that was not there when I used 2.6.5-gentoo-r1.

X would start, the X log says nothing unusual, but the screen is blank (a little bit of scramble at the bottom of the screen) and unresponsive. Only X is unresponsive, meaning that I can still ssh from other machine and kill X.

The kernel syslog says this:

Badness in pci_find_subsys at drivers/pci/search.c:167
 [pci_find_subsys+232/240] pci_find_subsys+0xe8/0xf0
 [<c020e9e8>] pci_find_subsys+0xe8/0xf0
 [pci_find_device+47/64] pci_find_device+0x2f/0x40
 [<c020ea1f>] pci_find_device+0x2f/0x40
 [pci_find_slot+40/80] pci_find_slot+0x28/0x50
 [<c020e828>] pci_find_slot+0x28/0x50
 [pg0+951120080/1069117440] os_pci_init_handle+0x39/0x68 [nvidia]
 [<f8f760d0>] os_pci_init_handle+0x39/0x68 [nvidia]
 [pg0+949631071/1069117440] _nv001243rm+0x1f/0x24 [nvidia]
 [<f8e0a85f>] _nv001243rm+0x1f/0x24 [nvidia]
 [pg0+950968597/1069117440] _nv000816rm+0x2f5/0x384 [nvidia]
 [<f8f51115>] _nv000816rm+0x2f5/0x384 [nvidia]
 [pg0+950348076/1069117440] _nv003801rm+0xd8/0x100 [nvidia]
 [<f8eb992c>] _nv003801rm+0xd8/0x100 [nvidia]
 [pg0+950967375/1069117440] _nv000809rm+0x2f/0x34 [nvidia]
 [<f8f50c4f>] _nv000809rm+0x2f/0x34 [nvidia]
 [pg0+950351696/1069117440] _nv003816rm+0xf0/0x104 [nvidia]
 [<f8eba750>] _nv003816rm+0xf0/0x104 [nvidia]
 [pg0+950355143/1069117440] _nv000013rm+0x77/0x84 [nvidia]
 [<f8ebb4c7>] _nv000013rm+0x77/0x84 [nvidia]
 [pg0+950353515/1069117440] _nv003780rm+0x1df/0x2c8 [nvidia]
 [<f8ebae6b>] _nv003780rm+0x1df/0x2c8 [nvidia]
 [pg0+950353015/1069117440] _nv000012rm+0x43/0x58 [nvidia]
 [<f8ebac77>] _nv000012rm+0x43/0x58 [nvidia]
 [pg0+950352948/1069117440] _nv000012rm+0x0/0x58 [nvidia]
 [<f8ebac34>] _nv000012rm+0x0/0x58 [nvidia]
 [pg0+949581468/1069117440] _nv001219rm+0xa8/0x124 [nvidia]
 [<f8dfe69c>] _nv001219rm+0xa8/0x124 [nvidia]
 [pg0+951110056/1069117440] nv_kern_rc_timer+0x0/0x37 [nvidia]
 [<f8f739a8>] nv_kern_rc_timer+0x0/0x37 [nvidia]
 [pg0+949649078/1069117440] rm_run_rc_callback+0x36/0x4c [nvidia]
 [<f8e0eeb6>] rm_run_rc_callback+0x36/0x4c [nvidia]
 [pg0+951110075/1069117440] nv_kern_rc_timer+0x13/0x37 [nvidia]
 [<f8f739bb>] nv_kern_rc_timer+0x13/0x37 [nvidia]
 [run_timer_softirq+203/432] run_timer_softirq+0xcb/0x1b0
 [<c01211cb>] run_timer_softirq+0xcb/0x1b0
 [do_timer+223/240] do_timer+0xdf/0xf0
 [<c012139f>] do_timer+0xdf/0xf0
 [__do_softirq+125/128] __do_softirq+0x7d/0x80
 [<c011d23d>] __do_softirq+0x7d/0x80
 [do_softirq+38/48] do_softirq+0x26/0x30
 [<c011d266>] do_softirq+0x26/0x30
 [do_IRQ+253/304] do_IRQ+0xfd/0x130
 [<c01081ad>] do_IRQ+0xfd/0x130
 [common_interrupt+24/32] common_interrupt+0x18/0x20
 [<c01064b4>] common_interrupt+0x18/0x20
                                                                                                                        
... repeated several times.
                                                                                                                        
Found related discussions here:
http://www.nvnews.net/vbulletin/showthread.php?t=24866
http://forums.gentoo.org/viewtopic.php?t=187632

Comment 7 Ronny Haryanto 2004-06-20 09:26:25 UTC
This is the message printed in kernel syslog when "Disabling IRQ 12" happened.

[__report_bad_irq+42/144] __report_bad_irq+0x2a/0x90
[<c01074da>] __report_bad_irq+0x2a/0x90
[note_interrupt+112/160] note_interrupt+0x70/0xa0
[<c01075d0>] note_interrupt+0x70/0xa0
[do_IRQ+289/304] do_IRQ+0x121/0x130
[<c0107871>] do_IRQ+0x121/0x130
[common_interrupt+24/32] common_interrupt+0x18/0x20
[<c0105cb4>] common_interrupt+0x18/0x20
[handle_IRQ_event+36/112] handle_IRQ_event+0x24/0x70
[<c0107464>] handle_IRQ_event+0x24/0x70
[do_IRQ+145/304] do_IRQ+0x91/0x130
[<c01077e1>] do_IRQ+0x91/0x130
[common_interrupt+24/32] common_interrupt+0x18/0x20
[<c0105cb4>] common_interrupt+0x18/0x20
[__do_softirq+48/128] __do_softirq+0x30/0x80
[<c011ab70>] __do_softirq+0x30/0x80
[do_softirq+38/48] do_softirq+0x26/0x30
[<c011abe6>] do_softirq+0x26/0x30
[do_IRQ+253/304] do_IRQ+0xfd/0x130
[<c010784d>] do_IRQ+0xfd/0x130
[common_interrupt+24/32] common_interrupt+0x18/0x20
[<c0105cb4>] common_interrupt+0x18/0x20
[pci_bus_read_config_byte+95/144] pci_bus_read_config_byte+0x5f/0x90
[<c020705f>] pci_bus_read_config_byte+0x5f/0x90
[pg0+946316190/1069428736] ehci_start+0x2ce/0x360 [ehci_hcd]
[<f8a9539e>] ehci_start+0x2ce/0x360 [ehci_hcd]
[preempt_schedule+42/80] preempt_schedule+0x2a/0x50
[<c03013ba>] preempt_schedule+0x2a/0x50
[release_console_sem+203/224] release_console_sem+0xcb/0xe0
[<c01178fb>] release_console_sem+0xcb/0xe0
[printk+269/368] printk+0x10d/0x170
[<c011777d>] printk+0x10d/0x170
[pg0+944391415/1069428736] usb_register_bus+0x137/0x160 [usbcore]
[<f88bf4f7>] usb_register_bus+0x137/0x160 [usbcore]
[pg0+944412043/1069428736] usb_hcd_pci_probe+0x2ab/0x4e0 [usbcore]
[<f88c458b>] usb_hcd_pci_probe+0x2ab/0x4e0 [usbcore]
[pci_device_probe_static+82/112] pci_device_probe_static+0x52/0x70
[<c020a982>] pci_device_probe_static+0x52/0x70
[__pci_device_probe+59/80] __pci_device_probe+0x3b/0x50
[<c020a9db>] __pci_device_probe+0x3b/0x50
[pci_device_probe+44/80] pci_device_probe+0x2c/0x50
[<c020aa1c>] pci_device_probe+0x2c/0x50
[bus_match+63/112] bus_match+0x3f/0x70
[<c023608f>] bus_match+0x3f/0x70
[driver_attach+89/144] driver_attach+0x59/0x90
[<c02361b9>] driver_attach+0x59/0x90
[bus_add_driver+145/176] bus_add_driver+0x91/0xb0
[<c0236461>] bus_add_driver+0x91/0xb0
[driver_register+47/64] driver_register+0x2f/0x40
[<c023691f>] driver_register+0x2f/0x40
[pci_register_driver+92/144] pci_register_driver+0x5c/0x90
[<c020ac9c>] pci_register_driver+0x5c/0x90
[pg0+944902179/1069428736] init+0x23/0x30 [ehci_hcd]
[<f893c023>] init+0x23/0x30 [ehci_hcd]
[sys_init_module+276/560] sys_init_module+0x114/0x230
[<c012c8c4>] sys_init_module+0x114/0x230
[syscall_call+7/11] syscall_call+0x7/0xb
[<c0105b47>] syscall_call+0x7/0xb

I tried turning off hotplug so that IRQ 12 is not disabled yet before I start X, and it worked! But I have no mouse since usb hotplug is also off.

IRQ 12 is used (shared) by the video card:

  Bus  2, device   0, function  0:
    VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev 161).
      IRQ 12.
      Master Capable.  Latency=248.  Min Gnt=5.Max Lat=1.
      Non-prefetchable 32 bit memory at 0xec000000 [0xecffffff].
      Prefetchable 32 bit memory at 0xe0000000 [0xe7ffffff].
Comment 8 Thierry Carrez (RETIRED) gentoo-dev 2004-06-21 07:13:36 UTC
Same here.
Current nvidia-kernel x86 stable version gives : "invalid module format"

Upgrading to ~ version does not work. Current ~x86 version gives : loads nicely but when used, blank (black) screen, no way to switch back to console. Looks like only the display is garbled, since ctrl-alt-del reboots nicely. Note that my PC at home works with 2.6.7-r1 and latest ~x86 nvidia-kernel works perfectly, so this side problem is probably card-specific.
Comment 9 Daniel Drake (RETIRED) gentoo-dev 2004-06-21 07:49:14 UTC
Make sure that you do not have 4k stacks enabled in your kernel.
Comment 10 Thierry Carrez (RETIRED) gentoo-dev 2004-06-21 09:02:11 UTC
My config has :
# CONFIG_4KSTACKS is not set
so it doesn't come from that.
Comment 11 Greg Kroah-Hartman (RETIRED) gentoo-dev 2004-06-21 11:13:31 UTC
That "badness" issue is a nvidia driver bug.  Please go bug them to fix this,
there is nothing we can do about it here.
Comment 12 aurelboiss 2004-06-21 12:17:38 UTC
i have exactly the same bug :(

Portage 2.0.50-r8 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.3.20040420-r0, 2.6.5-gentoo-r1)
=================================================================
System uname: 2.6.5-gentoo-r1 i686 AMD Athlon(tm) XP 2700+
Gentoo Base System version 1.4.16
Autoconf: sys-devel/autoconf-2.59-r3
Automake: sys-devel/automake-1.8.3
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X alsa apm arts avi berkdb crypt cups encode esd foomaticdb gdbm ggi gif gnome gpm gtk gtk2 imlib java jpeg kde libg++ libwww linguas_fr mad mikmod motif mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl svga tcpd truetype x86 xml2 xmms xv zlib"

i have installed the latest nvidia drivers 

nvidia-glx: 1.0.5336-r2
nvidia-kernel: 1.0.5336-r4

Jun 21 21:02:10 Gentoo Badness in pci_find_subsys at drivers/pci/search.c:167
Jun 21 21:02:10 Gentoo [<c02a8ae8>] pci_find_subsys+0xe8/0xf0
Jun 21 21:02:10 Gentoo [<c02a8b1f>] pci_find_device+0x2f/0x40
Jun 21 21:02:10 Gentoo [<c02a8928>] pci_find_slot+0x28/0x50
Jun 21 21:02:10 Gentoo [<e13ab0a8>] os_pci_init_handle+0x39/0x68 [nvidia]
Jun 21 21:02:10 Gentoo [<e123f85f>] _nv001243rm+0x1f/0x24 [nvidia]
Jun 21 21:02:10 Gentoo [<e1386115>] _nv000816rm+0x2f5/0x384 [nvidia]
Jun 21 21:02:10 Gentoo [<e12ee92c>] _nv003801rm+0xd8/0x100 [nvidia]
Jun 21 21:02:10 Gentoo [<e1385c4f>] _nv000809rm+0x2f/0x34 [nvidia]
Jun 21 21:02:10 Gentoo [<e12ef750>] _nv003816rm+0xf0/0x104 [nvidia]
Jun 21 21:02:10 Gentoo [<e12f04c7>] _nv000013rm+0x77/0x84 [nvidia]
Jun 21 21:02:10 Gentoo [<e12efe6b>] _nv003780rm+0x1df/0x2c8 [nvidia]
Jun 21 21:02:10 Gentoo [<e12efc77>] _nv000012rm+0x43/0x58 [nvidia]
Jun 21 21:02:10 Gentoo [<e12efc34>] _nv000012rm+0x0/0x58 [nvidia]
Jun 21 21:02:10 Gentoo [<e123369c>] _nv001219rm+0xa8/0x124 [nvidia]
Jun 21 21:02:10 Gentoo [<e13a89a8>] nv_kern_rc_timer+0x0/0x37 [nvidia]
Jun 21 21:02:10 Gentoo [<e1243eb6>] rm_run_rc_callback+0x36/0x4c [nvidia]
Jun 21 21:02:10 Gentoo [<e13a89bb>] nv_kern_rc_timer+0x13/0x37 [nvidia]
Jun 21 21:02:10 Gentoo [<e0b4b030>] rh_report_status+0x0/0x140 [usbcore]
Jun 21 21:02:10 Gentoo [<c01282cb>] run_timer_softirq+0xcb/0x1b0
Jun 21 21:02:10 Gentoo [<c012849f>] do_timer+0xdf/0xf0
Jun 21 21:02:10 Gentoo [<c012433d>] __do_softirq+0x7d/0x80
Jun 21 21:02:10 Gentoo [<c0124366>] do_softirq+0x26/0x30
Jun 21 21:02:10 Gentoo [<c0107ead>] do_IRQ+0xfd/0x130
Jun 21 21:02:10 Gentoo [<c01061f4>] common_interrupt+0x18/0x20

Comment 13 aurelboiss 2004-06-21 12:32:36 UTC
oops ;) my kernel is an 2.6.7-r1 not an 2.6.5. The 2.6.5 work fine :)
Comment 14 Greg Kroah-Hartman (RETIRED) gentoo-dev 2004-06-21 12:55:02 UTC
So, let's try to figure this out, does the 2.6.7-r2 kernel work for the 
nvidia driver or not?

The "badness" message is the nvidia driver's fault, has been there for forever,
go bug them to fix this known issue (they are calling a core pci function at
an illegal time to do so.)
Comment 15 Thierry Carrez (RETIRED) gentoo-dev 2004-06-22 01:27:55 UTC
With 2.6.7-gentoo-r3 :
- 1.0.4363-r3 does not merge
- 1.0.4496-r3 gives "invalid module format" when inserted
- 1.0.4499 gives "invalid module format" when inserted
- 1.0.5328-r1 gives "invalid module format" when inserted
- 1.0.5336-r4 loads, but is buggy on my platform

With 2.6.5-gentoo-r1 :
- 1.0.4496-r3 works OK
- 1.0.5336-r4 loads, but is buggy on my platform
Comment 16 M. Creidieki Crouch 2004-06-22 06:13:48 UTC
Could we have the nvidia module block gentoo-dev-sources-2.6.7 until this is fixed?
Comment 17 Joel Parker 2004-06-23 19:46:36 UTC
I agree. We either need to make nvidia-kernel-4496 block gentoo-dev-sources-2.6.7, or mark the latest nvidia-* stable. If some cards get garbled, and it's NVIDIA's fault, there's nothing we can do anyway. If there aren't any major Gentoo-specific bugs out there, then why not mark stable?
Comment 18 G.K.MacGregor 2004-06-24 10:45:02 UTC
Someone has found the cause of the X crash here...
http://forums.gentoo.org/viewtopic.php?t=188870

It seems that EHCI (USB 2.0) support in 2.6.7 may be buggy for nForce 2
chipsets (and maybe some others). The evidence in that forum post mirrors
comments #6 and #7 here, and also this forum thread...
http://forums.gentoo.org/viewtopic.php?t=188410

What seems to be happening is that the IRQ for the EHCI controller does not
get a response (irq 10: nobody cared!) and the kernel disables that IRQ
totally. Anything else sharing that IRQ will also become unusable, and the
video card just happens to be sharing that IRQ.

I removed EHCI from my 2.6.7 kernel completely, and that was the only thing
that fixed the issue for me. Of course it is not ideal for those using USB 2
devices as well...

So I think it's a new bug in the 2.6.7 kernel.
Comment 19 Mark E. Drummond 2004-07-04 16:04:11 UTC
I think I can confirm it is a 2.6.7 bug. I was working fine with 2.6.5. I just installed 2.6.7-r8 and now I am broke. If I go back to my 2.6.5 kernel, it works fine.

Mark
Comment 20 Andrew Bevitt 2004-08-30 04:19:44 UTC
Can someone confirm / deny this is a problem with anything > 2.6.7
Comment 21 Daniel Drake (RETIRED) gentoo-dev 2004-10-06 02:47:26 UTC
Please reopen with information if the issue still exists in newer kernels.
Comment 22 Greg Kroah-Hartman (RETIRED) gentoo-dev 2005-08-16 11:54:06 UTC
Closed due to lack of info.