When I try to load the nvidia module running 2.6.7-gentoo kernel I get the following message: "invalid module format". I am using the stable version of nvidia-kernel and nvidia-glx. I have tried with the masked drivers and the module loads altought I experience other problems with these drivers: scrambled or not uniformly lit LCD screen. Reproducible: Always Steps to Reproduce: 1. 2. 3.
This bug is a duplicate of #54431. Try using ACCEPT_KEYWORDS to emerge media-video/nvidia-kernel-1.0.5336-r4 and see if that works...
Whoops. Meant duplicate of #50768.
I know the nvidia module loads fine if both nvidia-kernel and nvidia-glx are updated to the masked version but there are people like me that have problems with that version of the drivers. That is probably the reason this driver version is masked I have tried upgrading only nvidia-kernel but it did not work.
i have the same bug with 2.6.7 and the latest nvidia driver (5336-r4, 336-r2). My bug is not exactly the same. Y can load the nvidia driver but when y start the X my Pc Freeze :(
I can confirm this bug on a PIV system with gentoo-dev-sources-2.6.7 and nvidia-kernel-1.0.4496-r3 (both are the most current stable packages for x86). As I might add, manually unmasking a masked version of nvidia-kernel obviously isn't a solution since stable is supposed to work, and people who want to stick to stable shouldn't be left without a working nvidia support. As long as there aren't any known problems, I suggest to mark a newer nvidia-kernel (and corresponding nvidia-glx) stable. My amd64 machine for instance is running fine with 1.0.5332-r2 (the current stable on amd64).
OK, here's a bit more information since I'm in the same boat. My system: - ASUS A7N8X-X (nforce2 chipset) - GeForce FX5200 - gentoo-dev-sources-2.6.7 - nvidia-kernel-1.0.5336-r4 - nvidia-glx-1.0.5336-r2 - xorg-x11-6.7.0-r1 During boot right after "starting pci hotplug" and before "starting usb hotplug", there was a message "Disabling IRQ 12" that was not there when I used 2.6.5-gentoo-r1. X would start, the X log says nothing unusual, but the screen is blank (a little bit of scramble at the bottom of the screen) and unresponsive. Only X is unresponsive, meaning that I can still ssh from other machine and kill X. The kernel syslog says this: Badness in pci_find_subsys at drivers/pci/search.c:167 [pci_find_subsys+232/240] pci_find_subsys+0xe8/0xf0 [<c020e9e8>] pci_find_subsys+0xe8/0xf0 [pci_find_device+47/64] pci_find_device+0x2f/0x40 [<c020ea1f>] pci_find_device+0x2f/0x40 [pci_find_slot+40/80] pci_find_slot+0x28/0x50 [<c020e828>] pci_find_slot+0x28/0x50 [pg0+951120080/1069117440] os_pci_init_handle+0x39/0x68 [nvidia] [<f8f760d0>] os_pci_init_handle+0x39/0x68 [nvidia] [pg0+949631071/1069117440] _nv001243rm+0x1f/0x24 [nvidia] [<f8e0a85f>] _nv001243rm+0x1f/0x24 [nvidia] [pg0+950968597/1069117440] _nv000816rm+0x2f5/0x384 [nvidia] [<f8f51115>] _nv000816rm+0x2f5/0x384 [nvidia] [pg0+950348076/1069117440] _nv003801rm+0xd8/0x100 [nvidia] [<f8eb992c>] _nv003801rm+0xd8/0x100 [nvidia] [pg0+950967375/1069117440] _nv000809rm+0x2f/0x34 [nvidia] [<f8f50c4f>] _nv000809rm+0x2f/0x34 [nvidia] [pg0+950351696/1069117440] _nv003816rm+0xf0/0x104 [nvidia] [<f8eba750>] _nv003816rm+0xf0/0x104 [nvidia] [pg0+950355143/1069117440] _nv000013rm+0x77/0x84 [nvidia] [<f8ebb4c7>] _nv000013rm+0x77/0x84 [nvidia] [pg0+950353515/1069117440] _nv003780rm+0x1df/0x2c8 [nvidia] [<f8ebae6b>] _nv003780rm+0x1df/0x2c8 [nvidia] [pg0+950353015/1069117440] _nv000012rm+0x43/0x58 [nvidia] [<f8ebac77>] _nv000012rm+0x43/0x58 [nvidia] [pg0+950352948/1069117440] _nv000012rm+0x0/0x58 [nvidia] [<f8ebac34>] _nv000012rm+0x0/0x58 [nvidia] [pg0+949581468/1069117440] _nv001219rm+0xa8/0x124 [nvidia] [<f8dfe69c>] _nv001219rm+0xa8/0x124 [nvidia] [pg0+951110056/1069117440] nv_kern_rc_timer+0x0/0x37 [nvidia] [<f8f739a8>] nv_kern_rc_timer+0x0/0x37 [nvidia] [pg0+949649078/1069117440] rm_run_rc_callback+0x36/0x4c [nvidia] [<f8e0eeb6>] rm_run_rc_callback+0x36/0x4c [nvidia] [pg0+951110075/1069117440] nv_kern_rc_timer+0x13/0x37 [nvidia] [<f8f739bb>] nv_kern_rc_timer+0x13/0x37 [nvidia] [run_timer_softirq+203/432] run_timer_softirq+0xcb/0x1b0 [<c01211cb>] run_timer_softirq+0xcb/0x1b0 [do_timer+223/240] do_timer+0xdf/0xf0 [<c012139f>] do_timer+0xdf/0xf0 [__do_softirq+125/128] __do_softirq+0x7d/0x80 [<c011d23d>] __do_softirq+0x7d/0x80 [do_softirq+38/48] do_softirq+0x26/0x30 [<c011d266>] do_softirq+0x26/0x30 [do_IRQ+253/304] do_IRQ+0xfd/0x130 [<c01081ad>] do_IRQ+0xfd/0x130 [common_interrupt+24/32] common_interrupt+0x18/0x20 [<c01064b4>] common_interrupt+0x18/0x20 ... repeated several times. Found related discussions here: http://www.nvnews.net/vbulletin/showthread.php?t=24866 http://forums.gentoo.org/viewtopic.php?t=187632
This is the message printed in kernel syslog when "Disabling IRQ 12" happened. [__report_bad_irq+42/144] __report_bad_irq+0x2a/0x90 [<c01074da>] __report_bad_irq+0x2a/0x90 [note_interrupt+112/160] note_interrupt+0x70/0xa0 [<c01075d0>] note_interrupt+0x70/0xa0 [do_IRQ+289/304] do_IRQ+0x121/0x130 [<c0107871>] do_IRQ+0x121/0x130 [common_interrupt+24/32] common_interrupt+0x18/0x20 [<c0105cb4>] common_interrupt+0x18/0x20 [handle_IRQ_event+36/112] handle_IRQ_event+0x24/0x70 [<c0107464>] handle_IRQ_event+0x24/0x70 [do_IRQ+145/304] do_IRQ+0x91/0x130 [<c01077e1>] do_IRQ+0x91/0x130 [common_interrupt+24/32] common_interrupt+0x18/0x20 [<c0105cb4>] common_interrupt+0x18/0x20 [__do_softirq+48/128] __do_softirq+0x30/0x80 [<c011ab70>] __do_softirq+0x30/0x80 [do_softirq+38/48] do_softirq+0x26/0x30 [<c011abe6>] do_softirq+0x26/0x30 [do_IRQ+253/304] do_IRQ+0xfd/0x130 [<c010784d>] do_IRQ+0xfd/0x130 [common_interrupt+24/32] common_interrupt+0x18/0x20 [<c0105cb4>] common_interrupt+0x18/0x20 [pci_bus_read_config_byte+95/144] pci_bus_read_config_byte+0x5f/0x90 [<c020705f>] pci_bus_read_config_byte+0x5f/0x90 [pg0+946316190/1069428736] ehci_start+0x2ce/0x360 [ehci_hcd] [<f8a9539e>] ehci_start+0x2ce/0x360 [ehci_hcd] [preempt_schedule+42/80] preempt_schedule+0x2a/0x50 [<c03013ba>] preempt_schedule+0x2a/0x50 [release_console_sem+203/224] release_console_sem+0xcb/0xe0 [<c01178fb>] release_console_sem+0xcb/0xe0 [printk+269/368] printk+0x10d/0x170 [<c011777d>] printk+0x10d/0x170 [pg0+944391415/1069428736] usb_register_bus+0x137/0x160 [usbcore] [<f88bf4f7>] usb_register_bus+0x137/0x160 [usbcore] [pg0+944412043/1069428736] usb_hcd_pci_probe+0x2ab/0x4e0 [usbcore] [<f88c458b>] usb_hcd_pci_probe+0x2ab/0x4e0 [usbcore] [pci_device_probe_static+82/112] pci_device_probe_static+0x52/0x70 [<c020a982>] pci_device_probe_static+0x52/0x70 [__pci_device_probe+59/80] __pci_device_probe+0x3b/0x50 [<c020a9db>] __pci_device_probe+0x3b/0x50 [pci_device_probe+44/80] pci_device_probe+0x2c/0x50 [<c020aa1c>] pci_device_probe+0x2c/0x50 [bus_match+63/112] bus_match+0x3f/0x70 [<c023608f>] bus_match+0x3f/0x70 [driver_attach+89/144] driver_attach+0x59/0x90 [<c02361b9>] driver_attach+0x59/0x90 [bus_add_driver+145/176] bus_add_driver+0x91/0xb0 [<c0236461>] bus_add_driver+0x91/0xb0 [driver_register+47/64] driver_register+0x2f/0x40 [<c023691f>] driver_register+0x2f/0x40 [pci_register_driver+92/144] pci_register_driver+0x5c/0x90 [<c020ac9c>] pci_register_driver+0x5c/0x90 [pg0+944902179/1069428736] init+0x23/0x30 [ehci_hcd] [<f893c023>] init+0x23/0x30 [ehci_hcd] [sys_init_module+276/560] sys_init_module+0x114/0x230 [<c012c8c4>] sys_init_module+0x114/0x230 [syscall_call+7/11] syscall_call+0x7/0xb [<c0105b47>] syscall_call+0x7/0xb I tried turning off hotplug so that IRQ 12 is not disabled yet before I start X, and it worked! But I have no mouse since usb hotplug is also off. IRQ 12 is used (shared) by the video card: Bus 2, device 0, function 0: VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev 161). IRQ 12. Master Capable. Latency=248. Min Gnt=5.Max Lat=1. Non-prefetchable 32 bit memory at 0xec000000 [0xecffffff]. Prefetchable 32 bit memory at 0xe0000000 [0xe7ffffff].
Same here. Current nvidia-kernel x86 stable version gives : "invalid module format" Upgrading to ~ version does not work. Current ~x86 version gives : loads nicely but when used, blank (black) screen, no way to switch back to console. Looks like only the display is garbled, since ctrl-alt-del reboots nicely. Note that my PC at home works with 2.6.7-r1 and latest ~x86 nvidia-kernel works perfectly, so this side problem is probably card-specific.
Make sure that you do not have 4k stacks enabled in your kernel.
My config has : # CONFIG_4KSTACKS is not set so it doesn't come from that.
That "badness" issue is a nvidia driver bug. Please go bug them to fix this, there is nothing we can do about it here.
i have exactly the same bug :( Portage 2.0.50-r8 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.3.20040420-r0, 2.6.5-gentoo-r1) ================================================================= System uname: 2.6.5-gentoo-r1 i686 AMD Athlon(tm) XP 2700+ Gentoo Base System version 1.4.16 Autoconf: sys-devel/autoconf-2.59-r3 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs ccache sandbox" GENTOO_MIRRORS="ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="X alsa apm arts avi berkdb crypt cups encode esd foomaticdb gdbm ggi gif gnome gpm gtk gtk2 imlib java jpeg kde libg++ libwww linguas_fr mad mikmod motif mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl svga tcpd truetype x86 xml2 xmms xv zlib" i have installed the latest nvidia drivers nvidia-glx: 1.0.5336-r2 nvidia-kernel: 1.0.5336-r4 Jun 21 21:02:10 Gentoo Badness in pci_find_subsys at drivers/pci/search.c:167 Jun 21 21:02:10 Gentoo [<c02a8ae8>] pci_find_subsys+0xe8/0xf0 Jun 21 21:02:10 Gentoo [<c02a8b1f>] pci_find_device+0x2f/0x40 Jun 21 21:02:10 Gentoo [<c02a8928>] pci_find_slot+0x28/0x50 Jun 21 21:02:10 Gentoo [<e13ab0a8>] os_pci_init_handle+0x39/0x68 [nvidia] Jun 21 21:02:10 Gentoo [<e123f85f>] _nv001243rm+0x1f/0x24 [nvidia] Jun 21 21:02:10 Gentoo [<e1386115>] _nv000816rm+0x2f5/0x384 [nvidia] Jun 21 21:02:10 Gentoo [<e12ee92c>] _nv003801rm+0xd8/0x100 [nvidia] Jun 21 21:02:10 Gentoo [<e1385c4f>] _nv000809rm+0x2f/0x34 [nvidia] Jun 21 21:02:10 Gentoo [<e12ef750>] _nv003816rm+0xf0/0x104 [nvidia] Jun 21 21:02:10 Gentoo [<e12f04c7>] _nv000013rm+0x77/0x84 [nvidia] Jun 21 21:02:10 Gentoo [<e12efe6b>] _nv003780rm+0x1df/0x2c8 [nvidia] Jun 21 21:02:10 Gentoo [<e12efc77>] _nv000012rm+0x43/0x58 [nvidia] Jun 21 21:02:10 Gentoo [<e12efc34>] _nv000012rm+0x0/0x58 [nvidia] Jun 21 21:02:10 Gentoo [<e123369c>] _nv001219rm+0xa8/0x124 [nvidia] Jun 21 21:02:10 Gentoo [<e13a89a8>] nv_kern_rc_timer+0x0/0x37 [nvidia] Jun 21 21:02:10 Gentoo [<e1243eb6>] rm_run_rc_callback+0x36/0x4c [nvidia] Jun 21 21:02:10 Gentoo [<e13a89bb>] nv_kern_rc_timer+0x13/0x37 [nvidia] Jun 21 21:02:10 Gentoo [<e0b4b030>] rh_report_status+0x0/0x140 [usbcore] Jun 21 21:02:10 Gentoo [<c01282cb>] run_timer_softirq+0xcb/0x1b0 Jun 21 21:02:10 Gentoo [<c012849f>] do_timer+0xdf/0xf0 Jun 21 21:02:10 Gentoo [<c012433d>] __do_softirq+0x7d/0x80 Jun 21 21:02:10 Gentoo [<c0124366>] do_softirq+0x26/0x30 Jun 21 21:02:10 Gentoo [<c0107ead>] do_IRQ+0xfd/0x130 Jun 21 21:02:10 Gentoo [<c01061f4>] common_interrupt+0x18/0x20
oops ;) my kernel is an 2.6.7-r1 not an 2.6.5. The 2.6.5 work fine :)
So, let's try to figure this out, does the 2.6.7-r2 kernel work for the nvidia driver or not? The "badness" message is the nvidia driver's fault, has been there for forever, go bug them to fix this known issue (they are calling a core pci function at an illegal time to do so.)
With 2.6.7-gentoo-r3 : - 1.0.4363-r3 does not merge - 1.0.4496-r3 gives "invalid module format" when inserted - 1.0.4499 gives "invalid module format" when inserted - 1.0.5328-r1 gives "invalid module format" when inserted - 1.0.5336-r4 loads, but is buggy on my platform With 2.6.5-gentoo-r1 : - 1.0.4496-r3 works OK - 1.0.5336-r4 loads, but is buggy on my platform
Could we have the nvidia module block gentoo-dev-sources-2.6.7 until this is fixed?
I agree. We either need to make nvidia-kernel-4496 block gentoo-dev-sources-2.6.7, or mark the latest nvidia-* stable. If some cards get garbled, and it's NVIDIA's fault, there's nothing we can do anyway. If there aren't any major Gentoo-specific bugs out there, then why not mark stable?
Someone has found the cause of the X crash here... http://forums.gentoo.org/viewtopic.php?t=188870 It seems that EHCI (USB 2.0) support in 2.6.7 may be buggy for nForce 2 chipsets (and maybe some others). The evidence in that forum post mirrors comments #6 and #7 here, and also this forum thread... http://forums.gentoo.org/viewtopic.php?t=188410 What seems to be happening is that the IRQ for the EHCI controller does not get a response (irq 10: nobody cared!) and the kernel disables that IRQ totally. Anything else sharing that IRQ will also become unusable, and the video card just happens to be sharing that IRQ. I removed EHCI from my 2.6.7 kernel completely, and that was the only thing that fixed the issue for me. Of course it is not ideal for those using USB 2 devices as well... So I think it's a new bug in the 2.6.7 kernel.
I think I can confirm it is a 2.6.7 bug. I was working fine with 2.6.5. I just installed 2.6.7-r8 and now I am broke. If I go back to my 2.6.5 kernel, it works fine. Mark
Can someone confirm / deny this is a problem with anything > 2.6.7
Please reopen with information if the issue still exists in newer kernels.
Closed due to lack of info.