When I try to use nvidia-kernel/glx version 1.0.6111-r3 or 1.0.6629 I get a strange lockup when gdm starts up. The screen goes blank and I can't switch to a virtual terminal. I can, however, ssh into the box and use the system that way. I'm going to post my xorg.conf file so you guys can see that. I can't seam to find any error in the xorg logs, but if you'd like I'll post that as well. This machine is pretty old, and uses an old TNT2 graphics chip. Reproducible: Always Steps to Reproduce: emerge info: Portage 2.0.51-r3 (default-linux/x86/2004.3, gcc-3.4.3, glibc-2.3.4.20041102- r0, 2.6.10-rc2 i686) ================================================================= System uname: 2.6.10-rc2 i686 Pentium III (Katmai) Gentoo Base System version 1.6.6 ccache version 2.3 [enabled] Autoconf: sys-devel/autoconf-2.59-r5 Automake: sys-devel/automake-1.8.5-r1 Binutils: sys-devel/binutils-2.15.92.0.2-r1 Headers: sys-kernel/linux-headers-2.4.19,sys-kernel/linux-headers-2.4.22 Libtools: sys-devel/libtool-1.5.2-r7 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CFLAGS="-march=pentium3 -O3 -fomit-frame-pointer -funroll-loops -ffast-math - pipe" CHOST="i686-pc-linux-gnu" COMPILER="" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2 /share/config /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown / usr/kde/3/share/config /usr/lib/mozilla/defaults/pref /usr/share/config /usr/sha re/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/gen eric/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qma il/alias /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=pentium3 -O3 -fomit-frame-pointer -funroll-loops -ffast-math - pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs autoconfig buildpkg ccache distlocks fixpackages sandbox sfperms" GENTOO_MIRRORS="http://gentoo.chem.wisc.edu/gentoo/ http://gentoo.oregonstate.edu http://distro.ibiblio.org/pub/Linux/distributions/gentoo" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage" USE="X aalib alsa apm arts avi berkdb bitmap-fonts bonobo cdr crd crypt cups dga dvd encode esd f77 fam flac foomaticdb fortran gd gdbm gif gimpprint gnome gstreamer gtk gtk2 gtkhtml imagemagick imlib java jpeg junit libg++ libwww mad maildir mikmod mmx motif mozilla mpeg mysql ncurses nls oggvorbis opengl oss pam pdflib perl png ppds python qt quicktime readline samba sdl slang spell sse ssl svga tcltk tcpd tetex tiff truetype usb x86 xml xml2 xmms xv zlib"
Created attachment 44801 [details] My xorg.conf Also important to note... I'm running x.org-6.8.0-r3.
I dont think anyone will care if we steal this Now John, ftp://download.nvidia.com/XFree86/Linux-x86/1.0-6629/README.txt Has an appendix (H IIRC) that deals with TNT cards, perhaps read through there and see what you can find. Also it would be nice if you could attach some logs. /var/log/Xorg.0.log and the output from running: dmesg To start with.
Created attachment 45051 [details] dmesg Here is the dmesg from the system showing everything including the kernel module loading. No errors are ever reported to the log, even after it is locked up.
Created attachment 45053 [details] Xorg.0.log The X logs look pretty normal to me too. I have to admit I have no idea what all the resource range stuff indicates but I see no error messages that should be causing a problem.
OK can you cause a lockup and the ssh into the box and have a look at memory / cpu usage... Sort of try and diagnose what lockup is occuring, and what is causing it. top ; or ps aux should start with some info...
Top output while X is running: top - 10:15:13 up 3 days, 1:15, 4 users, load average: 4.21, 2.98, 2.32 Tasks: 46 total, 2 running, 44 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 100.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 644972k total, 477348k used, 167624k free, 146644k buffers Swap: 1249912k total, 8k used, 1249904k free, 279964k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26630 root 25 0 14452 10m 2392 R 99.6 1.6 16:24.49 X 1 root 16 0 1480 496 436 S 0.0 0.1 0:01.66 init 2 root 34 19 0 0 0 S 0.0 0.0 0:02.34 ksoftirqd/0 3 root 5 -10 0 0 0 S 0.0 0.0 0:00.50 events/0 4 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 khelper 5 root 14 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid 6 root 5 -10 0 0 0 S 0.0 0.0 0:04.87 kblockd/0 7 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd 11 root 12 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0 10 root 16 0 0 0 0 S 0.0 0.0 0:39.66 kswapd0 14 root 25 0 0 0 0 S 0.0 0.0 0:00.00 kseriod 17 root 5 -10 0 0 0 S 0.0 0.0 0:00.81 reiserfs/0 75 root 18 0 1864 976 676 S 0.0 0.2 0:00.22 devfsd 5449 root 15 0 1764 780 600 S 0.0 0.1 0:00.16 syslog-ng 6116 root 23 0 1492 472 416 S 0.0 0.1 0:00.00 dhcpcd 6147 root 16 0 5792 2636 1384 S 0.0 0.4 0:01.38 cupsd 6398 root 16 0 1808 628 532 S 0.0 0.1 0:01.01 crond ps aux output: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 1480 496 ? S Nov29 0:01 init [3] root 2 0.0 0.0 0 0 ? SN Nov29 0:02 [ksoftirqd/0] root 3 0.0 0.0 0 0 ? S< Nov29 0:00 [events/0] root 4 0.0 0.0 0 0 ? S< Nov29 0:00 [khelper] root 5 0.0 0.0 0 0 ? S< Nov29 0:00 [kacpid] root 6 0.0 0.0 0 0 ? S< Nov29 0:04 [kblockd/0] root 7 0.0 0.0 0 0 ? S Nov29 0:00 [khubd] root 11 0.0 0.0 0 0 ? S< Nov29 0:00 [aio/0] root 10 0.0 0.0 0 0 ? S Nov29 0:39 [kswapd0] root 14 0.0 0.0 0 0 ? S Nov29 0:00 [kseriod] root 17 0.0 0.0 0 0 ? S< Nov29 0:00 [reiserfs/0] root 75 0.0 0.1 1864 976 ? Ss Nov29 0:00 /sbin/devfsd /dev root 5449 0.0 0.1 1764 780 ? Ss Nov29 0:00 /usr/sbin/syslog-ng root 6116 0.0 0.0 1492 472 ? Ss Nov29 0:00 /sbin/dhcpcd -h homer -N -Y -t 3 -h JTSHAW eth0 root 6147 0.0 0.4 5792 2636 ? Ss Nov29 0:01 /usr/sbin/cupsd root 6398 0.0 0.0 1808 628 ? S Nov29 0:01 /usr/sbin/crond ntp 6520 0.0 0.6 4072 4068 ? SLs Nov29 0:01 /usr/sbin/ntpd -p /var/run/ntpd.pid -u ntp:ntp root 6570 0.0 0.3 7012 1996 ? Ss Nov29 0:00 /usr/sbin/smbd -D root 6572 0.0 0.2 3724 1408 ? Ss Nov29 0:00 /usr/sbin/nmbd -D root 6574 0.0 0.3 7012 1980 ? S Nov29 0:00 /usr/sbin/smbd -D root 6622 0.0 0.2 3528 1504 ? Ss Nov29 0:00 /usr/sbin/sshd root 6635 0.0 0.1 2344 1120 ? Ss Nov29 0:00 login -- root root 6637 0.0 0.0 1472 516 tty3 Ss+ Nov29 0:00 /sbin/agetty 38400 tty3 linux root 6638 0.0 0.0 1472 516 tty4 Ss+ Nov29 0:00 /sbin/agetty 38400 tty4 linux root 6639 0.0 0.0 1472 516 tty5 Ss+ Nov29 0:00 /sbin/agetty 38400 tty5 linux root 6643 0.0 0.0 1472 516 tty6 Ss+ Nov29 0:00 /sbin/agetty 38400 tty6 linux root 6644 0.0 0.2 2556 1552 tty1 Ss+ Nov29 0:00 -bash root 25537 0.0 0.0 0 0 ? S< Dec01 0:00 [loop0] root 16877 0.0 0.0 0 0 ? S 09:06 0:01 [pdflush] root 16884 0.0 0.0 0 0 ? S 09:06 0:00 [pdflush] root 22146 0.0 0.1 2344 1120 ? Ss 09:38 0:00 login -- root root 26042 0.0 0.2 2556 1544 tty2 Ss+ 09:43 0:00 -bash root 26627 0.0 0.3 9556 2352 ? Ss 09:46 0:00 /usr/bin/gdm root 26628 0.0 0.3 9556 2460 ? S 09:46 0:00 /usr/bin/gdm Running ps aux causes the ssh session to lockup, and it never finishes it's output. It seams to get stuck right before it spits out the information for the X process. There is nothing new in Xorg.0.log or in the dmesg.
It is probably also worth noting that I cannot kill the X process. It doesn't seam to listen to signals at all. The only way to kill the process is to reboot the machine at this point.
Ok... so I have done some messing around and I tried with the 1.0.6629 nvidia-kernel/glx module and I get a lovely message on dmesg: Unable to handle kernel NULL pointer dereference at virtual address 00000001 printing eip: c04cc290 *pde = 00000000 Oops: 0002 [#1] PREEMPT Modules linked in: nvidia CPU: 0 EIP: 0060:[<c04cc290>] Tainted: P VLI EFLAGS: 00213286 (2.6.10-rc2) EIP is at pci_find_bus+0x0/0x60 eax: 00000001 ebx: 00000000 ecx: 00000001 edx: caca9bee esi: caca9bee edi: caca9bee ebp: caca9bb0 esp: caca9b5c ds: 007b es: 007b ss: 0068 Process X (pid: 2450, threadinfo=caca8000 task=e4b4b540) Stack: e9390fb7 00000000 00000001 00203000 00000000 c045474c 00000001 e658d000 e4ef5000 e658d400 e91a506d 00000001 00000000 00000000 caca9bee caca9bee caca9bc4 00203282 c045474c e4ef5000 00000001 caca9bf0 e9199e7d e658d000 Call Trace: [<e9390fb7>] os_pci_init_handle+0x3a/0x8e [nvidia] [<e91a506d>] _nv001746rm+0x25/0x2c [nvidia] [<e9199e7d>] _nv002400rm+0x35/0x3c [nvidia] [<c0217db3>] pci_find_subsys+0xc3/0x110 [<e919fb7e>] _nv002355rm+0x72/0x52c [nvidia] [<e919fb71>] _nv002355rm+0x65/0x52c [nvidia] [<e9198a7a>] _nv001955rm+0x36/0xe0 [nvidia] [<e91ab9cc>] rm_update_agp_config+0x38/0x50 [nvidia] [<e91ab9dc>] rm_update_agp_config+0x48/0x50 [nvidia] [<e938ef6b>] nv_agp_init+0xb1/0x19b [nvidia] [<e91a7522>] _nv001779rm+0xda/0x110 [nvidia] [<e91a74c5>] _nv001779rm+0x7d/0x110 [nvidia] [<e92b0848>] _nv002152rm+0x98/0xa4 [nvidia] [<e91a7efd>] _nv004480rm+0x131/0x158 [nvidia] [<e91a7ee7>] _nv004480rm+0x11b/0x158 [nvidia] [<e9390f61>] os_pci_device_present+0x21/0x3d [nvidia] [<e91a7d59>] _nv004434rm+0xa9/0x11c [nvidia] [<e91a7da9>] _nv004434rm+0xf9/0x11c [nvidia] [<e91996d1>] _nv000806rm+0x15/0x24 [nvidia] [<e931a1fb>] _nv002078rm+0x73/0x94 [nvidia] [<e931a1ee>] _nv002078rm+0x66/0x94 [nvidia] [<e91a529e>] _nv001795rm+0x16/0x1c [nvidia] [<e92b1e5a>] _nv002236rm+0x1e6/0x1f4 [nvidia] [<e92b1cbd>] _nv002236rm+0x49/0x1f4 [nvidia] [<e92b2470>] _nv002231rm+0x74/0x80 [nvidia] [<e92b1a9e>] _nv002163rm+0x9e/0xb8 [nvidia] [<e92b1a91>] _nv002163rm+0x91/0xb8 [nvidia] [<e91a7bb7>] _nv001315rm+0x37/0xc0 [nvidia] [<e91a7bde>] _nv001315rm+0x5e/0xc0 [nvidia] [<e925bb0d>] _nv001933rm+0x65/0xc4 [nvidia] [<e91a8392>] _nv001320rm+0x1de/0x2bc [nvidia] [<e91a8386>] _nv001320rm+0x1d2/0x2bc [nvidia] [<e91a54be>] _nv001820rm+0x12/0x18 [nvidia] [<e91ab08b>] rm_init_adapter+0x5f/0x8c [nvidia] [<e91ab07f>] rm_init_adapter+0x53/0x8c [nvidia] [<e938c9e5>] nv_kern_open+0x2b3/0x34f [nvidia] [<e938d67f>] nv_kern_isr+0x0/0x144 [nvidia] [<c015f1a0>] chrdev_open+0x160/0x200 [<c015f040>] chrdev_open+0x0/0x200 [<c0154c52>] dentry_open+0x1d2/0x270 [<c0154a6c>] filp_open+0x5c/0x70 [<c0154d45>] get_unused_fd+0x55/0xf0 [<c0154ea9>] sys_open+0x49/0x90 [<c010323b>] syscall_call+0x7/0xb Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 31 00 00 00 2f 75 73 72 2f 69 6e 63 6c 75 64 65 2f
Created attachment 45157 [details] Xorg.0.log with nvidia-kernel/glx-1.0.6629
Created attachment 45158 [details] Dmesg with nvidia-kernel/glx-1.0.6629
ps aux w/ nvidia-kernel/glx-1.0.6629: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.1 0.0 1480 496 ? S 12:02 0:00 init [3] root 2 0.0 0.0 0 0 ? SN 12:02 0:00 [ksoftirqd/0] root 3 0.0 0.0 0 0 ? S< 12:02 0:00 [events/0] root 4 0.0 0.0 0 0 ? S< 12:02 0:00 [khelper] root 5 0.0 0.0 0 0 ? S< 12:02 0:00 [kacpid] root 6 0.0 0.0 0 0 ? S< 12:02 0:00 [kblockd/0] root 8 0.0 0.0 0 0 ? S 12:02 0:00 [pdflush] root 7 0.0 0.0 0 0 ? S 12:02 0:00 [khubd] root 9 0.0 0.0 0 0 ? S 12:02 0:00 [pdflush] root 11 0.0 0.0 0 0 ? S< 12:02 0:00 [aio/0] root 10 0.0 0.0 0 0 ? S 12:02 0:00 [kswapd0] root 14 0.0 0.0 0 0 ? S 12:02 0:00 [kseriod] root 17 0.0 0.0 0 0 ? S< 12:02 0:00 [reiserfs/0] root 75 0.0 0.1 1864 964 ? Ss 12:02 0:00 /sbin/devfsd /dev root 5433 0.0 0.1 1756 728 ? Ss 12:03 0:00 /usr/sbin/syslog-ng root 6100 0.0 0.0 1492 472 ? Ss 12:03 0:00 /sbin/dhcpcd -h homer -N -Y -t 3 -h JTSHAW eth0 root 6131 0.0 0.4 5792 2636 ? Ss 12:03 0:00 /usr/sbin/cupsd root 6382 0.0 0.0 1808 628 ? S 12:03 0:00 /usr/sbin/crond ntp 6504 0.0 0.6 4072 4068 ? SLs 12:03 0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -u ntp:ntp root 6554 0.0 0.3 7012 1996 ? Ss 12:03 0:00 /usr/sbin/smbd -D root 6556 0.0 0.2 3724 1408 ? Ss 12:03 0:00 /usr/sbin/nmbd -D root 6571 0.0 0.3 7012 1980 ? S 12:03 0:00 /usr/sbin/smbd -D root 6606 0.0 0.2 3528 1504 ? Ss 12:03 0:00 /usr/sbin/sshd root 6622 0.0 0.0 1472 516 tty1 Ss+ 12:03 0:00 /sbin/agetty 38400 tty1 linux root 6623 0.0 0.0 1472 516 tty2 Ss+ 12:03 0:00 /sbin/agetty 38400 tty2 linux root 6624 0.0 0.0 1472 516 tty3 Ss+ 12:03 0:00 /sbin/agetty 38400 tty3 linux root 6625 0.0 0.0 1472 516 tty4 Ss+ 12:03 0:00 /sbin/agetty 38400 tty4 linux root 6626 0.0 0.0 1472 516 tty5 Ss+ 12:03 0:00 /sbin/agetty 38400 tty5 linux root 6627 0.0 0.0 1472 516 tty6 Ss+ 12:03 0:00 /sbin/agetty 38400 tty6 linux root 6628 0.0 0.2 6172 1880 ? Ss 12:03 0:00 sshd: jtshaw [priv] jtshaw 6634 0.0 0.3 6556 2096 ? S 12:03 0:00 sshd: jtshaw@pts/0 jtshaw 6635 0.0 0.2 2816 1516 pts/0 Ss 12:03 0:00 -bash root 6650 0.0 0.2 2556 1540 pts/0 S 12:03 0:00 /bin/bash root 6724 0.0 0.3 9556 2352 ? Ss 12:04 0:00 /usr/bin/gdm root 6726 0.0 0.3 9556 2496 ? S 12:04 0:00 /usr/bin/gdm root 6727 0.1 0.0 0 0 ? D 12:04 0:00 [X] root 6938 0.0 0.1 2496 864 pts/0 R+ 12:16 0:00 ps aux top w/ nvidia-kernel/glx-1.0.6629: top - 12:17:12 up 14 min, 1 user, load average: 0.99, 0.93, 0.57 Tasks: 37 total, 1 running, 36 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3% us, 0.0% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 644972k total, 77552k used, 567420k free, 12900k buffers Swap: 1249912k total, 0k used, 1249912k free, 39300k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 16 0 1480 496 436 S 0.0 0.1 0:00.90 init 2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 3 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 events/0 4 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 khelper 5 root 14 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid 6 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pdflush 7 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd 9 root 15 0 0 0 0 S 0.0 0.0 0:00.02 pdflush 11 root 12 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0 10 root 25 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 14 root 25 0 0 0 0 S 0.0 0.0 0:00.00 kseriod 17 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 reiserfs/0 75 root 18 0 1864 964 676 S 0.0 0.1 0:00.07 devfsd 5433 root 15 0 1756 728 576 S 0.0 0.1 0:00.04 syslog-ng 6100 root 22 0 1492 472 416 S 0.0 0.1 0:00.00 dhcpcd 6131 root 16 0 5792 2636 1384 S 0.0 0.4 0:00.46 cupsd 6382 root 16 0 1808 628 532 S 0.0 0.1 0:00.00 crond 6504 ntp 16 0 4072 4068 3000 S 0.0 0.6 0:00.02 ntpd 6554 root 18 0 7012 1996 1544 S 0.0 0.3 0:00.00 smbd 6556 root 16 0 3724 1408 1112 S 0.0 0.2 0:00.00 nmbd 6571 root 18 0 7012 1980 1528 S 0.0 0.3 0:00.00 smbd 6606 root 16 0 3528 1504 1232 S 0.0 0.2 0:00.00 sshd 6622 root 17 0 1472 516 460 S 0.0 0.1 0:00.01 agetty 6623 root 16 0 1472 516 460 S 0.0 0.1 0:00.00 agetty 6624 root 16 0 1472 516 460 S 0.0 0.1 0:00.00 agetty 6625 root 16 0 1472 516 460 S 0.0 0.1 0:00.00 agetty 6626 root 16 0 1472 516 460 S 0.0 0.1 0:00.00 agetty 6627 root 16 0 1472 516 460 S 0.0 0.1 0:00.00 agetty 6628 root 16 0 6172 1880 1532 S 0.0 0.3 0:00.05 sshd 6634 jtshaw 16 0 6556 2096 1692 S 0.0 0.3 0:00.51 sshd 6635 jtshaw 16 0 2816 1516 1252 S 0.0 0.2 0:00.04 bash 6650 root 15 0 2556 1540 1284 S 0.0 0.2 0:00.11 bash 6724 root 16 0 9556 2352 1908 S 0.0 0.4 0:00.00 gdm 6726 root 16 0 9556 2496 2044 S 0.0 0.4 0:00.00 gdm 6727 root 21 0 0 0 0 D 0.0 0.0 0:00.79 X 6942 root 16 0 2116 1052 836 R 0.0 0.2 0:00.14 top The system load still seams to be really high... And of course null pointer exception is a little concerning.
Have a look at appendix h in the nvidia readme (the bit about TNT cards)... Does that apply to you? Does the described fix solve the problem?
No.. that is specific to the older TNT chipsets (not the TNT2 chipsets) as there were no TNT2 cards with SGRAM to my knowledge.
Ok, I think I have figured out all that can be figured out. According to a guy on the Nvidia forum, the last two patches at http://www.minion.de/files/1.0-6629/ are needed for the driver to work with the 2.6.10-rc kernels. Also, I was told that 1.0.6629 will not work with older GPU's (TNT, TNT2, maybe more) and that there is currently no work around for that problem. I was also told that a new version should be availible soon that fixes the support for the older GPU's. So in short, if you are using a 2.6.10-rc kernel and you have this problem, use the two patches. If you have an old GPU, don't use the 1.0.6629 driver. And if you have an old GPU and a 2.6.10-rc kernel I think you might be SOL for now... I'm trying to apply the patches to an older driver version to see if I can get it up and running....
Well we have all the patches being applied so 2.6.10 will work... A new nvidia release is rumored to be soon, for now we cant fix this.