Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 64072 - NVIDIA kernel causes oops & dump
Summary: NVIDIA kernel causes oops & dump
Status: RESOLVED CANTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Unspecified (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo X packagers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-09-14 18:02 UTC by Avuton Olrich
Modified: 2004-09-22 16:59 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Avuton Olrich 2004-09-14 18:02:51 UTC
This started happening soon after the x.org/GCC upgrade I believe, and I'm not really sure where to go with it (what I can do) other than downgrade.


Reproducible: Always
Steps to Reproduce:
1. Set /etc/X11/xorg.conf to nvidia
2. Restart xdm


Actual Results:  
It works with the plain jane 'nv' driver, but if I go in with 'nvidia' the
displays stop working (completely) and I end up having to SSH into my computer
with another working computer to restart.

Expected Results:  
Entered X

rocket sbh # emerge info
Portage 2.0.50-r11 (default-x86-2004.0, gcc-3.4.2, glibc-2.3.4.20040808-r0,
2.6.8.1-ck5)
=================================================================
System uname: 2.6.8.1-ck5 i686 AMD Athlon(tm) XP 2800+
Gentoo Base System version 1.5.3
distcc 2.17 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-march=athlon-xp -O3 -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config
/usr/kde/3.2/share/config
/usr/kde/3.3/share/config:/usr/kde/3.3/env:/usr/kde/3.3/shutdown
/usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=athlon-xp -O3 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache digest sandbox"
GENTOO_MIRRORS="http://gentoo.osuosl.org
http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X aalib acl acpi aim alsa apm arts audiofile avi berkdb bitmap-fonts crypt
cups dba dga doc dvd emacs emacs-e3 encode esd flac foomaticdb gd gdbm gif gnome
gpm gtk gtk2 icq imagemagick imlib java jikes joystick jpeg kde libg++ libwww
mad mikmod motif mozilla mpeg ncurses nls offensive oggvorbis opengl oscar oss
pam pcre pdflib perl php png python qt quicktime readline sdl slang speex spell
ssl svga tcltk tcpd theora tiff truetype unicode usb videos x86 xinerama xml
xml2 xmms xosd xprint xv xvid zlib"

The dump from dmesg:
NVRM: loading NVIDIA Linux x86 NVIDIA Kernel Module  1.0-6111  Tue Jul 27
07:55:38 PDT 2004
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
f10cdeb1
*pde = 00000000
Oops: 0000 [#1]
PREEMPT 
Modules linked in: nvidia
CPU:    0
EIP:    0060:[<f10cdeb1>]    Tainted: P  
EFLAGS: 00013296   (2.6.8.1-ck5) 
EIP is at os_string_compare+0x13/0x2d [nvidia]
eax: 00000046   ebx: df996e20   ecx: 00000000   edx: df996e00
esi: df996e01   edi: 00000000   ebp: ddf6be30   esp: ddf6be0c
ds: 007b   es: 007b   ss: 0068
Process X (pid: 6191, threadinfo=ddf6a000 task=ed9fc7d0)
Stack: df996e20 eee41400 f0f0d25a df996e00 00000000 b056af08 f0800000 b014d34b 
       b19f77fc ddf6be60 f0f0f1db eee41400 df996e00 00000000 b056af08 00000000 
       eee41400 00000000 b056af08 f09ff000 00001000 ddf6bea0 f0f12b74 eee41400 
Call Trace:
 [<f0f0d25a>] _nv001530rm+0x16/0x1c [nvidia]
 [<b014d34b>] map_area_pmd+0x6b/0xa0
 [<f0f0f1db>] _nv001618rm+0x33/0x7c [nvidia]
 [<f0f12b74>] _nv001244rm+0x208/0x2e8 [nvidia]
 [<f0f13c35>] rm_access_registry+0x75/0x9c [nvidia]
 [<f0f14134>] _nv001139rm+0x3a4/0x4b8 [nvidia]
 [<f0f13147>] rm_ioctl+0x23/0x38 [nvidia]
 [<f10cb6f0>] nv_kern_ioctl+0x387/0x3d3 [nvidia]
 [<b0166b6d>] sys_ioctl+0x14d/0x290
 [<b010421b>] syscall_call+0x7/0xb
Code: ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 8b 34 24 8b 7c
Comment 1 Andrew Bevitt 2004-09-15 05:34:21 UTC
Which version of xorg-x11 are you running?

I'd blame gcc 3.4.2 but unless you can downgrade and show that it works again im not totally convinced about that.
Comment 2 Avuton Olrich 2004-09-15 08:53:42 UTC
Well, I run xorg-x11 6.8.0 at the moment, and I just re-compiled it (with gcc 3.4.2) right before it started this mess. I'll only take action (downgrade or otherwise) on a gentoo dev recommendation.
Comment 3 Avuton Olrich 2004-09-16 17:20:44 UTC
OK, after much downgrading, upgrading and the like I found the problem. The problem was that for some reason after upgrading nvidia-glx it didn't uninstall the older one, so I had two versions protected by emerge somehow. This is all fixed now though.
Comment 4 Andrej Kacian (RETIRED) gentoo-dev 2004-09-22 15:32:15 UTC
This has recently happened to me with kdesdk - I uninstalled it some month ago, and when I installed recent version again, kbabel (the only package I ever used from kdesdk) tended to bail out with strange "undefined symbol" messages on random user actions. I unmerged kdesdk and tried to compile & install kbabel by hand. It did the same. Just after I removed every kdesdk lib, kbabel worked again, even the one from kdesdk package.

I suspect some hideous, hard-to-reproduce bug deep in portage...
Comment 5 Andrew Bevitt 2004-09-22 16:59:54 UTC
Its not uncommon for somethings like files to go googoogahgah

closing.