After a while (I'm not sure how long) using nouveau, something happens which causes all of these symptoms to begin:
1. cursor movments become jerky
2. moving cursor from one screen to another leaves a copy of the cursor on the first screen
3. syslog shows many lines of "nouveau 0000:01:00.0: no space while hiding cursor"
4. switching to a text VT 'hangs' the display (showing the X screen still) and switching back 'unhangs' display (while on a text VT, though you can't see anything, you can do things 'blindly')
5. restarting X leaves the screen 'hung' with no ability to SysRq+REISUB (keyboard is fully locked up)
A while after the above happens, the entire system will become unresponsive (though background process like RAID resync or torrent continue), and the keyboard doesn't work at all. The only thing that works is moving the mouse cursor.
Steps to Reproduce:
1. use a computer with the nouveau driver for 'a while'
Please see 'description'.
No loss of stability over time.
Linux delan2 2.6.38-rc2-delan2+ #5 SMP Fri Jan 28 16:21:14 WST 2011 x86_64 Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz GenuineIntel GNU/Linux
Kernel is from torvalds/linux-2.6.git, though this did still occur with gentoo-sources-2.6.37.
x11-base/xorg-server-184.108.40.2061-r1 from gentoo
x11-drivers/xf86-video-nouveau-0.0.16_pre20101130 from gentoo
Have you tried disabling hardware cursor yet?
Option "HWCursor" "false"
Also should be worth trying this patch:
With hardware cursor disabled my cursor flickers heavily and is most of the time invisible. I'm using the default X cursors (the black one, no special cursors like DMZ installed).
The xorg.conf I tried:
Option "HWCursor" "false"
Also, if I understand correctly, that patch was for kernels before somewhere around 2.6.35, right? Just by eyeing my nv50_display.c I can see that the patch probably wouldn't apply anymore on my 2.6.38-rc3 tree.
I have tried to patch nv50_display.c but it failed:
File to patch: drivers/gpu/drm/nouveau/nv50_display.c
patching file drivers/gpu/drm/nouveau/nv50_display.c
Hunk #1 FAILED at 344.
1 out of 1 hunk FAILED -- saving rejects to file drivers/gpu/drm/nouveau/nv50_display.c.rej
If it helps, here's my new uname and some related lspci -vvvnn:
Linux delan2 2.6.38-rc3-delan2 #24 SMP Tue Feb 1 13:35:10 WST 2011 x86_64 Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz GenuineIntel GNU/Linux
01:00.0 VGA compatible controller : nVidia Corporation GT200 [GeForce GTX 260] [10de:05e2] (rev a1) (prog-if 00 [VGA controller])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 4 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M]
Region 3: Memory at f8000000 (64-bit, non-prefetchable) [size=32M]
Region 5: I/O ports at bf00 [size=128]
[virtual] Expansion ROM at fb000000 [disabled] [size=512K]
Capabilities:  Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities:  MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities:  Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <4us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <1us, L1 <1us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
Kernel driver in use: nouveau
By the way, this problem still occurs with the latest nouveau as built in the latest linux-next git kernel. Apparently git.freedesktop has an even newer nouveau so I might try that. Unless the bug is actually in the xorg driver for nouveau?
Although the keyboard is fully non-operational (no VTs!) and the mouse cursor only moves but doesn't work once the hang occurs, SSH and network still work, and I received these four lines on /var/log/messages the moment nouveau hung:
Feb 12 11:51:03 delan2 kernel: [drm] nouveau 0000:01:00.0: PGRAPH - TRAP_CCACHE FAULT
Feb 12 11:51:03 delan2 kernel: [drm] nouveau 0000:01:00.0: PGRAPH - TRAP_CCACHE 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb 12 11:51:03 delan2 kernel: [drm] nouveau 0000:01:00.0: PGRAPH - TRAP
Feb 12 11:51:03 delan2 kernel: [drm] nouveau 0000:01:00.0: PGRAPH - ch 2 (0x0000b00000) subc 5 class 0x8397 mthd 0x0f04 data 0x00000000
I have also seen it vary slightly on the mthd (which has been 0x1414) and the data (which has been 0x00000100). Having searched this error and found basically nothing, this may be important in determining the problem that causes nouveau to hang.
You can try xf86-video-nouveau-9999 from the x11 overlay, if the problem still exists with that one, it is probably best to report a bug on https://bugs.freedesktop.org/ too.
I am already using =xf86-video-nouveau-9999 from the x11 overlay. Tomorrow I will report the bug to freedesktop.
Sorry, I didn't make that clear in my original post. Since reporting the bug, I have installed all-9999 packages from the x11 overlay (including xorg-server, xorg-drivers and xf86-video-nouveau, etc. are all 9999 versions) and the problem still occurs.
This is probably obsolete now after so many years :/
Please test with an updated system