Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 65625 - OpenGL applications cause Xorg 6.8 to hardlock when using Rage128 DRM
Summary: OpenGL applications cause Xorg 6.8 to hardlock when using Rage128 DRM
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: PPC Linux
: High critical (vote)
Assignee: Gentoo X packagers
URL: http://forums.gentoo.org/viewtopic.ph...
Whiteboard:
Keywords:
Depends on:
Blocks: 67326
  Show dependency tree
 
Reported: 2004-09-27 20:33 UTC by Joe Jezak (RETIRED)
Modified: 2005-11-20 12:13 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Strace of 6.8 OpenGL crash (glxstrace,17.93 KB, text/plain)
2004-10-02 04:46 UTC, Joe Jezak (RETIRED)
Details
dri-stipples.png (dri-stipples.png,381.84 KB, image/png)
2004-12-19 19:52 UTC, Lars Weiler (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joe Jezak (RETIRED) gentoo-dev 2004-09-27 20:33:12 UTC
Powerbook G4 500 with Rage 128 and DRM.  

This configuration worked great in Xorg 6.7, but causes a hard lock when an OpenGL application is started in Xorg 6.8.  The machine appears to be completely frozen, but the mouse can still be moved.  I am not using any of the new extensions.  There is nothing logged in the Xorg log or kernel messages.  This problem occurs with both 2.6.7 and 2.6.8.

Reproducible: Always
Steps to Reproduce:
1. Run glxgears on affected system

Actual Results:  
Instant hardlock
Comment 1 Joe McCann (RETIRED) gentoo-dev 2004-09-28 01:04:42 UTC
post emerge info(always) and check if there is anything in ~.xsession-errors. cc'ing ppc as it seems to be a ppc issue.
Comment 2 Joe Jezak (RETIRED) gentoo-dev 2004-09-28 03:49:43 UTC
Running Xorg without render accel, (Option "NoAccel" "True") in xorg.conf stops it from hardlocking, but really slows things down.

I think this bug should remain critical, possibly even blocking making Xorg 6.8 stable on PPC because many PPC machines have Rage 128 video chipsets and this is a hard lock that can lead to data loss, requiring the machine to be reset.

There are no .xsession-error files.

Emerge info:

Portage 2.0.51_rc1 (default-ppc-2004.2, gcc-3.4.1, glibc-2.3.4.20040808-r0, 2.6.7-gentoo-r14 ppc)
=================================================================
System uname: 2.6.7-gentoo-r14 ppc 7410, altivec supported
Gentoo Base System version 1.5.3
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
Binutils: sys-devel/binutils-2.15.90.0.3-r3
Headers:  sys-kernel/linux26-headers-2.6.6-r1
Libtools: sys-devel/libtool-1.5.2-r5
ACCEPT_KEYWORDS="ppc ~ppc"
AUTOCLEAN="yes"
CFLAGS="-O2 -pipe -mtune=7400 -maltivec -mabi=altivec -fno-strict-aliasing"
CHOST="powerpc-unknown-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -pipe -mtune=7400 -maltivec -mabi=altivec -fno-strict-aliasing"
DISTDIR="/usr/portage/distfiles"
FEATURES="ccache"
GENTOO_MIRRORS="mirror.aarnet.edu.au/pub/gentoo/"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.au.gentoo.org/gentoo-portage"
USE="X aac aim alsa altivec apache2 apm audiofile berkdb bonobo crypt divx4linux dv dvd dvdread encode ethereal faac faad fbcon flac ftp gdbm gif gnome-libs gpm gtk gtk2 gtkhtml guile imagemagick imap imlib jack javascript joystick jpeg lcd ldap mad mikmod motif moznocompose moznoirc moznomail mozsvg mpeg mpeg4 mysql ncurses nocd nptl oggvorbis opengl opie oscar oss pam pcmcia pda pdflib perl php png pnp ppc python qemu-fast quicktime rage128 readline sdl session sheep slang spell ssl svg tcpd tiff timidity tools truetype usb xml xml2 xmms xv xvid zlib"
Comment 3 Adam Jackson 2004-09-28 06:13:02 UTC
NoAccel disables _all_ acceleration, both XAA and DRI.  Try disabling just DRI by commenting out the Load "dri" line in xorg.conf.
Comment 4 Joe Jezak (RETIRED) gentoo-dev 2004-09-29 18:44:16 UTC
Commenting out DRI does stop the lockups in OpenGL.
Comment 5 Joe Jezak (RETIRED) gentoo-dev 2004-10-01 05:26:05 UTC
I'd just like to point out that the above isn't really a fix because DRI should be supported on this setup.  If anyone on the X11 team can suggest any debugging methods, I be happy to try and figure this out, I just don't know where to start.
Comment 6 Joe Jezak (RETIRED) gentoo-dev 2004-10-02 04:46:54 UTC
Created attachment 40916 [details]
Strace of 6.8 OpenGL crash
Comment 7 Joe Jezak (RETIRED) gentoo-dev 2004-10-02 05:01:13 UTC
I've created a bug on Xorg's bugzilla for this problem:
http://freedesktop.org/bugzilla/show_bug.cgi?id=1513
Comment 8 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-11 14:55:28 UTC
Looks like this might hold up stabling 6.8.0-r1 on ppc, so it might be nice for anyone who thinks they might have a clue to think about it.
Comment 9 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-11 15:13:54 UTC
Some info for anyone trying to debug this:
http://freedesktop.org/XOrg/DebuggingTheXserver

Probably a build with USE=debug will do everything you need and then some. Remember to quickpkg your current xorg installation.
Comment 10 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-11 19:34:09 UTC
One of the DRI developers hit the problem too, so we may see it get fixed before long.


From: 	Ian Romanick <idr@us.ibm.com>
To: 	DRI developer's list <dri-devel@lists.sourceforge.net>
Subject: 	Serious issues with Rage128 on PowerPC
Date: 	Mon, 11 Oct 2004 18:37:05 -0700
Mailer: 	Mozilla Thunderbird 0.8 (Windows/20040913)

I was trying to test the latest version of my ReadPixels work to make 
sure I didn't break anything on big-endian.  However, it seems someone 
beat me to it in the Rage128 driver. :)  In a nutshell, I can get one 
frame of gears, and then the 3D engine is toast.  After that frame is 
drawn, gears is at 100% and X is unresponsive.  When I kill gears, 
everything goes back to semi-normal.  If I run another 3D program, I get 
an empty (just a frame!) window.

Looking at the output from R128_DEBUG=all, it appears to be stuck in 
r128EmitHwStateLocked.

That single frame of gears is also wrong.  The colors are pinks and 
purples.  I suspect this may just be a byte-ordering problem.  I notice 
that the driver wants to use BGRA for primary color, but I suspect the 
hardware really wants ARGB.  Ditto for secondary color / fog.
Comment 11 David Holm (RETIRED) gentoo-dev 2004-10-11 23:28:01 UTC
We decided not to stable xorg-x11 >= 6.8.0 until the DRI-issues are fixed. Currently DRI is disabled if you run your card in PCI-mode. With 6.7.0 and XFree you would still get DRI even in PCI-mode. (There is no agpgart for the PegasosPPC, so if we stabilise 6.8.0 people will lose DRI-support)
Comment 12 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-12 07:54:20 UTC
David, is there a bug anywhere for that? This one is certainly not it. I hope you're just not waiting for someone upstream to trip over the problem randomly and fix it, because you may be waiting a long time.

If there isn't one, the best place to file it would be bugs.freedesktop.org.
Comment 13 Duke 2004-10-14 12:54:58 UTC
This bug isn't specific to PowerPC hardware.

Check out Bug #61574, one of these should probably be marked a duplicate of the other.

I'm getting the same result on PC hardware with a Radeon 9600SE, fglrx, and X.org 6.8.0.  I added this information to the X.org Bugzilla bug submitted by Joe Jezak in comment #7.
Comment 14 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-18 15:35:06 UTC
http://freedesktop.org/cgi-bin/viewcvs.cgi/mesa/Mesa/src/mesa/drivers/dri/r128/

Especially check out changes to r128_ioctl.c and r128_tris.c. The r128_ioctl.c one isn't quite right, though, as of now. See pasted mail below.

From: 	Michel D
Comment 15 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-18 15:35:06 UTC
http://freedesktop.org/cgi-bin/viewcvs.cgi/mesa/Mesa/src/mesa/drivers/dri/r128/

Especially check out changes to r128_ioctl.c and r128_tris.c. The r128_ioctl.c one isn't quite right, though, as of now. See pasted mail below.

From: 	Michel Dänzer <michel@daenzer.net>
To: 	mesa3d-dev@lists.sourceforge.net
Subject: 	[Mesa3d-dev] Re: [Mesa3d-cvs] CVS Update: Mesa (branch: trunk)
Date: 	Sun, 17 Oct 2004 23:22:49 -0400	
On Sun, 2004-10-17 at 14:29 -0700, Ian Romanick wrote:
> 
> Log message:
>   Fix hangs on big-endian (e.g., PowerPC) hardware.
> 
> Modified files:
>       Mesa/src/mesa/drivers/dri/r128/:
>         r128_ioctl.c 
>   
>   Revision      Changes    Path
>   1.12          +1 -2      Mesa/src/mesa/drivers/dri/r128/r128_ioctl.c

Good to see your PPC fixes Ian, but this one isn't quite correct. The
equivalent of an X (not src/mesa/drivers/dri/r128/server/r128_macros.h!)INREG() is really needed.
Comment 16 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-18 23:27:56 UTC
According to anholt, the above concerns are cosmetic and shouldn't affect functionality. Would anyone care to try patching in the commits to those two files?
Comment 17 Joe Jezak (RETIRED) gentoo-dev 2004-10-20 07:59:18 UTC
Patching r128_ioctl.h does not prevent the lock, it shows what appears to be the first frame, then locks, but the cursor is still active, same as before.  If you have any other ideas, I'd be happy to test them, I have access to a second machine and have compiled Xorg with debugging.
Comment 18 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-20 08:33:59 UTC
Joe, maybe you should reopen and be commenting on the freedesktop bug.
Comment 19 Joe Jezak (RETIRED) gentoo-dev 2004-10-24 19:31:32 UTC
Rebuilding xorg with the cvs version of Mesa does fix the problem, but isn't really a solution for closing this bug.  If you have any other suggestions, I'd be happy to test them out.  

To get it working, I grabbed the cvs tarball, removed the ./xc/extras/Mesa directory and replaced it with the cvs version.  I also had to edit the makefiles in ./xc/lib/GL/mesa/ and remove all references to r128_vb.c and r128_vb.h as they have been removed in the cvs version of mesa.
Comment 20 Adam Jackson 2004-10-24 19:58:02 UTC
You should have been able to simply 'make linux-dri' from the Mesa CVS checkout and then just copied lib/r128_dri.so to /usr/X11R6/lib/modules/dri.

If I get my way, we'll soon be able to build all the GL and DRI client side bits directly from Mesa, so this sort of problem could be fixed by simply emerging a newer version of Mesa with USE=dri (or whatever).
Comment 21 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-26 14:35:03 UTC
Try http://freedesktop.org/cgi-bin/viewcvs.cgi/mesa/Mesa/src/mesa/drivers/dri/r128/r128_ioctl.c?only_with_tag=mesa_6_2_branch&r2=1.11&r1=1.10 -- it's a backport that's supposed to accomodate an xorg 6.8.2 with a mesa 6.2 bugfix release.

If it's not fixed there, _please_ reopen and comment on the freedesktop bug.
Comment 22 Joe Jezak (RETIRED) gentoo-dev 2004-10-26 23:01:09 UTC
The changes made to the mesa_6_2_branch by Ian Romanick, completed on 04/10/26 at 19:05:20, fix the problem.
Comment 23 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-27 11:13:53 UTC
If you could attach a patch against our 6.8.0, that would be greatly appreciated.
Comment 24 Joe Jezak (RETIRED) gentoo-dev 2004-10-27 11:44:02 UTC
I couldn't get the version of mesa that comes with Xorg working.  The commits I referred to in #21 are the same patches as previously mentioned: the fix for r128_ioctl and r128_tris.  The r128_tris patch does not apply to the version of mesa that comes with Xorg because the driver was converted to use the common t_vertex implementation instead after the Xorg release.  Applying just the r128_ioctl patch does not fix the lockups, using the mesa_6_2_branch does, (after the commits were made as noted in #21).  Sorry for the confusion.
Comment 25 Donnie Berkholz (RETIRED) gentoo-dev 2004-10-27 14:44:45 UTC
Try a diff of the r128 directory from the mesa in xorg to the r128 dir of the mesa_6_2_branch. That should catch the t_vertex conversion.
Comment 26 Brett Boren 2004-11-09 18:20:40 UTC
I get a similar problem on pc hardware (intel845 3.0 HT ATI8500). Without disabling dri I get a segfault every time I try to run GL apps.
Comment 27 Donnie Berkholz (RETIRED) gentoo-dev 2004-12-18 18:22:52 UTC
Please test 6.8.1.901 -- I just added it.
Comment 28 Donnie Berkholz (RETIRED) gentoo-dev 2004-12-19 00:36:32 UTC
Don't really have a better place to say this, but I just committed a new patchset for 6.8.1.901 that should significantly help out Radeon (and slightly, Rage128) PPC users. It also adds an R128 patch to allow X without fbdev. https://bugs.freedesktop.org/show_bug.cgi?id=2089 has some more info.
Comment 29 Joe Jezak (RETIRED) gentoo-dev 2004-12-19 01:34:09 UTC
Thanks Donnie, I'll try this out tonight and let you know how it works.
Comment 30 Lars Weiler (RETIRED) gentoo-dev 2004-12-19 07:04:09 UTC
I just wanted to test, but I can't download the file
http://dev.gentoo.org/~spyderous/xorg-x11/patchsets/6.8.1.901/xorg-x11-6.8.1.901-files-0.6.tar.bz2
Comment 31 Donnie Berkholz (RETIRED) gentoo-dev 2004-12-19 13:19:57 UTC
As mentioned in the meeting, this has now been fixed. I neglected to sync my local stuff with something accessible.
Comment 32 Lars Weiler (RETIRED) gentoo-dev 2004-12-19 19:52:47 UTC
Created attachment 46400 [details]
dri-stipples.png

So, DRI is working again with the r128 in my iBook (still 16bpp as 32bpp would
not fit into 8MB VRAM).  No hardlocks with glxgears or anything similar.

But, after I leave such an application I get some uncommon "stipples" on my
screen.  If I move something over, they will disappear.  Just see the
screenshot.  I took it after I left armagetron.
Comment 33 Lars Weiler (RETIRED) gentoo-dev 2004-12-19 19:56:20 UTC
Another thing I recognized: xloadimage will not work any more:

lars@celeborn ~ $ xloadimage dri-stipples.png 
dri-stipples.png is 1024x768 PNG image, color type RGB, 16 bit
  Using TrueColor visual
xloadimage: X Error: BadAlloc (insufficient resources for operation) on 0x49
xloadimage: X Error: BadColor (invalid Colormap parameter) on 0x1a00001
imageToXImage: XAllocColor failed on a TrueColor/Directcolor visual
Cannot convert Image to XImage
lars@celeborn ~ $ 

I already recompiled it, with no luck.
Comment 34 Joe Jezak (RETIRED) gentoo-dev 2004-12-20 02:23:51 UTC
With hardware cursors on, OpenGL sometimes still locks, as well as there being some cursor corruption.  Using software cursors fixes the problem.  Aside from that, it seems to work well, including DRI without FBDev.

In addition, I'm seeing sandbox violations as follows:

 * Found sources for kernel version:
 *     2.6.9-gentoo-r1                                                    [ ok ]
>>> Source unpacked.
--------------------------- ACCESS VIOLATION SUMMARY ---------------------------
LOG FILE = "/tmp/sandbox-x11-base_-_xorg-x11-6.8.1.901-9714.log"

unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
unlink:    /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
open_wr:   /usr/src/linux-2.6.9-gentoo-r1/.tmp_gas_check
--------------------------------------------------------------------------------

It might be related to the fix that was added to the kernel to fix the ppc /dev/null bug, but I'm not sure, I know that's what that file is for.
Comment 35 Joe Jezak (RETIRED) gentoo-dev 2004-12-20 02:44:39 UTC
I can't replicate the problem Lars is having with xloadimage, but using xloadimage with .png files with alpha layers result in the incorrect colors being used.  Other graphics seem to work fine.
Comment 36 Joe Jezak (RETIRED) gentoo-dev 2004-12-20 05:12:51 UTC
Argh, I'm also getting the "snow" on my screen, but only after more complex opengl, possible related to texturing as I can't seem to replicate it on games without textures.
Comment 37 Donnie Berkholz (RETIRED) gentoo-dev 2004-12-20 09:40:12 UTC
Let's follow the sandbox problem on bug #75034. It's clearly not caused by xorg because 6.7 wasn't having problems before AFAIK.
Comment 38 Lars Weiler (RETIRED) gentoo-dev 2005-01-09 01:04:58 UTC
I think, we can close this bug.  It runs really stable on my system and I could not replicate the errors any more.

Just the upcoming release of X.org 6.8.2 has to be masked stable, when there is a stable version out.
Comment 39 Joshua Baergen (RETIRED) gentoo-dev 2005-11-20 12:13:06 UTC
This should be fixed in all current X versions.
Comment 40 Joshua Baergen (RETIRED) gentoo-dev 2005-11-20 12:13:22 UTC
Marking fixed.