Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 275861 - x11-base/xorg-server-1.6.1.901-r5: hang with intel KMS
Summary: x11-base/xorg-server-1.6.1.901-r5: hang with intel KMS
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Gentoo X packagers
URL: http://bugs.freedesktop.org/show_bug....
Whiteboard:
Keywords:
: 275852 275860 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-06-29 19:16 UTC by Robert Bradbury
Modified: 2009-07-07 09:59 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info information (EmergeInfo.lst,3.85 KB, text/plain)
2009-06-29 19:22 UTC, Robert Bradbury
Details
Subset of grub.conf for how kernel is booted (i915.modeset=1) (GrubInfo.lst,381 bytes, text/plain)
2009-06-29 19:23 UTC, Robert Bradbury
Details
loaded module information (ModuleInfo.lst,2.58 KB, text/plain)
2009-06-29 19:25 UTC, Robert Bradbury
Details
Linux kernel .config information (ConfigInfo.lst,72.25 KB, text/plain)
2009-06-29 19:27 UTC, Robert Bradbury
Details
Linux kernel boot dmesg file (DmesgInfo.lst,39.63 KB, text/plain)
2009-06-29 19:27 UTC, Robert Bradbury
Details
xorg.conf information (should work with KMS/UXA/DRI2) (Xorg-conf.Info.lst,13.17 KB, text/plain)
2009-06-29 19:28 UTC, Robert Bradbury
Details
Glxinfo with driver information and glxgear results (GlxInfo.lst,14.58 KB, text/plain)
2009-06-29 19:30 UTC, Robert Bradbury
Details
Xorg.2.log stack trace (Xorg.log.crash1,2.66 KB, text/plain)
2009-06-30 16:24 UTC, Robert Bradbury
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Bradbury 2009-06-29 19:16:58 UTC
There are severe problems when one tries to run the latest and greatest kernel + intel kernel & X drivers + xorg-server.

Problems can be summarized as follows:
1) Videos do not play on the console (which they will if you use intelfb and tell mplayer to use the -vo fbdev).

2) Playing certain videos (in my case .flac files) *hangs* the system.  No video on any X terminals or the consoles (Ctrl-Alt-F#).  System must be turned off and rebooted.  This is an X server/driver problem because mplayer "attempts" to start a video stream.  It draws a small video window in the middle of the screen and then colorizes about 80% of it (the bottom right hand corner is black) then hangs the system.  While the video file is labled ".flac", mplayer appears to recognize it as a "RAWDV" format and uses the DVSD / ffmpeg driver to decode it.

3) The xorg-server-1.6.1.901-r5 patches appear to not work.  Starting the X server appears to hang the system.  Falling back to "-r4" was necessary to get a running X server.

4) The glxgears numbers are going down!  As documented in the attached files, I'm down from 650+ FPS to ~500 FPS.  Now, if this is "expected" (due to KMS/UXA/DIR2) then there should be documentation, e.g. [R1] which *clearly* documents how to optimize for 2D vs. 3D.  Some people care primarily about "2D" (video/TV/WebCam) streaming and low CPU loads and could care less with respect to 3D graphics (DRI2?) / gaming (assuming that I'm have some limited understanding behind why KMS/UXA/DRI2 are "supposed" to be important).

5) I've currently only been able to get the high resolution on the console with the drivers compiled into the kernel (see .config file).  If this gives poor performance for video (relative to intelfb or vesafb) there should be a way to load & unlaod the drivers and get them in "optimal mode" for their capabilities (or else the current intel driver should *really* support a "FB" mode).

R1. http://en.gentoo-wiki.com/wiki/Intel_GMA

Reproducible: Always

Steps to Reproduce:
1. Rebuild the 2.6.30 kernel with the latest intel / 915 drivers built-in.
2. Rebuild latest xorg-server, intel_drv & mesa.
3. Boot the whole mess (which is a non-trivial process when fallback is required...)
4. Run the X-server
5. Test.  Windows, glxgears, video files, TV stream (Hauppauge), WebCam, etc.

Actual Results:  
As stated above.  System can (a) hang; (b) performs poorly; (c) doesn't appear to allow easy kernel driver unloading/reloading while still retaining high resolution consoles (tty1, tty2, etc.).

Expected Results:  
In an ideal world, we would get "OLD" glxgears performance (I think I've seen 800+ FPS on this Pentium IV 2.8 GHz with the i915 [R2]) *and* low CPU loads while playing video or streaming TV *and* high resolution 1680x1050+ console terminals.

R2. This was back on the 2.6.24 kernels with either the intelfb or vesafb drivers.

All associated configuration files will be attached.  As an aside, this software should never have made the light of day if the developers were doing their job (for driver changes of this magnitude testing Windows+Video+Streaming TV+WebCams should be a standard requirement).

This isn't an uncommon hardware configuration.  Its a standard HP Pavilion a630 PC which happens to be a few years old.  And for most "home" work (internet access, email, document preparation and heavy duty bioinformatics work) its a *fine* machine.  Its too bad that the developers seem to be more inclined to test things on the latest & greatest chips rather than the existing install base.
Comment 1 Robert Bradbury 2009-06-29 19:22:35 UTC
Created attachment 196101 [details]
emerge --info information
Comment 2 Robert Bradbury 2009-06-29 19:23:34 UTC
Created attachment 196102 [details]
Subset of grub.conf for how kernel is booted (i915.modeset=1)
Comment 3 Robert Bradbury 2009-06-29 19:25:44 UTC
Created attachment 196103 [details]
loaded module information

Output of lsmod, though the critical drivers are compiled into the kernel, so you have to look at the .config parameters.
Comment 4 Robert Bradbury 2009-06-29 19:27:00 UTC
Created attachment 196104 [details]
Linux kernel .config information
Comment 5 Robert Bradbury 2009-06-29 19:27:47 UTC
Created attachment 196106 [details]
Linux kernel boot dmesg file
Comment 6 Robert Bradbury 2009-06-29 19:28:44 UTC
Created attachment 196107 [details]
xorg.conf information (should work with KMS/UXA/DRI2)
Comment 7 Robert Bradbury 2009-06-29 19:30:36 UTC
Created attachment 196108 [details]
Glxinfo with driver information and glxgear results

Da.. da... here is the (poor) performance documentation
Comment 8 Rafael 2009-06-29 20:57:32 UTC
> 2) Playing certain videos (in my case .flac files) *hangs* the system.  No
> video on any X terminals or the consoles (Ctrl-Alt-F#).  System must be turned
> off and rebooted.  This is an X server/driver problem because mplayer
> "attempts" to start a video stream.  It draws a small video window in the
> middle of the screen and then colorizes about 80% of it (the bottom right hand
> corner is black) then hangs the system.  While the video file is labled
> ".flac", mplayer appears to recognize it as a "RAWDV" format and uses the DVSD
> / ffmpeg driver to decode it.
> 
This maybe the same problem I've reported a wile ago in bug 270636. 
Do you have KMS enabled? 
If you do, does the problem still happen if you disable KMS?
Comment 9 Rémi Cardona (RETIRED) gentoo-dev 2009-06-30 08:47:22 UTC
(In reply to comment #0)
> The xorg-server-1.6.1.901-r5 patches appear to not work.  Starting the X
> server appears to hang the system.  Falling back to "-r4" was necessary to get
> a running X server.

This is the worst problem of the lot. Let's try to get that fixed first, we'll tackle the rest later.

I'm short on time today, but I'll help you out later.

Thanks
Comment 10 Mike Auty (RETIRED) gentoo-dev 2009-06-30 13:05:27 UTC
I'm also experiencing issue number 3 (-r5 patch set hanging X on startup).  I'm using gdm, which would load and almost immediately afterwards hang.  The Xorg.0.log output looks as follows:

(II) intel(0): EDID vendor "SEC", prod id 12885
(II) intel(0): Printing DDC gathered Modelines:
(II) intel(0): Modeline "1920x1200"x0.0  167.80  1920 2020 2052 2264  1200 1202 1208 1235 -hsync -vsync (74.1 kHz)

Backtrace:
0: /usr/bin/Xorg(xorg_backtrace+0x3c) [0x8135ffc]
1: /usr/bin/Xorg(xf86SigHandler+0x52) [0x80d5562]
2: [0xffffe400]
3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_alloc_for_render+0x24) [0xb7a65484]
4: /usr/lib/xorg/modules/drivers//intel_drv.so [0xb7ab8e9c]
5: /usr/lib/xorg/modules/drivers//intel_drv.so [0xb7ad6d1c]
6: /usr/lib/xorg/modules/drivers//intel_drv.so(uxa_trapezoids+0x271) [0xb7ad7401]
7: /usr/bin/Xorg(CompositeTrapezoids+0x9c) [0x8171a2c]
8: /usr/bin/Xorg [0x817a1fd]
9: /usr/bin/Xorg [0x81748e6]
10: /usr/bin/Xorg(Dispatch+0x33f) [0x808d4ef]
11: /usr/bin/Xorg(main+0x3bd) [0x807214d]
12: /lib/libc.so.6(__libc_start_main+0xe6) [0xb7c62a76]
13: /usr/bin/Xorg [0x80715d1]


The EDID modeline lines are normal, the backtrace is not.  5;)  I'll try figuring out which patch in particular caused the problems (when I next get some time free), but I can confirm that mesa-7.5_rc4 works fine with 1.6.1.901-r4, and that it is purely -r5 that's causing the problems.  
Comment 11 Robert Bradbury 2009-06-30 16:15:44 UTC
Mike, 1.6.1.901-r4 isn't out of the woods yet.  I'm experiencing a similar but somewhat different (infrequent) X-server stack trace problem in 1.6.901-r4.  It appears to involve the memory allocation code rather than the drm code.

I get my trace (will be attached), either when I switch X displays, or it seems more probable when I logoff from a display.  It doesn't seem to be consistent.

Also, is there any freedesktop.org bug report filed for enhancing the X server stack traces?  (Bug #?)  I've got both X-server and intel_drv compiled in debug mode (/usr/lib/debug/.... symbols & /usr/src/debug/.... associated files are present) and it doesn't seem like it should be too hard to rip the code out of GDB which uses these files to produce a more helpful stack trace.

Also is there any documentation/hints on running GDB on the X server?  Are there any signals (or performance issues) to concerned about?  Does it really require 2 video adaptors/terminals to work well (or will switching between a X-server and a console window work ok)?  Given the "questionable" aspect of the intertwining of the KMS / modeset code across various severs and the console code I'd expect making this work to be iffy.

Also, I think I ran across a program that can split a large console into multiple terminals (kind of like splitvt but perhaps 4 terminals in a rectangle rather than vertical splitting).  Anybody know the name?
Comment 12 Robert Bradbury 2009-06-30 16:24:19 UTC
Created attachment 196177 [details]
Xorg.2.log stack trace

xorg-server trace of segfault, perhaps when gnome session logged off.
Comment 13 Rafael 2009-06-30 16:52:07 UTC
I've also experienced the xorg-server-1.6.1.901-r5+KMS bug, but I see that the ebuild has already been hard masked.
I also confirm the problem number 2 still persists with xorg-server-1.6.1.901-r5+KMS+vanilla-sources-2.6.30.
Comment 14 Rémi Cardona (RETIRED) gentoo-dev 2009-06-30 17:29:21 UTC
*** Bug 275852 has been marked as a duplicate of this bug. ***
Comment 15 Rémi Cardona (RETIRED) gentoo-dev 2009-06-30 17:34:20 UTC
@Robert, please, _one_ issue per bug. If you have other issues, please file _new_ bugs. Rafael, that goes for you too. Please understand that keeping bug reports on topic is essential if you want us to be able to help you.

As for the KMS hang, I talked to upstream devs this afternoon and we're getting closer. In a nutshell, -r5 uncovered a bug in the intel driver. Adding link to the upstream bug.

Thanks
Comment 16 Robert Bradbury 2009-07-03 07:04:47 UTC
Remi, any specific location on the web that we should watch upstream for possible fixes (I've looked at freedesktop.org from time to time and it isn't really clear what the state of the various drivers is relative to "works robustly", "works maybe", "use at own risk" and "bleeding edge.  It isn't as if the core developers are providing press releases that clearly state what the bug(s) are and what patches resolve them.

I'm debating whether to go in the direction of trying to install a 2nd Radeon video card and perhaps attempting to debug the Intel driver myself or fall back to a i915.modset=0 environment or even older intelfb or vesafb drivers which appear to provide video support and decent glxgears performance.  Also, do you happen to have a pointer to the complete hardware specs on the i915 (and/or other Intel graphics chips)?  Thanks.
Comment 17 Rémi Cardona (RETIRED) gentoo-dev 2009-07-03 08:38:38 UTC
(In reply to comment #16)
> Remi, any specific location on the web that we should watch upstream for
> possible fixes (I've looked at freedesktop.org from time to time and it isn't
> really clear what the state of the various drivers is relative to "works
> robustly", "works maybe", "use at own risk" and "bleeding edge.  It isn't as if
> the core developers are providing press releases that clearly state what the
> bug(s) are and what patches resolve them.

Thing is, graphics drivers are hugely complex beasts. Some chips will work beautifully with great performance, suspend/resume support, etc, while some will be less stable.

Bottom line, you _have_ to report bugs upstream if you want them to even consider that things might not work as they intend. The kernel and X drivers support more than a _hundred_ variants of their chipsets and they don't test half of those on a regular basis.

> I'm debating whether to go in the direction of trying to install a 2nd Radeon
> video card and perhaps attempting to debug the Intel driver myself or fall back
> to a i915.modset=0 environment or even older intelfb or vesafb drivers which
> appear to provide video support and decent glxgears performance.  Also, do you
> happen to have a pointer to the complete hardware specs on the i915 (and/or
> other Intel graphics chips)?  Thanks.

For now, I suggest you stop using KMS until this is fixed. As for going back to intelfb or (u)vesafb, this is probably a big mistake.

As for the docs, all available docs are listed on Intel's linux graphics website. So that means only 965 and newer.

Again, I cannot stress enough how important it is that you report bugs upstream.

Thanks
Comment 18 Rémi Cardona (RETIRED) gentoo-dev 2009-07-03 12:23:52 UTC
Ok, I've committed xorg-server 1.6.1.902 with a patch that should fix the crash/hang of xf86-video-intel when using KMS.

_Please_ test it out, so I can report upstream and they can include it or fix it if needs be.

Thanks
Comment 19 Andrew Gaydenko 2009-07-03 13:38:15 UTC
I have tried 1.6.1.902 (having 100% one CPU core eating and practically hanging with 1.6.1.901-r5) - this eating problem is resolved. Nevertheless glxgears (or my son's gl-games) just blinks. I use up to date ~amd64. 'lspci | grep -i vga' out is:

00:02.0 VGA compatible controller: Intel Corporation 82G965 Integrated Graphics Controller (rev 02)

I have tired to rebuild mesa without success. revdep-rebuild has found nothing to rebuild.
Comment 20 Rémi Cardona (RETIRED) gentoo-dev 2009-07-03 13:40:38 UTC
(In reply to comment #19)
> I have tried 1.6.1.902 (having 100% one CPU core eating and practically hanging
> with 1.6.1.901-r5) - this eating problem is resolved. Nevertheless glxgears (or
> my son's gl-games) just blinks. I use up to date ~amd64. 'lspci | grep -i vga'
> out is:
> 
> 00:02.0 VGA compatible controller: Intel Corporation 82G965 Integrated Graphics
> Controller (rev 02)
> 
> I have tired to rebuild mesa without success. revdep-rebuild has found nothing
> to rebuild.
> 

It looks like both bugs you described have _nothing_ to do with the current bug report. If you have issues, please file separate bug reports so we can actually figure out what to do.

Thanks
Comment 21 Mike Auty (RETIRED) gentoo-dev 2009-07-04 00:02:32 UTC
*** Bug 275860 has been marked as a duplicate of this bug. ***
Comment 22 Giovanni Pellerano 2009-07-04 15:03:22 UTC
Hey Rémi,

great work =) the patch works great for me =)

i'm one of the guys who initially submit some issues here and also upstream, and it seems that your patch now fix all my problems.

i'm going to notify it also to jesse barnes of intel


thanks
Comment 23 Giovanni Pellerano 2009-07-04 15:12:34 UTC
hey rémi think i've talk too early, it does not freeze but also does not work properly :/

with glxinfo i got direct rendering:yes, but with glxgears i goot a black window and the console output

unhandled buffer attach event, attacment type 215048776
unhandled buffer attach event, attacment type 209761664
.... and over... and over 

Comment 24 Rémi Cardona (RETIRED) gentoo-dev 2009-07-05 18:18:47 UTC
(In reply to comment #23)
> hey rémi think i've talk too early, it does not freeze but also does not work
> properly :/

Hi Giovanni,

Thanks for testing and getting back to us, it's greatly appreciated :)

> with glxinfo i got direct rendering:yes, but with glxgears i goot a black
> window and the console output

Hum, somehow, I'm not surprised. I'm currently trying out mesa's 7.5 branch and xf86-video-intel's master/2.8 branch so I can't really confirm what you have.

However, I'll try to get up-to-date snapshots of both packages for you to test (both branches should be much much stabler than what we currently ship in portage).

Stay tuned :)

Thanks
Comment 25 yan nails 2009-07-06 10:00:06 UTC
Here is my install versions:
The first one worked:
======================================
x11-base/xorg-server-1.6.1.901-r1
media-libs/mesa-7.4.2
x11-drivers/xf86-video-intel-2.7.1
--------------------------------------
~250fps

But after updating xorg-server I have the same error, with different xorg-server and mesa packages installed, when running glxgear, glxinfo or games ======================================
x11-base/xorg-server-1.6.1.901-r4
media-libs/mesa-7.4.2 (7.4.4)
x11-drivers/xf86-video-intel-2.7.1
====================================
x11-base/xorg-server-1.6.1.902
media-libs/mesa-7.4.2 (7.4.4) (7.5_rc4)
x11-drivers/xf86-video-intel-2.7.1
--------------------------------------
unhandled buffer attach event, attacment type 174637688
unhandled buffer attach event, attacment type 176806968
unhandled buffer attach event, attacment type 170069536
unhandled buffer attach event, attacment type 176221936
Segmentation Error
=======================================

Also I have problem when logging out from KDE, kdm.log:

Backtrace:
0: /usr/bin/X(xorg_backtrace+0x38) [0x812b5e0]

Fatal server error:
Caught signal 11.  Server aborting

==================================
gentoo-sources-2.6.30_r1
GEM,KMS (wiki page installed)

Maybe it'll help
Comment 26 Robert Bradbury 2009-07-06 20:55:43 UTC
It looks like the "unhandled buffer attach event" messages are coming from /usr/lib/dri/i915_dri.so (or i965_drv.so).  These in turn are part of the mesa package.  My Xorg.#.log files that X reloads the dri drivers under the EXA driver.  I have 'Option "AccelMethod" "UXA" in my xorg.conf file and I seem to recall some discussion of EXA/UXA and the intel driver in some forums.  I am not sure if this helps any.

Note, it seems possible to debug the Xorg & Intel drivers "in the act".  See "Debugging the Xserver" from xorg.freedesktop.org/wiki/Development.  But it is tricky in that you need the debug symbols for robust debugging.  On gentoo I've finally worked this out, in that one wants to compile with "-ggdb" in CFLAGS/CXXFLAGS and FEATURES (/etc/make.conf or in the environment) set to include "splitdebug" and/or "nostrip" and "installsources".  The splitdebug option places the debug symbol information in /usr/lib/debug/<--path-->/binary-name.debug.  The "nostrip" option leaves the debug symbol information in the binary output files, e.g. /usr/lib/lib*, /usr/bin/* etc.  The "installsources" option leaves the package sources in /usr/src/debug/<package-directory>.  One can symlink the /usr/lib/debug and /usr/src/debug directories elsewhere to save space on / & /usr (the directories can get quite large if you want to do this for lots of libraries.  One obviously needs to build all of the X packages (xorg-server, xf86-video-intel, libdrm & mesa) under this scheme to get useful debug environments in gdb.  Also helpful are rebuilding the standard C libraries such as glibc * libm.

At some point, I hope someone will integrate the gdb stack trace code into the X server (.../os/backtrace.c which typically uses backtrace(3)) and then we could get backtraces which were more informative if one had the code compiled for debugging.  It doesn't look as if it would be too hard to hack the X server (os/backtrace.c or hw/xfree86/common/xf86Events.c) to boot the user into a gdb session on the Xserver.  Though this would have to be on one of the console tty's or a different graphics card.

I have my MSI 3450 (ATI R620) installed on my machine now, so once that is all configured properly I'll be able to use it to debug the i915 (though I have yet to figure out the xorg.conf configuration for 2 graphics cards+monitors).  Of course since the KSM/EXA/mesa(dri) code is even more experimental it may be a case of out of the frying pan and into the fire.
Comment 27 Robert Bradbury 2009-07-06 23:10:22 UTC
I've been going through the mesa driver source and it looks like they include a fair amount of debug code.  There are at least two environment variables for the intel-915 driver (perhaps the same for i810 & i965 drivers):
  INTEL_DEBUG=list-of-strings (comma separated?)
and
  INTEL_STRICT_CONFORMANCE=#  (where number is the "conformance level")

The various debug option strings are in the file .../src/mesa/drivers/dri/i965/intel_context.c in the array: debug_control[].  If you use the string "all", you get all 27 debug options which I expect produces a lot of output.  The specific option "dri" should provide some feedback as to what is going wrong with the "unhandled buffer attach" messages.  The error reports don't seem to make clear precisely what program (X-startup / Gnome / Application) is causing the errors or what the mesa hardware/driver (i810/i915/i965) is.  The Mesa code supports three different drivers (presumably due to the hardware capabilities).

As mentioned previously it helps to have the source around (with the installsources FEATURES option) to see what might be going on.
Comment 28 Rémi Cardona (RETIRED) gentoo-dev 2009-07-07 09:59:24 UTC
@All, this bug is fixed.

Robert, like I told you earlier, please open _new_ bug reports instead of continuing on this one. I have more than 150 bugs to look after, please don't make my job harder than it already is.

Thank you
Comment 29 Rémi Cardona (RETIRED) gentoo-dev 2009-07-07 09:59:43 UTC
And closing with the proper resolution.

Thanks