Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 439368

Summary: x11-drivers/xf86-video-intel-2.20.14: hard system freezes
Product: Gentoo Linux Reporter: Tassilo Horn <tsdh>
Component: [OLD] UnspecifiedAssignee: Gentoo X packagers <x11>
Status: RESOLVED UPSTREAM    
Severity: normal CC: nikoli
Priority: Normal    
Version: unspecified   
Hardware: AMD64   
OS: Linux   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=57118
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: The Xorg.0.log file after the system locked up
dmesg output after the lockup
"bt full" in GDB attached to the frozen X process

Description Tassilo Horn 2012-10-23 08:36:54 UTC
The two latest x11-drivers/xf86-video-intel ebuilds (2.20.10 and 2.20.12) cause my system to lock up after some minutes or hours.  The lockups occur mostly when switching windows (GNOME 3.4) or when the screen is reactivated.

It also happens to occur much faster (or maybe only) when I have two outputs connected: the laptop LCD and an external monitor.

When it happens, the system is completely locked up.  I can't switch to a console using Ctrl-Alt-Fx.  All I can do is powering off the system using Magic SysRQ keys.  I didn't have a chance to try ssh-ing into the system, but I'll try to do that as soon as I find some spare time.

The USE flags for x11-drivers/xf86-video-intel are: dri glamor sna udev -uxa -xvmc

For the time being, I've downgraded to x11-drivers/xf86-video-intel-2.20.9 which doesn't lock up the system.

Reproducible: Always




# emerge --info
Portage 2.2.0_alpha141 (default/linux/amd64/10.0/desktop/gnome, gcc-4.6.3, glibc-2.15-r3, 3.6.2-gentoo x86_64)
=================================================================
System uname: Linux-3.6.2-gentoo-x86_64-Intel-R-_Core-TM-2_Duo_CPU_T8100_@_2.10GHz-with-gentoo-2.2
Timestamp of tree: Mon, 22 Oct 2012 18:00:01 +0000
app-shells/bash:          4.2_p37
dev-java/java-config:     2.1.12
dev-lang/python:          2.7.3-r2, 3.2.3-r1
dev-util/cmake:           2.8.9-r1
dev-util/pkgconfig:       0.27.1
sys-apps/baselayout:      2.2
sys-apps/sandbox:         2.6
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.9.6-r3, 1.11.6, 1.12.4
sys-devel/binutils:       2.22.90
sys-devel/gcc:            4.6.3
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r4
sys-kernel/linux-headers: 3.6 (virtual/os-headers)
sys-libs/glibc:           2.15-r3
Repositories: gentoo emacs my_local_overlay
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.0/conf"
CONFIG_PROTECT_MASK="${EPREFIX}/etc/gconf /etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O2 -pipe -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build n"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/ http://gentoo.mneisen.org/ http://mirror.netcologne.de/gentoo/"
LANG="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en de"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/emacs /usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acpi alsa amd64 apng archive avahi avx berkdb bluetooth branding bzip2 cairo cdda cdr cli colord consolekit cracklib crypt cups curl cxx dbus dri dts dvd dvdr emacs emboss encode evo exif fam ffmpeg firefox flac fuse gdbm gif gnome gnome-keyring gnome-online-accounts gnutls gpg gpm gstreamer gtk gtk3 iconv icq icu idn introspection ipv6 jabber jpeg kpathsea lcms ldap libnotify mad mmx mng modules mp3 mp4 mpeg mudflap multilib nautilus ncurses networkmanager nls nptl ntp offensive ogg opengl openmp pam pango pcre pdf png policykit ppds pppd pulseaudio qt3support qt4 readline schroedinger sdl session smp socialweb spell sqlite sqlite3 sse sse2 sse3 sse4_1 ssl ssse3 startup-notification svg systemd tcpd theora threads tiff truetype udev udisks unicode upower usb vaapi vorbis vpx webgl webkit wxwidgets x264 xcb xft xinerama xml xv xvid zlib zsh-completion" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en de" PHP_TARGETS="php5-3" PYTHON_TARGETS="python3_2 python2_7" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="intel" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 Tassilo Horn 2012-10-23 08:42:07 UTC
Graphics card information from lshw:

     *-pci
          description: Host bridge
          product: Mobile PM965/GM965/GL960 Memory Controller Hub
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 0c
          width: 32 bits
          clock: 33MHz
          configuration: driver=agpgart-intel
          resources: irq:0
        *-display:0
             description: VGA compatible controller
             product: Mobile GM965/GL960 Integrated Graphics Controller (primary)
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 0c
             width: 64 bits
             clock: 33MHz
             capabilities: msi pm vga_controller bus_master cap_list rom
             configuration: driver=i915 latency=0
             resources: irq:45 memory:f8100000-f81fffff memory:e0000000-efffffff ioport:1800(size=8)
        *-display:1 UNCLAIMED
             description: Display controller
             product: Mobile GM965/GL960 Integrated Graphics Controller (secondary)
             vendor: Intel Corporation
             physical id: 2.1
             bus info: pci@0000:00:02.1
             version: 0c
             width: 64 bits
             clock: 33MHz
             capabilities: pm bus_master cap_list
             configuration: latency=0
             resources: memory:f8200000-f82fffff
Comment 2 Chí-Thanh Christopher Nguyễn gentoo-dev 2012-10-23 08:51:28 UTC
> All I can do is powering off the system using Magic SysRQ keys.

Then the system is not completely frozen (probably only X server or kernel drm) and it may still be possible to get dmesg/Xorg.0.log or other debug information.
Comment 3 Tassilo Horn 2012-10-23 10:55:19 UTC
> Then the system is not completely frozen (probably only X server or kernel drm) > and it may still be possible to get dmesg/Xorg.0.log or other debug information.

Hm, right.  Unfortunately, after the freeze I restarted the system, emerged the older driver, and restarted again, thus the relevant Xorg.0.log is gone now.  Is there an option for specifying how many log files X should keep?

Anyway, I'll try to upgrade again and provoke the freezes later today.  Which other debug information besides Xorg.0.log would be helpful?
Comment 4 Chí-Thanh Christopher Nguyễn gentoo-dev 2012-10-23 11:35:34 UTC
* dmesg
* Xorg.0.log
* attach gdb to X server when it hangs and provide a stack trace (may need to build xorg-server and xf86-video-intel with CFLAGS="-ggdb" and FEATURES="splitdebug" to get meaningful output)
* what happens when you run "chvt 1"
Comment 5 Tassilo Horn 2012-10-23 12:00:15 UTC
> * dmesg
> * Xorg.0.log
> * what happens when you run "chvt 1"

Ok, that shouldn't be hard to provide.

> * attach gdb to X server when it hangs and provide a stack trace
> (may need to build xorg-server and xf86-video-intel with CFLAGS="-ggdb"
> and FEATURES="splitdebug" to get meaningful output)

Ok, that's "gdb /usr/bin/X `pgrep X`", right?  I hope that's no Heisenbug that disappears once you try to debug it.
Comment 6 Chí-Thanh Christopher Nguyễn gentoo-dev 2012-10-23 12:02:03 UTC
You need to attach gdb to the X server only after it has hung (via ssh or so).
Comment 7 Tassilo Horn 2012-10-23 12:08:40 UTC
(In reply to comment #6)
> You need to attach gdb to the X server only after it has hung (via ssh or
> so).

Sure.  I've meant that many bugs disappear once you compile with "-O0 -ggdb" to get helpful backtraces.  Or should I stick with my usual CFLAGS "-march=native -O2 -pipe -fomit-frame-pointer" and just add -ggdb?  (I think, at least I need to remove the fomit-frame-pointer to get somewhat meaningful backtraces.)
Comment 8 Chí-Thanh Christopher Nguyễn gentoo-dev 2012-10-23 13:32:23 UTC
-fomit-frame-pointer is not very useful on amd64 anyway, you may want to consider dropping it.

Otherwise, just adding -ggdb and setting FEATURES="splitdebug" should result in the exact same code being built. -O2 can stay for now.
Comment 9 Tassilo Horn 2012-10-23 16:03:16 UTC
(In reply to comment #8)
> -fomit-frame-pointer is not very useful on amd64 anyway, you may want to
> consider dropping it.

Ok, dropped it.

> Otherwise, just adding -ggdb and setting FEATURES="splitdebug" should result
> in the exact same code being built. -O2 can stay for now.

Ok, so I created a /etc/portage/env/debug.conf with the following content:

--------------------------------------------
USE="debug"
CFLAGS="-march=native -pipe -g -ggdb"
CXXFLAGS="${CFLAGS}"
FEATURES="splitdebug"
--------------------------------------------

and added these entries to /etc/portage/package.env

--------------------------------------------
x11-drivers/xf86-video-intel	debug.conf
x11-base/xorg-server		debug.conf
--------------------------------------------

and recompiled these packages (intel driver 2.20.12, Xorg 1.13.0).

Then I provoked another freeze.  A reliable method seems to be to simply open a terminal window and press and hold F11 (in GNOME) which toggles fullscreen over and over again.

The first time, I couldn't ssh into the machine when the lockup occured.  But the second time I could.  I'll attach the dmesg output, the Xorg.0.log, and the GDB backtrace as attachments in a minute.

When the system locked up, the command "chvt 1" simply hangs.
Comment 10 Tassilo Horn 2012-10-23 16:03:58 UTC
Created attachment 327240 [details]
The Xorg.0.log file after the system locked up
Comment 11 Tassilo Horn 2012-10-23 16:04:32 UTC
Created attachment 327242 [details]
dmesg output after the lockup
Comment 12 Tassilo Horn 2012-10-23 16:05:44 UTC
Created attachment 327244 [details]
"bt full" in GDB attached to the frozen X process

As root:

  $ gdb /usr/bin/X `pgrep X`
  (gdb) bt full
Comment 13 Chí-Thanh Christopher Nguyễn gentoo-dev 2012-11-14 00:15:38 UTC
It seems to wait for some kind of input event. If possible, please report a bug at https://bugs.freedesktop.org/ and add the URL to this bug.
Comment 14 Tassilo Horn 2012-11-14 10:17:41 UTC
(In reply to comment #13)
> It seems to wait for some kind of input event. If possible, please report a
> bug at https://bugs.freedesktop.org/ and add the URL to this bug.

There are already some server bugs that look related.

But I'm not sure if the backtrace actually shows the problem this report is all about.  I had some more lockups in the meantime (also with x11-drivers/xf86-video-intel-2.20.13), but except for the one time above, I wasn't able to log into the machine using SSH anymore, so I couldn't generate a backtrace.

So maybe the backtrace actually shows another problem which doesn't happen that frequently.  At least, it doesn't seem to be related to the intel driver...

I'll try out the suggestions at the X.org wiki ServerDebugging page.  Maybe that will let me collect more information.
Comment 15 Tassilo Horn 2012-11-14 15:22:40 UTC
I had another lockup and was able to gather another backtrace.  It was the same as attached here.  So I went ahead and wrote a bug report at freedesktop.org.

  https://bugs.freedesktop.org/show_bug.cgi?id=57118
Comment 16 Rémi Cardona (RETIRED) gentoo-dev 2012-11-28 20:31:52 UTC
Upstream says the bug is now fixed. Could you try the latest ~arch ebuild and let us know if it works ok for you?

Thanks
Comment 17 Tassilo Horn 2012-11-28 21:02:46 UTC
(In reply to comment #16)
> Upstream says the bug is now fixed. Could you try the latest ~arch ebuild
> and let us know if it works ok for you?

I'm already running it.  If the bug is still there, it should at least trigger tomorrow @work with my dual screen setup.  I'll report back.
Comment 18 Tassilo Horn 2012-11-29 07:29:09 UTC
> I'm already running it.  If the bug is still there, it should at least
> trigger tomorrow @work with my dual screen setup.  I'll report back.

I had another freeze with 2.20.14, so the bug doesn't seem to be fixed.  I've reopened the upstream issue.
Comment 19 Rémi Cardona (RETIRED) gentoo-dev 2012-12-01 16:49:50 UTC
Closing as issue is being handled upstream.

Thanks