Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 266330 - xserver freezes (intel graphics), [mi] EQ overflowing. The server is probably stuck in an infinite loop. - evdev, miPolyArc
Summary: xserver freezes (intel graphics), [mi] EQ overflowing. The server is probably...
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo X packagers
URL: http://bugs.freedesktop.org/show_bug....
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-16 05:14 UTC by Cyp
Modified: 2009-05-25 19:35 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Xorg.0.log from freeze (Xorg.0.log.old,371.93 KB, text/plain)
2009-04-16 05:15 UTC, Cyp
Details
xorg.conf (xorg.conf,15.96 KB, text/plain)
2009-04-16 05:16 UTC, Cyp
Details
Call graph generated by manually adding debug trace to code. (callTree.txt,4.51 KB, text/plain)
2009-05-21 11:17 UTC, Cyp
Details
Call graph with more trace (callTree2.txt,6.09 KB, text/plain)
2009-05-23 09:32 UTC, Cyp
Details
Yet more detailed trace... (callTree3.txt,9.97 KB, text/plain)
2009-05-24 20:50 UTC, Cyp
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cyp 2009-04-16 05:14:44 UTC
Seems there are many EQ overflowing bug reports, but I think mine is different, because the others don't mention miPolyArc. I wasn't sure what an appropriate summary was.

The backtraces always mention evdev and miPolyArc, which to me seem to be unrelated.

I'm not using compiz, and am on kde 3.5.10.

The freezes occur when using completely unrelated programs (mainly inkscape, I think). I think (not sure) that there are usually bézier curves on the screen when it happens.

The freezes have only occurred when using the mouse (and maybe keyboard).

I'm not sure when it started, but I think it was after I upgraded both xorg and the kernel (at the same time).


From the log:

(EE) Logitech USB Receiver: Read error: Resource temporarily unavailable
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/X(xorg_backtrace+0x26) [0x4e9e86]
1: /usr/bin/X(mieqEnqueue+0x271) [0x4caf81]
2: /usr/bin/X(xf86PostMotionEventP+0xc4) [0x471554]
3: /usr/lib64/xorg/modules/input//evdev_drv.so [0x7f629882c872]
4: /usr/bin/X [0x487975]
5: /usr/bin/X [0x46fc66]
6: /lib/libpthread.so.0 [0x7f62af6f3a00]
7: /lib/libc.so.6 [0x7f62ad61a2cb]
8: /lib/libc.so.6(memmove+0x18a) [0x7f62ad61897a]
9: /usr/bin/X [0x4b94c6]
10: /usr/bin/X [0x4b9a31]
11: /usr/bin/X [0x4bb354]
12: /usr/bin/X [0x4bbb74]
13: /usr/bin/X(miPolyArc+0x85) [0x4bc905]
14: /usr/bin/X [0x52b3d2]
15: /usr/bin/X(ProcPolyArc+0x105) [0x4475c5]
16: /usr/bin/X(Dispatch+0x364) [0x44a104]
17: /usr/bin/X(main+0x44d) [0x43094d]
18: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f62ad5ba5c6]
19: /usr/bin/X [0x42fd39]
Comment 1 Cyp 2009-04-16 05:15:29 UTC
Created attachment 188533 [details]
Xorg.0.log from freeze
Comment 2 Cyp 2009-04-16 05:16:00 UTC
Created attachment 188534 [details]
xorg.conf
Comment 3 Cyp 2009-04-16 05:17:23 UTC
P.S. I'm on amd64, so my pointers are bigger than my ints.
Comment 4 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2009-04-18 16:38:47 UTC
Please post your "emerge --info".
Comment 5 Cyp 2009-04-18 17:54:33 UTC
I sometimes see some graphics corruption (graphics data from the wrong window/desktop displayed with the wrong pitch) in place of qt4 widgets. The cursor in a qt4 QSpinBox usually looks like a mostly black or grey square with some corruption.

Just now I saw some corruption in a 3.5.10 konsole, which isn't a qt4 widget, making me wonder if it's not a qt 4.5 specific bug.

Don't know if it's relevant, but thought it was worth noting in case.


(In reply to comment #4)
> Please post your "emerge --info".
> 

Portage 2.1.6.7 (default/linux/amd64/2008.0/desktop, gcc-4.3.3, glibc-2.9_p20081201-r2, 2.6.28-gentoo-r2 x86_64)
=================================================================
System uname: Linux-2.6.28-gentoo-r2-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q6600_@_2.40GHz-with-glibc2.2.5
Timestamp of tree: Fri, 17 Apr 2009 05:00:17 +0000
ccache version 2.4 [disabled]
app-shells/bash:     3.2_p39
dev-java/java-config: 1.3.7-r1, 2.1.7
dev-lang/python:     2.5.2-r7
dev-python/pycrypto: 2.0.1-r8
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.6.3-r1
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.4_p6, 1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/lib/X11/xkb /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-march=core2 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks fixpackages parallel-fetch protect-owned sandbox sfperms splitdebug strict unmerge-orphans userfetch"
GENTOO_MIRRORS=" http://gentoo.tiscali.nl/ http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ #http://pandemonium.tiscali.de/pub/gentoo/ http://ds.thn.htu.se/linux/gentoo"
LANG="en_US.UTF8"
LDFLAGS="-Wl,-O1"
LINGUAS="en da pt_BR"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage /usr/portage/local/layman/java-overlay /usr/portage/local/layman/science"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="3dnow 3dnowext X a52 aac aalib acl acpi alisp alsa amd64 apache2 ares bash-completion bazaar bdf berkdb bluetooth branding bzip2 cairo cdr cli cracklib crypt cups curl cvs d dbus djvu dk doc dri dvd dvdr dvdread editor eds emboss encode esd evo exif fam ffmpeg firefox flac foomaticdb fortran gcj gd gdbm gif git glitz gnutls gpm graphviz gstreamer gtk hal iconv imlib isdnlog ithreads jadetex java jpeg jpeg2k kde kerberos kpathsea ldap libcaca libnotify live logitech-mouse lzma mad mercurial midi mikmod mjpeg mmx mmxext mng mp2 mp3 mpeg mudflap multilib mysql ncurses network nls nodrm nptl nptlonly nsplugin objc objc++ objc-gc offensive ogg opengl openmp oss pam pcre pdf perl plotutils png povray ppds pppd python qt3 qt3support qt4 quicktime readline reflection rle samba scanner sdl se_swedb server session sift solver speex spell spl sse sse2 ssl ssse3 startup-notification subversion svg sysfs tcpd tga theora threads tiff timidity tk tokenizer truetype unicode usb v4l v4l2 vcd vorbis webkit wma x264 xcomposite xine xml xorg xscreensaver xulrunner xv xvid xvmc zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en da pt_BR" USERLAND="GNU" VIDEO_CARDS="s3 v4l vesa vga via i810 intel"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 6 Cyp 2009-04-18 18:11:19 UTC
Not again... Oh well, here's another backtrace, in case it helps.

Lines 0-6 and 12-19 are identical, but the middle lines 7-11 are different...


Backtrace:
0: /usr/bin/X(xorg_backtrace+0x26) [0x4e9e86]
1: /usr/bin/X(mieqEnqueue+0x271) [0x4caf81]
2: /usr/bin/X(xf86PostMotionEventP+0xc4) [0x471554]
3: /usr/lib64/xorg/modules/input//evdev_drv.so [0x7f93cbcd3872]
4: /usr/bin/X [0x487975]
5: /usr/bin/X [0x46fc66]
6: /lib/libpthread.so.0 [0x7f93e2b9aa00]
7: /lib/libm.so.6(scalbn+0xe5) [0x7f93e0fca045]
8: /lib/libm.so.6(ldexp+0x47) [0x7f93e0fca197]
9: /lib/libm.so.6(cbrt+0x100) [0x7f93e0fb9160]
10: /usr/bin/X [0x4b8412]
11: /usr/bin/X [0x4bb39e]
12: /usr/bin/X [0x4bbb74]
13: /usr/bin/X(miPolyArc+0x85) [0x4bc905]
14: /usr/bin/X [0x52b3d2]
15: /usr/bin/X(ProcPolyArc+0x105) [0x4475c5]
16: /usr/bin/X(Dispatch+0x364) [0x44a104]
17: /usr/bin/X(main+0x44d) [0x43094d]
18: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f93e0a615c6]
19: /usr/bin/X [0x42fd39]
Comment 7 Cyp 2009-05-06 07:13:19 UTC
Is there anything I can do to help debug this?

I thought about putting an 'ulimit -c unlimited' in whichever file starts X (not sure which file), and making some script to 'killall -SIGSEGV X' when '[mi] EQ overflowing.' appears in the log, to get a core dump. (Would that be a sensible idea?)

Upgrading the kernel to 2.6.29-gentoo-r1, and running without an /etc/X11/xorg.conf didn't help. I'm on the latest xorg-server-1.5.3-r5, xf86-video-intel-2.6.3-r1 and xf86-input-evdev-2.2.1. Anything relevant to try upgrading?

Here's another backtrace, in case more backtraces help.
I have FEATURES="splitdebug", so I don't know why the backtrace doesn't give filename/line numbers.

0: /usr/bin/X(xorg_backtrace+0x26) [0x4e9e86]
1: /usr/bin/X(mieqEnqueue+0x271) [0x4caf81]
2: /usr/bin/X(xf86PostMotionEventP+0xc4) [0x471554]
3: /usr/lib64/xorg/modules/input//evdev_drv.so [0x7fde8c620c78]
4: /usr/bin/X [0x487975]
5: /usr/bin/X [0x46fc66]
6: /lib/libpthread.so.0 [0x7fdea3658a00]
7: /usr/bin/X [0x4b9878]
8: /usr/bin/X [0x4bb354]
9: /usr/bin/X [0x4bbb74]
10: /usr/bin/X(miPolyArc+0x85) [0x4bc905]
11: /usr/bin/X [0x52b3d2]
12: /usr/bin/X(ProcPolyArc+0x105) [0x4475c5]
13: /usr/bin/X(Dispatch+0x364) [0x44a104]
14: /usr/bin/X(main+0x44d) [0x43094d]
15: /lib/libc.so.6(__libc_start_main+0xe6) [0x7fdea151f5c6]
16: /usr/bin/X [0x42fd39]
Comment 8 Cyp 2009-05-13 20:09:50 UTC
I made the script to killall -SIGSEGV X, when EQ overflowing appears in the log.

I got a core dump. Afterwards, I realised that to get line numbers, I also need to build with "-g" in CFLAGS. I rebuilt with -g, hope that didn't mess up the core dump - it seems the same as it was before, just with more useful information.

Some of the parameters look a little odd - I wonder if they could be out of range somehow...

Any more information I can give that would be relevant?


(gdb) bt
#0  0x00007fcb737651e5 in raise () from /lib/libc.so.6
#1  0x00007fcb73766703 in abort () from /lib/libc.so.6
#2  0x0000000000465189 in ddxGiveUp () at xf86Init.c:1472
#3  0x00000000004f3d6d in AbortServer () at log.c:407
#4  0x00000000004f4415 in FatalError (f=0x5709a0 "Caught signal %d.  Server aborting\n") at log.c:553
#5  0x0000000000487899 in xf86SigHandler (signo=11) at xf86Events.c:593
#6  <signal handler called>
#7  0x00007fcb737b12d3 in ?? () from /lib/libc.so.6
#8  0x00007fcb737af97a in memmove () from /lib/libc.so.6
#9  0x00000000004b94c6 in newFinalSpan (y=-224046, xmin=-2147483485, xmax=162) at /usr/include/bits/string3.h:59
#10 0x00000000004b9a31 in arcSpan (y=-224111, lx=-2147483647, lw=<value optimized out>, rx=-2147483647, rw=0, def=0x3, bounds=0x7fff7e0be1c0,
    acc=0x7fff7e0be130, mask=<value optimized out>) at miarc.c:2984
#11 0x00000000004bb354 in drawArc (tarc=0x7fff7e0be360, l=1, a0=<value optimized out>, a1=<value optimized out>, right=0x0, left=0x0)
    at miarc.c:3686
#12 0x00000000004bbb74 in miArcSegment (pDraw=0x16cbfc0, pGC=0x151dcd0, tarc=
      {x = 152, y = 55, width = 20, height = 20, angle1 = 0, angle2 = 23040}, right=0x0, left=0x0) at miarc.c:312
#13 0x00000000004bc905 in miPolyArc (pDraw=0x16cbfc0, pGC=0x151dcd0, narcs=1, parcs=0x8dac434) at miarc.c:1070
#14 0x000000000052b3d2 in damagePolyArc (pDrawable=0x16cbfc0, pGC=0x151dcd0, nArcs=1, pArcs=0x8dac434) at damage.c:1239
#15 0x00000000004475c5 in ProcPolyArc (client=0xcff480) at dispatch.c:1744
#16 0x000000000044a104 in Dispatch () at dispatch.c:454
#17 0x000000000043094d in main (argc=8, argv=0x7fff7e0bec88, envp=<value optimized out>) at main.c:438
Comment 9 Cyp 2009-05-13 20:17:21 UTC
P.S. It it normal to change the status to ASSIGNED, without changing the assignee or adding to CC, or is it a mistake?
Comment 10 Sebastian Luther (few) 2009-05-18 18:46:17 UTC
(In reply to comment #9)
> P.S. It it normal to change the status to ASSIGNED, without changing the
> assignee or adding to CC, or is it a mistake?
> 

Bugs get assigned to bugwrangels as long as they are considered incomplete.
Comment 11 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2009-05-18 22:06:37 UTC
Please don't put me on CC just for getting your bug worked on further...
Comment 12 Cyp 2009-05-19 09:35:27 UTC
(In reply to comment #11)
> Please don't put me on CC just for getting your bug worked on further...

Sorry, won't happen again. I thought you meant to assign yourself. Which I now know was wrong, as per comment #10.
Comment 13 Cyp 2009-05-21 11:17:44 UTC
Created attachment 192027 [details]
Call graph generated by manually adding debug trace to code.

The parameter values from the core dump didn't make much sense, so I tried adding some trace to mi/miarc.c, showing which functions call which other functions, and how many times.

The data in callTree.txt comes from a _single_ call to miPolyArc, which never returned.

Something seems a bit suspect about drawQuadrant calling arcSpan 164626412 times. The call to drawQuadrant did not return, at the time of the last dump before I rebooted the computer, so I think it was stuck in an infinite loop.

In the callTree.txt, all parameters are given in the form [min, last, max], where 'min' and 'max' are the smallest and largest values they for any given call to the function, and 'last' is the value of the parameter at the very last call to the function. (And in functions that were only called once, the three numbers are the same.)

Later, I plan to dump *def, *acc and *spdata, not just the addresses of the pointers...

Hope this information helps.
Comment 14 Cyp 2009-05-23 09:32:19 UTC
Created attachment 192172 [details]
Call graph with more trace

This is getting stranger...

When miArcSegment is called, tarc.height == 20 .
When drawQuadrant is called, def->h == 0.000000 .
Before the call to drawQuadrant, there is a call to computeAcc, which I didn't add to the trace.
In computeAcc, there is the statement "def->h = ((double) tarc->height) / 2.0;".
Unfortunately, I didn't add computeAcc to the call tree trace.

That statement seems to have failed to execute, somehow... (Suggesting that it's a compiler problem.) I haven't found a way of reliably reproducing that failure. (Suggesting that it's not a compiler problem.)

Once def->h == 0, later calculations end up with NANs all over the place, and loops apparently become infinite.

I guess I'll test whether tracing out the values of tarc->height and def->h just before and after the statement gives anything useful.

I'm confused why I can't deterministically reproduce the freeze (suggesting that it's a threading, kernel or hardware problem), and yet I don't get random freezes/crashes anywhere except in that particular function (suggesting that it's not a threading/kernel/hardware problem).
Comment 15 Cyp 2009-05-24 20:50:01 UTC
Created attachment 192333 [details]
Yet more detailed trace...

Aaargh!

This time, now that I trace out before and after the assignment, def->h = 10.000000, which is correct. The problem has now moved to the computeBound function, where bound.ellipse.max should clearly have been assigned the value 10, but somehow gets assigned 0 instead. (def->a1 is set to 90 just before the call to computeBound.)

Anyone have any ideas as to what to try, or theories as to what kind of weird things could cause the weird problem?

I guess I'll wait and see if it's random which value mutates into a 0, or if it depends on where I add trace... I'm not sure what else to trace out right now.
Comment 16 Rémi Cardona (RETIRED) gentoo-dev 2009-05-25 11:03:19 UTC
You might want to open a bug in FreeDesktop's bugzilla [1] to get input from upstream devs.

Please paste the url here once you've opened the bug.

Thanks

[1] https://bugs.freedesktop.org
Comment 17 Cyp 2009-05-25 16:33:30 UTC
(In reply to comment #16)
> You might want to open a bug in FreeDesktop's bugzilla [1] to get input from
> upstream devs.
> 
> Please paste the url here once you've opened the bug.
> 
> Thanks
> 
> [1] https://bugs.freedesktop.org

Thanks for the suggestion.

http://bugs.freedesktop.org/show_bug.cgi?id=21931
Comment 18 Rémi Cardona (RETIRED) gentoo-dev 2009-05-25 19:35:15 UTC
Thanks, let's track the bug upstream then, because this is beyond my knowledge of the X code.

Thanks