Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 192048 - x11-base/xorg-server-1.3.0.0 and above crashes after update (gcc-4.2)
Summary: x11-base/xorg-server-1.3.0.0 and above crashes after update (gcc-4.2)
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo X packagers
URL:
Whiteboard:
Keywords:
: 193138 194056 194155 194557 (view as bug list)
Depends on:
Blocks: gcc-4.2
  Show dependency tree
 
Reported: 2007-09-10 21:12 UTC by Rafał Mużyło
Modified: 2008-10-29 14:56 UTC (History)
15 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Fix SSE test (sse.pixman.patch,596 bytes, patch)
2007-09-19 19:58 UTC, Alan Hourihane
Details | Diff
The same patch for Xorg (sse.xorg.patch,2.05 KB, patch)
2007-09-19 19:59 UTC, Alan Hourihane
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rafał Mużyło 2007-09-10 21:12:02 UTC
With the recent stabilization of x11-base/xorg-server-1.3.0.0 X stopped working for me.
When I tried to use 1.4 the error is almost the same.

I'm getting a crash with a backtrace:
X(xf86SigHandler+0x7e) [0x80d674e]
[0xb7f29420]
/usr/lib/libpixman-1.so.0(pixman_image_composite+0x55c) [0xb7e680dc]
/usr/lib/xorg/modules//libfb.so(fbComposite+0x1f1) [0xb7124601] /usr/lib/xorg/modules/drivers//nvidia_drv.so(_nv000681X+0x498) [0xb71e38f8] 

However it's not a nvidia problem - I got similar effects when I tried "fbdev" and "vesa" drivers.
I don't have the error from 1.3.0.0 but as far as I remember it was about the same (without the libpixman line).
xorg-server-1.2 worked fine, but after unmasking all those packages for 1.4 I don't think downgrade is an option anymore.
Comment 1 Donnie Berkholz (RETIRED) gentoo-dev 2007-09-10 21:26:23 UTC
Please show me an error using the nv/fbdev/vesa drivers instead.
Comment 2 Rafał Mużyło 2007-09-11 04:59:48 UTC
I don't have nv built.

"fbdev" gives me:
0: X(xf86SigHandler+0x7e) [0x80d674e]
1: [0xb7ef4420]
2: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x55c) [0xb7e330dc]
3: /usr/lib/xorg/modules//libfb.so(fbComposite+0x1f1) [0xb7173601]
4: X [0x816628d]
5: X(CompositePicture+0x150) [0x814d630]
6: X(miGlyphs+0x3a8) [0x814a328]
7: X [0x816609a]
8: X(CompositeGlyphs+0x9a) [0x814d7aa]
9: X [0x8155081]
10: X [0x81504e5]
11: X [0x8143c9e]
12: X(Dispatch+0x2ce) [0x808579e]
13: X(main+0x486) [0x806cce6]
14: /lib/libc.so.6(__libc_start_main+0xe0) [0xb7cc69d0]
15: X(FontFileCompleteXLFD+0xa5) [0x806c051]

"vesa" gives me:
0: X(xf86SigHandler+0x7e) [0x80d674e]
1: [0xb7f1f420]
2: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x55c) [0xb7e5e0dc]
3: /usr/lib/xorg/modules//libfb.so(fbComposite+0x1f1) [0xb7255601]
4: X [0x816628d]
5: X(CompositePicture+0x150) [0x814d630]
6: X(miGlyphs+0x3a8) [0x814a328]
7: X [0x816609a]
8: X(CompositeGlyphs+0x9a) [0x814d7aa]
9: X [0x8155081]
10: X [0x81504e5]
11: X [0x8143c9e]
12: X(Dispatch+0x2ce) [0x808579e]
13: X(main+0x486) [0x806cce6]
14: /lib/libc.so.6(__libc_start_main+0xe0) [0xb7cf19d0]
15: X(FontFileCompleteXLFD+0xa5) [0x806c051]
Comment 3 Torsten Kaiser 2007-09-11 05:57:00 UTC
I have seen something similar on my ~amd64 system.
After upgrading from xorg-server-1.3.0.0 to 1.4, xdm will start up, but a few seconds after login X will die.

My backtrace:
Backtrace:
0: /usr/bin/X(xf86SigHandler+0x6a) [0x4996ba]
1: /lib/libc.so.6 [0x7f8fdf4696d0]
2: /usr/lib/libpixman-1.so.0 [0x7f8fe00551f3]
3: /usr/lib/libpixman-1.so.0 [0x7f8fe005a741]
4: /usr/lib/libpixman-1.so.0(pixman_composite_rect_general+0x39a) [0x7f8fe005c21a]
5: /usr/lib/libpixman-1.so.0 [0x7f8fe00614a3]
6: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x706) [0x7f8fe0061ed6]
7: /usr/lib64/xorg/modules//libfb.so(fbComposite+0x20d) [0x7f8fdcf0360d]
8: /usr/lib64/xorg/modules//libexa.so(ExaCheckComposite+0x109) [0x7f8fdcce8b29]
9: /usr/lib64/xorg/modules//libexa.so(exaComposite+0x3fc) [0x7f8fdcce6c5c]
10: /usr/bin/X [0x538b8e]
11: /usr/bin/X [0x529976]
12: /usr/bin/X(Dispatch+0x2e8) [0x4508d8]
13: /usr/bin/X(main+0x47b) [0x4371fb]
14: /lib/libc.so.6(__libc_start_main+0xf4) [0x7f8fdf456b74]
15: /usr/bin/X(FontFileCompleteXLFD+0x259) [0x436519]

Note: I am using the masked xf86-video-ati beta drivers. (Both 6.6.193 and 6.7.192 crashed with xorg-server-1.4)
Above was a try with EXA enabled, but first I got this error with XAA. Also disableing all accelerations did not fix it.
I remerged xf86-input-mouse, xf86-input-keyboard, the ati drivers and x11-libs/pixman-0.9.5 after the 1.4 upgrade without success.

I now use xorg-server-1.3.0.0 (remerged from a saved tbz2) with the 6.6.193 driver, that is stable for me.
Comment 4 Lars T. Mikkelsen 2007-09-13 16:05:41 UTC
I experienced a similar issue with both xorg-server-1.3.0.0 and xorg-server-1.4-r1 using xf86-video-ati-6.6.3.

From xorg-server-1.3.0.0:

Backtrace:
0: /usr/bin/X(xf86SigHandler+0x81) [0x80c7eb1]
1: [0xb7f77420]
2: /usr/lib/xorg/modules//libfb.so(fbComposite+0x55a) [0xb7ab484a]
3: /usr/lib/xorg/modules//libxaa.so(XAAComposite+0x224) [0xb7a75374]
4: /usr/bin/X [0x817417f]
5: /usr/bin/X(CompositePicture+0x150) [0x815ad90]
6: /usr/bin/X [0x8161056]
7: /usr/bin/X [0x815df05]
8: /usr/bin/X [0x815134e]
9: /usr/bin/X(Dispatch+0x19f) [0x808ea9f]
10: /usr/bin/X(main+0x47b) [0x807678b]
11: /lib/libc.so.6(__libc_start_main+0xe0) [0xb7d039e0]
12: /usr/bin/X(FontFileCompleteXLFD+0x1e9) [0x8075b01]

Fatal server error:
Caught signal 4.  Server aborting

From xorg-server-1.4-r1:

Backtrace:
0: /usr/bin/X(xf86SigHandler+0x7e) [0x80c712e]
1: [0xb7fda420]
2: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x55c) [0xb7e961bc]
3: /usr/lib/xorg/modules//libfb.so(fbComposite+0x1f1) [0xb7a99581]
4: /usr/lib/xorg/modules//libxaa.so(XAAComposite+0x224) [0xb7a62ae4]
5: /usr/lib/xorg/modules//libxaa.so [0xb7a7dc56]
6: /usr/bin/X [0x817059d]
7: /usr/bin/X(CompositePicture+0x150) [0x8157940]
8: /usr/bin/X [0x815d956]
9: /usr/bin/X [0x815a7f5]
10: /usr/bin/X [0x814dfae]
11: /usr/bin/X(Dispatch+0x2ce) [0x808d6de]
12: /usr/bin/X(main+0x486) [0x8074c26]
13: /lib/libc.so.6(__libc_start_main+0xe0) [0xb7cee9e0]
14: /usr/bin/X(FontFileCompleteXLFD+0x21d) [0x8073f91]

Fatal server error:
Caught signal 4.  Server aborting

Notice the SIGILL signal. I suspect this to be a bug in gcc-4.2.0, as reemerging xorg-server-1.3.0.0 with gcc-4.1.2 solved the issue with xorg-server-1.3.0.0 and likewise reemerging pixman-0.9.5 with gcc-4.1.2 solved the issue with xorg-server-1.4-r1.

Relevant system information:

$ uname -mrp
2.6.22.1 i686 AMD Athlon(tm) Processor

$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 4
model name      : AMD Athlon(tm) Processor
stepping        : 4

CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=athlon-tbird -pipe"
CHOST="i686-pc-linux-gnu"
Comment 5 Rafał Mużyło 2007-09-14 05:47:08 UTC
(In reply to comment #4)
 
> Notice the SIGILL signal. I suspect this to be a bug in gcc-4.2.0, as
> reemerging xorg-server-1.3.0.0 with gcc-4.1.2 solved the issue with
> xorg-server-1.3.0.0 and likewise reemerging pixman-0.9.5 with gcc-4.1.2 solved
> the issue with xorg-server-1.4-r1.

Well, you may be onto something.
My processor is Duron 650, gcc is 4.2.0, system is mostly stable with several ~86 packages in packages.keywords.
Comment 6 Rafał Mużyło 2007-09-14 10:34:47 UTC
You were right, recompiling x11-libs/pixman with gcc-4.1.2 solved it for me, too.
Comment 7 Donnie Berkholz (RETIRED) gentoo-dev 2007-09-15 00:00:46 UTC
Could you try with gcc 4.2, but add -fno-tree-vrp to your CFLAGS?
Comment 8 Lars T. Mikkelsen 2007-09-15 10:01:20 UTC
(In reply to comment #7)
> Could you try with gcc 4.2, but add -fno-tree-vrp to your CFLAGS?

I'm sorry, X still crashes with -fno-tree-vrp added to my CFLAGS. It actually crashes even when compiling pixman without optimization, i.e. CFLAGS="-march=athlon-tbird -pipe".
Comment 9 Lars T. Mikkelsen 2007-09-15 11:20:53 UTC
I investigated this a bit further. Using a core dump of the X server, I identified the illegal instruction to be 'movss', for instance used here (in libpixman-1.so):

00029d00 <fbCompositeSrc_8888RevNPx8888mmx>:
...
   29f82:       8b 55 c4                mov    -0x3c(%ebp),%edx
   29f85:       0f 6e ce                movd   %esi,%mm1
   29f88:       f3 0f 10 85 44 ff ff    movss  -0xbc(%ebp),%xmm0
   29f8f:       ff 
   29f90:       0f 6f 42 f8             movq   -0x8(%edx),%mm0

The 'movss' instruction is part of the SSE instruction set. The SSE instruction set is, however, not available in the Athlon Thunderbird (and Duron Spitfire) core. This is confirmed by my cpuflags:

$ cat /proc/cpuinfo | grep flags
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow

Adding -mno-sse to my CFLAGS solved the issue - I was able to reemerge pixman with gcc-4.2.0 and run X without crashes. I do think, however, that GCC shouldn't generate SSE instructions when using -march=athlon-tbird.
Comment 10 Torsten Kaiser 2007-09-15 15:56:44 UTC
I don't know what fixed it for me.
My working 1.3.0.0-server was compiled with gcc-4.2.1, so at least on my ~amd64 system it did not misscompile it. But on the other hand, my system supports SSE.

Yesterday I tried the xorg-server-1.4-r1 again and now it works.
Only difference between the first nonworking and the now working case is, that I emerged pixman before the server-1.4 and only tried a later merge of pixman with the 6.7 beta-drivers.

So remerging xorg-server-1.4-r1, pixman-0.9.5, xf86-input-mouse-1.2.2, xf86-video-ati-6.6.193 and xf86-input-keyboard-1.2.2 in that order with gcc-4.2.1 gives me a working setup.

The only toolchain-related merge between both tries was a downgrade from sys-devel/binutils-2.18.50.0.1 to sys-devel/binutils-2.18, as this ebuild now again no longer has any ~ keywords.

Beta-Binutils-Bug instead of GCC-4.2-Bug?
Comment 11 Lars T. Mikkelsen 2007-09-15 17:08:26 UTC
I'm no longer sure whether this is a bug in gcc-4.2.0 or pixman-0.9.5. The pixman configure script checks for "MMX/SSE intrinsics in the compiler". This test succeeds on my system and as a result pixman/pixman-mmx.c is compiled with the MMX_CFLAGS - that contain -msse. As mentioned gcc-4.2.0 generates the 'movss' instruction while gcc-4.1.2 doesn't. I think _mm_cvtsi32_si64() is the function that generates the 'movss' instructions.

In xorg-server-1.3.0.0 the issue is similar, although the file in question is fb/fbmmx.c.
Comment 12 Lars T. Mikkelsen 2007-09-15 18:16:11 UTC
Okay, I think the issue can be summarized as follows. 

$ cat > conftest.c << EOF
#include <mmintrin.h>
int main () {
    __m64 v = _mm_cvtsi32_si64(1);
    return 0;
}
EOF

$ /usr/i686-pc-linux-gnu/gcc-bin/4.1.2/gcc -msse -march=athlon conftest.c
$ ./a.out
$

$ /usr/i686-pc-linux-gnu/gcc-bin/4.2.0/gcc -march=athlon conftest.c
$ ./a.out
$

$ /usr/i686-pc-linux-gnu/gcc-bin/4.2.0/gcc -msse -march=athlon conftest.c
$ ./a.out
Illegal instruction
$

If -march=athlon should take precedence over -msse, then it's a bug in gcc-4.2.0, if not, then it's a bug in the pixman-0.9.5 configure script.
Comment 13 Ryan Hill (RETIRED) gentoo-dev 2007-09-16 04:39:08 UTC
i would think -march=athlon implies -mno-sse.  i've sent a mail upstream asking them.
Comment 14 Ryan Hill (RETIRED) gentoo-dev 2007-09-16 20:03:49 UTC
the reply:

No.  -march should only override previous -march options.  Doing
anything else is complicated and error-prone.

looks like it's pixman that's at fault.
Comment 15 Rémi Cardona (RETIRED) gentoo-dev 2007-09-18 21:49:08 UTC
Confirming this bug on my trusty duron box ... any news from upstream on this? (I haven't seen anything on xorg-devel about this lately)
Comment 16 Alan Hourihane 2007-09-18 21:55:13 UTC
For those with this problem can you try editing pixman's configure.ac and change the line that says....

CFLAGS="$CFLAGS $MMX_CFLAGS"

to

CFLAGS="$MMX_CFLAGS $CFLAGS"

does that fix the ordering problem ??
Comment 17 Lars T. Mikkelsen 2007-09-19 08:20:51 UTC
(In reply to comment #15)
> Confirming this bug on my trusty duron box ... any news from upstream on this?
> (I haven't seen anything on xorg-devel about this lately)

By upstream, I think Ryan was referring to GCC:

http://gcc.gnu.org/ml/gcc-help/2007-09/msg00185.html
Comment 18 Lars T. Mikkelsen 2007-09-19 08:31:13 UTC
(In reply to comment #16)
> For those with this problem can you try editing pixman's configure.ac and
> change the line that says....
> 
> CFLAGS="$CFLAGS $MMX_CFLAGS"
> 
> to
> 
> CFLAGS="$MMX_CFLAGS $CFLAGS"
> 
> does that fix the ordering problem ??

I'm sorry, the change doesn't fix the problem. I think the CFLAGS in configure.ac are only affecting the configure test though. When compiling pixman-mmx.c the MMX_CFLAGS were already preceding the CFLAGS.
Comment 19 Donnie Berkholz (RETIRED) gentoo-dev 2007-09-19 08:53:31 UTC
Could you try with gcc 4.2, but add -fno-tree-vrp to your CFLAGS?(In reply to comment #18)
> (In reply to comment #16)
> > For those with this problem can you try editing pixman's configure.ac and
> > change the line that says....
> > 
> > CFLAGS="$CFLAGS $MMX_CFLAGS"
> > 
> > to
> > 
> > CFLAGS="$MMX_CFLAGS $CFLAGS"
> > 
> > does that fix the ordering problem ??
> 
> I'm sorry, the change doesn't fix the problem. I think the CFLAGS in
> configure.ac are only affecting the configure test though. When compiling
> pixman-mmx.c the MMX_CFLAGS were already preceding the CFLAGS.

Did you run autoreconf after making the change, then rerun configure?
Comment 20 Lars T. Mikkelsen 2007-09-19 09:01:50 UTC
(In reply to comment #19)
> Did you run autoreconf after making the change, then rerun configure?

Yes. Here's the output from config.log:

configure:19815: checking For MMX/SSE intrinsics in the compiler
configure:19838: gcc -c -mmmx -msse -Winline --param inline-unit-growth=10000 --param large-function-growth=10000 -O2 -march=athlon -pipe -Wall  conftest.c >&5
configure:19844: $? = 0
configure:19859: result: yes
Comment 21 Alan Hourihane 2007-09-19 09:12:21 UTC
for this are you using gcc 4.1.2 or 4.2.0 ?
Comment 22 Lars T. Mikkelsen 2007-09-19 09:35:01 UTC
(In reply to comment #21)
> for this are you using gcc 4.1.2 or 4.2.0 ?

I'm using 4.2.0. Note that the conftest compiles with 4.2.0, but it doesn't actually run - and I don't think the configure script checks whether the a.out runs or not.
Comment 23 Alan Hourihane 2007-09-19 09:43:19 UTC
ah. that now makes sense. i'll look into that.
Comment 24 Alan Hourihane 2007-09-19 10:01:25 UTC
O.k, so change the line that says....

AC_COMPILE_IFELSE

to 

AC_RUN_IFELSE

that should make it execute conftest. Does that work now ?
Comment 25 Alan Hourihane 2007-09-19 10:03:25 UTC
Mmm. But I'm actually thinking we could really do with a runtime test in the code. 

It's probably unacceptable for other distro's to have this in the upstream code as they may build on an architecture that doesn't support sse but want it to run on them that do.
Comment 26 Alan Hourihane 2007-09-19 10:09:32 UTC
so what does /proc/cpuinfo show on athlon ???
Comment 27 Lars T. Mikkelsen 2007-09-19 10:32:26 UTC
(In reply to comment #24)
> O.k, so change the line that says....
> 
> AC_COMPILE_IFELSE
> 
> to 
> 
> AC_RUN_IFELSE
> 
> that should make it execute conftest. Does that work now ?

Yes, with CFLAGS="-march=athlon -pipe" it works. The conftest fails and the final libpixman-1.so doesn't contain the movss instruction. But to make matters worse, it doesn't work with -O2 in my CFLAGS - I noticed this earlier, but forgot to mention it... the -O2 seems to optimize the conftest to do basically nothing and thus succeed.
Comment 28 Lars T. Mikkelsen 2007-09-19 10:33:19 UTC
(In reply to comment #26)
> so what does /proc/cpuinfo show on athlon ???

$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 4
model name      : AMD Athlon(tm) Processor
stepping        : 4
cpu MHz         : 1333.035
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips        : 2668.79
clflush size    : 32
Comment 29 Alan Hourihane 2007-09-19 19:19:18 UTC
So after this line....

CFLAGS="$MMX_CFLAGS $CFLAGS"

Can you then insert this additional line.

CFLAGS=`echo $CFLAGS | sed 's/-O[[0-9]]//'`

that should turn the -O2 into just -O, but will actually turn any -OX into -O
Comment 30 Alan Hourihane 2007-09-19 19:37:46 UTC
ugh, late night for me. What I meant to say is that it will remove any -O flags and turn off optimization completey. It obviously won't have any -O on the compile line so it should work now.
Comment 31 Alan Hourihane 2007-09-19 19:58:19 UTC
Created attachment 131331 [details, diff]
Fix SSE test

This patch should fix the pixman SSE detection troubles.
Comment 32 Alan Hourihane 2007-09-19 19:59:08 UTC
Created attachment 131333 [details, diff]
The same patch for Xorg 

The same patch for Xorg as it relies on MMX/SSE for fb/fbmmx.c too, and others seem to be hitting this along with pixman.
Comment 33 Alan Hourihane 2007-09-19 20:47:45 UTC
Just spoke with the author of pixman and he can't remember why he added -msse now as the code just uses -mmmx. So......

The other quick solution is to remove -msse from the CFLAGS which should avoid the troubles too in both xserver 1.3 and pixman.

Comment 34 Lars T. Mikkelsen 2007-09-19 20:50:12 UTC
(In reply to comment #31)
> Created an attachment (id=131331) [edit]
> Fix SSE test
> 
> This patch should fix the pixman SSE detection troubles.

The unfortunate thing about this patch is, that it will also completely disable the use of MMX in pixman on a system that supports MMX but not SSE (like the Athlon Thunderbird) - and as you mentioned, a binary build on a system that doesn't support SSE won't be able to use MMX/SSE on a system that does support it.

I'm currently investigating the issue further. Pixman does actually have a runtime test for MMX support in pixman-pict.c. The intention of the conftest seems to be to check whether the compiler (and not the architecture) supports the MMX extensions. The gcc man states that "Applications which perform runtime CPU detection must compile separate files for each supported architecture, using the appropriate flags." and I think that's the issue. Pixman-mmx.c is only compiled for a MMX/SSE architecture and not for a MMX-only architecture.
Comment 35 Lars T. Mikkelsen 2007-09-19 20:53:50 UTC
(In reply to comment #33)
> The other quick solution is to remove -msse from the CFLAGS which should avoid
> the troubles too in both xserver 1.3 and pixman.

I think that will be a better solution. Pixman-mmx.c does contain some SSE-only optimizations, but I think they will be enabled on amd64/x86-64 architectures regardless of the -msse flag.
Comment 36 Alan Hourihane 2007-09-19 21:06:55 UTC
(In reply to comment #34)
> (In reply to comment #31)
> > Created an attachment (id=131331) [edit]
> > Fix SSE test
> > 
> > This patch should fix the pixman SSE detection troubles.
> 
> The unfortunate thing about this patch is, that it will also completely disable
> the use of MMX in pixman on a system that supports MMX but not SSE (like the
> Athlon Thunderbird) - and as you mentioned, a binary build on a system that
> doesn't support SSE won't be able to use MMX/SSE on a system that does support
> it.

Exactly my point, which is why the patches are not being committed upstream.
Comment 37 Alan Hourihane 2007-09-19 21:08:30 UTC
(In reply to comment #35)
> (In reply to comment #33)
> > The other quick solution is to remove -msse from the CFLAGS which should avoid
> > the troubles too in both xserver 1.3 and pixman.
> 
> I think that will be a better solution. Pixman-mmx.c does contain some SSE-only
> optimizations, but I think they will be enabled on amd64/x86-64 architectures
> regardless of the -msse flag.

Right. 

I agree that removing -msse is the best course of action.
Comment 38 Eugene St Leger 2007-09-20 13:03:04 UTC
*** Bug 193138 has been marked as a duplicate of this bug. ***
Comment 39 Erik 2007-09-21 00:10:50 UTC
(In reply to comment #29)
> CFLAGS=`echo $CFLAGS | sed 's/-O[[0-9]]//'`
> 
> that should turn the -O2 into just -O, but will actually turn any -OX into -O
> 

I suppose you menant this:
sed "s/-O\(s\|[0-9]*\)//"
Comment 40 Alan Hourihane 2007-09-21 00:25:28 UTC
No, I did mean what I said in the sed script, but my comment confused this.

The sed command with remove any -Ox and will not replace it, therefore compiling with no optimizations.

See comment #30
Comment 41 James Brown 2007-09-21 00:40:35 UTC
Possible note of interest: I am getting the same crash on my ~amd64 system (which does indeed support SSE and SSE2. The crash is intermittent, but the backtrace is the same. Thoughts?
Comment 42 Erik 2007-09-21 01:05:44 UTC
(In reply to comment #40)
> No, I did mean what I said in the sed script, but my comment confused this.
> 
> The sed command with remove any -Ox and will not replace it, therefore
> compiling with no optimizations.

Except that your sed command did not remove "-O", which is equivalent to "-O1". It did not remove "-Os" either. And it replaced "-O89" (which is equivalent to "-O3") with "9".
Comment 43 Eugene St Leger 2007-09-21 01:18:36 UTC
In reply to comment #41:
Do you get the signal 4 which is SIGILL, and means 'illegal instruction'?
Comment 44 James Brown 2007-09-21 01:25:58 UTC
(In reply to comment #43)
> In reply to comment #41:
> Do you get the signal 4 which is SIGILL, and means 'illegal instruction'?
> 

Actually, you're right, at least for my most recent crash:

Backtrace:
0: /usr/bin/X(xf86SigHandler+0x6a) [0x48bb3a]
1: /lib/libc.so.6 [0x2b2b414b06d0]
2: /usr/lib/libpixman-1.so.0 [0x2b2b407906d3]
3: /usr/lib/libpixman-1.so.0 [0x2b2b40796402]
4: /usr/lib/libpixman-1.so.0(pixman_composite_rect_general+0x38c) [0x2b2b407945fc]
5: /usr/lib/libpixman-1.so.0 [0x2b2b4079ca93]
6: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x68d) [0x2b2b4079c23d]
7: /usr/lib64/xorg/modules//libfb.so(fbComposite+0x20d) [0x2b2b43b06c6d]
8: /usr/lib64/xorg/modules/drivers//nvidia_drv.so(_nv000848X+0x413) [0x2b2b435150b3]

Fatal server error:
Caught signal 11.  Server aborting

Does this look like the same bug, or should I start a new bug?
Comment 45 Eugene St Leger 2007-09-21 05:05:53 UTC
(In reply to comment #44)
> (In reply to comment #43)
> > In reply to comment #41:
> > Do you get the signal 4 which is SIGILL, and means 'illegal instruction'?
> > 
> 
> Actually, you're right, at least for my most recent crash:
> 
> Backtrace:
> 0: /usr/bin/X(xf86SigHandler+0x6a) [0x48bb3a]
> 1: /lib/libc.so.6 [0x2b2b414b06d0]
> 2: /usr/lib/libpixman-1.so.0 [0x2b2b407906d3]
> 3: /usr/lib/libpixman-1.so.0 [0x2b2b40796402]
> 4: /usr/lib/libpixman-1.so.0(pixman_composite_rect_general+0x38c)
> [0x2b2b407945fc]
> 5: /usr/lib/libpixman-1.so.0 [0x2b2b4079ca93]
> 6: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x68d) [0x2b2b4079c23d]
> 7: /usr/lib64/xorg/modules//libfb.so(fbComposite+0x20d) [0x2b2b43b06c6d]
> 8: /usr/lib64/xorg/modules/drivers//nvidia_drv.so(_nv000848X+0x413)
> [0x2b2b435150b3]
> 
> Fatal server error:
> Caught signal 11.  Server aborting
> 
> Does this look like the same bug, or should I start a new bug?
> 

I don't know.  Comment #3 also mentions ~amd64.  Perhaps a few bugs are being confused as one.

I noticed 'nvidia_drv.so'.  If you haven't already, try the VESA and nv (open source nvidia) drivers.
Comment 46 Rémi Cardona (RETIRED) gentoo-dev 2007-09-21 05:39:49 UTC
That last line of the backtrace is inside nvidia's driver, not the X server. That looks like a whole different bug. I'd suggest opening a new one.
Comment 47 Alan Hourihane 2007-09-21 08:10:43 UTC
(In reply to comment #42)
> (In reply to comment #40)
> > No, I did mean what I said in the sed script, but my comment confused this.
> > 
> > The sed command with remove any -Ox and will not replace it, therefore
> > compiling with no optimizations.
> 
> Except that your sed command did not remove "-O", which is equivalent to "-O1".
> It did not remove "-Os" either. And it replaced "-O89" (which is equivalent to
> "-O3") with "9".

Good point. But it's a little academic now. The best solution is to remove -msse and forget the patches.
Comment 48 James Brown 2007-09-21 08:30:45 UTC
Sorry for bothering everybody. You can ignore my report -- it turned out that avant-window-navigator was causing the crash. I know, weird. But when it bounced an icon (which it does when the title of a window changes), it caused that segfault. A recompile of a-w-n fixed the problem. Which is doubly odd because awn doesn't link in any of the pixman libs. But whatever.
Comment 49 Torsten Kaiser 2007-09-22 15:25:19 UTC
(In reply to comment #45)
>I don't know.  Comment #3 also mentions ~amd64.  Perhaps a few bugs are being
>confused as one.

See Comment #10.

I think my problem was sys-devel/binutils-2.18.50.0.1 . This package was keyworded ~amd64 for a short time. After that keyword was removed and I remerge these packages, it now works for me.
(But still using only xf86-video-ati-6.6.193 until I figure out how to switch from MergedFB to RandR1.2 without losing acceleration on both displays. ;) )
Comment 50 Alan Hourihane 2007-09-24 09:47:29 UTC
Donnie - just a note to remove -msse from pixman's configure.ac for xorg 1.4 and -msse in fb/Makefile.am in xorg 1.3
Comment 51 Alan Hourihane 2007-09-24 09:49:17 UTC
Oh, the patches (I guess) fit gentoo's model of building for the target host, but these won't go upstream. 

So the choice is either use the patches, or remove -msse.

The first patch is for pixman for xorg 1.4, the second is for xorg 1.3.
Comment 52 Alan Hourihane 2007-09-24 09:51:24 UTC
To add furthur, the patches would also disable MMX on systems that don't have SSE. 
Comment 53 Alan Hourihane 2007-09-24 09:52:59 UTC
I've obsoleted the patches, I really don't like them. Removing -msse is best.

I'll upload patches for those shortly if you don't beat me to it.
Comment 54 Donnie Berkholz (RETIRED) gentoo-dev 2007-09-24 09:53:42 UTC
(In reply to comment #51)
> Oh, the patches (I guess) fit gentoo's model of building for the target host,
> but these won't go upstream. 

We do try to support cross-compiling. Although it's not working great with X at the moment, I'd prefer to not break it any more than it is already. As a general philosophy, I also have an extreme dislike for adding any patches that cannot go upstream. I'm even averse to adding patches that haven't already been committed upstream.
Comment 55 Alan Hourihane 2007-09-24 10:24:29 UTC
O.k. If there's no problem waiting a little longer for an upstream fix, let's do that. 
Comment 56 James 2007-09-24 20:04:25 UTC
(In reply to comment #55)
> O.k. If there's no problem waiting a little longer for an upstream fix, let's
> do that. 
> 

I could use a fix sooner, if possible.  I authored bug 190914, (https://bugs.gentoo.org/show_bug.cgi?id=190914) which seems similar to this issue, I would like to try your solution, but I'm not certain what I should do to test it. Could someone please explain in simple terms the fix here?  Thanks
James

Comment 57 Lars T. Mikkelsen 2007-09-24 20:14:12 UTC
(In reply to comment #56)
> I could use a fix sooner, if possible.  I authored bug 190914,
> (https://bugs.gentoo.org/show_bug.cgi?id=190914) which seems similar to this
> issue, I would like to try your solution, but I'm not certain what I should do
> to test it. Could someone please explain in simple terms the fix here?  Thanks
> James

James, the easiest workaround right now is to add "-mno-sse" to your CFLAGS in /etc/make.conf and reemerge xorg-server (as you're using xorg-server-1.3.0.0). This should override the -msse flag that causes the crash (at least in this bug).
Comment 58 Donnie Berkholz (RETIRED) gentoo-dev 2007-09-25 00:59:53 UTC
One thing we could start trying, if anyone wants to spend the time on it, is adding fixes that are unsuitable for the main tree into the x11 overlay instead. (Of course this relies on the overlay getting fixed, again).
Comment 59 Stefan de Konink 2007-09-28 00:24:48 UTC
Today I hit this bug on my AMD64 after having upgraded some packages (including GTK+) the upgraded package evince (PDF document viewer) results (after some browsing) in the same pixman error as already mentioned here.
Comment 60 James Brown 2007-09-28 00:42:00 UTC
(In reply to comment #59)
> Today I hit this bug on my AMD64 after having upgraded some packages (including
> GTK+) the upgraded package evince (PDF document viewer) results (after some
> browsing) in the same pixman error as already mentioned here.
> 

Actually, I'm now getting the same thing with evince 2.20/poppler 0.6 any time I resize the poppler window. A downgrade solves the problem. Since I've seen the same error with multiple (presumably) unrelated software packages now, I'm thinking this might be an X problem after all.

(for recap, I'm running ~amd64 of an Athlon 64 X2, so I most definitely have SSE support...)
Comment 61 Stefan de Konink 2007-09-28 13:09:00 UTC
(In reply to comment #60)
> (In reply to comment #59)
> > Today I hit this bug on my AMD64 after having upgraded some packages (including
> > GTK+) the upgraded package evince (PDF document viewer) results (after some
> > browsing) in the same pixman error as already mentioned here.
> > 
> 
> Actually, I'm now getting the same thing with evince 2.20/poppler 0.6 any time
> I resize the poppler window. A downgrade solves the problem. Since I've seen
> the same error with multiple (presumably) unrelated software packages now, I'm
> thinking this might be an X problem after all.
> 
> (for recap, I'm running ~amd64 of an Athlon 64 X2, so I most definitely have
> SSE support...)

EXACTLY the same for me!

This error on horizontal resize:

** (xfwm4:24896): WARNING **: The display does not support the XRandr extension.
picked i=0, /home/skinkie/Wodkaland/innabg.png
** Message: Orage **: Default timezone set to floating. Do not use timezones whe
n setting appointments, it does not make sense without proper local timezone.
** Message: Orage **: Build alarm list: Added 0 alarms. Processed 1 events.
** Message: Orage **:   Found 0 alarms of which 0 are active. (Searched 0 recurr
ing alarms.)

Backtrace:
0: X(xf86SigHandler+0x6a) [0x48371a]
1: /lib/libc.so.6 [0x2b7dd2b806d0]
2: /usr/lib/libpixman-1.so.0 [0x2b7dd2158b83]
3: /usr/lib/libpixman-1.so.0 [0x2b7dd215e8a2]
4: /usr/lib/libpixman-1.so.0(pixman_composite_rect_general+0x38c) [0x2b7dd215ca9
c]
5: /usr/lib/libpixman-1.so.0 [0x2b7dd2164dd3]
6: /usr/lib/libpixman-1.so.0(pixman_image_composite+0x6de) [0x2b7dd216418e]
7: /usr/lib64/xorg/modules//libfb.so(fbComposite+0x20d) [0x2b7dd54db3bd]
8: /usr/lib64/xorg/modules/drivers//nvidia_drv.so(_nv000848X+0x413) [0x2b7dd4ce4
0b3]


Comment 62 Jakub Moc (RETIRED) gentoo-dev 2007-09-29 21:46:04 UTC
*** Bug 194155 has been marked as a duplicate of this bug. ***
Comment 63 Lars T. Mikkelsen 2007-09-29 22:49:02 UTC
(In reply to comment #62)
> *** Bug 194155 has been marked as a duplicate of this bug. ***

As Eugene mentioned in comment #45, I think two or more bugs are being confused as one.

The bug reported by the original reporter, me (comment #4), Rémi (comment #15), and Eugene (bug 193138) occurs on early Athlon/Duron cores without SSE support. The X server aborts with signal 4 (SIGILL). We identified the -msse flag to be the issue.

The bug (bugs?) reported by Torsten (comment #3), James B (comment #44), Stefan (comment #61), Jakub (bug 194155) occurs on ~amd64 that does have SSE support. Although the backtrace is very similar to the -msse bug, the X server aborts with signal 11 (SIGSEGV). It thus seems likely that this is a different issue. Torsten suggested that binutils-2.18.50.0.1 could be the problem (comment #49).

The bug reported by James (bug 190914) also occurs on an early Athlon and the X server aborts with signal 4. The backtrace does not contain any reference to pixman, but it seems likely to be the -msse bug.

Perhaps bug 194155 should be reopened and investigated as a separate issue.
Comment 64 Donnie Berkholz (RETIRED) gentoo-dev 2007-10-01 05:49:22 UTC
*** Bug 194056 has been marked as a duplicate of this bug. ***
Comment 65 Jakub Moc (RETIRED) gentoo-dev 2007-10-02 21:37:21 UTC
*** Bug 194557 has been marked as a duplicate of this bug. ***
Comment 66 Jakub Moc (RETIRED) gentoo-dev 2007-10-02 21:46:42 UTC
*** Bug 194557 has been marked as a duplicate of this bug. ***
Comment 67 James Brown 2007-10-02 21:55:31 UTC
Okay, since apparently it's too confusing to have multiple bugs for different crashes, let's summarize:

xorg-server-1.3 and higher crashes on pre-SSE Athlons all the time because it's built with SSE even though it's not supposed to be. X sends a Signal 4 in this case.

xorg-server 1.4 only crashes on Athlon X2's (which have SSE) when certain applications (such as evince 2.20) are opened, for reasons which are as of yet a mystery. X sends a Signal 11 in this case.

Clearly, these are the same bug...
Comment 68 M. Edward Borasky 2007-10-03 02:11:02 UTC
(In reply to comment #67)
> Okay, since apparently it's too confusing to have multiple bugs for different
> crashes, let's summarize:
> 
> xorg-server-1.3 and higher crashes on pre-SSE Athlons all the time because it's
> built with SSE even though it's not supposed to be. X sends a Signal 4 in this
> case.
> 
> xorg-server 1.4 only crashes on Athlon X2's (which have SSE) when certain
> applications (such as evince 2.20) are opened, for reasons which are as of yet
> a mystery. X sends a Signal 11 in this case.
> 
> Clearly, these are the same bug...

How are two different signals (4 and 11) the same bug? Does the "mno-sse" flag get rid of the signal 11 case? I have an Athlon TBird, an Athlon XP and an Athlon64 X2. All three are now happily running xorg-server 1.4 and *only* the Tbird required "-mno-sse". Moreover, I have never seen a signal 11.

Since I have an Athlon64 X2 running ~amd64, perhaps someone can give me instructions for reproducing the signal 11 case.
> 

Comment 69 James Brown 2007-10-03 02:42:20 UTC
I was being sarcastic. They're probably not the same bug. And the Signal 11 is the one that several people have reported for evince 2.20, which you can read all about in bug #194557 (which was closed because it's supposedly a duplicate of this bug). The mno-sse fix only refers to the bug which affects Athlon Thunderbird and older Duron processors.
Comment 70 Eugene St Leger 2007-10-03 09:08:19 UTC
Regarding the signal 11 bug, I keep seeing 'nvidia_drv.so' in the crash logs.  Have you guys tried nv (open source driver) or vesa?  Do any non-nvidia users get the signal 11?
Comment 71 Tobias Klausmann (RETIRED) gentoo-dev 2007-10-03 14:55:19 UTC
I've got the follwoing setup: amd64 box, gcc 4.2.0 and xorg-server 1.4-r2.

Opening a particular pdf file will let the X server die with a similar backtrace as mentioned in comment #44

So I tried Eugenes suggestions of using the nv driver (as opposed to the binary lump by nvidia). Voila, no crash anymore. This (to me) proves it's a bug in the nvidia driver. 

Unfortunately, there is only one version of said driver that works with recent xorg-x11. So I can now choose between looking at PDFs and using OpenGL.

Bottom line: the evince-lets-die-x11 is another bug than the SSE one. They may be related, but I doubt it.
Comment 72 Giacomo Graziosi 2007-10-04 10:02:04 UTC
Same problem here with x11-base/xorg-server-1.4-r2 and app-text/evince-2.20.0.
I'm on a Core2 CPU T7200 @ 2.00GHz and Gentoo AMD64.
Comment 73 Giacomo Graziosi 2007-10-04 10:04:35 UTC
No NVIDIA GPU here:

# lspci
00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03)
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)
00:07.0 Performance counters: Intel Corporation Unknown device 27a3 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
01:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 22)
02:00.0 Network controller: Atheros Communications, Inc. AR5418 802.11a/b/g/n Wireless PCI Express Adapter (rev 01)
03:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61)
Comment 74 Eugene St Leger 2007-10-04 11:40:30 UTC
(In reply to comment #73)
Are you using a binary driver though (as opposed to an open-source one)?  Try vesa, or an open-source graphics driver for your Intel graphics chipset (if one even exists).

Perhaps xorg have broken the ABI (Application Binary Interface), so all the binary drivers are broken for now.
Comment 75 Giacomo Graziosi 2007-10-07 17:19:30 UTC
I am using the open driver (x11-drivers/xf86-video-i810).
Comment 76 Eugene St Leger 2007-10-08 15:43:59 UTC
(In reply to comment #75)
> I am using the open driver (x11-drivers/xf86-video-i810).
> 

Oh that seems to be the open-source driver.  Have you tried reemerging it?
Comment 77 Giacomo Graziosi 2007-10-12 21:52:19 UTC
(In reply to comment #76)
> (In reply to comment #75)
> > I am using the open driver (x11-drivers/xf86-video-i810).
> > 
> 
> Oh that seems to be the open-source driver.  Have you tried reemerging it?
> 

Just tried, it doesn't work.
Comment 78 Giacomo Graziosi 2007-10-12 21:55:59 UTC
app-text/epdfview-0.1.6-r1 works fine.
Comment 79 James Brown 2007-10-18 19:45:25 UTC
At least for me, this issue has been completely fixed through the combination of pixman-0.9.5-r1, poppler-0.6.1, and evince-2.20.1. Woo!
Comment 80 Tobias Klausmann (RETIRED) gentoo-dev 2007-10-18 20:47:48 UTC
Re Comment #79: Same here.

I only needed to update pixman and evince and since X11 still crashed right after the update (no restart of X11) and didn't afterwards, I suspect pixman to be the culprit, not evince. But that's really just a guess. I suspect this difference between 0.9.5 and 0.9.5-r1:

PATCHES="${FILESDIR}/${PV}-pixman-compose-fix.patch"

Anyway, thanks to whoever fixed it.
Comment 81 Mark Wagner 2007-11-02 02:54:49 UTC
It's still broken for x11-base/xorg-server-1.3.0.0-r2.
Comment 82 Account removed 2007-11-05 17:51:46 UTC
(In reply to comment #79)
> At least for me, this issue has been completely fixed through the combination
> of pixman-0.9.5-r1, poppler-0.6.1, and evince-2.20.1. Woo!
> 

Wow! This really seems to have fixed the problems here too :)
Thank you very much! :)
Comment 83 Stefan de Konink 2007-11-05 18:06:35 UTC
Sure, but does this also fix the potential vulnerability in X?
Comment 84 Rafał Mużyło 2007-11-12 16:27:19 UTC
original issue with pixman solved with pixman 0.9.6 (upstream split mmx and sse tests).
Comment 85 Jakub Moc (RETIRED) gentoo-dev 2008-03-16 14:16:36 UTC
Nothing for ~4 months, perhaps this should be just closed...
Comment 86 Stefan de Konink 2008-03-16 14:39:13 UTC
(In reply to comment #85)
> Nothing for ~4 months, perhaps this should be just closed...

I still have this issue on any attempt to use 1.4.
Comment 87 Rafał Mużyło 2008-05-28 18:04:34 UTC
As it goes for me, this bug is closed - I've been using 1.4.0.90
for quite awhile, and anyway, for me it was that -msse thing.

So, feel free to close this.
Comment 88 Redeeman 2008-06-14 13:55:21 UTC
just want to let you know, i ran in to this issue with stable xorg-server update just now with todays -r6. I compiled it with gcc 4.3.1 with cflags: "-O2 -mtune=core2 -march=core2 -msse -msse2 -msse3 -mssse3 -msse4.1 -ftree-vectorize -fomit-frame-pointer -pipe"
and it broke.. i then did lots of debugging, and finally tried compiling with gcc 4.1.2 and cflags: "-O2 -march=nocona -fomit-frame-pointer -pipe", and it works. This is also what i had compiled with before. This is a newest gen 45nm core2, so it really shouldnt be a matter of unsupported sse..
Comment 89 Rémi Cardona (RETIRED) gentoo-dev 2008-10-29 14:56:58 UTC
This bug should be now fixed with newer versions of GCC (4.3) and pixman (0.12.0).

If some people still have issues, please open new bugs so we can leave this one rest in peace.

Thanks