616402 – >=x11-libs/libxcb-1.12[abi_x86_32] optimizations above -O1 causing multiple applications to stop working (e.g. Civilization 5)

Bug 616402 - >=x11-libs/libxcb-1.12[abi_x86_32] optimizations above -O1 causing multiple applications to stop working (e.g. Civilization 5)

Summary: >=x11-libs/libxcb-1.12[abi_x86_32] optimizations above -O1 causing multiple a...

Status:	RESOLVED UPSTREAM

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal major with 2 votes (vote)
Assignee:	Gentoo X packagers

URL:	https://bugs.freedesktop.org/show_bug...
Whiteboard:	Workaround: use glibc[stack-realign]
Keywords:

Duplicates (1):	660362 (view as bug list)
Depends on:
Blocks:

Reported:	2017-04-23 17:08 UTC by Rasmus Thomsen
Modified:	2023-10-01 06:35 UTC (History)
CC List:	16 users (show)

See Also:	https://bugzilla.redhat.com/show_bug.cgi?id=1471427 592222 677852 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38496
Package list:
Runtime testing required:	---

Attachments
Contents of bug report filed with Asypr. (asypr-bug-report,3.65 KB, text/plain) 2020-06-25 23:36 UTC, Richard Yao (RETIRED)	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Rasmus Thomsen 2017-04-23 17:08:08 UTC

Hello,
32-bit versions of >=x11-libs/libxcb-1.12 don't seem to work properly with some programs if they were compiled with optimization levels above -O1.
See: https://bugs.archlinux.org/task/49560

Comment 1 Alexander Tsoy 2017-08-09 01:47:20 UTC

Confirming this. Segfault occurs in libxcb if it was compiled with gcc-6 and -O2 or higher optimisation level. Backtrace is the same as was reported a year ago to the ML:

https://lists.freedesktop.org/archives/xcb/2016-June/010815.html


Core was generated by `/home/gamer/.local/share/Steam/steamapps/common/Sid Meier's Civilization V/./Ci'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xf6b98b44 in remove_finished_readers (completed=<optimized out>, prev_reader=0xe437b1e0) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:107
107         while(*prev_reader && XCB_SEQUENCE_COMPARE((*prev_reader)->request, <=, completed))
[Current thread is 1 (Thread 0xee185b40 (LWP 19796))]
(gdb) bt full
#0  0xf6b98b44 in remove_finished_readers (completed=<optimized out>, prev_reader=0xe437b1e0) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:107
No locals.
#1  read_packet (c=<optimized out>) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:223
        lastread = <optimized out>
        length = 32
        buf = <optimized out>
        eventlength = 0
        nfd = 0
        bufsize = <optimized out>
        pend = 0x0
        event = <optimized out>
#2  _xcb_in_read (c=<optimized out>) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:1012
        n = <optimized out>
        iov = {iov_base = 0xe437a1b0, iov_len = 4096}
        cmsgbuf = {cmsghdr = {cmsg_len = 262149, cmsg_level = 4096, cmsg_type = -466095248, __cmsg_data = 0xee181a10 ""},
          buf = "\005\000\004\000\000\020\000\000p\363\067\344\000\000\060\344\350-\000\000@\000\200\354@7#\367\000\000\000\000\330-\000\000\333=\000\000\000 \b\000qJ\017\367\231\000\000\000p\033\030\356\006\000\000\000p\000\000\000 \000$
000\000a\272\027\367\020\000\200", <incomplete sequence \354>}
        msg = {msg_name = 0x0, msg_namelen = 0, msg_iov = 0xee1819e0, msg_iovlen = 1, msg_control = 0xee181a04, msg_controllen = 0, msg_flags = 0}
#3  0xf6b95a4f in _xcb_conn_wait (c=0xe437a158, cond=0xee181b80, vector=0x0, count=0x0) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_conn.c:515
        may_read = 1
        ret = 1
        fd = {fd = 153, events = 1, revents = 1}
#4  0xf6b97c85 in wait_for_reply (c=c@entry=0xe437a158, request=<optimized out>, e=0x0) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:516
        cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}
        reader = {request = 1, data = 0xee181b80, next = 0x0}
        ret = 0x0
#5  0xf6b97e23 in xcb_wait_for_reply (c=0xe437a158, request=1, e=0x0) at /home/tmp/portage/x11-libs/libxcb-1.12-r2/work/libxcb-1.12/src/xcb_in.c:546
        ret = <optimized out>
#6  0xf6b9fcc8 in xcb_intern_atom_reply (c=0xe437a158, cookie=..., e=0x0) at xproto.c:3250
No locals.
...
...

Comment 2 happycorsair 2017-08-27 14:07:13 UTC

I confirm this bug.

TeamViewer 12 doesn't launch unless one builds libxcb with -O1 (Bug 621918).

Comment 3 Jonas Stein gentoo-dev

2017-12-11 02:59:13 UTC

the (treecleaned) acroread-9.5.5 shows the same effect.
Workaround:
CFLAGS="-march=native -O1 -pipe" emerge -1 x11-libs/libxcb

Comment 4 Jonas Stein gentoo-dev

2017-12-12 14:40:28 UTC

These packages depend on x11-libs/libxcb
https://qa-reports.gentoo.org/output/genrdeps/rindex/x11-libs/libxcb

Chances are high, that these are affected too.

Adding CC to ryao, the maintainer of app-emulation/crossover-bin so that he is informed too.

Comment 5 Mike Lothian 2018-01-25 14:37:55 UTC

Curious to know which version of GCC you're using. I have issues with 32bit binaries with GCC 7 when built with -march=native, I pass in -mno-bmi to fix it in my make.conf

CFLAGS_x86="${CFLAGS_x86} -mno-bmi"
CXXFLAGS_x86="${CXXFLAGS_x86} -mno-bmi"

Comment 6 Jonas Stein gentoo-dev

2018-01-25 20:44:41 UTC

(In reply to Mike Lothian from comment #5)
> Curious to know which version of GCC you're using. 

Alexander and I used gcc-6 (now gcc-6.4.0-r1)

Comment 7 Mike Lothian 2018-01-27 09:53:22 UTC

I'm not too sure, I had no issues with GCC 6 and have been using -mno-bmi with GCC 7 to work around issues until this patch

Maybe get the folks to test -mno-bmi to see if it fixes the issue for them

Comment 8 Alexander Tsoy 2018-01-27 18:32:55 UTC

(In reply to Mike Lothian from comment #5)
> Curious to know which version of GCC you're using. I have issues with 32bit
> binaries with GCC 7 when built with -march=native, I pass in -mno-bmi to fix
> it in my make.conf
> 
> CFLAGS_x86="${CFLAGS_x86} -mno-bmi"
> CXXFLAGS_x86="${CXXFLAGS_x86} -mno-bmi"

gcc-6.4.0-r1. Disabling BMI doesn't help.

$ sudo zgrep CFLAGS /var/log/portage/build/x11-libs/libxcb-1.12-r2\:20180127-182357.log.gz
  Used CFLAGS:
    CFLAGS..............: -O2 -march=bdver2 -mtune=bdver2 -mno-tbm -mno-fma4 -mno-xop -mno-lwp -pipe -mno-bmi
  Used CFLAGS:
    CFLAGS..............: -O2 -march=bdver2 -mtune=bdver2 -mno-tbm -mno-fma4 -mno-xop -mno-lwp -pipe -mno-bmi

Comment 9 Johannes Hirte 2018-03-13 20:50:04 UTC

I think the bug is about stack alignment. Adding -mstackrealign should fix it.

Comment 10 Jason Oliveira 2018-04-06 01:10:55 UTC

(In reply to Johannes Hirte from comment #9)
> I think the bug is about stack alignment. Adding -mstackrealign should fix
> it.

Adding -mstackrealign to CFLAGS solved Civ V crashing. Thank you so much for this.

Comment 11 Alexander Tsoy 2018-06-19 21:31:19 UTC

(In reply to Jason Oliveira from comment #10)
> Adding -mstackrealign to CFLAGS solved Civ V crashing. Thank you so much for
> this.
I have a different story. After recompiling libxcb with -mstackrealign Civ5 started crashing in libpulse. After recompiling pulseaudio with -mstackrealign Civ5 started crashing in libpthread... And I gave up on this step. :) So Civ5 seems really broken.

Comment 12 Alexander Tsoy 2018-06-19 23:08:41 UTC

Some info from redhat bugzilla partially related to this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1471427
Looks like they are compiling most of the 32-bit userland without SSE2 support and 32-bit glibc without IFUNC feature. This (mostly) avoids problems with binaries that are not aligned to a 16-byte boundary.

Comment 13 Johannes Hirte 2018-06-20 13:25:18 UTC

(In reply to Alexander Tsoy from comment #11)
> (In reply to Jason Oliveira from comment #10)
> > Adding -mstackrealign to CFLAGS solved Civ V crashing. Thank you so much for
> > this.
> I have a different story. After recompiling libxcb with -mstackrealign Civ5
> started crashing in libpulse. After recompiling pulseaudio with
> -mstackrealign Civ5 started crashing in libpthread... And I gave up on this
> step. :) So Civ5 seems really broken.

No, it's just escalating through all the parts without proper stack alignment.

Comment 14 Matt Turner gentoo-dev

2018-07-07 18:27:51 UTC

*** Bug 660362 has been marked as a duplicate of this bug. ***

Comment 15 Michał Dec 2018-07-09 01:07:37 UTC

I've made such changes:

/etc/portage/package.env/civ5:
x11-libs/libxcb         stackrealign.cfg

/etc/portage/env/stackrealign.cfg:
CFLAGS="${CFLAGS} -mstackrealign"

Then I recompiled x11-libs/libxcb and the issue is gone. My default CFLAGS are "-O2 -pipe -march=znver1 -ggdb". What applications require stepping down to -O1?

I tried to report various issues to Aspyr several months ago, but their response is generally "if it doesn't trigger on Ubuntu, we are not fixing it". Would it be considered foul play to set up Ubuntu, recompile libxcb our way and then insist they fix the bug because we managed to trigger it on Ubuntu?

Comment 16 Jonas Stein gentoo-dev

2018-09-21 18:42:11 UTC

@moog, no that would be fair. Please do so.

Comment 17 Michał Dec 2018-12-08 13:07:11 UTC

I tried to recompile:
media-libs/alsa-lib
media-sound/pulseaudio
x11-libs/libxcb
sys-libs/glibc

with "-O1 -pipe -ggdb -mstackrealign" and here's the output I get from GDB while starting the game:

#0  0x0885bca7 in ?? ()
#1  0x0887d4b7 in ?? ()
#2  0x0887d408 in ?? ()
#3  0x0887d2ea in ?? ()
#4  0x0887c549 in cvLandmarkVisSystem::LandmarkRenderJob::Execute(unsigned int) ()
#5  0x08dc2970 in ?? ()
#6  0xf7d89adf in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all (this=0xdc51d600, parent=..., child=0x885c0b20) at ../../src/tbb/custom_scheduler.h:481
#7  0xf7d84804 in tbb::internal::arena::process (this=this@entry=0xde5e2b00, s=...) at ../../src/tbb/arena.cpp:102
#8  0xf7d83f67 in tbb::internal::market::process (this=0xde5a5c80, j=...) at ../../src/tbb/market.cpp:481
#9  0xf7d7fd60 in tbb::internal::rml::private_worker::run (this=this@entry=0xde5e2200) at ../../src/tbb/private_server.cpp:281
#10 0xf7d7ff8f in tbb::internal::rml::private_worker::thread_routine (arg=0xde5e2200) at ../../src/tbb/private_server.cpp:234
#11 0xf7b6b69f in start_thread (arg=<optimized out>) at pthread_create.c:463
#12 0xf79b0756 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108

1. What should I change to rectify this behavior?
2. I'm in the middle of making Gentoo-grade Ubuntu packages that will hopefully trigger the bug on Ubuntu and therefore force Aspyr to react.

Comment 18 matt 2019-06-30 05:59:31 UTC

(In reply to Johannes Hirte from comment #9)
> I think the bug is about stack alignment. Adding -mstackrealign should fix
> it.

This fix worked for me, but *not* in libxcb.  I had to compile glibc with -mstackrealign, but after doing so, I can run the game with no issues.

Comment 19 Richard Yao (RETIRED) gentoo-dev

2020-06-25 23:36:51 UTC

Created attachment 646456 [details]
Contents of bug report filed with Asypr.

I just hit this today:

I attached gdb and got the following:

Thread 3 "Civ5XP" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf40a9b40 (LWP 20552)]
0xec8bd9e5 in pa_smoother_new (adjust_time=1000000, history_time=5000000, monotonic=true, smoothing=true, min_history=4, time_offset=31853400925, paused=true) at /usr/src/debug/media-sound/pulseaudio-13.0/pulseaudio-13.0/src/pulsecore/time-smoother.c:102
102         pa_assert(adjust_time > 0);
(gdb) bt
#0  0xec8bd9e5 in pa_smoother_new (adjust_time=1000000, history_time=5000000, monotonic=true, smoothing=true, min_history=4, time_offset=31853400925, paused=true) at /usr/src/debug/media-sound/pulseaudio-13.0/pulseaudio-13.0/src/pulsecore/time-smoother.c:102
#1  0xf34d3db2 in create_stream (direction=PA_STREAM_PLAYBACK, s=0xebbf26c0, dev=0xebbf2380 "alsa_output.pci-0000_0b_00.1.hdmi-stereo-extra2", attr=0xebd06d60, flags=<optimized out>, volume=0x0, sync_stream=0x0) at /usr/src/debug/media-sound/pulseaudio-13.0/pulseaudio-13.0/src/pulse/stream.c:1257
#2  0xf77d109b in ?? () from ./libopenal.so.1
#3  0xf77d132c in ?? () from ./libopenal.so.1
#4  0xf77a89a3 in alcCreateContext () from ./libopenal.so.1
#5  0x09126f4a in YUV12 ()
#6  0x091264a2 in YUV12 ()
#7  0x09113bee in check_for_pending_io ()
#8  0x09114188 in BinkOpen ()
#9  0x085f7553 in ASL::PlayBinkMovieGL(char const*, float, unsigned int, unsigned int, bool*) ()
#10 0x0884c26c in PlayMovieState::Begin() ()
#11 0x086e0fc3 in Civ5App::PlayOpeningMovie() ()
#12 0x086e1c46 in Civ5App::Init(char const*) ()
#13 0x0865b3ed in WinMain ()
#14 0x085f5487 in ?? ()
#15 0x085d8e3e in ThreadHANDLE::ThreadProc(void*) ()
#16 0xf7b21204 in start_thread (arg=<optimized out>) at pthread_create.c:479
#17 0xf7965a26 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108
(gdb) disassemble 
Dump of assembler code for function pa_smoother_new:
   0xec8bd9b0 <+0>:     push   %ebp
   0xec8bd9b1 <+1>:     push   %edi
   0xec8bd9b2 <+2>:     push   %esi
   0xec8bd9b3 <+3>:     push   %ebx
   0xec8bd9b4 <+4>:     call   0xec880f10 <__x86.get_pc_thunk.bx>
   0xec8bd9b9 <+9>:     add    $0x40647,%ebx
   0xec8bd9bf <+15>:    sub    $0x3c,%esp
   0xec8bd9c2 <+18>:    mov    0x58(%esp),%edi
   0xec8bd9c6 <+22>:    mov    0x50(%esp),%eax
   0xec8bd9ca <+26>:    mov    0x54(%esp),%edx
   0xec8bd9ce <+30>:    mov    0x6c(%esp),%esi
   0xec8bd9d2 <+34>:    mov    %edi,0x10(%esp)
   0xec8bd9d6 <+38>:    mov    0x60(%esp),%edi
   0xec8bd9da <+42>:    mov    %eax,(%esp)
   0xec8bd9dd <+45>:    mov    0x5c(%esp),%ebp
   0xec8bd9e1 <+49>:    mov    %edx,0x4(%esp)
=> 0xec8bd9e5 <+53>:    movdqa (%esp),%xmm0

I filed a bug report with Asypr through their support system. I advised them that I see two ways of recompiling Civilization V to fix this:

1. Add `-march=nocona` to CFLAGS/CXXFLAGS, recompile and raise the minimum CPU version to match steam:

https://github.com/ValveSoftware/steam-for-linux

2. Add `-mpreferred-stack-boundary=4` to CFLAGS and CXXFLAGS and recompile:

https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-mstackrealign

Anyway, I attached the contents of my bug report to this issue since it really should be public. I cannot change the status of this bug, but after I submit this comment, I am going to change it to RESOLVED UPSTREAM.

Comment 20 Michał Dec 2020-06-26 05:41:23 UTC

>I filed a bug report with Asypr through their support system.

Been there and done that years ago. They'll tell you they will do nothing about it because you're not encountering this on a supported version of Ubuntu.

Comment 21 Larry the Git Cow gentoo-dev

2022-06-25 21:40:02 UTC

The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=020b1514bcb86d96700d81ff5ad82ec698b45311

commit 020b1514bcb86d96700d81ff5ad82ec698b45311
Author: Sam James <sam@gentoo.org>
AuthorDate: 2022-06-25 21:36:50 +0000
Commit: Sam James <sam@gentoo.org>
CommitDate: 2022-06-25 21:39:41 +0000

sys-libs/ncurses: Add stack-realign flag for compat with old 32-bit x86 binaries

Older 32-bit x86 binaries aligned the stack to 4 bytes, whereas modern
binaries align to 16 bytes. These older binaries sometimes segfault when
newer libraries use SSE instructions. This is becoming increasingly
common. Applying the -mstackrealign flag to the 32-bit build works
around the issue but at a performance cost. Other popular
distributions always apply this.

[sam: There's no good choices here. As Ionen pointed out (I'd missed
any reports of this), this ends up getting worse with GCC 12's
default-on vectorisation at -O2. Let's make it optional for now for
32-bit/x86 (irrelevant for other arches, it's specific to x86 ABI).

ncurses is going to need similar treatment. If we end up having
to do this for far more packages, we may revisit and e.g.
just append-flags in ebuilds for right ABI and tell users
to set -mno-stackrealign, or similar.

Another option would be to set this globally by default (again,
this is only ever for x86), but it'd possibly be a big performance
hit (and bad enough doing it in glibc, but it's unavoidable).

The only saving grace here is that there aren't _that_ many
libraries with such longevity & ABI stability from back then
that older applications are using.]

Bug: https://bugs.gentoo.org/616402
Bug: https://github.com/taviso/123elf/issues/12
See: 02aa6328a720c
Signed-off-by: Sam James <sam@gentoo.org>

profiles/arch/amd64/no-multilib/package.use.mask | 1 +
profiles/arch/amd64/package.use | 1 +
profiles/arch/amd64/package.use.mask | 1 +
profiles/arch/x86/package.use.mask | 1 +
profiles/base/package.use.mask | 1 +
sys-libs/ncurses/metadata.xml | 4 ++++
sys-libs/ncurses/ncurses-6.3_p20220423-r1.ebuild | 10 ++++++++--
sys-libs/ncurses/ncurses-6.3_p20220423.ebuild | 10 ++++++++--
8 files changed, 25 insertions(+), 4 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=02aa6328a720c86d0157c4582f7e5bac72ae9296

commit 02aa6328a720c86d0157c4582f7e5bac72ae9296
Author: James Le Cuirot <chewi@gentoo.org>
AuthorDate: 2022-06-11 21:11:12 +0000
Commit: Sam James <sam@gentoo.org>
CommitDate: 2022-06-25 21:39:28 +0000

sys-libs/glibc: Add stack-realign flag for compat with old 32-bit x86 binaries

Another option would be to set this globally by default (again,
this is only ever for x86), but it'd possibly be a big performance
hit (and bad enough doing it in glibc, but it's unavoidable).

The only saving grace here is that there aren't _that_ many
libraries with such longevity & ABI stability from back then
that older applications are using.]

Bug: https://bugs.gentoo.org/616402
Bug: https://github.com/taviso/123elf/issues/12
Signed-off-by: James Le Cuirot <chewi@gentoo.org>
Closes: https://github.com/gentoo/gentoo/pull/25858
Signed-off-by: Sam James <sam@gentoo.org>

sys-libs/glibc/glibc-2.35-r7.ebuild | 31 ++++++++++++++++++-------------
sys-libs/glibc/glibc-9999.ebuild | 31 ++++++++++++++++++-------------
sys-libs/glibc/metadata.xml | 1 +
3 files changed, 37 insertions(+), 26 deletions(-)