Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 270120 (PR40838) - [4.4/bad-code] -ftree-vectorize causes segfaults on x86 due to stack misalignment
Summary: [4.4/bad-code] -ftree-vectorize causes segfaults on x86 due to stack misalign...
Status: RESOLVED FIXED
Alias: PR40838
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Unspecified (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL: http://gcc.gnu.org/PR40838
Whiteboard:
Keywords:
: 256677 265986 278798 281758 282341 283183 283220 283487 286189 308133 317603 323431 326579 333307 341725 356159 (view as bug list)
Depends on:
Blocks: gcc-4.4
  Show dependency tree
 
Reported: 2009-05-17 03:58 UTC by Evan Teran
Modified: 2016-07-28 16:31 UTC (History)
31 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
gcc-4.4-pr40838-1.patch (gcc-4.4-pr40838-1.patch,6.62 KB, patch)
2009-08-19 18:59 UTC, Denis Kaganovich
Details | Diff
sse & 32bit -> -mstackrealign (sse-stackrealign-1.patch,2.57 KB, patch)
2009-09-25 11:26 UTC, Denis Kaganovich
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Evan Teran 2009-05-17 03:58:17 UTC
compiling sys-libs/zlib-1.2.3-r1 with gcc-4.4 and -O3, specifically the -ftree-vectorize flag causes applications to crash.

I've noticed this in both mozilla-firefox and mozilla-thunderbird. Compiling with:
CFLAGS="-march=native -O2 -fomit-frame-pointer -pipe"

causes no problems. Additionally -O3 works fine on the stable gcc, so this appears to be a regression of some kind.

Reproducible: Always

Steps to Reproduce:
1. install gcc-4.4.0

2. 
set CFLAGS="-march=native -O3 -fomit-frame-pointer -pipe"
or
set CFLAGS="-march=native -O2 -ftree-vectorize -fomit-frame-pointer -pipe"

3. attempt to run firefox
Actual Results:  
segmentation fault (appears to be a null pointer)


emerge --info
Portage 2.2_rc28 (default/linux/x86/2008.0/desktop, gcc-4.4.0, glibc-2.8_p20080602-r1, 2.6.28-gentoo-r5 i686)
=================================================================                                            
System uname: Linux-2.6.28-gentoo-r5-i686-Intel-R-_Core-TM-2_Duo_CPU_T7700_@_2.40GHz-with-gentoo-1.12.11.1   
Timestamp of tree: Sat, 16 May 2009 17:30:01 +0000                                                           
app-shells/bash:     3.2_p39                                                                                 
dev-java/java-config: 2.1.7                                                                                  
dev-lang/python:     2.6.2                                                                                   
dev-util/cmake:      2.6.2-r1                                                                                
sys-apps/baselayout: 1.12.11.1                                                                               
sys-apps/sandbox:    1.6-r2                                                                                  
sys-devel/autoconf:  2.13, 2.63                                                                              
sys-devel/automake:  1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=native -O2 -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/kde/4.2/env /usr/kde/4.2/share/config /usr/kde/4.2/shutdown /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-march=native -O2 -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="collision-protect distlocks fixpackages parallel-fetch preserve-libs protect-owned sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="en_US.UTF-8"
LDFLAGS="-Wl,-O1"
LINGUAS="en_US en"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="   "
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac accessibility acl acpi alsa apache2 arts avahi bash-completion berkdb bluetooth boost branding bzip2 cairo captury cdr chroot cleartype cli cracklib crypt cups curl cvs dbus debugger dell dhcp divx doc dri dvd dvdr dvdread eds emboss encode esd evo examples expat fam fat ffmpeg firefox flac gdbm gif glibc-omitfp glitz gmp gnome gnutls google-gadgets gpm graphite graphviz gstreamer gtk hal htmlhandbook iconv imagemagick innodb inotify ipod ipv6 ipw3945 isdnlog jadetex java java6 jpeg jpeg2k kde kdeprefix kpathsea kqemu lame ldap lesstif libnotify libwww lm_sensors mad mdnsresponder-compat midi mikmod mjpeg mmap mmx mng mono mp3 mp4 mpeg mpeg2 mplayer mudflap mysql ncurses network-cron nls nptl nptlonly nsplugin ntfs nvidia ogg openal openexr opengl openmp openssl oss pam pango pcap pch pcre pdf perl phonon php physfs plasma pmu png posix ppds pppd python qt3 qt3support qt4 quicktime readline reflection rss rtc samba sdk sdl session smp sockets spell spl sqlite sqlite3 sse sse2 ssl ssse3 startup-notification subversion svg sysfs tcl tcpd templates theora threads thumbnail tiff tivo tk truetype unicode usb userlocales utempter v4l vcd vim-syntax vnc vorbis webkit wifi win32codecs wireshark wmf wmp wxwindows x86 xanim xcomposite xft xine xinerama xml xorg xpm xrandr xrender xscreensaver xulrunner xv xvid zeroconf zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse synaptics evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en_US en" USERLAND="GNU" VIDEO_CARDS="nvidia"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 SpanKY gentoo-dev 2009-05-18 03:52:36 UTC
last time this came up was Bug 151394.  but just like there, firefox isnt really a useful test case.  we need something reduced or the bug report is going to sit around indefinitely.
Comment 2 Evan Teran 2009-05-18 04:39:12 UTC
Fair enough, I'll try to come up with a minimal program that triggers the bug when I have a chance.
Comment 3 Ryan Hill (RETIRED) gentoo-dev 2009-05-18 05:48:06 UTC
seems to be x86 specific.  i can't reproduce here.
Comment 4 Samuli Suominen gentoo-dev 2009-08-10 13:19:20 UTC
(In reply to comment #3)
> seems to be x86 specific.  i can't reproduce here.
> 

Good. I want GCC 4.4 unmasked for amd64 today.
Comment 5 Mark Loeser (RETIRED) gentoo-dev 2009-08-17 15:12:42 UTC
*** Bug 281758 has been marked as a duplicate of this bug. ***
Comment 6 Denis Kaganovich 2009-08-18 11:04:19 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > seems to be x86 specific.  i can't reproduce here.
> Good. I want GCC 4.4 unmasked for amd64 today.

Not 4.4.0. IMHO.

For 4.4.1 & this bug exists test (I don't try, only mozillas).
See Bug 281758 (links to tests too).
Comment 7 Denis Kaganovich 2009-08-19 18:59:04 UTC
Created attachment 201741 [details, diff]
gcc-4.4-pr40838-1.patch

This new patch only looks working for me for x86_32, gcc 4.4.1, vectorizer and sse, at least like 4.3.3 (by idea must be better then 4.3.* ...).
From http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838 with fixed pathes.
Comment 8 Ryan Hill (RETIRED) gentoo-dev 2009-09-05 09:05:46 UTC
*** Bug 283487 has been marked as a duplicate of this bug. ***
Comment 9 SpanKY gentoo-dev 2009-09-06 16:33:15 UTC
*** Bug 265986 has been marked as a duplicate of this bug. ***
Comment 10 SpanKY gentoo-dev 2009-09-06 16:33:24 UTC
*** Bug 256677 has been marked as a duplicate of this bug. ***
Comment 11 SpanKY gentoo-dev 2009-09-06 16:33:46 UTC
*** Bug 283183 has been marked as a duplicate of this bug. ***
Comment 12 SpanKY gentoo-dev 2009-09-06 16:33:48 UTC
*** Bug 278798 has been marked as a duplicate of this bug. ***
Comment 13 SpanKY gentoo-dev 2009-09-06 16:36:04 UTC
perhaps we should add arch/x86/profile.bashrc like the amd64 one and have it whine/barf when someone is using -ftree-vectorize in CFLAGS.  stack alignment is known to be screwed with x86 and optimization and sse instructions for pretty much all versions of gcc.
Comment 14 Ryan Hill (RETIRED) gentoo-dev 2009-09-06 20:22:32 UTC
it works well enough in 4.3, if just because it disables itself when it sees anything scary, and hopefully 4.5 will be fixed.  a big fat warning would probably help in the meantime though.
Comment 15 Nirbheek Chauhan (RETIRED) gentoo-dev 2009-09-07 19:37:01 UTC
*** Bug 282341 has been marked as a duplicate of this bug. ***
Comment 16 Nirbheek Chauhan (RETIRED) gentoo-dev 2009-09-08 04:29:41 UTC
*** Bug 283220 has been marked as a duplicate of this bug. ***
Comment 17 Denis Kaganovich 2009-09-23 11:23:23 UTC
Try to use "-mstackrealign" in CFLAGS in cases when -msse (-msse*, -march=pentium4, etc) is enabled. I use this system-wide with published patch (patch is good enough), but IMHO "-mstackrealign" must work without it.

IMHO good idea to add -mstackrealign into wiki pages about "safe cflags" into 32bit sse targets. Just I read about bugs with -mstackrealign in old gcc, but there are too old to be interesting for me.

Additional link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41025 - gcc self-compiling failed with "-mstackrealign", but other packages problems not found while.
Comment 18 Denis Kaganovich 2009-09-23 11:40:30 UTC
(In reply to comment #17)

> Additional link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41025 - gcc
> self-compiling failed with "-mstackrealign", but other packages problems not
> found while.

Sorry, link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41156
Comment 19 Ryan Hill (RETIRED) gentoo-dev 2009-09-25 00:06:30 UTC
yes, lets add more random CFLAGS.

you don't need to add anything to the Safe CFLAGs page.  it recommends -O2 which will not trigger this.
Comment 20 Andrew Savchenko gentoo-dev 2009-09-25 06:45:20 UTC
(In reply to comment #19)
> yes, lets add more random CFLAGS.
> 
> you don't need to add anything to the Safe CFLAGs page.  it recommends -O2
> which will not trigger this.

CFLAGS within -O3 as well as within any -Ox should be considered safe, they are NOT random. If they are not safe, this is serious gcc bug.
Comment 21 Denis Kaganovich 2009-09-25 11:21:33 UTC
(In reply to comment #19)
> yes, lets add more random CFLAGS.
> 
> you don't need to add anything to the Safe CFLAGs page.  it recommends -O2
> which will not trigger this.
> 

This problem not for -O3 / vectorizer. For this case there are only easy visible. But even with -O2 sse code may be dangerous.
Comment 22 Denis Kaganovich 2009-09-25 11:26:06 UTC
Created attachment 205198 [details, diff]
sse & 32bit -> -mstackrealign

There are my experimental patch to make stack realign default equal to SSE & 32bit (exclude gcc libs while, may be fixed - "Stack alignment in unwind library is unsupported."). IMHO it works.
Comment 23 Ryan Hill (RETIRED) gentoo-dev 2009-09-25 20:26:59 UTC
when it goes in upstream, we'll add it.
Comment 24 Frédéric COIFFIER 2009-10-02 07:14:41 UTC
*** Bug 286189 has been marked as a duplicate of this bug. ***
Comment 25 Denis Kaganovich 2009-10-09 14:27:58 UTC
I found, at least on new AMD CPUs this problem absent. You may run same "broken" 32bit code on AMD without bugs. I found no documents, only Athlon 7550 and /proc/cpuinfo examples on net. IMHO there are shown by CPU flag "misalignsse", also IMHO sse4a are satellite (may be used as cname) for this feature.

New patches with descriptions posted here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41156

PS This changes looks like not actual for 64bit CPU, but 32bit code is less RAM-aggressive...
Comment 26 Mark R. Pariente 2010-04-28 08:15:13 UTC
*** Bug 317603 has been marked as a duplicate of this bug. ***
Comment 27 Alexander Bezrukov 2010-06-10 15:18:07 UTC
*** Bug 323431 has been marked as a duplicate of this bug. ***
Comment 28 Alexander Bezrukov 2010-06-10 15:34:51 UTC
(In reply to comment #25)
> I found, at least on new AMD CPUs this problem absent. 

Denis, on some cores aligned accesses are aliases to unaligned ones. Other cores would trigger an exception. For example, with AMD, this is one of differences betweeen Athlons and Opterons.

What makes me very unhappy is that I vastly depend on the vectorizer in some computational code. Most of the time I run it on amd64 but sometimes on x86. Enabling -ftree-vectorize for anything in sci-libs/ would be inelegant.

In my opinion, having this kind of problems is enough reason to hard mask the package.
Comment 29 Alexander Bezrukov 2010-06-10 19:10:34 UTC
GCC people discuss the situation when stack is unaligned on the entry. The problem, which I observe (see Bug #323431), happens even if stack is aligned on the entry to function. Correct me, if I miscounted:

inflate_table:
.LFB45:
        .file 1 "inftrees.c"
        .loc 1 39 0
.LVL0:
        pushl   %ebp                     ; -4
.LCFI0:
        .loc 1 108 0
        pxor    %xmm0, %xmm0
        .loc 1 39 0
        movl    %esp, %ebp
.LCFI1:
        pushl   %edi
.LCFI2:
        pushl   %esi
.LCFI3:
        pushl   %ebx
.LCFI4:
        call    .L101
.L101:
        popl    %ebx
        addl    $_GLOBAL_OFFSET_TABLE_+[.-.L101], %ebx
        subl    $188, %esp
.LCFI5:
        .loc 1 108 0
        movdqa  %xmm0, -56(%ebp)         ; -56-4=-60, unaligned

The problem which is closer to the problem which at least I have (that is, when everything is compiled with gcc-4.4.3-r2 with -ftree-vectorize is discussed here
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41156
Comment 30 Ed Catmur 2010-06-10 21:34:22 UTC
Alexander, you're omitting to consider that call pushes EIP + 4 onto the stack.
 Thus on entry the stack is already misaligned by -4, so gcc is correct.

If the crash is happening in firefox, it's because some hand-rolled asm (usually in script languages' native interfaces, e.g. JS XPCOM, or Java JNI) is failing to preserve alignment.  If you climb the stack, looking at %ebp, you should be able to find the bad code.  The correct alignment of %ebp is -8.  I've found that a combination of fixing asm where possible, and applying -mstackrealign where not, suffices to prevent all crashes.
Comment 31 Alexander Bezrukov 2010-06-11 01:41:50 UTC
Thank you, Ed. I missed that. The life on caffeine doesn't promote a sharp mind :)
Would you please share, why the problem is x86 specific -- by occasion or there is some cause which makes stack never misaligned on amd64?
Comment 32 Ed Catmur 2010-06-11 17:31:10 UTC
Stack alignment (especially userspace) appears to be a matter of historical consensus (and lack of) between compiler and OS vendors, rather than having any design or standards. The fact that vendors have settled on 16-byte alignment on x86-64 seems driven by the preexistence of sse2 and other 16-byte aligned instruction sets; I haven't found any reference to it in AMD material (admittedly, I haven't looked too hard).

Sadly, I expect that when some future super-vectorised instruction set extension requires 32- or 64-byte alignment, we're going to see this issue all over again :)
Comment 33 Roger 2010-06-13 02:00:12 UTC
Upgraded to gcc-4.4.3-r2 & recompiled system.  Seamonkey & Icecat, basically all Mozilla GUI browsers segfault with little to no explanation.

After recompiling sys-libs/zlib-1.2.3-r1 using debug cflags, my  @#$@#$ unexplained segfaults were immediately resolved.

Is there a standard work around for this?? Can a user upgrade to 1.2.4 or 1.2.5 to evade this GCC bug with zlib?
Comment 34 SpanKY gentoo-dev 2010-06-13 02:12:18 UTC
there is no "working" version of gcc, thus masking any package or version makes no sense.  the workaround is to not use -ftree-vectorize.

complaining on this bug will also make no difference to the issue.  we know the issue exists, we arent experts (or even passing knowledgeable) in the code in question to attempt changing anything, nor are there any real patches that can be considered for us to merge.  so track it in the already linked upstream bug if you want to keep up-to-date.
Comment 35 Roger 2010-06-13 05:02:56 UTC
aka "-fno-tree-vectorize"

# cat /etc/portage/env/sys-libs/zlib
CFLAGS="-march=pentium3 -O3 -pipe -fomit-frame-pointer -fno-tree-vectorize"
CXXFLAGS="${CFLAGS}"
LDFLAGS=""

However, granted as you stated, the bug still exists and this is only one package worked-around versus recompiling the entire system to use "-fno-tree-vectorize".

Using 32 bit x86 (P3) here.

Watching http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41156 here.
Comment 36 Ed Catmur 2010-06-13 07:21:03 UTC
Roger: Please check that you have the patch from https://bugzilla.mozilla.org/show_bug.cgi?id=552627 - this should fix xulrunner (firefox etc.) crashes that are caused by JS reflection.
Comment 37 Andrew Savchenko gentoo-dev 2010-06-13 08:05:04 UTC
(In reply to comment #35)
> aka "-fno-tree-vectorize"

Do not forget, that -fno-tree-vectorize is not a complete workaround. Tree vectorizer only tends to misalign SSE and alike code on x86 frequently. There is no guaranty at all that misalignment will not occur under some other conditions.

-mstackrealing is not an ultimate solution also as described above. So the only solution is to pray for gcc guys fixing this someday before the end of the world 8-).
Comment 38 Ed Catmur 2010-06-13 08:54:51 UTC
(In reply to comment #37)
> -mstackrealing is not an ultimate solution also as described above. So the only
> solution is to pray for gcc guys fixing this someday before the end of the
> world 8-).

Waiting for changes in gcc is unlikely to achieve anything; the ABI is now 16 bytes and packages are expected to work with it or use -mstackrealign (what are you referring to "above"?).

There's not much need for this bug to remain open now; what we need is a tracker bug to individually fix broken packages (mozilla, OOo, java, etc.).
Comment 39 Samuli Suominen gentoo-dev 2010-07-02 09:20:46 UTC
*** Bug 326579 has been marked as a duplicate of this bug. ***
Comment 40 Torsten Kurbad 2010-08-18 14:50:30 UTC
*** Bug 333307 has been marked as a duplicate of this bug. ***
Comment 41 Samuli Suominen gentoo-dev 2010-09-14 13:10:59 UTC
*** Bug 308133 has been marked as a duplicate of this bug. ***
Comment 42 Samuli Suominen gentoo-dev 2010-10-19 08:07:41 UTC
*** Bug 341725 has been marked as a duplicate of this bug. ***
Comment 43 Tiago Marques 2010-12-10 23:25:51 UTC
I tested everything what what is breaking it for me is -funroll-loops. The rest, including -ftree-vectorize, has no influence. I have also not observed the crashes reported and I've recompiled the whole installation with:
-O2 -march=i686 -mtune=athlon-xp -funroll-loops -ftree-vectorize

Please check if -funroll-loops is also breaking it for you.

Best regards,
Tiago
Comment 44 Andrew Savchenko gentoo-dev 2010-12-12 19:51:20 UTC
(In reply to comment #43)
> Please check if -funroll-loops is also breaking it for you.

I do not use -funroll-loops. As of gcc-4.5.1-r1 -ftree-vectorize is still broken, or precisely speaking SSE alignment is still broken. If you have not hit this you may be just lucky.

However, -ftree-vectorize together with -mstackrealign (as described in the upstream bug) works fine for me on both Athlon-XP and Atom N270.
Comment 45 SpanKY gentoo-dev 2011-03-13 23:16:56 UTC
*** Bug 356159 has been marked as a duplicate of this bug. ***
Comment 46 SpanKY gentoo-dev 2012-03-13 22:02:35 UTC
hopefully should be addressed in gcc-4.5+, and since 4.5.3 is stable now, close this out
Comment 47 Roger 2012-03-13 23:30:18 UTC
confirmed: sys-libs/zlib-1.2.5-r2 compiled fine here using CFLAGS fno-tree-vectorize with gcc-4.5.3 (i686-pc-linux-gnu-4.5.3)
Comment 48 Andrew Savchenko gentoo-dev 2012-03-13 23:45:32 UTC
With gcc-4.5.3 -ftree-vectorize works for me only with -mstack-realign, otherwise random packages fail.
Comment 49 Petr Zima 2012-03-14 11:22:46 UTC
The bug is NOT FIXED in gcc-4.5. I am using ~x86 on P4 and still need a workaround. CFLAGS -fno-tree-vectorize or -mstackrealing were always been just workarounds.

Also note that the upstream bug reffered-to in the URL is still open, but there is no activity for a year now. Maybe closing this bug as WONTFIX would be appropriate, since it affects custom CFLAGS (-O3 or -ftree-vectorize) only.
Comment 50 SpanKY gentoo-dev 2012-03-15 19:04:32 UTC
the upstream bug is not relevant.  that talks about assembly functions that misalign the stack in violation of the x86 ABI.  -mstackrealign is a cudgel to workaround those.  the upstream reporter wants all functions to check their stack alignment instead of requiring hand written asm to realign their stack.

if zlib is misaligning the stack in assembly code, then that's a bug in zlib.  if other projects are misaligning the stack before calling zlib and then crashing, that's a bug in those other projects.  if gcc is actually misaligning the stack, then that's a bug in gcc.

so at this point in time, we need to narrow down exactly what's crashing.
Comment 51 Ryan Hill (RETIRED) gentoo-dev 2012-03-16 04:28:41 UTC
(In reply to comment #50)
> if zlib is misaligning the stack in assembly code, then that's a bug in
> zlib.  if other projects are misaligning the stack before calling zlib and
> then crashing, that's a bug in those other projects.  if gcc is actually
> misaligning the stack, then that's a bug in gcc.
> 
> so at this point in time, we need to narrow down exactly what's crashing.

As mentioned in comment #30, the problem is other projects are misaligning the stack before calling zlib.  There are several bugs open in the mozilla's bugzilla regarding stack misalignment and we continue to have issues because of it (eg. avx).
Comment 52 SpanKY gentoo-dev 2012-03-16 05:06:54 UTC
(In reply to comment #51)

great.  so we agree zlib isn't broken (mozilla is), and the right thing to do is fix those projects.  nothing for gcc to do here.

people who want to use full tree vectorize flags can turn on -mstackrealign in zlib via /etc/portage/env/.  i don't think using that flag in firefox would help if the crash occurs when calling zlib since that flag realigns stack upon function entry.

my understanding of the new AVX insns is that their alignment requirements are much less than that of SSE.  so as people convert to AVX, the issue will get better.  this doesn't help the poor saps on 32bit installs though :).
Comment 53 Petr Zima 2012-03-19 13:08:45 UTC
(In reply to comment #52)
> (In reply to comment #51)
> 
> great.  so we agree zlib isn't broken (mozilla is), and the right thing to
> do is fix those projects.  nothing for gcc to do here.
> 
> people who want to use full tree vectorize flags can turn on -mstackrealign
> in zlib via /etc/portage/env/.  i don't think using that flag in firefox
> would help if the crash occurs when calling zlib since that flag realigns
> stack upon function entry.
> 
> my understanding of the new AVX insns is that their alignment requirements
> are much less than that of SSE.  so as people convert to AVX, the issue will
> get better.  this doesn't help the poor saps on 32bit installs though :).

Ok, seems you are right, since all the problematic packages (callers of zlib) afaik mess with asm. So we are supposed to file bugs for firefox and the like. Sorry for the noise.
Comment 54 Andrew Savchenko gentoo-dev 2012-05-31 21:38:34 UTC
-ftree-vectorize is badly broken on ~x86 again. Even together with -mstackrealign it miscompiles code. At leasth 3 cases found:

1) net-misc/rsync-3.0.9.r2
Daemon part hangs during data exchange over ssh.

2) dev-lang/python-3.2.3-r1
./python -E ./setup.py build
loops forever during package build at 100% CPU usage.

3) www-client/chromium-20.0.1132.17
fails during linking.

All issues above were fixed by purging -ftree-vectorize -mstackrealign from *FLAGS.
Comment 55 Andrew Savchenko gentoo-dev 2012-05-31 21:39:28 UTC
... with gcc-4.6.3.
Comment 56 SpanKY gentoo-dev 2012-06-01 05:42:48 UTC
(In reply to comment #54)

file a new bug.  this one has gone on long enough.