Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 409471 - dev-python/pypy-1.8-r1 CFLAGS=-march=native fails to compile
Summary: dev-python/pypy-1.8-r1 CFLAGS=-march=native fails to compile
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Dirkjan Ochtman (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-23 21:00 UTC by Mads
Modified: 2012-06-19 21:40 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build.log (build.log.gz,62.83 KB, application/x-gzip)
2012-03-30 22:45 UTC, Mads
Details
Upstream patch that fixes unrecognized AVX ops bug (1.8-unknown-opcodes.patch,654 bytes, patch)
2012-04-22 05:49 UTC, WANG Xuerui
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mads 2012-03-23 21:00:23 UTC
If you try to compile PyPy with -march=native in CFLAGS when you got a AMD Bulldozer CPU, it crashes when it starts compiling the generated .c-files.

Reproducible: Always

Steps to Reproduce:
1. Have -march=native in CFLAGS and a Bulldozer CPU
2. Try to compile pypy-1.8-r1

Actual Results:  
It crashes just after trying to compile the generated .c-files.

Expected Results:  
PyPy 1.8 gets installed.

The CFLAGS used when using march=native on a Bulldozer CPU is:

# cc -march=native -E -v - </dev/null 2>&1 | rep cc1
 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.6.2/cc1 -E -quiet -v - -march=bdver1 -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mabm -mlwp -mno-fma -mfma4 -mxop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1
Comment 1 Mads 2012-03-23 21:06:14 UTC
If you remove march=native from CFLAGS, it works.

Sorry for not providing logs... but it seems it's easy reproducible at least. It takes so long for the PyPy compile to crash, I just thought I should report this bug before I forgot.

I tried this with GCC 4.6.2, GCC 4.7, and even Path64. The same error happened every time.

I guess I'll upload the build.log later.
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2012-03-24 16:19:15 UTC
1) Please attach the entire build log to this bug report.
2) Please post your `emerge --info' output in a comment.
Comment 3 Mads 2012-03-30 22:45:56 UTC
Created attachment 307243 [details]
build.log

# emerge --info
Portage 2.2.0_alpha96 (default/linux/amd64/10.0, gcc-4.6.2, glibc-2.14.1-r2, 3.3.0-gentoo x86_64)
=================================================================
System uname: Linux-3.3.0-gentoo-x86_64-AMD_FX-tm-8120_Eight-Core_Processor-with-gentoo-2.1
Timestamp of tree: Fri, 30 Mar 2012 20:30:01 +0000
app-shells/bash:          4.2_p24
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.2-r3, 3.2.2-r1
dev-util/cmake:           2.8.7-r5
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1
sys-apps/openrc:          0.9.9.3
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.11.3
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.6.2
sys-devel/gcc-config:     1.6
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 3.3 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r2
Repositories: gentoo x11 xhub rion lxde local
Installed sets: 
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA Oracle-BCLA-JavaSE AdobeFlash-10.3"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=native -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/themes/oxygen-gtk/gtk-2.0"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=native -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build=n"
FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans"
FFLAGS=""
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en en_US"
MAKEOPTS="-j8"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/x11 /var/lib/layman/xhub /var/lib/layman/rion /var/lib/layman/lxde /usr/portage/local"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="3dnow 3dnowext X acl aes-ni alsa amd64 apng avx bzip2 cairo cli consolekit cracklib crypt cups cxx dbus dri dvb egl flac fortran gdbm gif gpm hvm iconv icu jpeg kde libnotify lxde lzma mmx mmxext modules mp3 mudflap multilib ncurses nls nptl nptlonly ogg opengl openmp optimized-qmake pam pcre pgo png policykit pppd pvr qt3support qt4 readline samba semantic-desktop session sse sse2 sse3 sse4 sse4_1 sse4_2 sse4a sse5 ssl ssse3 startup-notification svg sysfs system-sqlite tcpd threads tiff truetype udev unicode v4l vaapi vdpau vorbis xcb xcomposite xinerama xorg xv xvmc zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en en_US" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fglrx radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 4 Mads 2012-03-30 22:48:31 UTC
The logfile is gzipped.
Comment 5 Jeroen Roovers (RETIRED) gentoo-dev 2012-04-02 01:41:42 UTC
(In reply to comment #0)
> The CFLAGS used when using march=native on a Bulldozer CPU is:
> 
> # cc -march=native -E -v - </dev/null 2>&1 | rep cc1
>  /usr/libexec/gcc/x86_64-pc-linux-gnu/4.6.2/cc1 -E -quiet -v - -march=bdver1
> -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mabm -mlwp -mno-fma -mfma4
> -mxop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 --param l1-cache-size=16
> --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1

So which of these is the culprit?

Looks like it also fails to respect the local CC and inserts its own -O3.
Comment 6 Ian Delaney (RETIRED) gentoo-dev 2012-04-03 17:50:38 UTC
CFLAGS="-march=native -pipe -O2"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS}"

archtester virtuoso-server # eix pypy
[I] dev-python/pypy
     Available versions:  

     Installed versions:  1.7-r2(1.7)(02:51:48 09/02/12)(bzip2 jit ncurses sqlite ssl xml -doc -examples -sandbox) 1.8-r1(1.8)(21:30:29 28/02/12)(bzip2 jit ncurses shadowstack sqlite ssl xml -doc -examples -sandbox

Don't be so quick to generalise.  I have an amd cpu, and not 1 but 2 instances of pypy happily emerged
Comment 7 Marien Zwart (RETIRED) gentoo-dev 2012-04-03 18:34:02 UTC
Looks like this is https://bugs.pypy.org/issue1095 upstream, and we should backport the upstream change adding 'v' to the list in trackgcroot.py (or rather: we should just grab all upstream additions to IGNORE_OPS_WITH_PREFIXES).

Mads: this is only "easy reproducible" if you actually have a Bulldozer CPU (or a recent Intel one, from the looks of it) to run the build on. Forcing -march=<some architecture other than my own> is unlikely to work (some of the binaries built before pypy itself are executed). Therefore please do include build logs (all we need to see to match this to an upstream bug is the "__main__.UnrecognizedOperation: vmulsd" line, in this case), or send all developers sufficiently recent CPUs so they can actually reproduce the problem :)
Comment 8 Mads 2012-04-04 08:41:52 UTC
Line 4394 in the attached build.log:

[platform:Error] __main__.UnrecognizedOperation: vmulsd

The build.log.gz file attached is correct, it's just gzipped twice. I guess that happened because I uploaded it as "build.log" when it should be "build.log.gz", and then later Jeroen Roovers set the mime type to be application/x-gzip which I guess compressed it one more time, somehow.

> gunzip build.log.gz
> mv build.log build.log.gz
> gunzip build.log.gz

Should do the trick :)
Comment 9 WANG Xuerui gentoo-dev 2012-04-22 05:48:17 UTC
The unrecognized operation is from the AVX instruction set, hence all CPU's with an "avx" cpuinfo flag are affected. The bug can be reproduced by adding "-mavx" to CFLAGS.

Upstream bug has been fixed for a while; I integrated their patch into the Portage tree for test and pypy-1.8-r1 successfully compiled.
Comment 10 WANG Xuerui gentoo-dev 2012-04-22 05:49:51 UTC
Created attachment 309767 [details, diff]
Upstream patch that fixes unrecognized AVX ops bug
Comment 11 Mads 2012-05-18 21:41:04 UTC
That patch unfortunately doesn't do the trick. I guess there's another CFLAG interfering when you have a Bulldozer CPU. I tested the build with the CFLAGS-setting I pasted earlier minus the "-mavx", but the compile still crashes.

(Of course I also tried applying the patch, same crash.)

I guess I'll update the case later with what I find, if anyone knows what flag the culprit might be, please do tell.
Comment 12 Mads 2012-05-21 09:59:21 UTC
It seems that -msse4.2 or -msse4.1 might be the culprit. These flags works:

CFLAGS="-O3 -march=native -pipe -mno-avx -mno-sse4.2 -mno-sse4.1"

(without any patch).

I'm trying with only no-sse4.2 as we speak to see if I can narrow it down even more.
Comment 13 Mads 2012-05-21 16:45:53 UTC
Done testing, I found out that both -msse4.1 and -msse4.2 makes the compile crash, doesn't matter if you just use one of them or both. The compile will crash because of a unrecognized opname in the same way as when compiling with -mavx without the patch:

[translation:ERROR]       File "/tmp/portage/dev-python/pypy-1.8-r1/work/pypy1.8/pypy/translator/c/gcc/trackgcroot.py", line 234, in find_missing_visit_method
[translation:ERROR]         raise UnrecognizedOperation(opname)
[translation:ERROR]     __main__.UnrecognizedOperation: movups

So, with a Bulldozer CPU you have to use "-mno-avx -mno-sse4.2 -mno-sse4.1" when using -march=native.
Comment 14 WANG Xuerui gentoo-dev 2012-05-29 16:21:46 UTC
(In reply to comment #13)
> So, with a Bulldozer CPU you have to use "-mno-avx -mno-sse4.2 -mno-sse4.1"
> when using -march=native.

Or add those insns you encountered to the mentioned list of ignored insn prefixes.

My processor is a Core i7-2670QM which has a different insn encoding format for AVX, so the names for those AVX ops are just different. Obviously we need to further extend that list in the patch... disabling potential vectorization merely for passing compilation without patching just isn't worth it, IMO.
Comment 15 Francesco Riosa 2012-06-09 20:43:58 UTC
The bug is present in dev-python/pypy-1.9 too.



please, please add a check earlyer in the ebuild for known broken CFLAGS components, the problem is that currently it will take more than one hour to reach the failing point.

A five second wait for and big fat warning would be the best.



is-flag() from "flag-o-matic.eclass" seem apt to the job

# @FUNCTION: is-flag
# @USAGE: <flag>
# @DESCRIPTION:
# Echo's "true" if flag is set in {C,CXX,F,FC}FLAGS.  Accepts shell globs.
is-flag() 

per other comments at least -msse4.* -mavx* are incompatible (seem confirmed here 1.9 failed with "__main__.UnrecognizedOperation: roundsd" which is 4.1 stuff)

if also march is involved the thing is a bit more complicated but maybe feasible anyway.

---- SUCCESSFUL FLAGS: ----
ECFLAGSbase='-O2 -march=core2'
ECFLAGSnative='-mno-movbe -mno-aes -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2 -mno-sse4.1 --param l1-cache-size=32 --param=l1-cache-line-size=64 --param=l2-cache-size=2048 -mtune=core2'
ECFLAGSo3='-fgcse-after-reload -fpredictive-commoning -ftree-vectorize -funswitch-loops'
ECFLAGSpypy='-Wno-error -Wno-array-bounds -Wno-pointer-sign -Wno-pointer-to-int-cast -Wno-implicit-function-declaration -Wno-strict-overflow'

CFLAGS="${ECFLAGSbase} ${ECFLAGSnative} ${ECFLAGSo3} ${ECFLAGSpypy}"
CXXFLAGS="${CFLAGS} -fpermissive"


---- FAILUNG FLAGS: ----
ECFLAGSbase='-O2 -march=corei7-avx -pipe -frecord-gcc-switches'
ECFLAGSnative='-mno-movbe -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=corei7-avx'
ECFLAGSo3='-fgcse-after-reload -fpredictive-commoning -ftree-vectorize -funswitch-loops'
ECFLAGSgraphite='-fgraphite-identity -floop-block -floop-interchange -floop-strip-mine'
ECFLAGSdbug='-ggdb -gdwarf-4 -fvar-tracking-assignments'
ECFLAGSlto=''
CFLAGS="${ECFLAGSbase} ${ECFLAGSnative} ${ECFLAGSo3} ${ECFLAGSgraphite} ${ECFLAGSlto} ${ECFLAGSdbug}"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS} -fvisibility-inlines-hidden"
Comment 16 Marien Zwart (RETIRED) gentoo-dev 2012-06-18 17:02:19 UTC
roundsd should be fixed in 1.9-r1. Please reopen if 1.9-r1 still fails. Adding a check is rather difficult: we obviously should not forbid -march=native, and figuring out if your gcc with the flags you're using may under some conditions generate code pypy's script doesn't accept is not all that easy (patches accepted, but I don't really see how you'd do it).
Comment 17 Francesco Riosa 2012-06-19 21:40:31 UTC
(In reply to comment #16)
> roundsd should be fixed in 1.9-r1. Please reopen if 1.9-r1 still fails.

confirmed fixed, thanks

> Adding a check is rather difficult: we obviously should not forbid
> -march=native, and figuring out if your gcc with the flags you're using may
> under some conditions generate code pypy's script doesn't accept is not all
> that easy (patches accepted, but I don't really see how you'd do it).

understandable, not to mention test require very recent cpu