During stabilization of blas-atlas-3.6.0 it was discovered that the code fails to compile on some PPC machines. Currently it looks as if xemit_mm creates "bad" code of the form ----------------- SNIP ---------------------------- void ATL_dpKBmm_b0 (const int M, const int N, const int K, const double alpha, const double *A, const int lda, const double *B, const int ldb, const double beta, double *C, const int ldc) { else { ATL_dupKBmm1_1_1_b0(M, N, K, alpha, A, lda, B, ldb, beta, C, ldc); } else if (K == (((((K) >> 1)) << 1))) { ATL_dupKBmm0_2_0_b0(M, N, K, alpha, A, lda, B, ldb, beta, C, ldc); } } ------------------------------------------------------------------------------------- A bug has been filed upstream to hopefully resolve this issue https://sourceforge.net/tracker/index.php?func=detail&aid=1417683&group_id=23725&atid=379483
Folks, the ATLAS developers have gotten back to me and here is the relevant quote from their response: ----------- SNIP -------------------------------- First, let me confirm that this is a known bug in the ATLAS framework. I think it's still in the newest dev release as well. I've known about it for about 5 years :) OK, so why isn't it fixed? The reason is that it only rarely happens, and never when the search has found a really good solution (i.e. the only times I observed it, the search had gone awry and was timing non-optimal cases). For machines where it happened, I intervened by providing architectural defaults that provided better performance, as well as not producing illegal code :-} ------------------------------------------------------------------------------------- Hence we've hit a known issue with some ppc machines and this is probably the reason why it works for some people and not for others. @ppc team: Here are my thoughts on how to deal with this. Please let me know what you think. Due to this issue I don't feel comfortable moving blas-atlas into ppc, but I would be fine having it in ~ppc since it seems to work in most cases. If that sounds like a reasonable solution for now I'll close this bug as WONTFIX. Thanks, Markus
Bum! I have exactly been bitten by it. /usr/lib/ccache/bin/gcc -DL2SIZE=4194304 -I/var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/include -I/var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/include/Linux_UNKNOWNAltiVec -I/var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/include/contrib -DAdd__ -DStringSunStyle -DATL_OS_Linux -DATL_AltiVec -DATL_AVgcc -DATL_GAS_LINUX_PPC -DATL_UCLEANM -DATL_UCLEANN -DATL_UCLEANK -c -fomit-frame-pointer -O -maltivec -mabi=altivec ATL_dupKBmm_b0.c -fPIC -DPIC -o .libs/ATL_dupKBmm_b0.o ATL_dupKBmm_b0.c: In function `ATL_dpKBmm_b0': ATL_dupKBmm_b0.c:26: error: parse error before "else" make[8]: *** [ATL_dupKBmm_b0.o] Error 1 --------------- For some reason I didn't have this problem with 3.6.0 (I think), it only surfaced when 3.6.0-r1 appeared in my tree. I will check 3.6.0 shortly. For info my machine is iMac G4 second revision (I think) emerge info: Portage 2.0.54 (default-linux/ppc/2005.1/ppc/G4, gcc-3.4.4, glibc-2.3.5-r3, 2.6.14-gentoo-r5 ppc) ================================================================= System uname: 2.6.14-gentoo-r5 ppc 7450, altivec supported Gentoo Base System version 1.6.14 ccache version 2.3 [enabled] dev-lang/python: 2.3.5-r2, 2.4.2 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="ppc" AUTOCLEAN="yes" CBUILD="powerpc-unknown-linux-gnu" CFLAGS="-O2 -mcpu=7450 -pipe -maltivec -mabi=altivec -mpowerpc-gfxopt -fsigned-char -frename-registers -fweb -fno-strict-aliasing" CHOST="powerpc-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -mcpu=7450 -pipe -maltivec -mabi=altivec -mpowerpc-gfxopt -fsigned-char -frename-registers -fweb -fno-strict-aliasing" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig ccache distlocks fixpackages sandbox sfperms strict" GENTOO_MIRRORS="ftp://ftp.vic.keypoint.com.au http://mirrors.tds.net/gentoo ftp://mirrors.tds.net/gentoo http://mirror.tucdemonic.org/gentoo/" LC_ALL="en_NZ.UTF-8" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.au.gentoo.org/gentoo-portage" USE="ppc X a52 aac aalib alsa altivec arts audiofile berkdb bitmap-fonts bzip2 cairo cddb cdf cdparanoia cdr crypt cups curl dts dv dvd dvdr dvdread emboss encode esd exif expat f77 fam fbcon ffmpeg flac foomaticdb fortran gd gdbm ggi gif glut gmp gpm graphviz gstreamer gtk gtk2 gtkhtml guile hal idn ieee1394 imagemagick imlib imlib2 ipv6 java jbig jpeg jpeg2k kde kdexdeltas lcms libwww live lzo mad matroska mikmod mjpeg mng motif mp3 mpeg ncurses netcdf network nls nptl ogg oggvorbis openexr opengl oss pam pcre pdflib perl plotutils png ppds python qt rdesktop readline samba sdl slang slp spell ssl svg szip tcltk tcpd tetex theora tiff truetype truetype-fonts type1-fonts udev unicode usb vorbis wmf xine xml xml2 xmms xv xvid zeroconf zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LDFLAGS, LINGUAS, MAKEOPTS, PORTDIR_OVERLAY
I succesfully build 3.6.0 on my machine. Only 3.6.0-r1 show the problem. I personally masked 3.6.0-r1 on my machine. What is so essential in -r1 that breaks the build process?
(In reply to comment #3) > I succesfully build 3.6.0 on my machine. Only 3.6.0-r1 show the > problem. I personally masked 3.6.0-r1 on my machine. > What is so essential in -r1 that breaks the build process? > Hi Francois, 3.6.0-r1 has a fix to remove insecure RUNPATHS (see bug #114587). The error that you posted in #2 should be completely unrelated to this. However, I fixed a small bug in the 3.6.0-r1 ebuild that caused it to ignore some of the user's CFLAGS. Please try again and report back. best, Markus
(In reply to comment #4) > Hi Francois, > > 3.6.0-r1 has a fix to remove insecure RUNPATHS (see bug #114587). > The error that you posted in #2 should be completely > unrelated to this. However, I fixed a small bug in the 3.6.0-r1 ebuild > that caused it to ignore some of the user's CFLAGS. Please try again > and report back. > Sorry for being so long with my answer, I wasn't on CC list for the bug (Did the behavior of bugzilla change or once upon a time you were automatically put on CC list when contributing to a bug?). I just had a go now. Which failed miserably because I moved to gcc4.1.0 which include gfortran while atlas expect g77: which: no g77 in (/sbin:/usr/sbin:/usr/lib/portage/bin:/bin:/usr/bin:/opt/bin:/usr/powerpc-unknown-linux-gnu/gcc-bin/4.1.0:/opt/ibm-jdk-bin-1.4.2.03/bin:/opt/ibm-jdk-bin-1.4.2.03/jre/bin:/usr/kde/3.5/sbin:/usr/kde/3.5/bin:/usr/qt/3/bin:/usr/kde/3.4/sbin:/usr/kde/3.4/bin) * No fortran compiler found on the system! * Please add fortran to your USE flags and reemerge gcc! ------------- I will give another go with my gcc-3.4.5 profile later in the evening so that it is an overnight compile. But if we are to move on gcc4.* something will have to be done.
The ~arch ebuilds (3.7.x) are fixed for gcc-4.
(In reply to comment #4) > Hi Francois, > > 3.6.0-r1 has a fix to remove insecure RUNPATHS (see bug #114587). > The error that you posted in #2 should be completely > unrelated to this. However, I fixed a small bug in the 3.6.0-r1 ebuild > that caused it to ignore some of the user's CFLAGS. Please try again > and report back. > Hi, I tried to compile 3.6.0-r1 with my gcc-3.4.5 profile and it now fails at install time: chmod 644 /var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/gentoo/libs/libatlas.a powerpc-unknown-linux-gnu-ranlib /var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/gentoo/libs/libatlas.a libtool: install: warning: remember to run `libtool --finish /usr/lib' cd gentoo/libf77blas.a ; \ libtool --mode=link --tag=CC /usr/lib/ccache/bin/gcc -o libblas.la ../libs/libatlas.la *.lo \ -rpath /usr/lib/blas/atlas -lg2c ; \ rm -f .libs/libblas.so.0.0.0; \ /usr/lib/ccache/bin/gcc --shared .libs/*.o -lg2c /var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/gentoo/libs/libatlas.so -Wl,-soname -Wl,libblas.so.0 -o .libs/libblas.so.0.0.0; \ libtool --mode=install install -s libblas.la /var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS/gentoo/libs /bin/sh: line 0: cd: gentoo/libf77blas.a: No such file or directory libtool: link: `*.lo' is not a valid libtool object gcc: .libs/*.o: No such file or directory libtool: install: `libblas.la' is not a valid libtool archive Try `libtool --help --mode=install' for more information. make[1]: *** [libblas.so] Error 1 make[1]: Leaving directory `/var/tmp/portage/blas-atlas-3.6.0-r1/work/ATLAS' make: *** [shared-strip] Error 2
Hi Francois, You are suffering from the same problem that originally triggered this bug (see https://bugs.gentoo.org/show_bug.cgi?id=114587#c12) You could try upgrading to version 3.7.11 and see if this fixes the problem for you. Best, Markus
(In reply to comment #8) > Hi Francois, > > You are suffering from the same problem that originally triggered this > bug (see https://bugs.gentoo.org/show_bug.cgi?id=114587#c12) > > You could try upgrading to version 3.7.11 and see if this fixes > the problem for you. > Yes I have noticed that some time this morning. I will try 3.7.11 shortly. I didn't work last time I tried 1 month ago but thing may have changed. Thanks.
(In reply to comment #9) > (In reply to comment #8) > > Hi Francois, > > > > You are suffering from the same problem that originally triggered this > > bug (see https://bugs.gentoo.org/show_bug.cgi?id=114587#c12) > > > > You could try upgrading to version 3.7.11 and see if this fixes > > the problem for you. > > > Yes I have noticed that some time this morning. I will try 3.7.11 > shortly. I didn't work last time I tried 1 month ago but thing may > have changed. > Just a quick note to say that 3.7.11 works for me. I had to upgrade to blas-lapack-3.7.11 afterward but everything went OK.
(In reply to comment #7) > I tried to compile 3.6.0-r1 with my gcc-3.4.5 profile and it now fails > at install time: Same problem here. Athlon XP (x86) + gcc 3.4.5.
(In reply to comment #11) > > Same problem here. Athlon XP (x86) + gcc 3.4.5. > Please try the latest version in ~arch (3.7.11) which has much better support for newer processors and report back. Thanks, Markus
I've got the same prob on an ibook G4. Apparently it needs better defaults, as comment #1 suggests.
Created attachment 89028 [details, diff] 3.7.11-add-ppc-processors.patch Luca pointed out that during interactive configuration, an incorrect "unknown" processor was the default selection. Thus during the autoconfig, it would be selected rather than the G4/G5 setting. Turns out this is because config.c checks /proc/cpuinfo, but it's missing the CPU IDs for some of the newer PPC chips. Here's a patch to fix a couple I'm aware of.
Created attachment 89069 [details, diff] Yet another patch Another generalized patch
Created attachment 89354 [details, diff] s/bits/Bits That one works on .7.11 correctly
Got the time to test it a bit, seems working, committed