Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 83036 - fftw3 library causes segfaults when compiled with USE=sse (Pentium M)
Summary: fftw3 library causes segfaults when compiled with USE=sse (Pentium M)
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: x86 All
: High major (vote)
Assignee: Gentoo Science Related Packages
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2005-02-22 23:55 UTC by Adam Piątyszek
Modified: 2005-04-21 23:41 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Piątyszek 2005-02-22 23:55:13 UTC
I would like to report the problem with fftw3 library compiled on my laptop (Pentium M 1.6 GHz, Centrino). After emerging it with 'sse' USE flag, one of my C++ aplication (a communication system symulation) that is linked statically with libfftw3.a causes segfaults.
After recompiling the fftw3 library with "USE=-sse" the problem does not exist anymore.

Here is my 'emerge info' output:

Portage 2.0.51-r15 (default-linux/x86/2004.3, gcc-3.3.5, glibc-2.3.4.20040808-r1, 2.6.10-gentoo-r6 i686)
=================================================================
System uname: 2.6.10-gentoo-r6 i686 Intel(R) Pentium(R) M processor 1.60GHz
Gentoo Base System version 1.4.16
Python:              dev-lang/python-2.3.4-r1 [2.3.4 (#1, Feb 13 2005, 11:22:34)]
dev-lang/python:     2.3.4-r1
sys-devel/autoconf:  2.59-r6, 2.13
sys-devel/automake:  1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.4
sys-devel/binutils:  2.15.92.0.2-r1
sys-devel/libtool:   1.5.10-r4
virtual/os-headers:  2.4.21-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs autoconfig ccache distlocks sandbox sfperms"
GENTOO_MIRRORS="http://trumpetti.atm.tut.fi/gentoo/ http://ftp.uni-erlangen.de/pub/mirrors/gentoo/ http://src.gentoo.pl http://gentoo.zie.pg.gda.pl"
LANG="pl_PL"
LC_ALL="pl_PL"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.de.gentoo.org/gentoo-portage"
USE="x86 X acpi acpi4linux alsa apache2 auctex bidi bitmap-fonts cddb cdparanoia cups dhcp divx4linux dvd dvdread edl emacs extras f77 fbcon fftw flac fortran freetype gd gimpprint gpm gtk gtk2 i8x0 jabber java jpeg jpeg2k kadu-modules kadu-voice lcd leim live maildir mailwrapper md5sum mmx motif mozilla moznocompose moznoirc moznomail mpeg mplayer network nls objc oggvorbis opengl pdflib pic plotutils png ppds quicktime readline real rtc sasl sdl sis spell sse ssl tetex tiff truetype truetype-fonts type1 type1-fonts unicode usb wmf xfs xmms xosd xv xvid xvmc video_cards_i810 linguas_pl"
Unset:  ASFLAGS, CBUILD, CTARGET, LDFLAGS

/ediap
Comment 1 Patrick Kursawe (RETIRED) gentoo-dev 2005-03-01 02:36:59 UTC
Can you reproduce this with less aggressive CFLAGS (for example, -O2)?
Comment 2 Adam Piątyszek 2005-03-01 02:51:16 UTC
I have been using the -O2 optimisation for my system. See the 'emerge info' output in my previous post ;-)

CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64"
Comment 3 Patrick Kursawe (RETIRED) gentoo-dev 2005-03-01 04:48:07 UTC
I noticed that. I meant "-O2 without all that -fwhatever" - I know the flags you are using are considered mostly harmless, but I want to be sure.
Comment 4 Adam Piątyszek 2005-03-01 12:25:40 UTC
Strange... I followed your suggestion, and recompiled fftw with USE=sse and CFLAGS="-march=pentium3 -O2 -pipe" only. And, as you might expect, noticed no problems using it in my simulator. So I added more and more of my previous flags until I ended up with
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64".
Unfortunately this time I haven't noticed any segfaults caused by fftw. Maybe it was a false alarm. Sorry for that.
Should any segfaults occur in the future, I will let you know by reopening this bug report. However, this time we can close this with a resolution "WORKSFORME".
Comment 5 Patrick Kursawe (RETIRED) gentoo-dev 2005-03-02 00:41:51 UTC
OK
Comment 6 Adam Piątyszek 2005-03-02 05:32:10 UTC
I encounter the segfault problems once again. 
Now I have: 
USE=sse
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64"
and my program segfaults, which is shown here:

Before FFT operation (N_FFT  = 2048)
After FFT operation

Before FFT operation (N_FFT  = 2048)
Segmentation fault

The "Before FFT..." and "After FFT..." strings are outputed with std::cerr function just before and after executing the FFT operation in my C++ simulator.

  std::cerr << "Before FFT operation (N_FFT  = " 
	    << fft_length * upsampl << ")" << endl;
  cvec H_vec = fft(h_vec, fft_length * upsampl);
  std::cerr << "After FFT operation" << endl << endl;

The same effect I have with: 
USE=sse 
CFLAGS="-march=pentium3 -O2 -pipe".

But when I dissable the sse flag: 
USE=-sse 
CFLAGS="-march=pentium3 -O2 -pipe" 
the FFT operaton does not segfault my program:

Before FFT operation (N_FFT  = 2048)
After FFT operation

Before FFT operation (N_FFT  = 2048)
After FFT operation

Finally, I have tried the fftw with: 
USE=-sse
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64" 
and it also worked OK. 

So my proposal here is to disable 'sse' flag for the whole fftw. Just to remind, my architecture is based on Pentium M (Centrino), but since gcc-3.3.5 does not have special optimisation flag for such processor, I use -march=pentium3.
Comment 7 Adam Piątyszek 2005-03-02 05:37:25 UTC
One more comment. From the documentation of the FFTW:

--enable-sse, --enable-sse2, --enable-k7, --enable-altivec: Enable the compilation of SIMD code for SSE (Pentium III+), SSE2 (Pentium IV+), 3dNow! (AMD K7 and others), or AltiVec (PowerPC G4+). SSE, 3dNow!, and AltiVec only work with --enable-float (above), while SSE2 only works in double precision (the default). The resulting code will still work on earlier CPUs lacking the SIMD extensions (SIMD is automatically disabled, although the FFTW library is still larger).

Which library is single precision and, which double in Gentoo? Maybe it is the reason it segfaults... I will check it later with my program and try to comment on it.
Comment 8 Adam Piątyszek 2005-03-02 23:45:37 UTC
I think the problem is in the ebuild:

    if use sse; then
        myconfsingle="$myconfsingle --enable-sse"
        myconfdouble="$myconfdouble --enable-sse2"
    elif [...]

SSE can be used on P3+ processors and above, but SSE2 is only for P4+ processors. So I suggest adding an additional 'sse2' flag to USE flags and write this part of ebuild: 

    if use sse; then
        myconfsingle="$myconfsingle --enable-sse"
    elif use sse2; then
        myconfsingle="$myconfsingle --enable-sse"
        myconfdouble="$myconfdouble --enable-sse2"
    elif [...]
 
What do you think?
Comment 9 Patrick Kursawe (RETIRED) gentoo-dev 2005-03-03 00:30:22 UTC
Sounds reasonable... there are already two other packages which have this as a local USE flag. But I am afraid the way you proposed it, sse2 will not be evaluated if sse is in USE?
Comment 10 Marcus D. Hanwell (RETIRED) gentoo-dev 2005-04-21 14:51:57 UTC
This should be fixed now - just flip the conditionals around and it evaluates everything fine :) Please test out sc-libs/fftw-3.0.1-r1 and let me know if that solves your problems. Only -O2 is allowed currently - I would like to test with less filtering when I get the chance as it mentions GCC 3.2 as the reason for filtering when the sse flag is set.
Comment 11 Adam Piątyszek 2005-04-21 23:41:43 UTC
Thanks for the ebuild. It seems that now it is OK. After having compiled using "USE=sse emerge fftw" on my Pentium M platform, there are no segfaults when I link this library to my C++ simulation program.

One question, by the way. I have set a global 'sse' flag in the make.conf:
#v+
USE="-* X acpi acpi4linux alsa apache2 auctex avi bash-completion berkdb bidi
     bitmap-fonts blas cddb cdparanoia cscope cups dhcp divx4linux dvd
     dvdread edl emacs extras f77 fbcon fftw flac fortran freetype gcj
     gd gif gimpprint gpm gtk gtk2 i8x0 jabber java jpeg jpeg2k
     kadu-modules kadu-voice lcd leim live mad maildir mailwrapper
     md5sum mmx motif mozilla moznocompose moznoirc moznomail mp3 mpeg
     mplayer network nls objc oggvorbis opengl pdflib perl pic plotutils
     png ppds quicktime readline real rtc sasl sdl sis spell sse ssl
     tetex tiff truetype truetype-fonts type1 type1-fonts unicode usb
     userlocales wmf xfs xosd xv xvid xvmc zlib"
#v-

But after performing the following command:

#v+
ediap@lespaul etc $ emerge -pv fftw

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild   R   ] sci-libs/fftw-3.0.1-r1  -3dnow (-altivec) -debug -mpi -sse* -sse2 0 kB 

Total size of downloads: 0 kB
#v-

The 'sse' flag seems to be not set. Do you happen to know why is is so?

/ediap