First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 83036
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo Science Related Packages <sci@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Adam Piątyszek <ediap@users.sourceforge.net>
Add CC:
CC:
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 83036 depends on: Show dependency tree
Bug 83036 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.






View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2005-02-22 23:55 0000
I would like to report the problem with fftw3 library compiled on my laptop
(Pentium M 1.6 GHz, Centrino). After emerging it with 'sse' USE flag, one of my
C++ aplication (a communication system symulation) that is linked statically
with libfftw3.a causes segfaults.
After recompiling the fftw3 library with "USE=-sse" the problem does not exist
anymore.

Here is my 'emerge info' output:

Portage 2.0.51-r15 (default-linux/x86/2004.3, gcc-3.3.5,
glibc-2.3.4.20040808-r1, 2.6.10-gentoo-r6 i686)
=================================================================
System uname: 2.6.10-gentoo-r6 i686 Intel(R) Pentium(R) M processor 1.60GHz
Gentoo Base System version 1.4.16
Python:              dev-lang/python-2.3.4-r1 [2.3.4 (#1, Feb 13 2005,
11:22:34)]
dev-lang/python:     2.3.4-r1
sys-devel/autoconf:  2.59-r6, 2.13
sys-devel/automake:  1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.4
sys-devel/binutils:  2.15.92.0.2-r1
sys-devel/libtool:   1.5.10-r4
virtual/os-headers:  2.4.21-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr
-frename-registers -fprefetch-loop-arrays -falign-functions=64"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/lib/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/
/usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/
/usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr
-frename-registers -fprefetch-loop-arrays -falign-functions=64"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs autoconfig ccache distlocks sandbox sfperms"
GENTOO_MIRRORS="http://trumpetti.atm.tut.fi/gentoo/
http://ftp.uni-erlangen.de/pub/mirrors/gentoo/ http://src.gentoo.pl
http://gentoo.zie.pg.gda.pl"
LANG="pl_PL"
LC_ALL="pl_PL"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.de.gentoo.org/gentoo-portage"
USE="x86 X acpi acpi4linux alsa apache2 auctex bidi bitmap-fonts cddb
cdparanoia cups dhcp divx4linux dvd dvdread edl emacs extras f77 fbcon fftw
flac fortran freetype gd gimpprint gpm gtk gtk2 i8x0 jabber java jpeg jpeg2k
kadu-modules kadu-voice lcd leim live maildir mailwrapper md5sum mmx motif
mozilla moznocompose moznoirc moznomail mpeg mplayer network nls objc oggvorbis
opengl pdflib pic plotutils png ppds quicktime readline real rtc sasl sdl sis
spell sse ssl tetex tiff truetype truetype-fonts type1 type1-fonts unicode usb
wmf xfs xmms xosd xv xvid xvmc video_cards_i810 linguas_pl"
Unset:  ASFLAGS, CBUILD, CTARGET, LDFLAGS

/ediap

------- Comment #1 From Patrick Kursawe 2005-03-01 02:36:59 0000 -------
Can you reproduce this with less aggressive CFLAGS (for example, -O2)?

------- Comment #2 From Adam Piątyszek 2005-03-01 02:51:16 0000 -------
I have been using the -O2 optimisation for my system. See the 'emerge info'
output in my previous post ;-)

CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr
-frename-registers -fprefetch-loop-arrays -falign-functions=64"

------- Comment #3 From Patrick Kursawe 2005-03-01 04:48:07 0000 -------
I noticed that. I meant "-O2 without all that -fwhatever" - I know the flags
you are using are considered mostly harmless, but I want to be sure.

------- Comment #4 From Adam Piątyszek 2005-03-01 12:25:40 0000 -------
Strange... I followed your suggestion, and recompiled fftw with USE=sse and
CFLAGS="-march=pentium3 -O2 -pipe" only. And, as you might expect, noticed no
problems using it in my simulator. So I added more and more of my previous
flags until I ended up with
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr
-frename-registers -fprefetch-loop-arrays -falign-functions=64".
Unfortunately this time I haven't noticed any segfaults caused by fftw. Maybe
it was a false alarm. Sorry for that.
Should any segfaults occur in the future, I will let you know by reopening this
bug report. However, this time we can close this with a resolution
"WORKSFORME".

------- Comment #5 From Patrick Kursawe 2005-03-02 00:41:51 0000 -------
OK

------- Comment #6 From Adam Piątyszek 2005-03-02 05:32:10 0000 -------
I encounter the segfault problems once again. 
Now I have: 
USE=sse
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64"
and my program segfaults, which is shown here:

Before FFT operation (N_FFT  = 2048)
After FFT operation

Before FFT operation (N_FFT  = 2048)
Segmentation fault

The "Before FFT..." and "After FFT..." strings are outputed with std::cerr function just before and after executing the FFT operation in my C++ simulator.

  std::cerr << "Before FFT operation (N_FFT  = " 
	    << fft_length * upsampl << ")" << endl;
  cvec H_vec = fft(h_vec, fft_length * upsampl);
  std::cerr << "After FFT operation" << endl << endl;

The same effect I have with: 
USE=sse 
CFLAGS="-march=pentium3 -O2 -pipe".

But when I dissable the sse flag: 
USE=-sse 
CFLAGS="-march=pentium3 -O2 -pipe" 
the FFT operaton does not segfault my program:

Before FFT operation (N_FFT  = 2048)
After FFT operation

Before FFT operation (N_FFT  = 2048)
After FFT operation

Finally, I have tried the fftw with: 
USE=-sse
CFLAGS="-march=pentium3 -O2 -pipe -fomit-frame-pointer -fforce-addr -frename-registers -fprefetch-loop-arrays -falign-functions=64" 
and it also worked OK. 

So my proposal here is to disable 'sse' flag for the whole fftw. Just to remind, my architecture is based on Pentium M (Centrino), but since gcc-3.3.5 does not have special optimisation flag for such processor, I use -march=pentium3.

------- Comment #7 From Adam Piątyszek 2005-03-02 05:37:25 0000 -------
One more comment. From the documentation of the FFTW:

--enable-sse, --enable-sse2, --enable-k7, --enable-altivec: Enable the compilation of SIMD code for SSE (Pentium III+), SSE2 (Pentium IV+), 3dNow! (AMD K7 and others), or AltiVec (PowerPC G4+). SSE, 3dNow!, and AltiVec only work with --enable-float (above), while SSE2 only works in double precision (the default). The resulting code will still work on earlier CPUs lacking the SIMD extensions (SIMD is automatically disabled, although the FFTW library is still larger).

Which library is single precision and, which double in Gentoo? Maybe it is the reason it segfaults... I will check it later with my program and try to comment on it.

------- Comment #8 From Adam Piątyszek 2005-03-02 23:45:37 0000 -------
I think the problem is in the ebuild:

    if use sse; then
        myconfsingle="$myconfsingle --enable-sse"
        myconfdouble="$myconfdouble --enable-sse2"
    elif [...]

SSE can be used on P3+ processors and above, but SSE2 is only for P4+ processors. So I suggest adding an additional 'sse2' flag to USE flags and write this part of ebuild: 

    if use sse; then
        myconfsingle="$myconfsingle --enable-sse"
    elif use sse2; then
        myconfsingle="$myconfsingle --enable-sse"
        myconfdouble="$myconfdouble --enable-sse2"
    elif [...]
 
What do you think?

------- Comment #9 From Patrick Kursawe 2005-03-03 00:30:22 0000 -------
Sounds reasonable... there are already two other packages which have this as a
local USE flag. But I am afraid the way you proposed it, sse2 will not be
evaluated if sse is in USE?

------- Comment #10 From Marcus D. Hanwell 2005-04-21 14:51:57 0000 -------
This should be fixed now - just flip the conditionals around and it evaluates
everything fine :) Please test out sc-libs/fftw-3.0.1-r1 and let me know if
that solves your problems. Only -O2 is allowed currently - I would like to test
with less filtering when I get the chance as it mentions GCC 3.2 as the reason
for filtering when the sse flag is set.

------- Comment #11 From Adam Piątyszek 2005-04-21 23:41:43 0000 -------
Thanks for the ebuild. It seems that now it is OK. After having compiled using
"USE=sse emerge fftw" on my Pentium M platform, there are no segfaults when I
link this library to my C++ simulation program.

One question, by the way. I have set a global 'sse' flag in the make.conf:
#v+
USE="-* X acpi acpi4linux alsa apache2 auctex avi bash-completion berkdb bidi
     bitmap-fonts blas cddb cdparanoia cscope cups dhcp divx4linux dvd
     dvdread edl emacs extras f77 fbcon fftw flac fortran freetype gcj
     gd gif gimpprint gpm gtk gtk2 i8x0 jabber java jpeg jpeg2k
     kadu-modules kadu-voice lcd leim live mad maildir mailwrapper
     md5sum mmx motif mozilla moznocompose moznoirc moznomail mp3 mpeg
     mplayer network nls objc oggvorbis opengl pdflib perl pic plotutils
     png ppds quicktime readline real rtc sasl sdl sis spell sse ssl
     tetex tiff truetype truetype-fonts type1 type1-fonts unicode usb
     userlocales wmf xfs xosd xv xvid xvmc zlib"
#v-

But after performing the following command:

#v+
ediap@lespaul etc $ emerge -pv fftw

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild   R   ] sci-libs/fftw-3.0.1-r1  -3dnow (-altivec) -debug -mpi -sse*
-sse2 0 kB 

Total size of downloads: 0 kB
#v-

The 'sse' flag seems to be not set. Do you happen to know why is is so?

/ediap

First Last Prev Next    No search results available      Search page      Enter new bug