Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 445053 - sci-libs/fftw-3.3.3 fma USE flag ambiguous (fma3 or fma4)
Summary: sci-libs/fftw-3.3.3 fma USE flag ambiguous (fma3 or fma4)
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Science Related Packages
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2012-11-28 09:57 UTC by Duncan
Modified: 2013-02-19 05:13 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Duncan 2012-11-28 09:57:13 UTC
sci-libs/fftw-3.3.3 has a new fma USE flag, but...

There are two different and incompatible fma instruction sets, fma3 and fma4.  See http://en.wikipedia.org/wiki/FMA_instruction_set .  Which one will this activate?  There's no clue.

I have a bdver1 (bulldozer) with fma4, but there's no hint in either the USE flag name or description, nor for that matter in the ebuild either (it simply use-enables fma without saying which one) whether it enables fma3, which amd's bdver2 (piledriver, from Oct 2012) and "trinity" apus (from June 2012) and intel's haswell (2013) are supposed to support, or fma4, supported by amd's bdver1 (bulldozer, from 2011 and what I have, with fma4 listed in /proc/cpuinfo) and possibly later CPUs/APUs, according to wikipedia.

Or maybe both are supported and it detects which one to use at runtime?  That'd be the only way that an unqualified fma USE flag would really make sense.  But even then, to avoid ambiguity the USE flag description should specify that it supports both.

(Set to minor severity as the flag's off by default and can simply be left that way, as I'm doing ATM, if there's any doubt.  But this does need fixed before use of this flag spreads and people start reporting broken packages because they enabled the flag on cpus supporting the other variant.)
Comment 1 Christoph Junghans (RETIRED) gentoo-dev 2012-11-28 16:43:16 UTC
(In reply to comment #0)
> sci-libs/fftw-3.3.3 has a new fma USE flag, but...
I had a look at the code and it seems to be a generic feature for all fma version. However I don't have a machine to test it.

What do you suggest as description for the fma use flag?
Comment 2 Duncan 2012-11-29 16:08:48 UTC
(In reply to comment #1)
> I had a look at the code and it seems to be a generic feature for all fma
> version. However I don't have a machine to test it.

Thanks.

> What do you suggest as description for the fma use flag?

Current description for reference:

Use the Fused Multiply Add instruction set

Ultimately I'd suggest simply adding "(fma3/fma4 either one)", making the description:

Use the Fused Multiply Add instruction set (fma3/fma4 either one)

But either confirming with upstream or testing to be sure would be good, first.  (This assumes configure --help or whatever is similarly ambiguous, which I'd guess it to be given the choice of generic "fma" as the configure option.)  I'll build with FEATURES=test both with and without USE=fma here, thus hopefully confirming fma4 one way or the other.

Assuming that works, an interrim plan might add to the description this instead "(fma4 tested, should work with fma3 as well)", then do an einfo (conditional on USE=fma so as not to bother people without it turned on) asking for anyone with fma3 to run the tests and report the results both with the flag and without, as well.  (Maybe link the wikipedia entry too, for anyone who enables it but might be mixed up.)

Of course if my tests show it doesn't work for fma4, I'd suggest making the flag fma3 instead of generic fma, with a description to match.  Similarly, make it fma4 if other people's tests ultimately discover that it works for fma4 but not fma3.

Based on the wikipedia entry, I'd guess fma3 will ultimately dominate (unless both end up supported), and if only one's supported, I'd guess it to be that, unless the implementor had fma4 hardware only and simply wasn't aware of fma3, which is possible, given the earlier availability of fma4 hardware.  I guess we'll see...

Out to run some tests, now. =:^)
Comment 3 Duncan 2012-11-29 17:40:16 UTC
FWIW, ebuild ... test passed both ways (and I verified the fma as passed to configure was disabled/enabled appropriately), so things look good.

... And the warning about the test taking 30 minutes... seems a bit dated.  More like one (it took about three to unpack, configure, build AND test, bulldozer 3-cluster/6-core @ 3.6GHz).

I su-ed to the portage user, sourced the environment, and am trying emake bigcheck (with USE=fma) now, to be safe.   That *IS* taking rather longer (still in my first subdir, double, with single and long-double to go, so it'll be hours...). If I get some failures I'll try the same with USE=-fma.  But the smallcheck passed just fine both ways, so fma4 at least is looking good so far. =:^)
Comment 4 Duncan 2012-11-30 17:36:09 UTC
OK, finished/passed the "bigtest" for all three subdirs (single/double/long-double), with USE=fma, on my fma4 machine.  So it definitely seems to work with fma4.

Now if someone with fma3 could confirm that it works there...  AFAIK, that'd be a brand new amd "piledriver", or one of the "trinity" apus, available since June.  (AFAIK from the wikipedia article, Intel's fma-supporting hardware will all be fma3, but won't be available until next year.)

Maybe I'll ask on the amd64 list...
Comment 5 Christoph Junghans (RETIRED) gentoo-dev 2013-02-19 02:33:06 UTC
So what was the conclusion? What would be the better description?
Comment 6 Duncan 2013-02-19 04:54:58 UTC
(In reply to comment #5)
> So what was the conclusion? What would be the better description?

Thanks for the bump.  I had forgotten about this bug...

I did ask on the amd64 list, but got no replies.  Seems it's not so active these days.  I could resubscribe to user and ask there, but...

Given that fma4 is likely to be the rarer version and I know it passed there, I'd suggest the same wording as I suggested earlier.

Just add "(fma3/fma4 either one)", making the description:

Use the Fused Multiply Add instruction set (fma3/fma4 either one)

If that's incorrect and fma3 does NOT work, I guess we'll know soon enough, as someone will surely bug it.

Alternatively, CYA a bit better with something like this:

Use the Fused Multiply Add instruction set (fma3 untested, fma4 tested)

An elog, conditional on USE=fma, could then ask fma3 users to report their results on this bug, after which the USE description could be altered accordingly to either the "either one" wording above, or to fma4 only, possibly with a USE flag name change altho of course there's a --newuse cost to that.

Or if you think that's too much bother, just close the bug.  I know it works here now and am long since over my initial irritation at the lack of specificity, and judging by the lack of CCs, nobody else has been similarly irritated, so...

Thanks for the work you put in on gentoo and fftw.  I do appreciate the fact that without you devs, there'd not BE a gentoo to run OR to file bugs against!
Comment 7 Christoph Junghans (RETIRED) gentoo-dev 2013-02-19 05:13:33 UTC
(In reply to comment #6)
> Just add "(fma3/fma4 either one)", making the description:
> 
> Use the Fused Multiply Add instruction set (fma3/fma4 either one)
I went for this one, somebody will report if it's wrong, but I don't think it will make too much trouble. 

Thanks for your contribution to Gentoo.