Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 654598 - dev-libs/crypto++-7.0.0 - In file included from gcm-simd.cpp:39:0: /usr/lib/gcc/x86_64-pc-linux-gnu/7.3.0/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline ‘__m128i _mm_clmulepi64_si128(__m128i, __m128i, int)’: target specific opt
Summary: dev-libs/crypto++-7.0.0 - In file included from gcm-simd.cpp:39:0: /usr/lib/g...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Crypto team [DISABLED]
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-02 11:48 UTC by email200202
Modified: 2018-05-05 10:20 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info '=dev-libs/crypto++-7.0.0::gentoo' (file_654598.txt,6.40 KB, text/plain)
2018-05-02 11:48 UTC, email200202
Details
emerge -pqv '=dev-libs/crypto++-7.0.0::gentoo' (file_654598.txt,428 bytes, text/plain)
2018-05-02 11:51 UTC, email200202
Details
build.log (file_654598.txt,3.53 KB, text/plain)
2018-05-02 11:53 UTC, email200202
Details
build.log without -s (file_654598.txt,9.85 KB, text/plain)
2018-05-03 08:24 UTC, email200202
Details

Note You need to log in before you can comment on or make changes to this bug.
Description email200202 2018-05-02 11:48:37 UTC
Created attachment 529336 [details]
emerge --info '=dev-libs/crypto++-7.0.0::gentoo'

dev-libs/crypto++-7.0.0 failed to emerge with the multiple messages like:

In file included from gcm-simd.cpp:39:0:
/usr/lib/gcc/x86_64-pc-linux-gnu/7.3.0/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline ‘__m128i _mm_clmulepi64_si128(__m128i, __m128i, int)’: target specific option mismatch
 _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
 ^~~~~~~~~~~~~~~~~~~~
Comment 1 email200202 2018-05-02 11:51:38 UTC
Created attachment 529338 [details]
emerge -pqv '=dev-libs/crypto++-7.0.0::gentoo'
Comment 2 email200202 2018-05-02 11:53:06 UTC
Created attachment 529340 [details]
build.log
Comment 3 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-02 21:09:27 UTC
Hi,
Please remove -s from MAKE_OPTS when filing bugs.
Please reattach build.log.
Thanks
Comment 4 Jeffrey Walton 2018-05-03 01:47:26 UTC
This sounds like architectural flags/options are missing from the *-simd.cpp files. For example, sse-simd.cpp gets compiled with the users options plus -msse2. As another example, rijndael-simd.cpp gets compiled with the users options plus -msse4.1 -maes. And the file in question, gcm-simd.cpp, gets compiled with the users options plus -mssse3 -mpclmul.

There is a wiki page covering the topic at https://www.cryptopp.com/wiki/BASE+SIMD. It provides a list of the *-simd.cpp files and specifies the architectural flags needed for the files.

If someone can provide instructions for working with Gentoo's build system then I can help fill in some of the missing pieces. Preferably the instructions would be prescriptive, like (1) "git clone ..." and (2) "edit the file like so". I can produce a diff or pull request.

You can also email me offline at noloader, gmail account. I can also provide my cell and home numbers for a call, if needed.
Comment 5 email200202 2018-05-03 08:24:35 UTC
Created attachment 529520 [details]
build.log without -s
Comment 6 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-03 09:37:28 UTC
Fails when:
CXXFLAGS="-march=nehalem -DCRYPTOPP_DISABLE_AESNI"

Succeeds when:
USE="cpu_flags_x86_aes"

@noloader, you can reproduce this in your environment:

CXXFLAGS="-march=nehalem -DCRYPTOPP_DISABLE_AESNI" make -f GNUmakefile clean all

How do you suggest to solve it?
Comment 7 Jeffrey Walton 2018-05-03 10:50:52 UTC
(In reply to Alon Bar-Lev from comment #6)
> Fails when:
> CXXFLAGS="-march=nehalem -DCRYPTOPP_DISABLE_AESNI"
> 
> Succeeds when:
> USE="cpu_flags_x86_aes"
> 
> @noloader, you can reproduce this in your environment:
> 
> CXXFLAGS="-march=nehalem -DCRYPTOPP_DISABLE_AESNI" make -f GNUmakefile clean
> all

Yes:

$ CXXFLAGS="-march=nehalem -DCRYPTOPP_DISABLE_AESNI" make -f GNUmakefile
g++ -march=nehalem -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c gcm-simd.cpp
gcm-simd.cpp: In function ‘__m128i CryptoPP::GCM_Reduce_CLMUL(__m128i, __m128i, __m128i, const __m128i&)’:
gcm-simd.cpp:528:23: error: ‘__builtin_ia32_pclmulqdq128’ needs isa option -m32 -mpclmul
     c1 = _mm_xor_si128(c1, _mm_clmulepi64_si128(c0, r, 0x10));
          ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

> How do you suggest to solve it?

OK, so the back story here is, both AESNI and PCLMULDQ used to be tied to CRYTPOPP_AESNI_AVAILABLE. We caught a bug report for an odd OEM cpu that had SSE2, SSE3, SSSE3 and AESNI but nothing else. So we added separate flags for CRYTPOPP_AESNI_AVAILABLE and CRYTPOPP_CLMUL_AVAILABLE but kept them coupled.

However, this is the problem config.h:

// Couple to CRYPTOPP_DISABLE_AES, but use CRYPTOPP_CLMUL_AVAILABLE so we can selectively
//  disable for misbehaving platofrms and compilers, like Solaris or some Clang.
#if defined(CRYPTOPP_DISABLE_AES)
	#define CRYPTOPP_DISABLE_CLMUL 1
#endif

It seems we called them two different things.

I'm thinking CRYPTOPP_DISABLE_AESNI is the better choice. Do you mind if we change CRYPTOPP_DISABLE_AES -> CRYPTOPP_DISABLE_AESNI?
Comment 8 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-03 11:45:46 UTC
(In reply to Jeffrey Walton from comment #7)
>
> I'm thinking CRYPTOPP_DISABLE_AESNI is the better choice. Do you mind if we
> change CRYPTOPP_DISABLE_AES -> CRYPTOPP_DISABLE_AESNI?

No problem.
Comment 9 Patrick Fourniols 2018-05-03 22:06:10 UTC
hello, i have no clue of what happened:
i go in directory /var/tmp/portage/dev-libs/crypto++-7.0.0-r1/work
i look in GNUMakefile and close it with no change.
i do make just for see what happen.
compile ok ;)
i leave directory work
i  do ebuild $(equery which crypto++) compile install qmerge
all worked fine
i do the same on another computer where emerge crypto failed with error like noticed here and same result
so if anybody can explain me why i'll listen ;)
Comment 10 Jeffrey Walton 2018-05-04 01:31:23 UTC
(In reply to Alon Bar-Lev from comment #8)
> (In reply to Jeffrey Walton from comment #7)
> >
> > I'm thinking CRYPTOPP_DISABLE_AESNI is the better choice. Do you mind if we
> > change CRYPTOPP_DISABLE_AES -> CRYPTOPP_DISABLE_AESNI?
> 
> No problem.

Changed at https://github.com/weidai11/cryptopp/commit/5422f0c13a57 .
Comment 11 email200202 2018-05-04 02:05:41 UTC
dev-libs/crypto++-7.0.0-r1  did not fix the problem.
Comment 12 email200202 2018-05-04 02:24:57 UTC
I confirm Patrick Fourniols results.

- If I run emerge command for dev-libs/crypto++-7.0.0-r1, it fails to compile
- But if I run make command from the work directory, it compiles ok.
Comment 13 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-04 04:45:55 UTC
(In reply to Jeffrey Walton from comment #10)
> (In reply to Alon Bar-Lev from comment #8)
> > (In reply to Jeffrey Walton from comment #7)
> > >
> > > I'm thinking CRYPTOPP_DISABLE_AESNI is the better choice. Do you mind if we
> > > change CRYPTOPP_DISABLE_AES -> CRYPTOPP_DISABLE_AESNI?
> > 
> > No problem.
> 
> Changed at https://github.com/weidai11/cryptopp/commit/5422f0c13a57 .

We now have the problem in different file:

x86_64-pc-linux-gnu-g++ -O2 -pipe -march=nehalem -mno-popcnt -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c sha.cpp

In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
                 from sha-simd.cpp:16:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const word32*, size_t, CryptoPP::ByteOrder)’:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error: inlining failed in call to always_inline ‘__m128i _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
 _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
 ^~~~~~~~~~~~~~~~~~~
sha-simd.cpp:384:46: note: called from here
         E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);

...

Also in -march=native - this is new:

x86_64-pc-linux-gnu-g++ -O2 -march=native -fomit-frame-pointer -pipe -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c sha.cpp
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
                 from sha-simd.cpp:16:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const word32*, size_t, CryptoPP::ByteOrder)’:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error: inlining failed in call to always_inline ‘__m128i _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
 _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
 ^~~~~~~~~~~~~~~~~~~
sha-simd.cpp:384:46: note: called from here
         E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);
Comment 14 Jeffrey Walton 2018-05-04 07:03:13 UTC
(In reply to Alon Bar-Lev from comment #13)
> (In reply to Jeffrey Walton from comment #10)
> > (In reply to Alon Bar-Lev from comment #8)
> > > (In reply to Jeffrey Walton from comment #7)
> > > >
> > > > I'm thinking CRYPTOPP_DISABLE_AESNI is the better choice. Do you mind if we
> > > > change CRYPTOPP_DISABLE_AES -> CRYPTOPP_DISABLE_AESNI?
> > > 
> > > No problem.
> > 
> > Changed at https://github.com/weidai11/cryptopp/commit/5422f0c13a57 .
> 
> We now have the problem in different file:
> 
> x86_64-pc-linux-gnu-g++ -O2 -pipe -march=nehalem -mno-popcnt
> -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c sha.cpp
> 
> In file included from
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
>                  from sha-simd.cpp:16:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function
> ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const
> word32*, size_t, CryptoPP::ByteOrder)’:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error:
> inlining failed in call to always_inline ‘__m128i
> _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
>  _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
>  ^~~~~~~~~~~~~~~~~~~
> sha-simd.cpp:384:46: note: called from here
>          E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);
> 
> ...
> 
> Also in -march=native - this is new:
> 
> x86_64-pc-linux-gnu-g++ -O2 -march=native -fomit-frame-pointer -pipe
> -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c sha.cpp
> In file included from
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
>                  from sha-simd.cpp:16:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function
> ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const
> word32*, size_t, CryptoPP::ByteOrder)’:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error:
> inlining failed in call to always_inline ‘__m128i
> _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
>  _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
>  ^~~~~~~~~~~~~~~~~~~
> sha-simd.cpp:384:46: note: called from here
>          E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);


Intel SHA is available on Goldmont and Goldmont+. -march=native would clear the SHA errors if you had, say, a Celeron J3455 (https://www.amazon.com/dp/B01LYCDG4H) because the Celeron is Goldmont.
Comment 15 Jeffrey Walton 2018-05-04 07:04:44 UTC
(In reply to email200202 from comment #12)
> I confirm Patrick Fourniols results.
> 
> - If I run emerge command for dev-libs/crypto++-7.0.0-r1, it fails to compile
> - But if I run make command from the work directory, it compiles ok.

It sounds like Emerge is _not_ adding the architecture specific flags required to build the *-simd.cp files. Also see https://www.cryptopp.com/wiki/BASE+SIMD.
Comment 16 Jeffrey Walton 2018-05-04 07:05:15 UTC
(In reply to Patrick Fourniols from comment #9)
> hello, i have no clue of what happened:
> i go in directory /var/tmp/portage/dev-libs/crypto++-7.0.0-r1/work
> i look in GNUMakefile and close it with no change.
> i do make just for see what happen.
> compile ok ;)
> i leave directory work
> i  do ebuild $(equery which crypto++) compile install qmerge
> all worked fine
> i do the same on another computer where emerge crypto failed with error like
> noticed here and same result
> so if anybody can explain me why i'll listen ;)

It sounds like Emerge is _not_ adding the architecture specific flags required to build the *-simd.cp files. Also see https://www.cryptopp.com/wiki/BASE+SIMD.
Comment 17 Jeffrey Walton 2018-05-04 07:45:56 UTC
Hi Everyone. Here is more on the BASE+SIMD compilation (https://www.cryptopp.com/wiki/BASE+SIMD). It is the cause of the problems in this report.

Crypto++ switched to BASE+SIMD to better support distros at Crypto++ 6.0. If you will remember, Crypto++ and the Makefile used to compile with -march=native. This was bad for several reasons for distros.

In BASE+SIMD the straight C++ implementation of an algorithm uses CXXFLAGS. The CXXFLAGS are the ones a user supplies via Emerge. Some source files need architecture specific options in addition to the user's CXXFLAGS. The source files that need them are the *-simd.cpp file.

This bug report lists two or three of the problems for i686/x86_64. The work arounds are rather easy once you know what the problem is. gcm-simd.cpp needs `$CXXFLAGS -mssse3 -mpclmul`. rijndael-simd.cpp needs `$CXXFLAGS -msse4.1 -maes`. And sha-simd.cpp needs `$CXXFLAGS -msse4.2 -msha`.

-----

The Makefile adds the architecture flags on the *-simd.cpp files for you. Ports to other build systems like Emerge and Bazel needs to add them. This is a one setup tasks.

In the Makefile here are the recipes for the object files that need the extra flags. Our task is to get this logic added to Emerge along with the ISA options. Also see https://github.com/weidai11/cryptopp/blob/master/GNUmakefile#L1040 :

# SSSE3 or NEON available
aria-simd.o : aria-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(ARIA_FLAG) -c) $<

# SSE4.1 or ARMv8a available
blake2-simd.o : blake2-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(BLAKE2_FLAG) -c) $<

# SSE2 on i586
sse-simd.o : sse-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(SSE_FLAG) -c) $<

# SSE4.2 or ARMv8a available
crc-simd.o : crc-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(CRC_FLAG) -c) $<

# PCLMUL or ARMv7a/ARMv8a available
gcm-simd.o : gcm-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(GCM_FLAG) -c) $<

# NEON available
neon-simd.o : neon-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(NEON_FLAG) -c) $<

# AltiVec, Power7, Power8 available
ppc-simd.o : ppc-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(ALTIVEC_FLAG) -c) $<

# AESNI or ARMv7a/ARMv8a available
rijndael-simd.o : rijndael-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(AES_FLAG) -c) $<

# SSE4.2/SHA-NI or ARMv8a available
sha-simd.o : sha-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(SHA_FLAG) -c) $<

# SSE4.2/SHA-NI or ARMv8a available
shacal2-simd.o : shacal2-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(SHA_FLAG) -c) $<

# SSSE3 or NEON available
simon-simd.o : simon-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(SIMON_FLAG) -c) $<

# SSSE3 or NEON available
speck-simd.o : speck-simd.cpp
	$(CXX) $(strip $(CXXFLAGS) $(SPECK_FLAG) -c) $<

The values for ARIA_FLAG, BLAKE2_FLAG, ..., GCM_FLAG, AES_FLAGS, ... is given at https://www.cryptopp.com/wiki/BASE+SIMD#Arch_Options .
Comment 18 Jeffrey Walton 2018-05-04 07:58:56 UTC
(In reply to Jeffrey Walton from comment #17)
> ...
> 
> The Makefile adds the architecture flags on the *-simd.cpp files for you.
> Ports to other build systems like Emerge and Bazel needs to add them. This
> is a one setup tasks.
> 
> In the Makefile here are the recipes for the object files that need the
> extra flags. Our task is to get this logic added to Emerge along with the
> ISA options. Also see
> https://github.com/weidai11/cryptopp/blob/master/GNUmakefile#L1040 :
> 
> # SSSE3 or NEON available
> aria-simd.o : aria-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(ARIA_FLAG) -c) $<
> 
> # SSE4.1 or ARMv8a available
> blake2-simd.o : blake2-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(BLAKE2_FLAG) -c) $<
> 
> # SSE2 on i586
> sse-simd.o : sse-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(SSE_FLAG) -c) $<
> 
> # SSE4.2 or ARMv8a available
> crc-simd.o : crc-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(CRC_FLAG) -c) $<
> 
> # PCLMUL or ARMv7a/ARMv8a available
> gcm-simd.o : gcm-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(GCM_FLAG) -c) $<
> 
> # NEON available
> neon-simd.o : neon-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(NEON_FLAG) -c) $<
> 
> # AltiVec, Power7, Power8 available
> ppc-simd.o : ppc-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(ALTIVEC_FLAG) -c) $<
> 
> # AESNI or ARMv7a/ARMv8a available
> rijndael-simd.o : rijndael-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(AES_FLAG) -c) $<
> 
> # SSE4.2/SHA-NI or ARMv8a available
> sha-simd.o : sha-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(SHA_FLAG) -c) $<
> 
> # SSE4.2/SHA-NI or ARMv8a available
> shacal2-simd.o : shacal2-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(SHA_FLAG) -c) $<
> 
> # SSSE3 or NEON available
> simon-simd.o : simon-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(SIMON_FLAG) -c) $<
> 
> # SSSE3 or NEON available
> speck-simd.o : speck-simd.cpp
> 	$(CXX) $(strip $(CXXFLAGS) $(SPECK_FLAG) -c) $<
> 
> The values for ARIA_FLAG, BLAKE2_FLAG, ..., GCM_FLAG, AES_FLAGS, ... is
> given at https://www.cryptopp.com/wiki/BASE+SIMD#Arch_Options .

I believe this is the documentation on Emerge: https://wiki.gentoo.org/wiki/Basic_guide_to_write_Gentoo_Ebuilds . It lacks a treatment of "build this source file with these flags".

If someone can provide cryptopp.ebuild with things stubbed out then I am happy to fill in the pieces. I don't have experience with Emerge so I don't know how to do it.
Comment 19 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-04 10:24:13 UTC
This has nothing to do with gentoo, I clearly stated that it has nothing to do with native in comment#13. What I showed you that not only it does not work for the reproduction case on comment#6 it also a regression for -march=native.

I also do not agree for your statements about native, but let's ignore that for now and focus.

Same reproduction as comment#6, now with your patch, maybe other patch is missing.

curl -L https://github.com/weidai11/cryptopp/commit/5422f0c13a57.patch | patch -p1
CXXFLAGS="-O2 -pipe -march=nehalem -DCRYPTOPP_DISABLE_AESNI" make -f GNUmakefile

g++ -O2 -pipe -march=nehalem -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe -c sha-simd.cpp

In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
                 from sha-simd.cpp:16:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const word32*, size_t, CryptoPP::ByteOrder)’:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error: inlining failed in call to always_inline ‘__m128i _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
 _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
 ^~~~~~~~~~~~~~~~~~~
sha-simd.cpp:384:46: note: called from here
         E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);
Comment 20 Jeffrey Walton 2018-05-04 14:58:37 UTC
(In reply to Alon Bar-Lev from comment #19)
> ...
> g++ -O2 -pipe -march=nehalem -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe
> -c sha-simd.cpp
> 
> In file included from
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
>                  from sha-simd.cpp:16:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function
> ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const
> word32*, size_t, CryptoPP::ByteOrder)’:
> /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error:
> inlining failed in call to always_inline ‘__m128i
> _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
>  _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
>  ^~~~~~~~~~~~~~~~~~~
> sha-simd.cpp:384:46: note: called from here
>          E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);

We just checked in a change for CRYPTOPP_DISABLE_SHANI similar to the one for CRYPTOPP_DISABLE_AESNI. Also see https://github.com/weidai11/cryptopp/commit/188e0df65008.

In general I am not sure if the strategy is the best one, though. If I am parsing things correctly (which I may not be doing), it seems like the hardware acceleration is being selectively disabled.

I think the "disable hardware" strategy was required for Crypto++ 5.6.5 and below because of [incorrect] assumptions we made. Crypto++ 6.0 changed that by switching to BASE+SIMD. BASE+SIMD allows us to cater to distros by compiling for a minimal machine in BASE. BASE is just a vanilla C++ implementation. If available at runtime, then the faster implementation from SIMD are used (like AESNI or SHANI).

We probably did not do a good job of communicating it to folks like Gentoo. Part of the problem is, I don't Emerge so I can't look at things and say "hey, X can probably be improved by using Y".
Comment 21 Jeffrey Walton 2018-05-04 15:07:06 UTC
@Alon,

I see there is a build log at https://bugs.gentoo.org/attachment.cgi?id=529340 . However, it is missing the invocation of the compiler. I have a couple of basic questions.

First, what compiler and version is being used?

Second, what is the command being used to compile source files?

Third, is it possible to add a V=1 to provide verbose output?

For item (3) I'm guessing I want something like the following to get a verbose output:

    make -j5 -s -f GNUmakefile all shared V=1

Sorry to rewind to the basics. I'm trying to understand what is going on so I can advise you on potential strategies that are available to you.
Comment 22 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-04 15:26:51 UTC
(In reply to Jeffrey Walton from comment #21)
> @Alon,
> 
> I see there is a build log at
> https://bugs.gentoo.org/attachment.cgi?id=529340 . However, it is missing
> the invocation of the compiler. I have a couple of basic questions.

I asked without -s see attachment#529520 [details].

> First, what compiler and version is being used?

You can see in attachment#529336 [details]:
sys-devel/gcc:            7.3.0-r1::gentoo

I can reproduce all using:
sys-devel/gcc:            6.4.0-r1::gentoo
 
> Second, what is the command being used to compile source files?

You can see this here:
https://github.com/gentoo/gentoo/blob/master/dev-libs/crypto%2B%2B/crypto%2B%2B-7.0.0-r1.ebuild#L29
 
> Third, is it possible to add a V=1 to provide verbose output?

Sure, but this does nothing I can notice, removing the "-s" should provide same affect.

> For item (3) I'm guessing I want something like the following to get a
> verbose output:
> 
>     make -j5 -s -f GNUmakefile all shared V=1
> 
> Sorry to rewind to the basics. I'm trying to understand what is going on so
> I can advise you on potential strategies that are available to you.


That's ok, however, I do provide you with a way to reproduce regardless of using emerge. Emerge is just an automation of what people could do manually. You should remember that unlike other distribution Gentoo is leveraging the build system to achieve maximum optimization of build, this is why you see problem that are triggered in Gentoo but are unrelated to Gentoo.
Comment 23 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-04 15:36:53 UTC
(In reply to Jeffrey Walton from comment #20)
> (In reply to Alon Bar-Lev from comment #19)
> > ...
> > g++ -O2 -pipe -march=nehalem -DCRYPTOPP_DISABLE_AESNI -fPIC -pthread -pipe
> > -c sha-simd.cpp
> > 
> > In file included from
> > /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/immintrin.h:71:0,
> >                  from sha-simd.cpp:16:
> > /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h: In function
> > ‘void CryptoPP::SHA1_HashMultipleBlocks_SHANI(CryptoPP::word32*, const
> > word32*, size_t, CryptoPP::ByteOrder)’:
> > /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/shaintrin.h:53:1: error:
> > inlining failed in call to always_inline ‘__m128i
> > _mm_sha1nexte_epu32(__m128i, __m128i)’: target specific option mismatch
> >  _mm_sha1nexte_epu32 (__m128i __A, __m128i __B)
> >  ^~~~~~~~~~~~~~~~~~~
> > sha-simd.cpp:384:46: note: called from here
> >          E0 = _mm_sha1nexte_epu32(E0, E0_SAVE);
> 
> We just checked in a change for CRYPTOPP_DISABLE_SHANI similar to the one
> for CRYPTOPP_DISABLE_AESNI. Also see
> https://github.com/weidai11/cryptopp/commit/188e0df65008.

Thanks, but I can see the same.

> In general I am not sure if the strategy is the best one, though. If I am
> parsing things correctly (which I may not be doing), it seems like the
> hardware acceleration is being selectively disabled.

Oh sure! you can see the "-DCRYPTOPP_DISABLE_AESNI" in my reproduction case. Gentoo is about optimization, user can disable hardware feature if used by package, your package enables disabling hardware acceleration feature, we trigger that based on USE flag.

> I think the "disable hardware" strategy was required for Crypto++ 5.6.5 and
> below because of [incorrect] assumptions we made. Crypto++ 6.0 changed that
> by switching to BASE+SIMD. BASE+SIMD allows us to cater to distros by
> compiling for a minimal machine in BASE. BASE is just a vanilla C++
> implementation. If available at runtime, then the faster implementation from
> SIMD are used (like AESNI or SHANI).

OK, you perform runtime detection of CPU features and executing the correct code? This is slower than doing build-time, but will let me remove the use of the CRYPTOPP_DISABLE_* macros.

> We probably did not do a good job of communicating it to folks like Gentoo.
> Part of the problem is, I don't Emerge so I can't look at things and say
> "hey, X can probably be improved by using Y".

In the changelog you can announce that switching to runtime optimization detection and deprecate the RYPTOPP_DISABLE_* flags, you could also have warning if these set in build system :)

Please acknowledge that I understood correct and package will be compatible for any cpu when built without special flags and will use cpu features if available.
Comment 24 Jeffrey Walton 2018-05-04 18:22:29 UTC
(In reply to Alon Bar-Lev from comment #23)
> (In reply to Jeffrey Walton from comment #20)
> > ...
> > 
> > We just checked in a change for CRYPTOPP_DISABLE_SHANI similar to the one
> > for CRYPTOPP_DISABLE_AESNI. Also see
> > https://github.com/weidai11/cryptopp/commit/188e0df65008.
> 
> Thanks, but I can see the same.

Thanks. It sounds like something is sideways on our side.

I need to get a couple of tests written to cover cases like this.

> > In general I am not sure if the strategy is the best one, though. If I am
> > parsing things correctly (which I may not be doing), it seems like the
> > hardware acceleration is being selectively disabled.
> 
> Oh sure! you can see the "-DCRYPTOPP_DISABLE_AESNI" in my reproduction case.
> Gentoo is about optimization, user can disable hardware feature if used by
> package, your package enables disabling hardware acceleration feature, we
> trigger that based on USE flag.
> 
> > I think the "disable hardware" strategy was required for Crypto++ 5.6.5 and
> > below because of [incorrect] assumptions we made. Crypto++ 6.0 changed that
> > by switching to BASE+SIMD. BASE+SIMD allows us to cater to distros by
> > compiling for a minimal machine in BASE. BASE is just a vanilla C++
> > implementation. If available at runtime, then the faster implementation from
> > SIMD are used (like AESNI or SHANI).
> 
> OK, you perform runtime detection of CPU features and executing the correct
> code? This is slower than doing build-time, but will let me remove the use
> of the CRYPTOPP_DISABLE_* macros.

Ah, yes. This is what I am looking for. Don't disable anything. Crypto++ 6.0 should properly detect the compile-time and runtime environments.

There will be a small penalty when a compare is performed (pseudo-code):

    if (HasAESNI())
    {
        Encrypt_AESNI(...);
    }
    else
    {
        Encrypt_CXX(...);
    }

The last time I bench-marked it the penalty was not measurable. However, since that time Specter and Meltdown have appeared. It may be measurable nowadays due to restraining branch prediction and speculative execution.

> > We probably did not do a good job of communicating it to folks like Gentoo.
> > Part of the problem is, I don't Emerge so I can't look at things and say
> > "hey, X can probably be improved by using Y".
> 
> In the changelog you can announce that switching to runtime optimization
> detection and deprecate the CRYPTOPP_DISABLE_* flags, you could also have
> warning if these set in build system :)

Yeah, let me think about that. Our makefile does have a few warnings, and this may be another good one to have since Crypto++ 6.0 changed a lot of things.

> Please acknowledge that I understood correct and package will be compatible
> for any cpu when built without special flags and will use cpu features if
> available.

Yes, please try without the CRYPTOPP_DISABLE_* flags. If you have to use a CRYPTOPP_DISABLE_* flag with Crypto++ 6+ and a newer toolchain then something is definitely broke on our end. You may need it with say, Fedora 15, due to an old linker or assembler but you should not need it on a modern setup.

We still have to plot a course for issue raised at https://github.com/weidai11/cryptopp/issues/653, but that is a separate problem.
Comment 25 Jeffrey Walton 2018-05-04 18:45:07 UTC
(In reply to Jeffrey Walton from comment #24)
> (In reply to Alon Bar-Lev from comment #23)
> ...
> > Please acknowledge that I understood correct and package will be compatible
> > for any cpu when built without special flags and will use cpu features if
> > available.
> 
> Yes, please try without the CRYPTOPP_DISABLE_* flags. If you have to use a
> CRYPTOPP_DISABLE_* flag with Crypto++ 6+ and a newer toolchain then
> something is definitely broke on our end. You may need it with say, Fedora
> 15, due to an old linker or assembler but you should not need it on a modern
> setup.

By the way... 

If possible, please test with both GCC and Clang. Both should work as expected. (I presume Gentoo supports both GCC and Clang).
Comment 26 Alon Bar-Lev (RETIRED) gentoo-dev 2018-05-04 19:22:08 UTC
(In reply to Jeffrey Walton from comment #25)
> (In reply to Jeffrey Walton from comment #24)
> > (In reply to Alon Bar-Lev from comment #23)
> > ...
> > > Please acknowledge that I understood correct and package will be compatible
> > > for any cpu when built without special flags and will use cpu features if
> > > available.
> > 
> > Yes, please try without the CRYPTOPP_DISABLE_* flags. If you have to use a
> > CRYPTOPP_DISABLE_* flag with Crypto++ 6+ and a newer toolchain then
> > something is definitely broke on our end. You may need it with say, Fedora
> > 15, due to an old linker or assembler but you should not need it on a modern
> > setup.

Done.

> 
> By the way... 
> 
> If possible, please test with both GCC and Clang. Both should work as
> expected. (I presume Gentoo supports both GCC and Clang).

Works.

Thanks!
Comment 27 Larry the Git Cow gentoo-dev 2018-05-04 19:22:49 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b973b43800f3114ed819fc32e3024bc7bbe0c0ed

commit b973b43800f3114ed819fc32e3024bc7bbe0c0ed
Author:     Alon Bar-Lev <alonbl@gentoo.org>
AuthorDate: 2018-05-04 19:06:41 +0000
Commit:     Alon Bar-Lev <alonbl@gentoo.org>
CommitDate: 2018-05-04 19:22:34 +0000

    dev-libs/crypto++: remove cpu-flag USE
    
    Closes: https://bugs.gentoo.org/654598
    Package-Manager: Portage-2.3.24, Repoman-2.3.6

 .../crypto++/{crypto++-7.0.0-r1.ebuild => crypto++-7.0.0-r2.ebuild} | 6 ------
 1 file changed, 6 deletions(-)
Comment 28 email200202 2018-05-05 10:20:16 UTC
dev-libs/crypto++-7.0.0-r2 compiles ok.