| Summary: | dev-libs/ncnn-20200413 - In file included from .../work/ncnn-20200413/src/layer/x86/convolution_x86.cpp:22: /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/fmaintrin.h:63:1: error: inlining failed in call to always_inline ‘__m256 _mm256_fmadd_ps(__m256 ... | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Sergey 'L29Ah' Alirzaev <zl29ah> |
| Component: | Current packages | Assignee: | Piotr Karbowski (RETIRED) <slashbeast> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | ionen, sam |
| Priority: | Normal | ||
| Version: | unspecified | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
| Attachments: |
build log
emerge --info |
||
Created attachment 647418 [details]
build log
Created attachment 647420 [details]
emerge --info
It seems that march=ivybridge triggers it, works here without march and with march=zenvn2. Can you confirm that dropping march fixes your issue? If so, then we need to bring it up upstream. There's one new ncnn version that I plan to package this weekend so maybe this one will have better results. Yeah, dropping march=ivybridge makes it build. This seems still present in 20210525 version.
-march=ivybridge -mno-avx will also work for this.
Current case here seem to be caused by lacking FMA but any cases where have avx but not avx2 will lead to similar issues. I don't think this code gets tested on avx + no-avx2 much.
If wanted to workaround this from the ebuild until upstream gets to revising these, the "dirty" way would either be:
A) find src -type f -exec sed -i s/__AVX__/__AVX2__/g {} + || die
or
B) use cpu_flags_x86_avx2 || append-flags -mno-avx
For 'B' I don't particularly like the idea of introducing a USE for a workaround, 'A' might be worth some consideration (not that I like it much either).
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=182a324f44e60684c6c72750ead69fb7149b3012 commit 182a324f44e60684c6c72750ead69fb7149b3012 Author: Ionen Wolkens <sudinave@gmail.com> AuthorDate: 2021-05-29 07:38:30 +0000 Commit: Piotr Karbowski <slashbeast@gentoo.org> CommitDate: 2021-05-31 19:30:38 +0000 dev-libs/ncnn: add 20210525 ebuild updates: - respect $(get_libdir) - build shared library over static (this new version also exports additional symbols needed by waifu2x-ncnn-vulkan for shared linking) - install more docs - add IUSE for tools and vulkan - add ZLIB license for *_mathfun.h - add temporary workaround for bug 730468 - scrubbed previous patch and added upstream issue link (still needed) Closes: https://bugs.gentoo.org/730468 Signed-off-by: Ionen Wolkens <sudinave@gmail.com> Signed-off-by: Piotr Karbowski <slashbeast@gentoo.org> dev-libs/ncnn/Manifest | 1 + dev-libs/ncnn/files/ncnn-fix-glslang-include.patch | 10 +-- dev-libs/ncnn/metadata.xml | 4 ++ dev-libs/ncnn/ncnn-20210525.ebuild | 76 ++++++++++++++++++++++ 4 files changed, 84 insertions(+), 7 deletions(-) The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=92a9da73390b1f2d90011e6b6a6fa8cf25512cec commit 92a9da73390b1f2d90011e6b6a6fa8cf25512cec Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2022-04-20 01:06:53 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2022-04-20 02:39:39 +0000 dev-libs/ncnn: add 20220419 Also remove now unnecessary avx workaround wrt bug #730468, entire usage was refactored upstream and the sed now cause issues instead. Bug: https://bugs.gentoo.org/730468 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> dev-libs/ncnn/Manifest | 1 + dev-libs/ncnn/ncnn-20220419.ebuild | 73 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) |
FAILED: src/CMakeFiles/ncnn.dir/layer/x86/convolution_x86.cpp.o /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -I/var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src -Isrc -I/var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src/layer -O2 -pipe -march=ivybridge -mtune=native -ftree-vectorize -malign-data=cacheline -mtls-dialect=gnu2 -fPIC -Wall -Wextra -Wno-unused-function -fvisibility=hidden -fvisibility-inlines-hidden -fopenmp -MD -MT src/CMakeFiles/ncnn.dir/layer/x86/convolution_x86.cpp.o -MF src/CMakeFiles/ncnn.dir/layer/x86/convolution_x86.cpp.o.d -o src/CMakeFiles/ncnn.dir/layer/x86/convolution_x86.cpp.o -c /var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src/layer/x86/convolution_x86.cpp In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/immintrin.h:107, from /var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src/layer/x86/convolution_x86.cpp:22: /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/fmaintrin.h: In function ‘_ZN4ncnnL24conv3x3s1_winograd23_sseERKNS_3MatERS0_S2_S2_RKNS_6OptionE._omp_fn.1’: /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/fmaintrin.h:63:1: error: inlining failed in call to always_inline ‘__m256 _mm256_fmadd_ps(__m256, __m256, __m256)’: target specific option mismatch 63 | _mm256_fmadd_ps (__m256 __A, __m256 __B, __m256 __C) | ^~~~~~~~~~~~~~~ In file included from /var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src/layer/x86/convolution_x86.cpp:32: /var/tmp/portage/dev-libs/ncnn-20200413/work/ncnn-20200413/src/layer/x86/convolution_3x3.h:506:45: note: called from here 506 | _sum3n = _mm256_fmadd_ps(_r0n, _k3n, _sum3n); | ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ and a lot of similar errors