Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 916069 - sys-libs/compiler-rt: dev-util/hipcc fails to compile sensible FP16 code if not setting -O3 or -march
Summary: sys-libs/compiler-rt: dev-util/hipcc fails to compile sensible FP16 code if n...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: LLVM support project
URL: https://github.com/ROCmSoftwarePlatfo...
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2023-10-21 12:13 UTC by Yiyang Wu
Modified: 2023-11-24 20:19 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
patch used by fedora (0001-compiler-rt-Fix-FLOAT16-feature-detection.patch,1.30 KB, patch)
2023-10-21 12:16 UTC, Yiyang Wu
Details | Diff
compiler-rt-17.0.4 build and test (build-compiler-rt-17.0.4.log,136.29 KB, text/x-log)
2023-11-18 15:16 UTC, Yiyang Wu
Details
compiler-rt-16.0.6 build and test (build-compiler-rt-16.0.6.log,141.28 KB, text/x-log)
2023-11-18 15:16 UTC, Yiyang Wu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yiyang Wu 2023-10-21 12:13:33 UTC
Compile an pure c++ source with hipcc:

```c++
#include <iostream>
#include <algorithm>
#include <vector>

int main()
{
  const size_t count = 4;

  std::vector<_Float16> f16(count);
  std::vector<float> f32(count);

  for(size_t i = 0 ; i < count; ++i)
    f16[i] = static_cast<_Float16>(0.123) * static_cast<_Float16>(i);
  
  std::copy_n(f16.begin(), count, f32.begin());

  for(size_t i = 0; i < count; ++i)
    {
      std::cout << "loop f16=" << static_cast<float>(f16[i]) << " f32=" << f32[i] << std::endl;
    }

  _Float16 one_f16 = 0.123;
  float one_f32 = one_f16;
  std::cout << "one f32=" << one_f32 << std::endl;
  return 0;
}
```

and the program gives wrong result, if not specifying -march or -O3. specifying -march=native on Zen3 and Zen4 machine, or specify -O3, does not have this issue.

And this issue can be solved by a fedora patch: https://src.fedoraproject.org/fork/tstellar/rpms/compiler-rt/raw/0459cbc5d9eb15f1ad51d74707b4988049183708/f/0001-compiler-rt-Fix-FLOAT16-feature-detection.patch

According to https://github.com/ROCmSoftwarePlatform/rocFFT/issues/439#issuecomment-1714779352, it seems that llvm upstream does not consider merge fedora's patch. We have to apply it in Gentoo in order to resolve the bug.  

Reproducible: Always

Steps to Reproduce:
1. emerge =dev-util/hip-5.7.1
2. hipcc float16.cpp -o float16
3. ./float16
Comment 2 Yiyang Wu 2023-10-21 12:16:12 UTC
Created attachment 873194 [details, diff]
patch used by fedora
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-10-21 12:18:17 UTC
The patch looks OK to pull in, but I'm wondering if tstellar tried to submit it somewhere or not. Could you ask him?
Comment 4 Yiyang Wu 2023-10-21 12:22:54 UTC
(In reply to Sam James from comment #3)
> The patch looks OK to pull in, but I'm wondering if tstellar tried to submit
> it somewhere or not. Could you ask him?

https://github.com/ROCmSoftwarePlatform/rocFFT/issues/439#issuecomment-1714779352

About one months ago, I've asked, and he does not intend to upstream the patch. Though I does not understand why it's not an upstream bug, I either does not understand the details of the bug, so I believed his/her judge.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-10-21 12:26:17 UTC
Ah, thanks. I suspect it wouldn't be rejected, but rather finding a reviewer is difficult. But it's not a blocker for us pulling it in IMO.
Comment 6 Yiyang Wu 2023-10-21 12:39:54 UTC
(In reply to Sam James from comment #5)
> Ah, thanks. I suspect it wouldn't be rejected, but rather finding a reviewer
> is difficult. But it's not a blocker for us pulling it in IMO.

Maybe, I think it it should be raised, and let upstream decide if the patch should be merged.

I have checked the patch. Basically it's an issue of whether compile llvm-project as a whole or compile each component separately. If compiled as a whole, the CMAKE_TRY_COMPILE_TARGET_TYPE is correctly set to STATIC and there's no issue. I'm not sure why upstream make CMAKE_TRY_COMPILE_TARGET_TYPE non STATIC for non-whole-llvm compilation, maybe there's a reason
Comment 7 Yiyang Wu 2023-11-18 14:56:19 UTC
https://github.com/llvm/llvm-project/pull/69842 is still in reviewing, while Mgorny already approved it, and it is currently clear that at least fix this issue in Gentoo. Shall we pull it into our patch set first?

Currently >=compiler-rt-16 needs this.
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-18 15:00:26 UTC
(In reply to Yiyang Wu from comment #7)
> https://github.com/llvm/llvm-project/pull/69842 is still in reviewing, while
> Mgorny already approved it, and it is currently clear that at least fix this
> issue in Gentoo. Shall we pull it into our patch set first?
> 
> Currently >=compiler-rt-16 needs this.

compiler-rt-16 tests fail with it which made us feel unsure about what to do with it even for 17. Please investigate.
Comment 9 Yiyang Wu 2023-11-18 15:16:10 UTC
Created attachment 875146 [details]
compiler-rt-17.0.4 build and test
Comment 10 Yiyang Wu 2023-11-18 15:16:38 UTC
Created attachment 875147 [details]
compiler-rt-16.0.6 build and test
Comment 11 Yiyang Wu 2023-11-18 15:18:16 UTC
(In reply to Sam James from comment #8)
> 
> compiler-rt-16 tests fail with it which made us feel unsure about what to do
> with it even for 17. Please investigate.

On my system 16.0.6 failed three tests and 17.0.4 is fine with the patch:

Failed Tests (2):                                                                                                                                                          
  Builtins-x86_64-linux :: extendhfsf2_test.c                                                                                                                              
  Builtins-x86_64-linux :: truncdfhf2_test.c

Is this the same failure you encountered? I will take a look at this.
Comment 12 Yiyang Wu 2023-11-18 15:30:53 UTC
I pick up extendhfsf2_test to take a look:

error in test__extendhfsf2(0x7e00) = 0.000000, expected nan

while in compiler-rt-17.0.4, the test past with consistent value nan.
Comment 13 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-18 16:29:24 UTC
(In reply to Yiyang Wu from comment #11)
> (In reply to Sam James from comment #8)
> > 
> > compiler-rt-16 tests fail with it which made us feel unsure about what to do
> > with it even for 17. Please investigate.
> 
> On my system 16.0.6 failed three tests and 17.0.4 is fine with the patch:
> 
> Failed Tests (2):                                                           
> 
>   Builtins-x86_64-linux :: extendhfsf2_test.c                               
> 
>   Builtins-x86_64-linux :: truncdfhf2_test.c
> 
> Is this the same failure you encountered? I will take a look at this.

Yeah, that's it. Thanks!
Comment 14 Yiyang Wu 2023-11-24 13:33:40 UTC
I figured out what cause the different behavior between compiler-rt-16 and 17:

The executed commands to compile truncdfhf2_test.c are

```
/opt/gentoo/usr/lib/llvm/16/bin/clang   -gline-tables-only  -m64  -fno-builtin -I /run/user/18014/portage/sys-libs/compiler-rt-16.0.6-r1/work/compiler-rt/lib/builtins -nodefaultlibs /run/user/18014/portage/sys-libs/compiler-rt-16.0.6-r1/work/compiler-rt/test/builtins/Unit/extendhfsf2_test.c /run/user/18014/portage/sys-libs/compiler-rt-16.0.6-r1/work/compiler-rt-16.0.6_build/lib/linux/libclang_rt.builtins-x86_64.a -lc -lm -o /run/user/18014/portage/sys-libs/compiler-rt-16.0.6-r1/work/compiler-rt-16.0.6_build/test/builtins/Unit/X86_64LinuxConfig/Output/extendhfsf2_test.c.tmp.2

/opt/gentoo/usr/lib/llvm/17/bin/clang   -gline-tables-only  -m64 -DCOMPILER_RT_HAS_FLOAT16  -fno-builtin -I /run/user/18014/portage/sys-libs/compiler-rt-17.0.4-r1/work/compiler-rt/lib/builtins -nodefaultlibs /run/user/18014/portage/sys-libs/compiler-rt-17.0.4-r1/work/compiler-rt/test/builtins/Unit/truncdfhf2_test.c /run/user/18014/portage/sys-libs/compiler-rt-17.0.4-r1/work/compiler-rt-17.0.4_build/lib/linux/libclang_rt.builtins-x86_64.a -lc -lm -o /run/user/18014/portage/sys-libs/compiler-rt-17.0.4-r1/work/compiler-rt-17.0.4_build/test/builtins/Unit/X86_64LinuxConfig/Output/truncdfhf2_test.c.tmp
```

The error is caused by missing `-DCOMPILER_RT_HAS_FLOAT16` when testing compiler-rt-16. With the patch, and with `-DCOMPILER_RT_HAS_FLOAT16` manually added, the error disappears. =

Interestingly, without that patch, but manually adding `-DCOMPILER_RT_HAS_FLOAT16`, the error comes out. And this is also true for compiler-rt-17.

So for both 16 and 17:

no patch + no -DCOMPILER_RT_HAS_FLOAT16 = pass
patch + -DCOMPILER_RT_HAS_FLOAT16 = pass
patch + no -DCOMPILER_RT_HAS_FLOAT16 = fail
no patch + -DCOMPILER_RT_HAS_FLOAT16 = fail

However the test suite automatically add `-DCOMPILER_RT_HAS_FLOAT16` if the patch is applied, and that's why it pass the test
Comment 15 Yiyang Wu 2023-11-24 13:44:32 UTC
(In reply to Yiyang Wu from comment #14)
> 
> So for both 16 and 17:
> 
> no patch + no -DCOMPILER_RT_HAS_FLOAT16 = pass
> patch + -DCOMPILER_RT_HAS_FLOAT16 = pass
> patch + no -DCOMPILER_RT_HAS_FLOAT16 = fail
> no patch + -DCOMPILER_RT_HAS_FLOAT16 = fail
> 
> However the test suite automatically add `-DCOMPILER_RT_HAS_FLOAT16` if the
> patch is applied, and that's why it pass the test

And I confirmed that https://github.com/llvm/llvm-project/commit/f2bb57c19455e77fd71be782f9ee9092f31bd2a9 fix the broken test of compiler-rt-16, by automatically append `-DCOMPILER_RT_HAS_FLOAT16` when building truncdfhf2_test.c, which makes 16 and 17 behaves the same.

So my suggestion is, the patch is OK, and it's only one problem of testing in compiler-rt-16 which is fixed by https://github.com/llvm/llvm-project/commit/f2bb57c19455e77fd71be782f9ee9092f31bd2a9 in compiler-rt-17.
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-24 19:34:45 UTC
Thank you! Let's add it then.
Comment 17 Larry the Git Cow gentoo-dev 2023-11-24 20:19:17 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=866069267e1b71a27b1dc1d718180c731696fd92

commit 866069267e1b71a27b1dc1d718180c731696fd92
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2023-11-24 20:18:33 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2023-11-24 20:19:14 +0000

    sys-libs/compiler-rt: Add FP16 fixes
    
    Closes: https://bugs.gentoo.org/916069
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 sys-libs/compiler-rt/Manifest                      |   2 +
 ...-17.0.5.ebuild => compiler-rt-16.0.6-r4.ebuild} |   1 +
 sys-libs/compiler-rt/compiler-rt-17.0.5-r1.ebuild  | 178 +++++++++++++++++++++
 3 files changed, 181 insertions(+)