Summary: | dev-lang/perl-5.34.1-r3: build issue | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Jocelyn Mayer <l_indien> |
Component: | Current packages | Assignee: | Gentoo Perl team <perl> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | eschwartz93, l_indien |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugs.gentoo.org/show_bug.cgi?id=821577 https://bugs.gentoo.org/show_bug.cgi?id=856112 https://github.com/Perl/perl5/issues/14556 |
||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
Build log for the "S0" server
output from emerge --info ==dev-lang/perl-5.34.1-r3 on "S0" server Build log for the "S1" server Output from emerge --info =dev-lang/perl-5.34.1-r3 on "S1" server Build log for the laptop Output from emerge --info =dev-lang/perl-5.34.1-r3 on the laptop |
Description
Jocelyn Mayer
2022-10-19 15:17:49 UTC
Created attachment 824755 [details]
Build log for the "S0" server
Created attachment 824757 [details]
output from emerge --info ==dev-lang/perl-5.34.1-r3 on "S0" server
Created attachment 824759 [details]
Build log for the "S1" server
Created attachment 824761 [details]
Output from emerge --info =dev-lang/perl-5.34.1-r3 on "S1" server
Created attachment 824763 [details]
Build log for the laptop
Created attachment 824765 [details]
Output from emerge --info =dev-lang/perl-5.34.1-r3 on the laptop
Note that I did took emerge --info for a previous version for the laptop because it seems that when the package did not merge we don't have the packages settings at the end of the log My bet is this is somehow related to bug 821577, i.e. the -fno-strict-aliasing is getting dropped somewhere. Is that enough of a hint for now for you to poke at it more? I don't have time atm unfortunately. (In reply to Sam James from comment #8) > My bet is this is somehow related to bug 821577, i.e. the > -fno-strict-aliasing is getting dropped somewhere. > > Is that enough of a hint for now for you to poke at it more? I don't have > time atm unfortunately. You're right, it seems to be the exact same bug with the only difference as I don't get a segfault after the crash. I will try to investigate the build flags, as you suggest, thanks ! (In reply to Jocelyn Mayer from comment #9) > (In reply to Sam James from comment #8) > > My bet is this is somehow related to bug 821577, i.e. the > > -fno-strict-aliasing is getting dropped somewhere. > > > > Is that enough of a hint for now for you to poke at it more? I don't have > > time atm unfortunately. > > You're right, it seems to be the exact same bug with the only difference as > I don't get a segfault after the crash. > I will try to investigate the build flags, as you suggest, thanks ! I can confirm that the problem is due to strict aliasing issue. The '-fno-strict-aliasing' flag is not dropped by the build system but I don't enable it as global compilation flag. Forcing it to be used (using a dedicated /etc/portage/env file) made me able to build dev-lang/perl on 5 different systems with different configurations (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems that test would failed at least for some configuration but I did not investigate this. I noticed that there still are numerous compilation warnings about potential undefinied behavior including, but not only, code like : s += SOMETHING(s); For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to me to be a reasonable fix. (In reply to Jocelyn Mayer from comment #10) > (In reply to Jocelyn Mayer from comment #9) > > (In reply to Sam James from comment #8) > > > My bet is this is somehow related to bug 821577, i.e. the > > > -fno-strict-aliasing is getting dropped somewhere. > > > > > > Is that enough of a hint for now for you to poke at it more? I don't have > > > time atm unfortunately. > > > > You're right, it seems to be the exact same bug with the only difference as > > I don't get a segfault after the crash. > > I will try to investigate the build flags, as you suggest, thanks ! > > I can confirm that the problem is due to strict aliasing issue. > The '-fno-strict-aliasing' flag is not dropped by the build system but I > don't enable it as global compilation flag. The weird thing is it's enabled by -O2. I was suspicious that perhaps your(?) "-Wstrict-aliasing" was confusing the (very weird) Perl build system (maybe it greps for "strict-aliasing" and then skips addint -fno-strict-aliasing if present). > Forcing it to be used (using a dedicated /etc/portage/env file) made me able > to build dev-lang/perl on 5 different systems with different configurations > (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems > that test would failed at least for some configuration but I did not > investigate this. Fantastic! Yeah, it's obviously failing for some weird reason. > I noticed that there still are numerous compilation warnings about potential > undefinied behavior including, but not only, code like : s += SOMETHING(s); > > For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to > me to be a reasonable fix. Given this is the second time it's come up, I agree, let's just bung it in append-flags. The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=28d674673b7e0198d2770ec5a780966555dbbc6a commit 28d674673b7e0198d2770ec5a780966555dbbc6a Author: Sam James <sam@gentoo.org> AuthorDate: 2022-10-28 12:36:37 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-10-28 12:37:31 +0000 dev-lang/perl: always pass -fno-strict-aliasing We keep getting these bugs where it turns out the build system didn't pass it for us, so just unconditionally do it. Closes: https://bugs.gentoo.org/877659 Signed-off-by: Sam James <sam@gentoo.org> dev-lang/perl/{perl-5.34.1-r3.ebuild => perl-5.34.1-r4.ebuild} | 5 ++++- dev-lang/perl/{perl-5.36.0.ebuild => perl-5.36.0-r1.ebuild} | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-) This was bugging me so came back to it. For me, it does: ``` Checking if your compiler accepts -fno-strict-aliasing Yes, it does. Leaving current flags -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -f record-gcc-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-obje ct -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to -pointer-cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv alone. ``` With a debug print added: ``` Checking for -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -frecord-gc c-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-object -Werro r=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to-pointer- cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv in *strict-aliasing* ``` i.e. it does: ``` Checking for ... -Werror=strict-aliasing ... in *strict-aliasing* ``` and obviously it matches. The relevant snippet in Configure is: ``` : argument order is deliberate, as the flag will start with - which set could : think is an option checkccflag='check=$1; flag=$2; callback=$3; echo " "; echo "Checking if your compiler accepts $flag" >&4; [ "X$sysroot" != "X" ] && echo "For sysroot = $sysroot"; echo "int main(void) { return 0; }" > gcctest.c; if $cc $_sysroot -O2 $flag -o gcctest gcctest.c 2>gcctest.out && $run ./gcctest; then echo "Yes, it does." >&4; if $test -s gcctest.out ; then echo "But your platform does not like it:"; cat gcctest.out; else case "$ccflags" in *$check*) echo "Leaving current flags $ccflags alone." >&4 ;; *) dflt="$dflt $flag"; eval $callback ;; esac fi else echo "Nope, it does not, but that is ok." >&4; fi ' ``` So, essentially, you got penalised for trying to do the right thing (by using -Werror=x). It is unfortunate that it strips the initial -fno- prefix in the glob it checks with... Now, all that said, I don't think we can actually do anything about this except unconditionally continue to pass -fno-strict-aliasing. But at least we know why we're doing it now. The build system gets confused by -Werror=x and it'll wrongly not add it for us then. There's one other consideration, maybe: I think if **we** append -fno-strict-aliasing like we are now, it ends up in the "Perl global flags", but if we let Perl's configure add it, I think it might be only for Perl itself. So, that's a reason to figure out a way to fix it. Maybe given we know Perl will pass -fno-sa, we can filter -Werror=strict-aliasing (ew)? Would be quite hilarious if one day they stop requiring fno-strict-aliasing... This sounds like an upstream bug that they check against "strict-aliasing" instead of "fno-strict-aliasing", can that be fixed upstream? Do they document a reason for this? Do they want to catch people passing -fstrict-aliasing and not override their choice? That sounds misguided given they "know" it doesn't work. But then they should still check for both exactly. (In reply to Eli Schwartz from comment #15) > Would be quite hilarious if one day they stop requiring > fno-strict-aliasing... > > This sounds like an upstream bug that they check against "strict-aliasing" > instead of "fno-strict-aliasing", can that be fixed upstream? Do they > document a reason for this? > > Do they want to catch people passing -fstrict-aliasing and not override > their choice? That sounds misguided given they "know" it doesn't work. But > then they should still check for both exactly. ``` ?*) set strict-aliasing -fno-strict-aliasing eval $checkccflag ``` Then checkccflag.. checks against the name (first arg)... From skimming other examples of checkccflag use, this appears to be an unfortunate accident? I think? |