the build on this dev-lang/perl version fails on all my systems while launching miniperl which stops with the message: Attempt to free unreferenced scalar: SV 0x561970a28330 at lib/strict.pm Reproducible: Always Steps to Reproduce: 1. Try to build dev-lang/perl-5.34.1-r3 2. Wait for the sentence 3. Actual Results: Build fails Expected Results: Build should succeed I'm not used with perl, thus I don't feel like investigating about this problem. As it seems that nobody have seen this bug before, it may be due to some global system configurations like hardening (on servers), kernel random memory mapping, or anything else... Following the logs for 3 systems: S0: online server, hardened, minimal configuration with lots of security features S1: local server, hardened, with more enabled features / software than S0 other: a laptop, not hardened with a lot of software installed
Created attachment 824755 [details] Build log for the "S0" server
Created attachment 824757 [details] output from emerge --info ==dev-lang/perl-5.34.1-r3 on "S0" server
Created attachment 824759 [details] Build log for the "S1" server
Created attachment 824761 [details] Output from emerge --info =dev-lang/perl-5.34.1-r3 on "S1" server
Created attachment 824763 [details] Build log for the laptop
Created attachment 824765 [details] Output from emerge --info =dev-lang/perl-5.34.1-r3 on the laptop
Note that I did took emerge --info for a previous version for the laptop because it seems that when the package did not merge we don't have the packages settings at the end of the log
My bet is this is somehow related to bug 821577, i.e. the -fno-strict-aliasing is getting dropped somewhere. Is that enough of a hint for now for you to poke at it more? I don't have time atm unfortunately.
(In reply to Sam James from comment #8) > My bet is this is somehow related to bug 821577, i.e. the > -fno-strict-aliasing is getting dropped somewhere. > > Is that enough of a hint for now for you to poke at it more? I don't have > time atm unfortunately. You're right, it seems to be the exact same bug with the only difference as I don't get a segfault after the crash. I will try to investigate the build flags, as you suggest, thanks !
(In reply to Jocelyn Mayer from comment #9) > (In reply to Sam James from comment #8) > > My bet is this is somehow related to bug 821577, i.e. the > > -fno-strict-aliasing is getting dropped somewhere. > > > > Is that enough of a hint for now for you to poke at it more? I don't have > > time atm unfortunately. > > You're right, it seems to be the exact same bug with the only difference as > I don't get a segfault after the crash. > I will try to investigate the build flags, as you suggest, thanks ! I can confirm that the problem is due to strict aliasing issue. The '-fno-strict-aliasing' flag is not dropped by the build system but I don't enable it as global compilation flag. Forcing it to be used (using a dedicated /etc/portage/env file) made me able to build dev-lang/perl on 5 different systems with different configurations (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems that test would failed at least for some configuration but I did not investigate this. I noticed that there still are numerous compilation warnings about potential undefinied behavior including, but not only, code like : s += SOMETHING(s); For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to me to be a reasonable fix.
(In reply to Jocelyn Mayer from comment #10) > (In reply to Jocelyn Mayer from comment #9) > > (In reply to Sam James from comment #8) > > > My bet is this is somehow related to bug 821577, i.e. the > > > -fno-strict-aliasing is getting dropped somewhere. > > > > > > Is that enough of a hint for now for you to poke at it more? I don't have > > > time atm unfortunately. > > > > You're right, it seems to be the exact same bug with the only difference as > > I don't get a segfault after the crash. > > I will try to investigate the build flags, as you suggest, thanks ! > > I can confirm that the problem is due to strict aliasing issue. > The '-fno-strict-aliasing' flag is not dropped by the build system but I > don't enable it as global compilation flag. The weird thing is it's enabled by -O2. I was suspicious that perhaps your(?) "-Wstrict-aliasing" was confusing the (very weird) Perl build system (maybe it greps for "strict-aliasing" and then skips addint -fno-strict-aliasing if present). > Forcing it to be used (using a dedicated /etc/portage/env file) made me able > to build dev-lang/perl on 5 different systems with different configurations > (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems > that test would failed at least for some configuration but I did not > investigate this. Fantastic! Yeah, it's obviously failing for some weird reason. > I noticed that there still are numerous compilation warnings about potential > undefinied behavior including, but not only, code like : s += SOMETHING(s); > > For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to > me to be a reasonable fix. Given this is the second time it's come up, I agree, let's just bung it in append-flags.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=28d674673b7e0198d2770ec5a780966555dbbc6a commit 28d674673b7e0198d2770ec5a780966555dbbc6a Author: Sam James <sam@gentoo.org> AuthorDate: 2022-10-28 12:36:37 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-10-28 12:37:31 +0000 dev-lang/perl: always pass -fno-strict-aliasing We keep getting these bugs where it turns out the build system didn't pass it for us, so just unconditionally do it. Closes: https://bugs.gentoo.org/877659 Signed-off-by: Sam James <sam@gentoo.org> dev-lang/perl/{perl-5.34.1-r3.ebuild => perl-5.34.1-r4.ebuild} | 5 ++++- dev-lang/perl/{perl-5.36.0.ebuild => perl-5.36.0-r1.ebuild} | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)
This was bugging me so came back to it. For me, it does: ``` Checking if your compiler accepts -fno-strict-aliasing Yes, it does. Leaving current flags -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -f record-gcc-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-obje ct -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to -pointer-cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv alone. ``` With a debug print added: ``` Checking for -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -frecord-gc c-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-object -Werro r=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to-pointer- cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv in *strict-aliasing* ``` i.e. it does: ``` Checking for ... -Werror=strict-aliasing ... in *strict-aliasing* ``` and obviously it matches. The relevant snippet in Configure is: ``` : argument order is deliberate, as the flag will start with - which set could : think is an option checkccflag='check=$1; flag=$2; callback=$3; echo " "; echo "Checking if your compiler accepts $flag" >&4; [ "X$sysroot" != "X" ] && echo "For sysroot = $sysroot"; echo "int main(void) { return 0; }" > gcctest.c; if $cc $_sysroot -O2 $flag -o gcctest gcctest.c 2>gcctest.out && $run ./gcctest; then echo "Yes, it does." >&4; if $test -s gcctest.out ; then echo "But your platform does not like it:"; cat gcctest.out; else case "$ccflags" in *$check*) echo "Leaving current flags $ccflags alone." >&4 ;; *) dflt="$dflt $flag"; eval $callback ;; esac fi else echo "Nope, it does not, but that is ok." >&4; fi ' ``` So, essentially, you got penalised for trying to do the right thing (by using -Werror=x). It is unfortunate that it strips the initial -fno- prefix in the glob it checks with... Now, all that said, I don't think we can actually do anything about this except unconditionally continue to pass -fno-strict-aliasing. But at least we know why we're doing it now. The build system gets confused by -Werror=x and it'll wrongly not add it for us then.
There's one other consideration, maybe: I think if **we** append -fno-strict-aliasing like we are now, it ends up in the "Perl global flags", but if we let Perl's configure add it, I think it might be only for Perl itself. So, that's a reason to figure out a way to fix it. Maybe given we know Perl will pass -fno-sa, we can filter -Werror=strict-aliasing (ew)?
Would be quite hilarious if one day they stop requiring fno-strict-aliasing... This sounds like an upstream bug that they check against "strict-aliasing" instead of "fno-strict-aliasing", can that be fixed upstream? Do they document a reason for this? Do they want to catch people passing -fstrict-aliasing and not override their choice? That sounds misguided given they "know" it doesn't work. But then they should still check for both exactly.
(In reply to Eli Schwartz from comment #15) > Would be quite hilarious if one day they stop requiring > fno-strict-aliasing... > > This sounds like an upstream bug that they check against "strict-aliasing" > instead of "fno-strict-aliasing", can that be fixed upstream? Do they > document a reason for this? > > Do they want to catch people passing -fstrict-aliasing and not override > their choice? That sounds misguided given they "know" it doesn't work. But > then they should still check for both exactly. ``` ?*) set strict-aliasing -fno-strict-aliasing eval $checkccflag ``` Then checkccflag.. checks against the name (first arg)... From skimming other examples of checkccflag use, this appears to be an unfortunate accident? I think?
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=69d4cc312072ad28491d2dce5798dd49d63713e5 commit 69d4cc312072ad28491d2dce5798dd49d63713e5 Author: Sam James <sam@gentoo.org> AuthorDate: 2024-05-22 01:40:06 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2024-05-22 01:43:56 +0000 perl-module.eclass: don't set 'ccflags' for Module::Build TL;DR: If we set 'ccflags', we're clobbering the Perl default. We're already setting 'optimize' which is what we're supposed to use here. We set ccflags *and* optimize for Module::Build (which dev-perl/Net-DNS uses), while we only set OPTIMIZE (case is fine) for MM (which dev-perl/Net-LibIDN2 uses). ccflags clobbers the Perl default, while optimize appends. We should just set optimize - to not clobber what Perl sets, but also for consistency between the two build systems). (Unfortunately, this does mean we also inherit things we don't really want to, which don't affect ABI, like -fno-strict-aliasing, but let's live with it for now...) Bug: https://bugs.gentoo.org/261375 Bug: https://bugs.gentoo.org/877659 Closes: https://bugs.gentoo.org/932176 Signed-off-by: Sam James <sam@gentoo.org> eclass/perl-module.eclass | 1 - 1 file changed, 1 deletion(-)