Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 877659 - dev-lang/perl-5.34.1-r3: build issue
Summary: dev-lang/perl-5.34.1-r3: build issue
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Perl team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-19 15:17 UTC by Jocelyn Mayer
Modified: 2024-05-12 02:30 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Build log for the "S0" server (S0_perl-5.34.1-r3:20221001-071921.log,86.74 KB, text/plain)
2022-10-19 15:19 UTC, Jocelyn Mayer
Details
output from emerge --info ==dev-lang/perl-5.34.1-r3 on "S0" server (S0-perl_infos,21.91 KB, text/plain)
2022-10-19 15:20 UTC, Jocelyn Mayer
Details
Build log for the "S1" server (S1-perl-5.34.1-r3:20221001-053006.log,100.99 KB, text/plain)
2022-10-19 15:21 UTC, Jocelyn Mayer
Details
Output from emerge --info =dev-lang/perl-5.34.1-r3 on "S1" server (S1-perl_infos,23.32 KB, text/plain)
2022-10-19 15:23 UTC, Jocelyn Mayer
Details
Build log for the laptop (perl-5.34.1-r3:20220924-055601.log,144.75 KB, text/x-log)
2022-10-19 15:25 UTC, Jocelyn Mayer
Details
Output from emerge --info =dev-lang/perl-5.34.1-r3 on the laptop (perl-infos,35.80 KB, text/plain)
2022-10-19 15:26 UTC, Jocelyn Mayer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jocelyn Mayer 2022-10-19 15:17:49 UTC
the build on this dev-lang/perl version fails on all my systems while launching miniperl which stops with the message:
Attempt to free unreferenced scalar: SV 0x561970a28330 at lib/strict.pm

Reproducible: Always

Steps to Reproduce:
1. Try to build dev-lang/perl-5.34.1-r3
2. Wait for the sentence
3.
Actual Results:  
Build fails

Expected Results:  
Build should succeed

I'm not used with perl, thus I don't feel like investigating about this problem.
As it seems that nobody have seen this bug before, it may be due to some global system configurations like hardening (on servers), kernel random memory mapping, or anything else...
Following the logs for 3 systems: 
S0: online server, hardened, minimal configuration with lots of security features
S1: local server, hardened, with more enabled features / software than S0
other: a laptop, not hardened with a lot of software installed
Comment 1 Jocelyn Mayer 2022-10-19 15:19:45 UTC
Created attachment 824755 [details]
Build log for the "S0" server
Comment 2 Jocelyn Mayer 2022-10-19 15:20:32 UTC
Created attachment 824757 [details]
output from emerge --info ==dev-lang/perl-5.34.1-r3 on "S0" server
Comment 3 Jocelyn Mayer 2022-10-19 15:21:33 UTC
Created attachment 824759 [details]
Build log for the "S1" server
Comment 4 Jocelyn Mayer 2022-10-19 15:23:04 UTC
Created attachment 824761 [details]
Output from emerge --info =dev-lang/perl-5.34.1-r3 on "S1" server
Comment 5 Jocelyn Mayer 2022-10-19 15:25:50 UTC
Created attachment 824763 [details]
Build log for the laptop
Comment 6 Jocelyn Mayer 2022-10-19 15:26:39 UTC
Created attachment 824765 [details]
Output from emerge --info =dev-lang/perl-5.34.1-r3 on the laptop
Comment 7 Jocelyn Mayer 2022-10-19 15:29:21 UTC
Note that I did took emerge --info for a previous version for the laptop because it seems that when the package did not merge we don't have the packages settings at the end of the log
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-10-28 07:07:15 UTC
My bet is this is somehow related to bug 821577, i.e. the -fno-strict-aliasing is getting dropped somewhere.

Is that enough of a hint for now for you to poke at it more? I don't have time atm unfortunately.
Comment 9 Jocelyn Mayer 2022-10-28 08:57:05 UTC
(In reply to Sam James from comment #8)
> My bet is this is somehow related to bug 821577, i.e. the
> -fno-strict-aliasing is getting dropped somewhere.
> 
> Is that enough of a hint for now for you to poke at it more? I don't have
> time atm unfortunately.

You're right, it seems to be the exact same bug with the only difference as I don't get a segfault after the crash.
I will try to investigate the build flags, as you suggest, thanks !
Comment 10 Jocelyn Mayer 2022-10-28 12:22:43 UTC
(In reply to Jocelyn Mayer from comment #9)
> (In reply to Sam James from comment #8)
> > My bet is this is somehow related to bug 821577, i.e. the
> > -fno-strict-aliasing is getting dropped somewhere.
> > 
> > Is that enough of a hint for now for you to poke at it more? I don't have
> > time atm unfortunately.
> 
> You're right, it seems to be the exact same bug with the only difference as
> I don't get a segfault after the crash.
> I will try to investigate the build flags, as you suggest, thanks !

I can confirm that the problem is due to strict aliasing issue.
The '-fno-strict-aliasing' flag is not dropped by the build system but I don't enable it as global compilation flag.
Forcing it to be used (using a dedicated /etc/portage/env file) made me able to build dev-lang/perl on 5 different systems with different configurations (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems that test would failed at least for some configuration but I did not investigate this.
I noticed that there still are numerous compilation warnings about potential undefinied behavior including, but not only, code like : s += SOMETHING(s);

For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to me to be a reasonable fix.
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-10-28 12:26:25 UTC
(In reply to Jocelyn Mayer from comment #10)
> (In reply to Jocelyn Mayer from comment #9)
> > (In reply to Sam James from comment #8)
> > > My bet is this is somehow related to bug 821577, i.e. the
> > > -fno-strict-aliasing is getting dropped somewhere.
> > > 
> > > Is that enough of a hint for now for you to poke at it more? I don't have
> > > time atm unfortunately.
> > 
> > You're right, it seems to be the exact same bug with the only difference as
> > I don't get a segfault after the crash.
> > I will try to investigate the build flags, as you suggest, thanks !
> 
> I can confirm that the problem is due to strict aliasing issue.
> The '-fno-strict-aliasing' flag is not dropped by the build system but I
> don't enable it as global compilation flag.

The weird thing is it's enabled by -O2. I was suspicious that perhaps your(?) "-Wstrict-aliasing" was confusing the (very weird) Perl build system (maybe it greps for "strict-aliasing" and then skips addint -fno-strict-aliasing if present).

> Forcing it to be used (using a dedicated /etc/portage/env file) made me able
> to build dev-lang/perl on 5 different systems with different configurations
> (hardened enabled / disabled, lto en / dis, ...), all amd64 based. It seems
> that test would failed at least for some configuration but I did not
> investigate this.

Fantastic! Yeah, it's obviously failing for some weird reason.

> I noticed that there still are numerous compilation warnings about potential
> undefinied behavior including, but not only, code like : s += SOMETHING(s);
> 
> For the issue I reported, forcing fno-strict-aliasing in the ebuild seems to
> me to be a reasonable fix.

Given this is the second time it's come up, I agree, let's just bung it in append-flags.
Comment 12 Larry the Git Cow gentoo-dev 2022-10-28 12:37:52 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=28d674673b7e0198d2770ec5a780966555dbbc6a

commit 28d674673b7e0198d2770ec5a780966555dbbc6a
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-10-28 12:36:37 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-10-28 12:37:31 +0000

    dev-lang/perl: always pass -fno-strict-aliasing
    
    We keep getting these bugs where it turns out the build system
    didn't pass it for us, so just unconditionally do it.
    
    Closes: https://bugs.gentoo.org/877659
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-lang/perl/{perl-5.34.1-r3.ebuild => perl-5.34.1-r4.ebuild} | 5 ++++-
 dev-lang/perl/{perl-5.36.0.ebuild => perl-5.36.0-r1.ebuild}    | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)
Comment 13 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-12 01:05:55 UTC
This was bugging me so came back to it.

For me, it does:
```
Checking if your compiler accepts -fno-strict-aliasing
Yes, it does.
Leaving current flags  -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -f
record-gcc-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-obje
ct -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to
-pointer-cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv alone.
```

With a debug print added:
```
Checking for  -O3 -march=native -mtls-dialect=gnu2 -flto=jobserver -fno-semantic-interposition -pipe -fcf-protection=none -fdiagnostics-color=always -fdiagnostics-urls=never -frecord-gc
c-switches -Wa,-O2 -Wa,-mtune=znver2 -Wstrict-aliasing -Wfree-nonheap-object -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wstrict-aliasing -Wfree-nonheap-object -Werro
r=lto-type-mismatch -Werror=strict-aliasing -Werror=odr -Wbuiltin-declaration-mismatch -ggdb3 -Wformat -Wformat-security -Waddress -Warray-bounds -Wfree-nonheap-object -Wint-to-pointer-
cast -Wmain -Wnonnull -Wodr -Wreturn-type -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized -Wvarargs -DNO_PERL_RAND_SEED -fwrapv in *strict-aliasing*
```

i.e. it does:
```
Checking for ... -Werror=strict-aliasing ... in *strict-aliasing*
```

and obviously it matches.

The relevant snippet in Configure is:
```
: argument order is deliberate, as the flag will start with - which set could
: think is an option
checkccflag='check=$1; flag=$2; callback=$3;
echo " ";
echo "Checking if your compiler accepts $flag" >&4;
[ "X$sysroot" != "X" ] && echo "For sysroot = $sysroot";
echo "int main(void) { return 0; }" > gcctest.c;
if $cc $_sysroot -O2 $flag -o gcctest gcctest.c 2>gcctest.out && $run ./gcctest; then
    echo "Yes, it does." >&4;
    if $test -s gcctest.out ; then
        echo "But your platform does not like it:";
        cat gcctest.out;
    else
        case "$ccflags" in
        *$check*)
            echo "Leaving current flags $ccflags alone." >&4
            ;;
        *) dflt="$dflt $flag";
            eval $callback
            ;;
        esac
    fi
else
    echo "Nope, it does not, but that is ok." >&4;
fi
'
```

So, essentially, you got penalised for trying to do the right thing (by using -Werror=x).

It is unfortunate that it strips the initial -fno- prefix in the glob it checks with...

Now, all that said, I don't think we can actually do anything about this except unconditionally continue to pass -fno-strict-aliasing. But at least we know why we're doing it now. The build system gets confused by -Werror=x and it'll wrongly not add it for us then.
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-12 01:08:25 UTC
There's one other consideration, maybe: I think if **we** append -fno-strict-aliasing like we are now, it ends up in the "Perl global flags", but if we let Perl's configure add it, I think it might be only for Perl itself.

So, that's a reason to figure out a way to fix it. Maybe given we know Perl will pass -fno-sa, we can filter -Werror=strict-aliasing (ew)?
Comment 15 Eli Schwartz 2024-05-12 02:01:48 UTC
Would be quite hilarious if one day they stop requiring fno-strict-aliasing...

This sounds like an upstream bug that they check against "strict-aliasing" instead of "fno-strict-aliasing", can that be fixed upstream? Do they document a reason for this?

Do they want to catch people passing -fstrict-aliasing and not override their choice? That sounds misguided given they "know" it doesn't work. But then they should still check for both exactly.
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-12 02:30:09 UTC
(In reply to Eli Schwartz from comment #15)
> Would be quite hilarious if one day they stop requiring
> fno-strict-aliasing...
> 
> This sounds like an upstream bug that they check against "strict-aliasing"
> instead of "fno-strict-aliasing", can that be fixed upstream? Do they
> document a reason for this?
> 
> Do they want to catch people passing -fstrict-aliasing and not override
> their choice? That sounds misguided given they "know" it doesn't work. But
> then they should still check for both exactly.

```
        ?*)     set strict-aliasing -fno-strict-aliasing
                eval $checkccflag
```

Then checkccflag.. checks against the name (first arg)...

From skimming other examples of checkccflag use, this appears to be an unfortunate accident? I think?