789105 – >=dev-lang/spidermonkey-78.10.1 with lto has a 5 times longer compile time when compiled with gcc-11.1.0

Bug 789105 - >=dev-lang/spidermonkey-78.10.1 with lto has a 5 times longer compile time when compiled with gcc-11.1.0

Summary: >=dev-lang/spidermonkey-78.10.1 with lto has a 5 times longer compile time wh...

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal normal (vote)
Assignee:	Mozilla Gentoo Team

URL:
Whiteboard:	no stable blocker
Keywords:

Depends on:
Blocks:	gcc-11
	Show dependency tree

Reported:	2021-05-09 18:25 UTC by tt_1
Modified:	2023-04-19 09:59 UTC (History)
CC List:	3 users (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
output from emerge --info (emerge-info,5.09 KB, text/plain) 2021-05-09 18:25 UTC, tt_1	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description tt_1 2021-05-09 18:25:46 UTC

Created attachment 706725 [details]
output from emerge --info

compile time with gcc-11.1.0 and USE="+jit -clang -debug -lto -test" is 5 minutes 50 seconds with -j12

compile time with gcc-11.1.0 and USE="+jit +lto -clang -debug -test" is 27 minutes and 41 seconds

thats more than five times longer, and it seems its due to lto not using all cores but rather struggeling along with one job only 

see emerge --info for a few details, gcc-11.1.0 is active version of gcc 

what can I do to debug this further? its not a compile failure in the end

Comment 1 Jonas Stein gentoo-dev

2021-05-09 20:36:42 UTC

It is sad to read that you have problems compiling the software. The situation seems to be a bit more complicate and requires some analysis.
We can not help you efficiently via bug tracker. The bug tracker aims rather on specific problems in .ebuilds and less on individual systems. 

I have had very good experience on the gentoo IRC [1] with questions like this. Of course there are also forums and mailing lists [2,3].
I hope you understand, that I will close the bug here therefore and wish you good luck on one of the mentioned channels [4].
Please reopen the ticket in order to provide an indication for an specific error in an ebuild or any gentoo related product.

[1] https://www.gentoo.org/get-involved/irc-channels/
[2] https://forums.gentoo.org/
[3] https://www.gentoo.org/get-involved/mailing-lists/all-lists.html
[4] https://www.gentoo.org/support/

Comment 2 John Helmert III archtester

2021-05-09 21:22:17 UTC

The ebuild actually does have logic to do LTO parallelization:

sed -i \
	-e "s/multiprocessing.cpu_count()/$(makeopts_jobs)/" \
	build/moz.configure/lto-pgo.configure \
	|| die "sed failed to set num_cores"

Maybe this bug is indeed valid.

Comment 3 Thomas Deutschmann (RETIRED) gentoo-dev

2021-05-10 01:24:30 UTC

I don't share your observations:

GCC:
real    2m11,911s
user    22m58,006s
sys     2m40,440s

LTO, GCC:
real    3m7,879s
user    28m44,101s
sys     6m22,420s

LTO, CLANG:
real    3m21,129s
user    29m19,864s
sys     3m33,536s

At first I would have guessed that ld.gold will be called without --thread-count argument which could be the limit but clang's lld uses threads but is slower.

So start with profiling and see where it spends time.

Comment 4 tt_1 2021-05-10 15:23:02 UTC

its using only one core for all the lto related things in this gcc-11 tinderbox chroot, while it uses all 12 cores in the c/c++ part before the lto linking. were there any changes in the way gcc-11.1.0 expects flto=$(makeopts_jobs) to be handed over? 

p.s: tried dev-lang/rust-1.51.0-r2 instead, same long compile time

Comment 5 tt_1 2021-05-10 16:03:54 UTC

output from emerge -pc gcc: 

[ebuild   R   ~] sys-devel/gcc-11.1.0:11::gentoo  USE="(cxx) fortran (multilib) nls nptl openmp (pie) sanitize ssp zstd (-ada) -custom-cflags -d -debug -doc (-fixed-point) -go -graphite (-hardened) -jit (-libssp) -lto -objc -objc++ -objc-gc -pch -pgo -systemtap -test -valgrind -vanilla -vtv" 0 KiB

Total: 1 package (1 reinstall), Size of downloads: 0 KiB

Comment 6 Thomas Deutschmann (RETIRED) gentoo-dev

2021-05-10 16:08:29 UTC

Mh, not sure: https://dev.gentoo.org/~whissi/stuff/bug789105.webm

At 2m15 it is starting with a single lto1 process but at 2m24 you see multiple processes and at 2m25 the lto-wrapper has spawned 42 jobs...

Comment 7 tt_1 2021-06-29 18:42:46 UTC

(In reply to Thomas Deutschmann from comment #6)
> Mh, not sure: https://dev.gentoo.org/~whissi/stuff/bug789105.webm
> 
> At 2m15 it is starting with a single lto1 process but at 2m24 you see
> multiple processes and at 2m25 the lto-wrapper has spawned 42 jobs...

I reproduced again with fresh stage3, using llvm-11.1.0 and rust-1.51.0

I will try again with new stable llvm-12 toolchain and rust-1.52.1, if this combination has the same probleme can I ask you to spend a few cpu cycles to reproduce, given that I provide a simple one? thanks

Comment 8 tt_1 2021-08-07 16:24:30 UTC

I'm seeing a similar behavoir when cross compiling firefox[lto] with gcc-9.4.0 as the cross compiler, but only *after* I made the switch to dev-util/pkgconf

Comment 9 tt_1 2021-08-07 17:22:19 UTC

(In reply to tt_1 from comment #8)
> I'm seeing a similar behavoir when cross compiling firefox[lto] with
> gcc-9.4.0 as the cross compiler, but only *after* I made the switch to
> dev-util/pkgconf

The compile time of firefox doesn't double, only the lto-trans process keeps staying at one job for about three minutes, just to go full -j12 and finish with around double the amount of time for lto linking 

in all, its up from 46 to 49 minutes of compile time for firefox[lto]

I'm not really sure if I want to spend many hours of cpu time with debugging and hunting this heisenbug :D

Comment 10 Sam James archtester

2023-04-19 06:12:42 UTC

Is this still an issue?

Comment 11 tt_1 2023-04-19 09:59:48 UTC

Propably a bug deep in the toolchain, there were countless binutils or llvm lto bugs fixed since I initially filled that bug. Its not a problem anymore with recent gcc-11 or gcc-12 toolchains, thats for sure.