ebuild(5) says binchecks Disable all QA checks for binaries. This should ONLY be used in packages for which binary checks make no sense (linux-headers and kernel-sources, for example, can safely be skipped since they have no binaries). If the binary checks need to be skipped for other reasons (such as proprietary binaries), see the QA CONTROL VARIABLES section for more specific exemptions. and kernel-2.eclass has # No need to run scanelf/strip on kernel sources/headers (bug #134453). RESTRICT="binchecks strip" But https://gitweb.gentoo.org/proj/portage.git/commit/?id=61571e4e8ee3f4f999782c542b4be84d8be8c729 intentionally removed this functionality. Our stage-building powerpc VMs have slow network storage, and the longest part of the ISO build is installing Gentoo sources, and a massive part of that is unnecessarily running scanelf over the entire source tree, which RESTRICT=binchecks was supposed to avoid. How can we fix this?
Created attachment 666137 [details, diff] pax-utils patch to use mmap MAP_SHARED instead of MAP_PRIVATE I suspect that the root cause of the performance problem may be related to mmap usage in the pax-utils _readelf_fd function (it's sub-optimal if the mmap usage causes the whole file to be transferred over the network when only the ELF header is needed). I wonder if MAP_SHARED would perform better than MAP_PRIVATE for this slow network storage scenario. So, I've prepared the attached patch for us to test with.
There's another recursive scanelf call inside bin/estrip. If we generate the NEEDED.ELF.2 file just a little earlier, then we can use the NEEDED.ELF.2 content to avoid the need to call scanelf inside estrip. Also, we can potentially minimize scanelf mmap calls by feeding it a list of ELF files that we've identified earlier by reading only the first 4 bytes of each file.
Thanks. I applied the patch to pax-utils, and I still see a significant amount of time spent running scanelf. Perhaps that's because of the other instance you found.
I think some madvise(2) calls with with DONTNEED advise based on a heuristic could possibly prevent retrieval of each whole file.
Created attachment 666413 [details, diff] pax-utils patch to use mmap MAP_SHARED + MADV_DONTNEED (In reply to Fabian Groffen from comment #4) > I think some madvise(2) calls with with DONTNEED advise based on a heuristic > could possibly prevent retrieval of each whole file. Thanks! Patch updated.
On bogsucker, I can measure no difference with either Zac's patch nor Sam's patch (https://github.com/gentoo/portage/pull/794). Running this command, 3 times each: # time emerge gentoo-sources --quiet --nodeps baseline real 4m42.458s 4m45.121s 4m45.477s user 3m9.676s 3m12.809s 3m12.877s sys 1m44.897s 1m45.134s 1m45.443s zac's pax-utils patch real 4m46.385s 4m46.696s 4m48.320s user 3m13.302s 3m13.751s 3m15.885s sys 1m45.790s 1m45.841s 1m45.104s sam's portage patch real 4m55.568s 4m47.790s 4m45.641s user 3m22.925s 3m14.849s 3m12.768s sys 1m45.434s 1m45.992s 1m45.568s but none of that made any sense, so I added a return as the first statement of install_qa_check() and I see only a very small improvement: real 4m39.837s 4m39.181s 4m39.394s user 3m12.140s 3m12.548s 3m12.380s sys 1m40.550s 1m39.447s 1m39.978s I wish I'd taken some benchmarks when I filed this bug. I feel like the scanelf invocation itself was taking minutes then.
I realized that we've since moved /var/tmp/portage to a tmpfs mount, which helps things, so I reran the benchmark with /var/tmp/portage just on the regular network-backed file system: real 5m0.065s 4m57.032s 4m55.467s user 3m9.064s 3m7.733s 3m4.978s sys 2m9.380s 2m7.549s 2m8.210s sam's patch real 4m57.560s 4m57.770s 4m58.006s user 3m6.511s 3m8.077s 3m8.158s sys 2m9.251s 2m7.882s 2m8.039s Still now seeing the kind of improvement I'd expect, and also not seeing huge amounts of time spent in scanelf. Unsure what's going on.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=bb88e766897f5e7e0b0a10c48cf99a04edb73a40 commit bb88e766897f5e7e0b0a10c48cf99a04edb73a40 Author: Sam James <sam@gentoo.org> AuthorDate: 2022-03-05 05:32:45 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-03-28 06:20:45 +0000 estrip: avoid calling scanelf twice We can use the previous scanelf data to not call it again to find all of the dynamically linked executables/libraries which need stripping. Bug: https://bugs.gentoo.org/749624 Closes: https://github.com/gentoo/portage/pull/794 Reviewed-by: Zac Medico <zmedico@gentoo.org> Signed-off-by: Sam James <sam@gentoo.org> bin/estrip | 3 ++- bin/misc-functions.sh | 29 +++++++++++++++-------------- 2 files changed, 17 insertions(+), 15 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=095d8c5b040eb399c245db1d3923d6a85747bbcf commit 095d8c5b040eb399c245db1d3923d6a85747bbcf Author: Benda Xu <heroxbd@gentoo.org> AuthorDate: 2022-07-18 15:04:02 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-07-18 15:16:46 +0000 estrip: fix double prefix The content of build-info/NEEDED is with EPREFIX, the following sed should only prepend the paths with D. Bug: https://bugs.gentoo.org/749624 Bug: https://bugs.gentoo.org/858818 Reference: https://github.com/gentoo/portage/pull/794 Signed-off-by: Benda Xu <heroxbd@gentoo.org> Signed-off-by: Sam James <sam@gentoo.org> bin/estrip | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=e8f0e06784549fc4c8a06e0a43ff946d6a895683 commit e8f0e06784549fc4c8a06e0a43ff946d6a895683 Author: Sam James <sam@gentoo.org> AuthorDate: 2022-08-01 01:44:02 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-08-06 21:00:28 +0000 estrip: apply scanelf optimisation to EAPI 7+ / dostrip (No need to do ${D%/} etc here as dostrip is in EAPI 7+ only.) See: bb88e766897f5e7e0b0a10c48cf99a04edb73a40 Bug: https://bugs.gentoo.org/749624 Bug: https://bugs.gentoo.org/862606 Signed-off-by: Sam James <sam@gentoo.org> Closes: https://github.com/gentoo/portage/pull/879 Signed-off-by: Sam James <sam@gentoo.org> bin/estrip | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) https://gitweb.gentoo.org/proj/portage.git/commit/?id=7f6e83707daf4a7d8231b968078a1ba94d80c697 commit 7f6e83707daf4a7d8231b968078a1ba94d80c697 Author: Sam James <sam@gentoo.org> AuthorDate: 2022-08-01 00:05:47 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-08-06 21:00:28 +0000 estrip: avoid spurious NEEDED warning when no ELF installed (< EAPI 7) If no ELF files are installed, then we're obviously not going to have a NEEDED file. The reason this didn't get hit sooner is because apparently very few of us are actually testing EAPI 6 actively - which figures. We only take this path for < EAPI 7 (non-dostrip) which means we're actually missing the estrip optimisation in bug #749624 for newer EAPIs! (Will be fixed in a followup). Bug: https://bugs.gentoo.org/749624 Bug: https://bugs.gentoo.org/862606 Fixes: bb88e766897f5e7e0b0a10c48cf99a04edb73a40 Signed-off-by: Sam James <sam@gentoo.org> bin/estrip | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)