Summary: | [Tracker] >=sys-libs/glibc-2.28 changes behavior of getdents() readdir() and friends, breaking qemu emulation | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | David Flogeras <dflogeras2> |
Component: | Current packages | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | CONFIRMED --- | ||
Severity: | major | CC: | alexander, chewi, sam |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugs.gentoo.org/show_bug.cgi?id=471102 https://bugs.gentoo.org/show_bug.cgi?id=722158 https://bugs.gentoo.org/show_bug.cgi?id=922642 |
||
Whiteboard: | patched in gentoo | ||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 681894, 681896, 684884, 686502, 833191 | ||
Bug Blocks: | 663916 |
Description
David Flogeras
2019-03-26 10:37:25 UTC
Interesting. Given the structure of the problem that you describe, we would need to fix the programs using glibc... So let's make this here a tracker and file bugs for packages that are affected. Practical question, how "usual" is it to still have 32bit OS *not* using LFS? (Don't we kind of force this on by now?) Where is it forced on? I'm certainly not well versed in this, and perhaps my chroot is just in a bad state and it's not as big an issue as would seem? _LARGE_FILE_OFFSET is set on package-by-package basis. On autoconf buld systems AC_SYS_LARGEFILE macro enables passing proper flags. https://bugs.gentoo.org/471102 has more details. Hmm, I didn't see that other bug initially. Should we close this as a duplicate and add bugs to the other tracker? Or just proceed attaching them here? There's a relevant discussion on LKML https://lkml.org/lkml/2018/12/27/155 It seems to be more to do with calling getdents on a 64bit kernel, where the guest expects 32bit behaviour. Definitely affects 32bit qemu, not sure yet if it affects a 32bit native (i686) chroot without qemu emulation. And here's the glibc bug https://sourceware.org/bugzilla/show_bug.cgi?id=23960 It seems there's a conundrum of: qemu thinks glibc should fix it, glibc thinks kernel should fix it, kernel did nothing. Now glibc thinks distros should fix it by forcing LFS on. I haven't had luck building sandbox with lfs turned on (see #681892). Hello. I have created a bunch of patches with new "getdents64_x32" syscall here https://bugzilla.kernel.org/show_bug.cgi?id=205957. It is hard to apply them. I've provided an order of applyment. Now I am able to build applications inside arm(via qemu user) without any problem (no LFS workarounds required). Thanks for the patch Andrew! We will unlikely apply the in Gentoo as they change syscall surface quite a bit. But it's a nice illustration of a problem in kernel/user interface. I've been in contact with the glibc developer who made the initial changes to break this. I've tested a preliminary patch he's working on the correct this behaviour (at least put it back to the semi-broken way it was pre glibc-2.28). The initial patch makes it possible to work around the issue with qemu, but it does also seem to break something on ARM. He has also put the patch forward for review to the glibc maintainers. I'll update here as things progress. PS As discussed here (which I just noticed the link was broken when I first reported) https://bugs.launchpad.net/qemu/+bug/1805913 Some people have had success working around the issue by just moving their chroots to a non-ext4 filesystem in the interim. The issue stems from the fact that ext4 always stores a 64bit hash in the off_t field of struct dirent, and when 64bit host qemu gets the 64bit value from the 64bit kernel, it cannot fit it into the 32bit value in the guest properly. Alternatively you could also compile qemu as a 32bit application. YMMV Wow I dropped the ball on this. A patchset was put forward in October (2020). https://sourceware.org/pipermail/libc-alpha/2020-October/118274.html I have tested it and seems to reverse the bad behaviour discussed in a 32bit qemu-chroot running atop a 64bit kernel, on an ext4 filesystem that uses 64bit hash in the off_t. Maybe if some more of us replied to the thread on the ML (it seems to have been reviewed, then fell off the horizon), we could get this accepted into mainline glibc. Currently I have been hand patching my chroot to use 2.33 + those patches, but its only a matter of time before the patchset falls behind. The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=33d1e367129c3ddef0558e989fbd75a8a39738ea commit 33d1e367129c3ddef0558e989fbd75a8a39738ea Author: Andreas K. Hüttel <dilfridge@gentoo.org> AuthorDate: 2022-01-12 22:00:13 +0000 Commit: Andreas K. Hüttel <dilfridge@gentoo.org> CommitDate: 2022-01-12 23:00:07 +0000 sys-libs/glibc: 2.34 patchset 11 and re-keyword Bug: https://bugs.gentoo.org/681790 Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org> sys-libs/glibc/Manifest | 2 +- sys-libs/glibc/glibc-2.34-r6.ebuild | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) |