Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 681790 - [Tracker] >=sys-libs/glibc-2.28 changes behavior of getdents() readdir() and friends, breaking qemu emulation
Summary: [Tracker] >=sys-libs/glibc-2.28 changes behavior of getdents() readdir() and ...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal major (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard: patched in gentoo
Keywords:
Depends on: 681894 681896 684884 686502 glibc-2.34-stable
Blocks: glibc-2.28
  Show dependency tree
 
Reported: 2019-03-26 10:37 UTC by David Flogeras
Modified: 2024-01-22 09:40 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Flogeras 2019-03-26 10:37:25 UTC
It seems that this commit

https://sourceware.org/git/?p=glibc.git;a=commit;h=298d0e3129c0b5137f4989275b13fe30d0733c4d

Has changed the old (wrong) behavior of getdents on a 32bit OS not using Large File Support (LFS), to now correctly fail instead of "wrapping around" on >32bit inodes and offsets.  This also affects readdir().

However, this correction can manifest in horrible ways, whereby software compiled without _LARGE_FILE_OFFSET 64 at al. that used to 'work' because it didn't actually use the d_ino or d_off fields of getdents now will get back a NULL pointer when calling readdir() with errno set to EOVERFLOW.  This at least can cause the following:

 'shared-mime-info' fails to generate its database without any error message, breaking any consumers of the database

 'update-ca-certificates' calls c_rehash and fails silently, which breaks https for applications that use it.

 sandbox uses readdir() and can print warnings like the following (not sure the effect but probably not sandboxed):

       (sandbox) error: in /var/tmp/portage/sys-apps/sandbox-2.13/work/sandbox-2.13/libsbutil/src/file.c, function rc_ls_dir(), line 122:
       (sandbox)        strerror() = 'Value too large for defined data type'
       (sandbox)        Failed to readdir() '/etc/sandbox.d'!



I suspect that the above (and other undiscovered) programs will need to be patched with LFS support.

Several of us in #gentoo-arm have already seen this behavior inside 32bit chroots via qemu-static wrapper, but I imagine it will also affect anyone left using x86 32bit userland with filesystems that use >32bit for inodes such as ext4 and xfs.

See qemu bug filed here (although it may in fact not simply be a qemu bug):
https://bugs.launchpad.net/qemu/+bug/1805913it

Reproducible: Always
Comment 1 Andreas K. Hüttel archtester gentoo-dev 2019-03-26 11:11:30 UTC
Interesting. Given the structure of the problem that you describe, we would need to fix the programs using glibc... So let's make this here a tracker and file bugs for packages that are affected.

Practical question, how "usual" is it to still have 32bit OS *not* using LFS?
(Don't we kind of force this on by now?)
Comment 2 David Flogeras 2019-03-26 21:36:14 UTC
Where is it forced on?  I'm certainly not well versed in this, and perhaps my chroot is just in a bad state and it's not as big an issue as would seem?
Comment 3 Sergei Trofimovich (RETIRED) gentoo-dev 2019-03-26 22:53:47 UTC
_LARGE_FILE_OFFSET is set on package-by-package basis. On autoconf buld systems AC_SYS_LARGEFILE macro enables passing proper flags.

https://bugs.gentoo.org/471102 has more details.
Comment 4 David Flogeras 2019-03-27 09:21:22 UTC
Hmm, I didn't see that other bug initially.  Should we close this as a duplicate and add bugs to the other tracker?  Or just proceed attaching them here?
Comment 5 David Flogeras 2019-04-16 13:22:55 UTC
There's a relevant discussion on LKML

https://lkml.org/lkml/2018/12/27/155

It seems to be more to do with calling getdents on a 64bit kernel, where the guest expects 32bit behaviour.  Definitely affects 32bit qemu, not sure yet if it affects a 32bit native (i686) chroot without qemu emulation.
Comment 6 David Flogeras 2019-05-01 12:21:21 UTC
And here's the glibc bug

https://sourceware.org/bugzilla/show_bug.cgi?id=23960

It seems there's a conundrum of: qemu thinks glibc should fix it, glibc thinks kernel should fix it, kernel did nothing.  Now glibc thinks distros should fix it by forcing LFS on.

I haven't had luck building sandbox with lfs turned on (see #681892).
Comment 7 Andrew Aladjev 2019-12-25 23:01:40 UTC
Hello. I have created a bunch of patches with new "getdents64_x32" syscall here https://bugzilla.kernel.org/show_bug.cgi?id=205957. It is hard to apply them. I've provided an order of applyment.

Now I am able to build applications inside arm(via qemu user) without any problem (no LFS workarounds required).
Comment 8 Sergei Trofimovich (RETIRED) gentoo-dev 2019-12-26 11:29:51 UTC
Thanks for the patch Andrew! We will unlikely apply the in Gentoo as they change syscall surface quite a bit. But it's a nice illustration of a problem in kernel/user interface.
Comment 9 David Flogeras 2020-05-10 09:54:16 UTC
I've been in contact with the glibc developer who made the initial changes to break this.  I've tested a preliminary patch he's working on the correct this behaviour (at least put it back to the semi-broken way it was pre glibc-2.28).

The initial patch makes it possible to work around the issue with qemu, but it does also seem to break something on ARM.   He has also put the patch forward for review to the glibc maintainers.  I'll update here as things progress.
Comment 10 David Flogeras 2020-05-10 09:58:35 UTC
PS

As discussed here (which I just noticed the link was broken when I first reported)

https://bugs.launchpad.net/qemu/+bug/1805913

Some people have had success working around the issue by just moving their chroots to a non-ext4 filesystem in the interim.  The issue stems from the fact that ext4 always stores a 64bit hash in the off_t field of struct dirent, and when 64bit host qemu gets the 64bit value from the 64bit kernel, it cannot fit it into the 32bit value in the guest properly.  Alternatively you could also compile qemu as a 32bit application.  YMMV
Comment 11 David Flogeras 2021-05-22 19:15:54 UTC
Wow I dropped the ball on this.  A patchset was put forward in October (2020).

https://sourceware.org/pipermail/libc-alpha/2020-October/118274.html

I have tested it and seems to reverse the bad behaviour discussed in a 32bit qemu-chroot running atop a 64bit kernel, on an ext4 filesystem that uses 64bit 
hash in the off_t.

Maybe if some more of us replied to the thread on the ML (it seems to have been reviewed, then fell off the horizon), we could get this accepted into mainline glibc.  Currently I have been hand patching my chroot to use 2.33 + those patches, but its only a matter of time before the patchset falls behind.
Comment 12 Larry the Git Cow gentoo-dev 2022-01-12 23:00:17 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=33d1e367129c3ddef0558e989fbd75a8a39738ea

commit 33d1e367129c3ddef0558e989fbd75a8a39738ea
Author:     Andreas K. Hüttel <dilfridge@gentoo.org>
AuthorDate: 2022-01-12 22:00:13 +0000
Commit:     Andreas K. Hüttel <dilfridge@gentoo.org>
CommitDate: 2022-01-12 23:00:07 +0000

    sys-libs/glibc: 2.34 patchset 11 and re-keyword
    
    Bug: https://bugs.gentoo.org/681790
    Package-Manager: Portage-3.0.30, Repoman-3.0.3
    Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>

 sys-libs/glibc/Manifest             | 2 +-
 sys-libs/glibc/glibc-2.34-r6.ebuild | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)