Summary: | sys-libs/glibc-2.22-r4: various crashes when using prelink | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Anton Kuleshov <anton.wd> |
Component: | Current packages | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED WONTFIX | ||
Severity: | normal | CC: | anton.wd, bill, egorov_egor, gentoo, pacho, shuber |
Priority: | Normal | Keywords: | PMASKED |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | https://sourceware.org/ml/libc-alpha/2016-05/msg00034.html | ||
Whiteboard: | prelink | ||
Package list: | Runtime testing required: | --- | |
Attachments: |
emerge --info with glibc-2.21-r2
.xsession-errors demonstrate problem demo: ld_so+prelink-failure.tar.gz glibc-fix-local-scope-for-prelink-conflicts.patch |
I confirm. The problem is because of incorrect prelink work (with glibc-2.22). In the attachment .xsession-errors file. Note the unnamed app(XXXX): KUniqueApplication: Pipe closed unexpectedly. As woraround, disable prelink (in file /etc/conf.d/prelink set PRELINKING="no") and undo prelinking (run /etc/cron.daily/prelink). Created attachment 430302 [details]
.xsession-errors demonstrate problem
unnamed app(XXXX): KUniqueApplication: Pipe closed unexpectedly.
what version of prelink are you using ? have you tried the latest in the tree ? I have used both the last stable and the last testing versions ( 2013 and 2015 ). In first case - black screen after logging. In second case - plasma started but many apps not. gui apps not starting isn't terribly interesting. look at the logs and try running programs yourself to see if they're crashing still w/newer prelink. Exactly the same behaviour here. I even tried applying all patches from #579388 . Reproducible both with 20130503 and 20151030 as long as glibc 2.22-r4 is used. I also tried to un-prelink full system, switch to 20151030 and then did a full re-prelinking. It does not affect just KDElibs-related apps. I observed a similar effect using git-svn: $ git svn rebase error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421. error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421. Use of uninitialized value $git_dir in concatenation (.) or string at /usr/libexec/git-core/git-svn line 352. error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421. error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421. fatal: Not a git repository: . at /usr/lib64/perl5/vendor_perl/5.20.2/Git.pm line 210. Re-doing "unprelink"=>"prelink" after removing "-m" from prelink-opts still left me in a broken state, though I now get: $ git svn rebase Can't fork: Success at /usr/lib64/perl5/vendor_perl/5.20.2/Git.pm line 1587. All in all it seems that for some combinations of libraries, fork is broken. Only "fix" seems to be to unprelink full system (or, of course, downgrade glibc). I have this in a kvm/qemu gentoo vm running LXC containers and glibc-2.22-r4. I can only create two containers in a series ... then: lxc-create -B btrfs -n dav -t /usr/share/lxc/templates/lxc-gentoo -f /home/config/etc/lxc/seed.dav lxc-create: lxc_create.c: main: 274 Error creating container dav Solution was to unprelink. BillK Created attachment 432494 [details] demo: ld_so+prelink-failure.tar.gz I have identified the culprit for this and bug #579546. Glibc's dynamic linker uses a wrong local symbol search scope for libraries when it computes conflict fixups for prelink and therefore misses some conflicts. I've created a small test case that demonstrates the problem. In case of the KDE apps libkdeui's fork gets prelinked to libpthread's fork_resolve (which is a strange design decision anyway) and the required conflict fixups are missed. In older glibc, libpthread's fork wasn't an IFUNC, so things would still work (assuming the bug existed before). To fix the problem, _dl_build_local_scope() in elf/dl-deps.c must be converted to use breadth first search. No patch yet, I need some sleep first. Created attachment 432574 [details, diff]
glibc-fix-local-scope-for-prelink-conflicts.patch
This patch for glibc should fix the issue. You have to re-prelink your system for the changes to take effect.
*** Bug 579546 has been marked as a duplicate of this bug. *** Tested - works for me. I was able to consistently build a string of LXC instances without the fork failures that were occurring before adding the patch. BillK (In reply to Alexander Miller from comment #9) assuming you wrote this patch, please post it to the glibc libc-alpha list for discussion: https://www.gnu.org/software/libc/development.html thanks ... hopefully we can get some guidance whether this is the right course. i'll prob do another 2.22 to pick this up and another security fix. This must be somewhat related. After unprelinking my libdbusmenu-qt* as suggested on #579546 I'm now getting the following after running emerge:
>>> Regenerating /etc/ld.so.cache...
/sbin/ldconfig: /usr/lib64/libdbusmenu-qt5.so.2 is not a symbolic link
/sbin/ldconfig: /usr/lib64/libdbusmenu-qt.so.2 is not a symbolic link
$ ls -lh /usr/lib64/libdbusmenu-qt*
-rwxr-xr-x 1 root root 208K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so.2
-rwxr-xr-x 1 root root 208K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so.2.6.0
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so.2
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so.2.6.0
^^^ note the different size of libdbusmenu-qt5.so.2
I think the symlinks where replaced when I undid prelinking. Now I unprelinked the whole system just in case (everything was ok after prelinking those libraries) and no other symlinks where replaced.
Using prelink-20130503 and glibc-2.22-r4.
The patch also fixes a problem with Hugin (panorama alignment freezes). I can confirm this patch fixes this problem for me with tk/wish (used by e.g. gitk and git-gui). Git-gui would say (see https://bugzilla.altlinux.org/show_bug.cgi?id=31739): ---8<---------- Cannot determine Git version: error waiting for process to exit: child process lost (is SIGCHLD ignored or trapped?) Git Gui requires Git 1.5.0 or later. --->8--------- I used: - sys-libs/glibc-2.22 - dev-lang/tk-8.5.17 - sys-devel/prelink-20130503 - sys-devel/gcc-5.4.0 Is there any particular reason this patch is not yet applied? (In reply to Da Fox from comment #16) > > Is there any particular reason this patch is not yet applied? I guess because upstream hasn't responded yet, but this bug may need some revivification. Alexander or SpanKY, could you try to ping upstream again on the mailing list? Has an upstream bug already been filed? (https://sourceware.org/glibc/wiki/FilingBugs) Is this still a problem with glibc 2.23 or later? (In reply to Andreas K. Hüttel from comment #18) > Is this still a problem with glibc 2.23 or later? I'm using 2.23-r4 and prelinking does not cause any issues already detected. (I actually don't know when I re-enabled prelinking.) (In reply to Andreas K. Hüttel from comment #18) > Is this still a problem with glibc 2.23 or later? More or less, probably. IIRC most (all?) the reported crashes were related to fork(), and libpthread's fork() is no longer an IFUNC is seems. So it may be no longer a big problem in practice. BUT the underlying bug has NOT been fixed! My patch still applies, and there has been no reaction on my mail¹ to libc-alpha, Mark Hatle's upstream bug report² or Mark's mail³. A program relying on tricky cases of symbol superposition with libraries will still fail. Moreover, there are still IFUNCs in libpthread (system, vfork, longjmp) that may cause problems. I haven't tested any of this because I'm running a patched glibc. Anyone who wants to verify the bug is still there on the unpatched glibc can try the demo from attachment 432494 [details]. It may be interesting to modify it to see if you can get a crash with libpthreads's system() vs. libc's. ¹) https://sourceware.org/ml/libc-alpha/2016-05/msg00034.html ²) https://sourceware.org/bugzilla/show_bug.cgi?id=20488 ³) https://sourceware.org/ml/libc-alpha/2016-08/msg00673.html Prelink support is being dropped upstream in glibc-2.35; sys-devel/prelink has been masked for removal. (In reply to Andreas K. Hüttel from comment #21) > Prelink support is being dropped upstream in glibc-2.35; sys-devel/prelink > has been masked for removal. Changed to 2.36 quite late on, for the record, but people should still be migrating off it. |
Created attachment 429948 [details] emerge --info with glibc-2.21-r2 After update glibc to last stable version i got black screen after logging in to kde. Mask 2.22-r4 and install back 2.21-r2 solves this issue. Should i attach emerge --info with 2.22-r4 version? Please let me know if you need more information. dmesg: Apr 08 22:46:57 [kernel] [ 46.583157] ------------[ cut here ]------------ Apr 08 22:46:57 [kernel] [ 46.583175] WARNING: CPU: 0 PID: 3479 at drivers/gpu/drm/drm_irq.c:1326 drm_wait_one_vblank+0x13e/0x190() Apr 08 22:46:57 [kernel] [ 46.583176] vblank wait timed out on crtc 0 Apr 08 22:46:57 [kernel] [ 46.583189] Modules linked in: vboxnetflt(O) vboxnetadp(O) vboxdrv(O) wl(PO) radeon input_leds ttm Apr 08 22:46:57 [kernel] [ 46.583195] CPU: 0 PID: 3479 Comm: X Tainted: P O 4.4.6-gentoo #9 Apr 08 22:46:57 [kernel] [ 46.583197] Hardware name: Acer Aspire 7750G/JE70_HR, BIOS V1.21 08/09/2012 Apr 08 22:46:57 [kernel] [ 46.583203] 0000000000000000 ffff880446a7b888 ffffffff8136cab8 ffff880446a7b8d0 Apr 08 22:46:57 [kernel] [ 46.583207] ffffffff81c4bab7 ffff880446a7b8c0 ffffffff8107f2d1 ffff88044c0c0000 Apr 08 22:46:57 [kernel] [ 46.583211] 0000000000000000 0000000000000aa4 ffff88044c0bc408 0000000000000000 Apr 08 22:46:57 [kernel] [ 46.583212] Call Trace: Apr 08 22:46:57 [kernel] [ 46.583222] [<ffffffff8136cab8>] dump_stack+0x4d/0x65 Apr 08 22:46:57 [kernel] [ 46.583230] [<ffffffff8107f2d1>] warn_slowpath_common+0x81/0xc0 Apr 08 22:46:57 [kernel] [ 46.583242] [<ffffffff8107f357>] warn_slowpath_fmt+0x47/0x50 Apr 08 22:46:57 [kernel] [ 46.583244] [<ffffffff810b5400>] ? finish_wait+0x50/0x60 Apr 08 22:46:57 [kernel] [ 46.583246] [<ffffffff8145d93e>] drm_wait_one_vblank+0x13e/0x190 Apr 08 22:46:57 [kernel] [ 46.583247] [<ffffffff810b55b0>] ? wait_woken+0x80/0x80 Apr 08 22:46:57 [kernel] [ 46.583249] [<ffffffff814efd10>] intel_atomic_commit+0x440/0x13a0 Apr 08 22:46:57 [kernel] [ 46.583252] [<ffffffff8147507f>] ? drm_atomic_check_only+0x13f/0x5d0 Apr 08 22:46:57 [kernel] [ 46.583253] [<ffffffff81474e72>] ? drm_atomic_add_affected_connectors+0x22/0xf0 Apr 08 22:46:57 [kernel] [ 46.583254] [<ffffffff81475542>] drm_atomic_commit+0x32/0x50 Apr 08 22:46:57 [kernel] [ 46.583256] [<ffffffff81453e62>] restore_fbdev_mode+0x232/0x260 Apr 08 22:46:57 [kernel] [ 46.583257] [<ffffffff81455fee>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70 Apr 08 22:46:57 [kernel] [ 46.583258] [<ffffffff81456058>] drm_fb_helper_set_par+0x28/0x50 Apr 08 22:46:57 [kernel] [ 46.583260] [<ffffffff81505895>] intel_fbdev_set_par+0x15/0x60 Apr 08 22:46:57 [kernel] [ 46.583262] [<ffffffff813f2ecf>] fb_set_var+0x19f/0x3e0 Apr 08 22:46:57 [kernel] [ 46.583264] [<ffffffff810a9d39>] ? check_preempt_wakeup+0x199/0x230 Apr 08 22:46:57 [kernel] [ 46.583266] [<ffffffff810a22d9>] ? check_preempt_curr+0x79/0x90 Apr 08 22:46:57 [kernel] [ 46.583268] [<ffffffff813ef8aa>] fbcon_blank+0x20a/0x2e0 Apr 08 22:46:57 [kernel] [ 46.583271] [<ffffffff813ca205>] do_unblank_screen+0xb5/0x1c0 Apr 08 22:46:57 [kernel] [ 46.583273] [<ffffffff813c0bf4>] complete_change_console+0x54/0xd0 Apr 08 22:46:57 [kernel] [ 46.583274] [<ffffffff813c1d95>] vt_ioctl+0x1125/0x12f0 Apr 08 22:46:57 [kernel] [ 46.583277] [<ffffffff811405ff>] ? unlock_page+0x4f/0x60 Apr 08 22:46:57 [kernel] [ 46.583280] [<ffffffff81166fdb>] ? do_wp_page+0x20b/0x5d0 Apr 08 22:46:57 [kernel] [ 46.583282] [<ffffffff813b53df>] tty_ioctl+0x3cf/0xbe0 Apr 08 22:46:57 [kernel] [ 46.583285] [<ffffffff81308227>] ? selinux_file_ioctl+0xf7/0x1c0 Apr 08 22:46:57 [kernel] [ 46.583286] [<ffffffff811a21b5>] do_vfs_ioctl+0x2b5/0x480 Apr 08 22:46:57 [kernel] [ 46.583289] [<ffffffff81301a0e>] ? security_file_ioctl+0x3e/0x60 Apr 08 22:46:57 [kernel] [ 46.583289] [<ffffffff811a23f4>] SyS_ioctl+0x74/0x80 Apr 08 22:46:57 [kernel] [ 46.583292] [<ffffffff81898197>] entry_SYSCALL_64_fastpath+0x12/0x6a Apr 08 22:46:57 [kernel] [ 46.583292] ---[ end trace 00e8d2554546b76d ]---