Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 579374

Summary: sys-libs/glibc-2.22-r4: various crashes when using prelink
Product: Gentoo Linux Reporter: Anton Kuleshov <anton.wd>
Component: Current packagesAssignee: Gentoo Toolchain Maintainers <toolchain>
Status: RESOLVED WONTFIX    
Severity: normal CC: anton.wd, bill, egorov_egor, gentoo, pacho, shuber
Priority: Normal Keywords: PMASKED
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://sourceware.org/ml/libc-alpha/2016-05/msg00034.html
Whiteboard: prelink
Package list:
Runtime testing required: ---
Attachments: emerge --info with glibc-2.21-r2
.xsession-errors demonstrate problem
demo: ld_so+prelink-failure.tar.gz
glibc-fix-local-scope-for-prelink-conflicts.patch

Description Anton Kuleshov 2016-04-08 20:31:30 UTC
Created attachment 429948 [details]
emerge --info with glibc-2.21-r2

After update glibc to last stable version i got black screen after logging in to kde.
Mask 2.22-r4 and install back 2.21-r2 solves this issue.

Should i attach emerge --info with 2.22-r4 version?

Please let me know if you need more information.

dmesg:

Apr 08 22:46:57 [kernel] [   46.583157] ------------[ cut here ]------------
Apr 08 22:46:57 [kernel] [   46.583175] WARNING: CPU: 0 PID: 3479 at drivers/gpu/drm/drm_irq.c:1326 drm_wait_one_vblank+0x13e/0x190()
Apr 08 22:46:57 [kernel] [   46.583176] vblank wait timed out on crtc 0
Apr 08 22:46:57 [kernel] [   46.583189] Modules linked in: vboxnetflt(O) vboxnetadp(O) vboxdrv(O) wl(PO) radeon input_leds ttm
Apr 08 22:46:57 [kernel] [   46.583195] CPU: 0 PID: 3479 Comm: X Tainted: P           O    4.4.6-gentoo #9
Apr 08 22:46:57 [kernel] [   46.583197] Hardware name: Acer Aspire 7750G/JE70_HR, BIOS V1.21 08/09/2012
Apr 08 22:46:57 [kernel] [   46.583203]  0000000000000000 ffff880446a7b888 ffffffff8136cab8 ffff880446a7b8d0
Apr 08 22:46:57 [kernel] [   46.583207]  ffffffff81c4bab7 ffff880446a7b8c0 ffffffff8107f2d1 ffff88044c0c0000
Apr 08 22:46:57 [kernel] [   46.583211]  0000000000000000 0000000000000aa4 ffff88044c0bc408 0000000000000000
Apr 08 22:46:57 [kernel] [   46.583212] Call Trace:
Apr 08 22:46:57 [kernel] [   46.583222]  [<ffffffff8136cab8>] dump_stack+0x4d/0x65
Apr 08 22:46:57 [kernel] [   46.583230]  [<ffffffff8107f2d1>] warn_slowpath_common+0x81/0xc0
Apr 08 22:46:57 [kernel] [   46.583242]  [<ffffffff8107f357>] warn_slowpath_fmt+0x47/0x50
Apr 08 22:46:57 [kernel] [   46.583244]  [<ffffffff810b5400>] ? finish_wait+0x50/0x60
Apr 08 22:46:57 [kernel] [   46.583246]  [<ffffffff8145d93e>] drm_wait_one_vblank+0x13e/0x190
Apr 08 22:46:57 [kernel] [   46.583247]  [<ffffffff810b55b0>] ? wait_woken+0x80/0x80
Apr 08 22:46:57 [kernel] [   46.583249]  [<ffffffff814efd10>] intel_atomic_commit+0x440/0x13a0
Apr 08 22:46:57 [kernel] [   46.583252]  [<ffffffff8147507f>] ? drm_atomic_check_only+0x13f/0x5d0
Apr 08 22:46:57 [kernel] [   46.583253]  [<ffffffff81474e72>] ? drm_atomic_add_affected_connectors+0x22/0xf0
Apr 08 22:46:57 [kernel] [   46.583254]  [<ffffffff81475542>] drm_atomic_commit+0x32/0x50
Apr 08 22:46:57 [kernel] [   46.583256]  [<ffffffff81453e62>] restore_fbdev_mode+0x232/0x260
Apr 08 22:46:57 [kernel] [   46.583257]  [<ffffffff81455fee>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70
Apr 08 22:46:57 [kernel] [   46.583258]  [<ffffffff81456058>] drm_fb_helper_set_par+0x28/0x50
Apr 08 22:46:57 [kernel] [   46.583260]  [<ffffffff81505895>] intel_fbdev_set_par+0x15/0x60
Apr 08 22:46:57 [kernel] [   46.583262]  [<ffffffff813f2ecf>] fb_set_var+0x19f/0x3e0
Apr 08 22:46:57 [kernel] [   46.583264]  [<ffffffff810a9d39>] ? check_preempt_wakeup+0x199/0x230
Apr 08 22:46:57 [kernel] [   46.583266]  [<ffffffff810a22d9>] ? check_preempt_curr+0x79/0x90
Apr 08 22:46:57 [kernel] [   46.583268]  [<ffffffff813ef8aa>] fbcon_blank+0x20a/0x2e0
Apr 08 22:46:57 [kernel] [   46.583271]  [<ffffffff813ca205>] do_unblank_screen+0xb5/0x1c0
Apr 08 22:46:57 [kernel] [   46.583273]  [<ffffffff813c0bf4>] complete_change_console+0x54/0xd0
Apr 08 22:46:57 [kernel] [   46.583274]  [<ffffffff813c1d95>] vt_ioctl+0x1125/0x12f0
Apr 08 22:46:57 [kernel] [   46.583277]  [<ffffffff811405ff>] ? unlock_page+0x4f/0x60
Apr 08 22:46:57 [kernel] [   46.583280]  [<ffffffff81166fdb>] ? do_wp_page+0x20b/0x5d0
Apr 08 22:46:57 [kernel] [   46.583282]  [<ffffffff813b53df>] tty_ioctl+0x3cf/0xbe0
Apr 08 22:46:57 [kernel] [   46.583285]  [<ffffffff81308227>] ? selinux_file_ioctl+0xf7/0x1c0
Apr 08 22:46:57 [kernel] [   46.583286]  [<ffffffff811a21b5>] do_vfs_ioctl+0x2b5/0x480
Apr 08 22:46:57 [kernel] [   46.583289]  [<ffffffff81301a0e>] ? security_file_ioctl+0x3e/0x60
Apr 08 22:46:57 [kernel] [   46.583289]  [<ffffffff811a23f4>] SyS_ioctl+0x74/0x80
Apr 08 22:46:57 [kernel] [   46.583292]  [<ffffffff81898197>] entry_SYSCALL_64_fastpath+0x12/0x6a
Apr 08 22:46:57 [kernel] [   46.583292] ---[ end trace 00e8d2554546b76d ]---
Comment 1 Eugeny Shkrigunov 2016-04-13 07:58:13 UTC
I confirm.

The problem is because of incorrect prelink work (with glibc-2.22).
In the attachment .xsession-errors file. Note the
unnamed app(XXXX): KUniqueApplication: Pipe closed unexpectedly.

As woraround, disable prelink (in file /etc/conf.d/prelink set PRELINKING="no") and undo prelinking (run /etc/cron.daily/prelink).
Comment 2 Eugeny Shkrigunov 2016-04-13 07:59:53 UTC
Created attachment 430302 [details]
.xsession-errors demonstrate problem

unnamed app(XXXX): KUniqueApplication: Pipe closed unexpectedly.
Comment 3 SpanKY gentoo-dev 2016-04-13 14:12:37 UTC
what version of prelink are you using ?  have you tried the latest in the tree ?
Comment 4 Anton Kuleshov 2016-04-13 14:20:18 UTC
I have used both the last stable and the last testing versions ( 2013 and 2015 ). In first case - black screen after logging. In second case - plasma started but many apps not.
Comment 5 SpanKY gentoo-dev 2016-04-13 14:22:25 UTC
gui apps not starting isn't terribly interesting.  look at the logs and try running programs yourself to see if they're crashing still w/newer prelink.
Comment 6 Oliver Freyermuth 2016-04-14 20:38:36 UTC
Exactly the same behaviour here. 
I even tried applying all patches from #579388 . 
Reproducible both with 20130503 and 20151030 as long as glibc 2.22-r4 is used. 

I also tried to un-prelink full system, switch to 20151030 and then did a full re-prelinking. 

It does not affect just KDElibs-related apps. I observed a similar effect using git-svn: 

$ git svn rebase
error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421.
error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421.
Use of uninitialized value $git_dir in concatenation (.) or string at /usr/libexec/git-core/git-svn line 352.
error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421.
error closing pipe: No child processes at /usr/lib64/perl5/vendor_perl/5.20.2/Error.pm line 421.
fatal: Not a git repository: . at /usr/lib64/perl5/vendor_perl/5.20.2/Git.pm line 210.

Re-doing "unprelink"=>"prelink" after removing "-m" from prelink-opts still left me in a broken state, though I now get:
$ git svn rebase
Can't fork: Success at /usr/lib64/perl5/vendor_perl/5.20.2/Git.pm line 1587.

All in all it seems that for some combinations of libraries, fork is broken. 

Only "fix" seems to be to unprelink full system (or, of course, downgrade glibc).
Comment 7 Bill Kenworthy 2016-04-24 08:13:22 UTC
I have this in a kvm/qemu gentoo vm running LXC containers and glibc-2.22-r4.  I can only create two containers in a series ... then:

lxc-create -B btrfs -n dav -t /usr/share/lxc/templates/lxc-gentoo -f /home/config/etc/lxc/seed.dav
lxc-create: lxc_create.c: main: 274 Error creating container dav

Solution was to unprelink.

BillK
Comment 8 Alexander Miller 2016-04-29 02:06:56 UTC
Created attachment 432494 [details]
demo: ld_so+prelink-failure.tar.gz

I have identified the culprit for this and bug #579546. Glibc's dynamic linker uses a wrong local symbol search scope for libraries when it computes conflict fixups for prelink and therefore misses some conflicts. I've created a small test case that demonstrates the problem.

In case of the KDE apps libkdeui's fork gets prelinked to libpthread's fork_resolve (which is a strange design decision anyway) and the required conflict fixups are missed. In older glibc, libpthread's fork wasn't an IFUNC, so things would still work (assuming the bug existed before).

To fix the problem, _dl_build_local_scope() in elf/dl-deps.c must be converted to use breadth first search. No patch yet, I need some sleep first.
Comment 9 Alexander Miller 2016-04-29 15:30:27 UTC
Created attachment 432574 [details, diff]
glibc-fix-local-scope-for-prelink-conflicts.patch

This patch for glibc should fix the issue. You have to re-prelink your system for the changes to take effect.
Comment 10 Johannes Huber (RETIRED) gentoo-dev 2016-05-02 20:41:03 UTC
*** Bug 579546 has been marked as a duplicate of this bug. ***
Comment 11 Bill Kenworthy 2016-05-03 12:04:25 UTC
Tested - works for me.

I was able to consistently build a string of LXC instances without the fork failures that were occurring before adding the patch.

BillK
Comment 12 SpanKY gentoo-dev 2016-05-03 19:31:09 UTC
(In reply to Alexander Miller from comment #9)

assuming you wrote this patch, please post it to the glibc libc-alpha list for discussion:
https://www.gnu.org/software/libc/development.html
Comment 13 SpanKY gentoo-dev 2016-05-05 05:50:25 UTC
thanks ... hopefully we can get some guidance whether this is the right course.  i'll prob do another 2.22 to pick this up and another security fix.
Comment 14 Fernando Rodriguez 2016-05-19 14:26:24 UTC
This must be somewhat related. After unprelinking my libdbusmenu-qt* as suggested on #579546 I'm now getting the following after running emerge:

>>> Regenerating /etc/ld.so.cache...
/sbin/ldconfig: /usr/lib64/libdbusmenu-qt5.so.2 is not a symbolic link
/sbin/ldconfig: /usr/lib64/libdbusmenu-qt.so.2 is not a symbolic link


$ ls -lh /usr/lib64/libdbusmenu-qt*
-rwxr-xr-x 1 root root 208K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so.2
-rwxr-xr-x 1 root root 208K May 17 03:21 /usr/lib64/libdbusmenu-qt5.so.2.6.0
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so.2
-rwxr-xr-x 1 root root 212K May 17 03:21 /usr/lib64/libdbusmenu-qt.so.2.6.0

^^^ note the different size of libdbusmenu-qt5.so.2

I think the symlinks where replaced when I undid prelinking. Now I unprelinked the whole system just in case (everything was ok after prelinking those libraries) and no other symlinks where replaced.

Using prelink-20130503 and glibc-2.22-r4.
Comment 15 Lukas Turek 2016-05-29 16:48:27 UTC
The patch also fixes a problem with Hugin (panorama alignment freezes).
Comment 16 Da Fox 2016-07-07 16:44:07 UTC
I can confirm this patch fixes this problem for me with tk/wish (used by e.g. gitk and git-gui). Git-gui would say (see https://bugzilla.altlinux.org/show_bug.cgi?id=31739):
---8<----------
Cannot determine Git version:

error waiting for process to exit: child process lost (is SIGCHLD ignored or
trapped?)

Git Gui requires Git 1.5.0 or later.
--->8---------


I used:
- sys-libs/glibc-2.22
- dev-lang/tk-8.5.17
- sys-devel/prelink-20130503
- sys-devel/gcc-5.4.0

Is there any particular reason this patch is not yet applied?
Comment 17 Erik Quaeghebeur 2016-12-13 09:14:56 UTC
(In reply to Da Fox from comment #16)
>
> Is there any particular reason this patch is not yet applied?
I guess because upstream hasn't responded yet, but this bug may need some revivification. Alexander or SpanKY, could you try to ping upstream again on the mailing list?

Has an upstream bug already been filed?
(https://sourceware.org/glibc/wiki/FilingBugs)
Comment 18 Andreas K. Hüttel archtester gentoo-dev 2017-08-15 12:19:29 UTC
Is this still a problem with glibc 2.23 or later?
Comment 19 Erik Quaeghebeur 2017-08-21 08:05:59 UTC
(In reply to Andreas K. Hüttel from comment #18)
> Is this still a problem with glibc 2.23 or later?
I'm using 2.23-r4 and prelinking does not cause any issues already detected. (I actually don't know when I re-enabled prelinking.)
Comment 20 Alexander Miller 2017-08-27 00:08:50 UTC
(In reply to Andreas K. Hüttel from comment #18)
> Is this still a problem with glibc 2.23 or later?

More or less, probably.

IIRC most (all?) the reported crashes were related to fork(),
and libpthread's fork() is no longer an IFUNC is seems. So
it may be no longer a big problem in practice.

BUT the underlying bug has NOT been fixed!

My patch still applies, and there has been no reaction on my
mail¹ to libc-alpha, Mark Hatle's upstream bug report² or
Mark's mail³. A program relying on tricky cases of symbol
superposition with libraries will still fail. Moreover, there
are still IFUNCs in libpthread (system, vfork, longjmp) that
may cause problems.

I haven't tested any of this because I'm running a patched
glibc. Anyone who wants to verify the bug is still there on
the unpatched glibc can try the demo from attachment 432494 [details].
It may be interesting to modify it to see if you can get a
crash with libpthreads's system() vs. libc's.

¹) https://sourceware.org/ml/libc-alpha/2016-05/msg00034.html
²) https://sourceware.org/bugzilla/show_bug.cgi?id=20488
³) https://sourceware.org/ml/libc-alpha/2016-08/msg00673.html
Comment 21 Andreas K. Hüttel archtester gentoo-dev 2022-01-22 01:00:45 UTC
Prelink support is being dropped upstream in glibc-2.35; sys-devel/prelink has been masked for removal.
Comment 22 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-02-07 02:53:13 UTC
(In reply to Andreas K. Hüttel from comment #21)
> Prelink support is being dropped upstream in glibc-2.35; sys-devel/prelink
> has been masked for removal.

Changed to 2.36 quite late on, for the record, but people should still be migrating off it.