Summary: | >=sys-libs/glibc-2.12 breaks sys-apps/sed[acl] on sh | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Raúl Porcel (RETIRED) <armin76> |
Component: | [OLD] Core system | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED WONTFIX | ||
Severity: | normal | CC: | sh+disabled |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | sh | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 430346 |
Description
Raúl Porcel (RETIRED)
![]() does it crash if sed is built USE=-acl ? (In reply to comment #1) > does it crash if sed is built USE=-acl ? Having a broken system, i've done the following: -extract binpkg of sys-apps/acl -USE="-acl" emerge sed && emerge acl -sed works -emerge sed -sed still works (In reply to comment #2) > (In reply to comment #1) > > does it crash if sed is built USE=-acl ? > > Having a broken system, i've done the following: > -extract binpkg of sys-apps/acl > -USE="-acl" emerge sed && emerge acl > -sed works > -emerge sed > -sed still works Okay, after doing some debugging, the way to reproduce: -emerge >=glibc-2.12* -emerge sed -emerge acl attr(this also pulls in autoconf-wrapper, m4 and autoconf) -sed segfaults (In reply to comment #3) i can verify this behavior using cross-compilers & qemu. so that's good :). the odd thing is that you need to rebuild & deploy all three for it to crash. if you deploy some, it'll run OK. hmm, i think it's related to the __gmon_start__() changes ... before glibc-2.12, that symbol was defined in all ELFs, but then it moved to being undefined. i think the core logic is supposed to behave like "if symbol is not defined, skip it", but there's probably a bug in that somewhere. if i create a dummy lib like so: echo '__gmon_start__(){}' | gcc -fPIC -shared -o libfoo.so then preload that: LD_PRELOAD=libfoo.so sed it seems to work fine. it seems that glibc-2.16 has this fixed (whether on purpose or by accident, i know not). if you could try upgrading your system to that, that'd be great. if it does turn out to work, i'd say let's just wait for that version. (In reply to comment #6) > it seems that glibc-2.16 has this fixed (whether on purpose or by accident, > i know not). if you could try upgrading your system to that, that'd be > great. if it does turn out to work, i'd say let's just wait for that > version. Nope, same issue I've bisected it: 9b2f1d4b58f192445db38d5bfe5de0eff2dc3b27 is the first bad commit commit 9b2f1d4b58f192445db38d5bfe5de0eff2dc3b27 Author: Kaz Kojima <kkojima@rr.iij4u.or.jp> Date: Sun Dec 13 09:43:51 2009 -0800 Update sysdeps/sh/elf/initfini.c. :100644 100644 93602d1935a3af3ba011395b209d6e445390c676 db9354489220ea8ba07af1f22e5764139b1fa7c6 M ChangeLog :040000 040000 7568da3c9ea33e3b4f8777bc413cae54c9957c81 33b95a035e7d118d8ac1d96b9a6e3801fe13b479 M sysdeps I'll investigate... (In reply to comment #8) yes, that is the commit i was referring to when i said the gmon symbol changed. that has no direct relevance in the latest versions since initfini.c has been thrown out in favor of crt[in].S. i'm not familiar with SuperH asm to walk the code and see if there is an error (checking to see if the symbol is non-NULL before attempting to call it). I'd try masking acl and attr USE-flags, like HPPA does. I tried disabling acl on sed but then after emerging ncurses some portage commands started failing. I'll try with the masking until we have an answer from the glibc maintainers... i suspect acl/attr aren't the problem. they're just the first libs you ran into. Thanks to the investigation of Yoshii-san from Renesas we already got the confirmation that the segfault is what is expected. The problem here is that its not possible to upgrade glibc without having to rebuild all the libraries in a specific order(-evuDN world doesn't work here). I'm not sure how we can proceed here considering i haven't got any answer from the glibc SH maintainers. The only way i can think is providing an upgrade guide explaining all the steps the users have to do. Of course those who prefer could start from scratch with the new stage3 we would provide. Oops, the upgrade path in a clean stage3 would be this(Thanks Yoshii-san): emerge attr acl emerge ncurses readline --nodeps emerge bzip2 cracklib libpcre xz-utils libffi db gdbm tcp-wrappers dev-libs/glib emerge gmp mpfr mpc libxml2 emerge file gettext e2fsprogs-libs e2fsprogs emerge expat emerge openssl emerge curl emerge perl emerge python i'm going to call it. i don't have the time to investigate and fix this, and at this point, glibc-2.12 was released like 3 years ago. i'm not sure it's even really worth our time either considering SuperH's dead status in the larger scheme of things. i've marked glibc-2.15-r3 stable for sh now: http://sources.gentoo.org/sys-libs/glibc/glibc-2.15-r3.ebuild?r1=1.13&r2=1.14 for posterity, here is what Yoshi had to say: -------- Original Message -------- Subject: Re: Fwd: >=glibc-2.12 issues on gentoo Date: Mon, 15 Oct 2012 15:52:18 +0900 From: Takashi Yoshii <takashi.yoshii.zj@renesas.com> To: Raúl Porcel <armin76@gentoo.org> Sorry for my slow progress. No news is bad news, generally. One month has passed so quickly. I have not get the issue fixed, but it has become quite obvious what is going on. Anyway, I'm now using glibc-2.15-r2 system built by not a smart workaround. Here is a report what I know currently, though you might already know them. Terms in the following context... - "exe" is executable binaries (for example, /bin/sed ) - "lib" is shared libraries (for example, /lib/libattr.so ) - "the change" is commit 9b2f1d4b58f192445db38d5bfe5de0eff2dc3b27 of glibc - "old", "new" means that are built before and after the change. * What is the issue? This is a transition issue of glibc version before and after the change. They are not compatible. The bad case is a combination of (New exe linked with old lib) + New lib. All the other cases including everything new one should work. This hits when you update exe before updating lib they depend on. You must update lib first, then exe. But that is not easy because Gentoo does not have dedicated package for exe and lib (not like as other distributions having XXX and XXX-dev). * Simplified procedure to reproduce the issue. 1. Build lib in old system. 2. Update glibc. 3. Build exe, using old lib. 4. Rebuild lib. 5. Execution of exe fails. Example: ## (setup stage3-sh4-20120307, then) $ echo 'l1(){}' | gcc -xc - -fPIC -shared -o libl1.so $ sudo emerge -K =sys-libs/glibc-2.15-r2 # (package prepared in advance) $ echo 'main(){l1();}' | gcc -xc - -L. -ll1 $ echo 'l1(){}' | gcc -xc - -fPIC -shared -o libl1.so $ LD_LIBRARY_PATH=. ./a.out Segmentation fault * Detailed description The symbol __gmon_start__ is for libc internal use, and most of binaries have a reference to it. It may be left unresolved when it is not used. Before the change, both exe and lib had local __gmon_start__ weak definition (This was somewhat SH specific). The change is to remove this weak definition, so there no longer are these definitions in new exe nor new lib. If an exe is built before lib built, because the compiled obj(.o) no longer has the definition but lib does, the exe gets its __gmon_start__ reference resolved with the lib on link, and successfully bound to the lib on execution. Nothing happens at this point. But then, when lib is re-built, the definition in lib disappears. After that, symbol binding fails while loading, then first init call results in jump to 0, and get SEGV. Example: __gmon_start__ status in the example procedure above. ## Before glibc update. $ objdump -T a.old |grep gmon 004006a0 w DF .text 00000000 Base __gmon_start__ $ objdump -T libl1.so.old |grep gmon 00000500 w DF .text 00000000 Base __gmon_start__ ## Build exe after glibc update $ objdump -T a.out |grep gmon 00400460 w DF UND 00000000 __gmon_start__ ## Build lib after glibc update $ objdump -T libl1.so |grep gmon 00000000 w D UND 00000000 __gmon_start__ ## Build exe after building lib $ objdump -T a.out |grep gmon 00000000 w D UND 00000000 __gmon_start__ * Fix Not yet found. I guess ldso can do some more test to safely ignore this, or possibly we can get the symbol left unresolved at link time. At least, I believe SEGV should be considered as a ldso's bug. This should be "symbol not found" or so, at least, I think. * Workaround. Simply do library update first, then update executable binaries. 1. Careful step by step emerge works. My bash history says I did... | emerge attr acl | emerge ncurses readline --nodeps | emerge bzip2 cracklib libpcre xz-utils libffi db gdbm tcp-wrappers dev-libs/glib | emerge gmp mpfr mpc libxml2 | emerge file gettext e2fsprogs-libs e2fsprogs | emerge expat | emerge openssl | emerge curl | emerge perl | emerge python to rebuild all libs in stage3. I think this one is far overkill, it can be automated more. But ebuild's dependency analysis is not enough, as you know. 2. Cross build works. Because corrupted exe never prevent building process. You might need to build twice in such case, though. 3. Bootstrapping should work. I haven't done yet, though. I do not think this issue matters, because this is an transition issue. /yoshii |