Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 246757 - [Tracker] Packages with weak linkage having undefined references
Summary: [Tracker] Packages with weak linkage having undefined references
Status: CONFIRMED
Alias: None
Product: Quality Assurance
Classification: Unclassified
Component: Trackers (show other bugs)
Hardware: All Linux
: High normal
Assignee: Gentoo Quality Assurance Team
URL:
Whiteboard:
Keywords: Tracker
Depends on: 246715 246727 246733 246742 246745 246747 246748 246749 246756 246758 246767 250747 250757 250759
Blocks:
  Show dependency tree
 
Reported: 2008-11-14 17:10 UTC by Jeroen Roovers (RETIRED)
Modified: 2021-07-21 01:06 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Example project with broken buildsystem (test-1.0.tar.gz,703 bytes, application/x-gzip)
2008-11-15 12:17 UTC, Robert Wohlrab
Details
Fix for example project (buildsystem-fix.patch,737 bytes, patch)
2008-11-15 12:20 UTC, Robert Wohlrab
Details | Diff
listing of warnings (undefined_warnings,2.67 KB, text/plain)
2008-11-17 21:45 UTC, Robert Wohlrab
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeroen Roovers (RETIRED) gentoo-dev 2008-11-14 17:10:37 UTC
.
Comment 1 Rémi Cardona (RETIRED) gentoo-dev 2008-11-14 17:41:03 UTC
You might want to get in touch with Mandriva's devs. They've been working on some of this stuff for the past couple of releases. It might be a good opportunity to collaborate instead of fixing this on our own.

@Diego, I think you might be interested in this bug.
Comment 2 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-14 17:49:46 UTC
Indeed I am! Thanks Rémi, I'll check this out and see to get in touch with somebody at Mandriva next week :)
Comment 3 Jeroen Roovers (RETIRED) gentoo-dev 2008-11-14 17:54:00 UTC
(In reply to comment #2)
> Indeed I am! Thanks Rémi, I'll check this out and see to get in touch with
> somebody at Mandriva next week :)

Great. Would you please reassign this bug appropriately, then? I was thinking toolchain@ but I'm not sure.
Comment 4 Robert Wohlrab 2008-11-14 18:20:38 UTC
Copy from bug 246757 (with more sane names):

I am testing packages for their symbol linking in shared libraries. I need to
build some of the packages on a system without the linux like dynamic loading
behavior (so no weak type of dynamic linking where we only look at the symbols
if everything is loaded by any of the different elf files). I
have the situation that a library/program must know where it needs to search
for a symbol.

And they are not only possible --as-needed bugs bug also bad behavior which
needs to be workarounded by other packages. Let's say library libmain is using
library libuseful but doesn't link against it. Now program prog_z wants to use library libmain. So natural way would be that program prog_z links against libmain and everything is fine. But in this situation it must link against libmain and libuseful - even when prog_z is not using libuseful
directly.

Now lets think about the problem that libmain noticed that libuseful is completely broken and implements it in a more sane way. But prog_z is still linked against libuseful - now what do you think which symbol will be used for the implementation of previous faulty functions from libuseful? I would bet on murphy and would say that the faulty functions will still be used by prog_z.

Or the other way around when libmain uses the new functions of library libextra but don't link against it. Yes, your binary for prog_z will be instant broken even if the usage of libmain's api should be transparent for prog_z.


What should be done? Relative easy: it must be checked if it is a gentoo-only
problem (patch or misusage of the ebuild system) or if it is in upstream. When
it is in upstream then inform them that they forgot to link their library x
against lib y and show them how they can test it. If they release a patch for it
then integrate a patch for it in your ebuilds.
But their exist of course the possible situation that it was build that way and
must be used in that way for a good reason.
Comment 5 Rémi Cardona (RETIRED) gentoo-dev 2008-11-14 19:00:08 UTC
My gut feeling tells me this should be reassigned to QA, but I think this really is something everyone should be involved with. => gentoo-devs@gentoo.org :)
Comment 6 Robert Wohlrab 2008-11-15 12:17:45 UTC
Created attachment 171799 [details]
Example project with broken buildsystem

A build on a normal i386 gentoo system without special setting will run "fine":
$ make                                                                                                                                                                        
cc -fPIC -shared   main.c -o libmain.so -Wl,-rpath,/home/test/libtest
cc -fPIC -shared   useful.c -o libuseful.so  -Wl,-rpath,/home/test/libtest
cc   prog_z.c -o prog_z -rdynamic libmain.so libuseful.so -Wl,-rpath,/home/test/libtest

When we simulate stricter linking it will break:
$ make LDFLAGS="-Wl,-z,-defs -Wl,--no-undefined"                                                                                                                              
cc -fPIC -shared  -Wl,-z,-defs -Wl,--no-undefined main.c -o libmain.so -Wl,-rpath,/home/test/libtest
/tmp/ccW1jr9I.o: In function `main_function':
main.c:(.text+0xa): undefined reference to `useful_function'
collect2: ld returned 1 exit status
make: *** [libmain.so] Error 1
Comment 7 Robert Wohlrab 2008-11-15 12:20:40 UTC
Created attachment 171800 [details, diff]
Fix for example project

This small patch fixes the buildsystem of the example project

$ make LDFLAGS="-Wl,-z,-defs -Wl,--no-undefined"                                                                                                                              
cc -fPIC -shared  -Wl,-z,-defs -Wl,--no-undefined useful.c -o libuseful.so  -Wl,-rpath,/home/test/libtest
cc -fPIC -shared  -Wl,-z,-defs -Wl,--no-undefined main.c -o libmain.so libuseful.so -Wl,-rpath,/home/test/libtest
cc  -Wl,-z,-defs -Wl,--no-undefined prog_z.c -o prog_z -rdynamic libmain.so -Wl,-rpath,/home/test/libtest
Comment 8 Rémi Cardona (RETIRED) gentoo-dev 2008-11-16 15:03:31 UTC
Robert, is there a way for ld to only _warn_ about undefined references instead of an error?

This way, we could see the full build and all potential undefined refs instead of just the first one.

Could be worth it, what do you think?

Cheers
Comment 9 Robert Wohlrab 2008-11-16 15:08:04 UTC
Yes, it would be worth. I saw that their are gentoo QA checks for different compiler warnings (like implicit declaration of functions for example). I will check if their is a possible way to make that undefined reference output of ld non-fatal.
Comment 10 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-16 15:10:22 UTC
As far as I can see, binutils 2.19 does not have it available as a warning; could be worth requesting such a warning upstream.
Comment 11 Robert Wohlrab 2008-11-16 15:34:57 UTC
Sry, but I must correct you. Try 
 make LDFLAGS="-Wl,--no-undefined -Wl,--warn-unresolved-symbols
with my test project.
Comment 12 Robert Wohlrab 2008-11-16 15:38:46 UTC
It is in binutils since 2.15
Comment 13 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-16 15:52:37 UTC
Oh I missed the combined use, I did know about --warn-unresolved-symbol (which alone does not warn about this) but I never tried combining the two of them. Quite useful indeed.
Comment 14 Rémi Cardona (RETIRED) gentoo-dev 2008-11-16 16:23:21 UTC
Great, it's definitely worth making a QA test for this. And since it needs to be added to the users LDFLAGS, we'll likely not get a flood of bugs :)

Do we ping Zac?
Comment 15 Zac Medico gentoo-dev 2008-11-17 21:16:29 UTC
I can add a build like check, but first I need some sample output of what the warning looks like.
Comment 16 Robert Wohlrab 2008-11-17 21:45:16 UTC
Created attachment 172129 [details]
listing of warnings

Only the output?
My test project for example will give following warning:

/tmp/cc4UXMlj.o: In function `main_function':
main.c:(.text+0xa): warning: undefined reference to `useful_function'

Other examples are from the log of alsa-libs and openjpeg
Comment 17 SpanKY gentoo-dev 2008-11-17 21:51:02 UTC
i dont think that's a good idea.  having undefined refs in a shared library is not necessarily a bug, and some packages do this on purpose with their plugins or libraries.
Comment 18 SpanKY gentoo-dev 2008-11-17 21:55:03 UTC
actually, an even better case is where a plugin cannot define all references.  for example, if a program itself loads a plugin on the fly, and the plugin uses symbols which are only available in the program.  theres nothing to link against in order to satisfy those symbol references.
Comment 19 Rémi Cardona (RETIRED) gentoo-dev 2008-11-17 22:40:34 UTC
Then perhaps we should do what Mandriva is currently trying to do : set the default symbol visibility to none.

Anyhow, this is a really big QA topic, I'm not sure bugzilla is the best place to discuss all this :)
Comment 20 SpanKY gentoo-dev 2008-11-17 22:52:34 UTC
how is this a "really big QA topic" ?  name one package where these undefined references in reality caused a problem.  every bug linked here will not show up at runtime (mostly because it's undefined references in plugins, not a shared library itself).

if packages arent actually breaking, then this is merely an educational exercise.

as for visibility, i dont follow.  what is this "none" you speak of ?  there is no such gcc visibility.  plus, some links would be useful ...
Comment 21 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-17 23:03:31 UTC
The cases where this is going to be a problem are the ones that are going to be reported as --as-needed failures, actually, which includes, for instance, allegro (bug #247256) which I reported today.

When I suggested using --no-undefined, I was sincerely thinking of using it as a way to show upstream the way, to make sure they don't introduce further bugs after fixing --as-needed, rather than an exercise to put in use in the tree.

Certainly, it takes much more source-digging than reporting --as-needed failures since you have to decide on a case by case basis whether the undefined symbols are an issue or not.

For instance just watching a couple of bugs still open here attached, the alsa-lib one (#246715) is certainly not fatal (alsa-lib itself links to libdl); while it would be a very good idea to get upstream to fix their buildsystem so that it actually links to libdl, it's not something we should spend time on, IMHO.

On the other hand bug #246727 shows that openjpeg fails to link to libm, which _is_ an issue, which should be solved; packages linking to openjpeg are most likely to fail when built with forced --as-needed, unless they use libm themselves so it works incidentally.

So yeah, it takes quite more time to identify which failures/warnings are issues and which ones are "false positive", like glibc or zsh.
Comment 22 SpanKY gentoo-dev 2008-11-17 23:24:22 UTC
yes, the openjpeg bug could be problematic.  but otherwise, your comment shows that this is indeed a minor issue.  i dont consider --as-needed to be a front & center issue that people should be falling over themselves to fix.
Comment 23 Rémi Cardona (RETIRED) gentoo-dev 2008-11-18 10:04:28 UTC
(In reply to comment #20)
> how is this a "really big QA topic" ?

I meant "big" as in scope, not as in priority.

As for the visibility, I mean "hidden". That's an interesting goal because it reduces the number of exported symbols and thus reduces run-time linking time. However, this needs to be explicitly supported by libraries. So this is a whole other issue.
Comment 24 SpanKY gentoo-dev 2008-11-18 14:05:23 UTC
yes, forcing visibility globally is broken.  if you want to start an initiative to "clean up library symbols", then more power to you as i'm sure there are a ton of people who'd appreciate it, but that is way more work than we have man power for.
Comment 25 SpanKY gentoo-dev 2008-11-18 22:03:45 UTC
Zac: the other thing to keep in mind is that there is no passive message portage can search for.  the logs provided show messages that only exist when a link failure occurs.  and when said link failure occurs, the emerge fails already.  simply outputting a slightly nicer message in that case seems pretty redundant.  the only way you'd be able to actively detect things is to do ELF symbol resolution the same was as the linker: scan every ELF for undefined symbols, load and scan every shared object referred to by the DT_NEEDED tag (as well as PT_INTERP), and do this recursively.

in other words, a ton of overhead for negligible return as well as plenty of false positives (which is the worst thing we can do to a developer -- make them research a "bug" which is no bug at all).
Comment 26 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-18 22:07:11 UTC
No actually there is the passive message (and I also didn't know about it before, and thought could only be a failure message).

But I agree, it's a bit of a problem since it's going to ask people to look for bugs if the warning is there for everybody; and if one has to add the warning flags to its flags or its spec file, then he also can take care of adding a bashrc snipped to find out the packages more quickly.
Comment 27 Robert Wohlrab 2008-11-18 22:08:24 UTC
It seems that you haven't read everything - it doesn't fail. It is a warning
Comment 28 Zac Medico gentoo-dev 2008-11-18 22:26:27 UTC
Okay, I'll just remove dev-portage from CC for now and you guys can ping me if you find some sort of QA test that doesn't have lots of false positives.
Comment 29 SpanKY gentoo-dev 2008-11-18 22:35:07 UTC
yes, the flag in question was mentioned before i noticed this bug

while it makes the process simpler, it still doesnt address the last point i raised.  if QA team wishes to perform audits and do some research, then great.  but dumping bugs that arent bugs on end developers is bad.
Comment 30 SpanKY gentoo-dev 2008-11-18 22:42:04 UTC
certainly a page covering this process would be good.  or perhaps integrating it into the current as-needed document.  that way we have accepted and documented methods to assist people in the audit procedure and once a potential real bug has been deduced, the page exists to pass knowledge on to the developer and/or maintainer who will be actually fixing / reviewing things.
Comment 31 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-18 22:53:37 UTC
I'll see to add a paragraph about this to the --as-needed documentation tomorrow then.
Comment 32 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-11-18 22:59:06 UTC
(I'm on qa@)