Hi... I was thinking in a tool for the emerge. ( Sorry if this is not the correct place in the bugs track ). Now, ebuilds dependencies are made by hand, seeing the configure output and options or viewing the website. It's very usual than developers forget some dependecies. I'm not sure, but I think that debian packages make a ldd of the binary files, then search the packages containing them, and reduce the dependencies by using dependency trees. I think it could be implemented in an extern utility, to display the runtime packages requiered for a particular set of USEs. Runtime dependencies that own directly to the USE var are perfectly set in the ebuild package. But, at least, common runtime dependencies could be shown by the use of an automaticed tool. For specials packages that need to be completele perfect at 100%, the utility could make all the USEs possible combination to discover the runtime dependencies. Talking about ebuilds, it should be good, that all packages were testing in a dedicated box, watching and sending bugs or emails ( or maybe including the ebuild in a packages.mask ) automatically when detecting errors in packages (like sandbox violations, miscompiling, the runtime dependencies that I have exposed before, lintian errors...). ideas... ideas... ideas... Slt! :]
Related, and mentioning here so I don't forget, would be to let sandbox write down all files that were read during the compile and in that way being able to tell what the building dependencies are
Im not going to get into if I think this is needed or not, but can we either get it resolved or closed ?
Ok, then I propose that we get it resolved :)
I don't have time to work on this, but I think it is too important, so I don't want to close it. Anyway, some pointers in implementing this: To be totally sure of what dependencies were used in building a package, you need to keep a list of all files that are opened. The easiest way may be to just start strace -f -topen from within ebuild.sh, and then generate a list of used files from the output. The best way would be to have sandbox keep a hash of opened files and dump that at the end of the run. Takers welcome.
Update: Here's how to get a package list that can help you write a DEPEND line for an ebuild: (using distcc as an example) Generate the list of files: # strace -qf -esignal=\!all -eopen -o distcc.out /usr/lib/portage/bin/ebuild /usr/portage/sys-devel/distcc/distcc-2.0.1-r1.ebuild compile [...] output deleted Get the unique list of successfully opened files that are important, and find out which packages they are from: # awk -F\" '/open/&&!/ENOENT/&&$2!~/\/tmp\//&&$2~/^\//{files[$2]++}END{for(f in files){print f}}' distcc.out | /usr/lib/portage/bin/find-packages baselayout-1.8.6.4-r1 fileutils-4.1.11 gcc-3.2.2-r1 glibc-2.3.2 popt-1.7-r1 portage-2.0.47-r10 zlib-1.1.4-r1 (zlib is probably a portage dependency or something like that. Running strace within ebuild.sh should give better results) And the RDEPEND could get hints from (after install) # find /var/tmp/portage/distcc-2.0.1-r1/image -type f -perm -100 | /usr/lib/portage/bin/find-requires|/usr/lib/portage/bin/find-packages glibc-2.3.2 popt-1.7-r1 To get more accurate results, the strace calls should be included in ebuild.sh. To do that, in the dyn_compile function, it should probably attach strace to the current process (strace -p $$ ...), run pkg_compile, and kill the strace. This could be controlled by a --find-depends option. The RDEPEND stuff could be run from the dyn_install function. It would be nicer if it was sandbox writing down the files, since it knows what files are inside the sandbox and which not, and since it would increase the speed.
So here's a quick patch that makes ebuild tell you what packages were used during the compile. It still needs to be switched on conditionally, but it's a good first step, I'd say. I found for instance that distcc needs linux-headers to compile, which is missing from the DEPEND line. See, it already does useful stuff ;) Note that find-packages needs to have the little arrow removed in the grep for this to work properly. No idea why it's there. Now it just searches for symlinks. Also note that I needed to add exec tracing as well. So, please, someone take this ball and run with it. Still needed: - option --find-depends in the ebuild executable, to make it conditional - saving the results in a file - the abovementioned method to get an RDEPEND approximation - make it nicely integrated with portage. --- ebuild.sh.orig 2003-05-04 23:59:47.000000000 +0200 +++ ebuild.sh 2003-05-05 00:10:56.000000000 +0200 @@ -650,7 +650,12 @@ #scripts, so set it to $T. export TMP="${T}" export TMPDIR="${T}" - src_compile + strace -qf -e signal=\!all -e open,execve -o "${T}/openedfiles.log" -p $$ & + local strace_pid=$! + src_compile + kill $strace_pid + ewarn Packages that were used for compilation: + awk -F\" '!/ENOENT/&&$2!~/\/tmp\//&&$2~/^\//{files[$2]++}END{for(f in files){print f}}' "${T}/openedfiles.log" | /usr/lib/portage/bin/find-packages #|| abort_compile "fail" cd ${BUILDDIR} touch .compiled
Created attachment 11510 [details, diff] Patch to get full dependencies on compiling Looks like I'm spending time on it after all :) This patch doesn't need find-packages any more and is lots faster and more accurate. Still To Do: - integrate in ebuild with a --find-deps option. Currently, you define FINDDEPS=true. - find the rdepends on installation - save the results in a file, like COMPILE_DEPENDS or something like that. - bugtest and tell people to use it/make the GRP builds use it.
Created attachment 11561 [details, diff] Small update Two little edits to make stuff more clear.
Created attachment 11872 [details, diff] patch for libsandbox.c to support tracing This will enable SANDBOX_TRACE and SANDBOX_TRACE_LOG to create the filelist that is normally created with strace. Advantages: paths are normalized, you only see the important paths, it's faster than strace, and it doesn't use strace. My feeling about this patch is that the code is very stable. Of course, testing can never hurt :) Carpaski, if you don't want to apply it, can you at least add the documentation comments I added in front?
Created attachment 11873 [details, diff] ebuild.sh changes to use the libsandbox patch As you can see, very minimal. In the long run, the sort -u should stay, and an external prog should be used to generate the package list. This could then also be used to generate package lists from stracing a running binary, giving a precise RDEPEND. More ideas: - put the tracing around all ebuild stages. This won't help for stages that don't use sandbox, like postinstall, but for these the author knows what he uses. - Mark the packages that are in the system package list so that no dependencies get added needlessly. (should go in that external app) Comments very welcome...
It seems like nothing has happened on this since May? Why hasn't the patch been applied? It seems pretty harmless since it needs the FINDDEPS variable to be set.
Technically and enhancement, marking it as such- Meanwhile, is this still even kicking?
So... no longer kicking. Beyond that, reasons why this approach has major issues- How will dlopen trickery (say, kdelibs fex) be handled automatically? If an autogeneration of RDEPENDS is implemented, it pretty much has to be fool proof, 'coz it would become heavily relied upon. Beyond that, autogeneration of rdepends fails pretty miserably if the source makes any exec calls to external utilities- that's not even accounting trying to auto-determine what is required for scripts- quilt fex.