Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 158829 - app-portage/portage-utils: qfile -f file
Summary: app-portage/portage-utils: qfile -f file
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Portage Utils Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 160735
  Show dependency tree
 
Reported: 2006-12-22 06:26 UTC by TGL
Modified: 2007-01-07 18:31 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
qfile--from-file.patch (qfile--from-file.patch,17.72 KB, patch)
2007-01-07 16:38 UTC, TGL
Details | Diff
man--move-config-orphans-script.patch (man--move-config-orphans-script.patch,1.01 KB, patch)
2007-01-07 16:40 UTC, TGL
Details | Diff
man/include/qfile-04-from-file.include (qfile-04-from-file.include,2.16 KB, text/plain)
2007-01-07 16:41 UTC, TGL
Details
bench.sh (bench.sh,757 bytes, text/plain)
2007-01-07 16:48 UTC, TGL
Details
bench.log.with-vdb-contents-whitespace-crap (bench.log.with-vdb-contents-whitespace-crap,1.53 KB, text/plain)
2007-01-07 16:51 UTC, TGL
Details
bench.log.improved-vdb-contents-whitespace-crap (bench.log.improved-vdb-contents-whitespace-crap,2.38 KB, text/plain)
2007-01-07 16:53 UTC, TGL
Details
bench.log.without-vdb-contents-whitespace-crap (bench.log.without-vdb-contents-whitespace-crap,2.15 KB, text/plain)
2007-01-07 16:54 UTC, TGL
Details
bench.log.short-lists (bench.log.short-lists,3.44 KB, text/plain)
2007-01-07 16:56 UTC, TGL
Details

Note You need to log in before you can comment on or make changes to this bug.
Description TGL 2006-12-22 06:26:00 UTC
Hi Solar,

~2 weeks ago you've sent a request to portage-utils@ for a qfile enhancement: adding a "-f file" option, so that the list of files to query are read from a file instead of the command line args (with "-f -" for stdin). You've also asked for a bug report, to track progress.  I'm sorry for the late reply, i've been busy this last weeks, and offline most of the time.
Anyway, i have finally found time to implement the feature yesterday. I can't send a patch now tho, since i didn't have a recent original file to diff with... I'll download one from cvsweb today, and will attach a patch here next time i get an internet access, somewhere next week.
Comment 1 solar (RETIRED) gentoo-dev 2006-12-22 08:08:03 UTC
Thanks TGL..
Comment 2 TGL 2007-01-07 16:36:52 UTC
Here is (finally) the patch for --from-file support.  Sorry for sending that late again, i've been offline longer than expected.

The patch is a bit big, because it also reorganize some of the qfile code:
 - the qfile(...) function now takes a structure as argument, instead of the numerous arguments it was using before.  This struct (qfile_args_t) holds all the various arrays that are needed (the basenames of query items, their dirnames, etc.)
 - whereas before all this arrays where prepared in qfile_main(...), this is now done in a new separate function, which is meant to fill the qfile_args_t struct from a list of query items (argv).
 - there are also new functions to create and free such structures.
This helps keeping the code a bit more readable i think, although it is a bit longer than before.  And i've also moved a few "free(something)" lines around, to make valgrind happy.

About --from-file support, the way it works is obvious: instead of using argv for query arguments, it first builds a list by doing some fgets on the input file.  The only trick is that there is a limit on the number of lines which are read from this file and treated at one time.  This was required for handling huge "find ... | qfile -f -" queries, so that memory consumption stays bounded.
To give you a rought idea, handling 100 000 files at a time would have consumed ~20MB RAM, meaning that a "find / | qfile -o -f -" on my system would have consumed ~100MB.  Also, when using some too long lists of query items, performances drop because of bad caches usage.

After doing a few benchs (for which i will attach the results), i've choosed a default limit of 5 000.  It keeps memory consumption very low, has reasonable performances here, and also has the benefit of displaying some results at a regular rate when doing a huge --orphans query (because with --orphans, results for a group of query items are all displayed at one time, when the search ends). Anyway, this default value can be changed with a new option (--max-args) option, if anyone cares.

Talking about performances, i've noted that there has been a change in CVS (and 0.1.22) a few months ago which makes reading the vdb CONTENTS files quite slow:
http://sources.gentoo.org/viewcvs.py/gentoo-projects/portage-utils/main.c?r1=1.124&r2=1.125
This has a big influence on what a correct value for the above mentioned --max-args limit is.  My 5 000 value is based on an optimized version of this code chunk, for which i have opened bug #160725.

Finally, i've also written a new section about --from-file usage for the qfile.1 manpage.
Comment 3 TGL 2007-01-07 16:38:23 UTC
Created attachment 105843 [details, diff]
qfile--from-file.patch

Adds --from-file/-f (and --max-args/-m) support for qfile.
Comment 4 TGL 2007-01-07 16:40:00 UTC
Created attachment 105845 [details, diff]
man--move-config-orphans-script.patch

A small patch for qfile-02-orphans.include, which removes an example script that i've moved to a new man section.
Comment 5 TGL 2007-01-07 16:41:09 UTC
Created attachment 105849 [details]
man/include/qfile-04-from-file.include

A new qfile.1 section about the --from-file option.
Comment 6 TGL 2007-01-07 16:48:31 UTC
Created attachment 105855 [details]
bench.sh

For what it's worth, the script i've used to bench "qfile -f file -m XXX".
Usage is as follow:

 - create some "something.list" files in the current directory, which are lists of various sizes. I've used:
# echo -e "/usr/bin/vi\n/usr/bin/vim" > 00-very-short.list
# find /bin > 01-bin.list
# find /usr/bin > 02-usr-bin.list
# find /usr/share/man > 03-usr-share-man.list
# find /usr/lib > 04-usr-lib.list

 - launch the script, and go make some coffee if you have some lists with tenths of thousands entries.  It will produce a "bench.log" file.
Comment 7 TGL 2007-01-07 16:51:12 UTC
Created attachment 105859 [details]
bench.log.with-vdb-contents-whitespace-crap

Here are some benchs results with the current main.c code for reading CONTENTS files (before bug #160725).
Comment 8 TGL 2007-01-07 16:53:18 UTC
Created attachment 105867 [details]
bench.log.improved-vdb-contents-whitespace-crap

Here are some benchs results with the slow chunk of main.c rewritten (after bug #160725).
Comment 9 TGL 2007-01-07 16:54:40 UTC
Created attachment 105869 [details]
bench.log.without-vdb-contents-whitespace-crap

And for reference, here are benchs results with the offending chunk removed (ie., similar to 0.1.21).
Comment 10 TGL 2007-01-07 16:56:53 UTC
Created attachment 105873 [details]
bench.log.short-lists

Finally, here is a quick bench i made to check that using too high --max-args values doesn't kill perfs of small queries like it does on big ones.
Comment 11 solar (RETIRED) gentoo-dev 2007-01-07 18:10:35 UTC
I'm going to commit this with minor changes to the option name.
Comment 12 solar (RETIRED) gentoo-dev 2007-01-07 18:31:26 UTC
cvs ci -m "- qfile -f file support. TGL bug #158829"
? scripts
/var/cvsroot/gentoo-projects/portage-utils/qfile.c,v  <--  qfile.c
new revision: 1.39; previous revision: 1.38
/var/cvsroot/gentoo-projects/portage-utils/man/mkman.sh,v  <--  man/mkman.sh
new revision: 1.10; previous revision: 1.9
/var/cvsroot/gentoo-projects/portage-utils/man/qfile.1,v  <--  man/qfile.1
new revision: 1.21; previous revision: 1.20
/var/cvsroot/gentoo-projects/portage-utils/man/include/qfile-02-orphans.include,v  <--  man/include/qfile-02-orphans.include
new revision: 1.2; previous revision: 1.1
/var/cvsroot/gentoo-projects/portage-utils/man/include/qfile-04-from.include,v  <--  man/include/qfile-04-from.include
initial revision: 1.1
/var/cvsroot/gentoo-projects/portage-utils/man/include/qfile-99-authors.include,v  <--  man/include/qfile-99-authors.include
initial revision: 1.1