Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 319069 - app-portage/portage-utils-0.3.1: qfile no longer understand dirs symlinks
Summary: app-portage/portage-utils-0.3.1: qfile no longer understand dirs symlinks
Status: RESOLVED OBSOLETE
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Tools (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Portage Utils Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-09 15:58 UTC by TGL
Modified: 2012-10-28 09:45 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
qfile-fix-realpath-checks.patch (qfile-fix-realpath-checks.patch,526 bytes, patch)
2010-05-09 15:59 UTC, TGL
Details | Diff
qfile-optimized-fix-for-realpath-checks.patch (qfile-optimized-fix-for-realpath-checks.patch,2.87 KB, patch)
2010-06-08 20:27 UTC, TGL
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description TGL 2010-05-09 15:58:08 UTC
With portage-utils-0.3.1, I get no result for this query:
 % qfile /usr/lib64/debug/sbin/tune2fs.debug 
 %

With portage-utils-0.2.1, it was working fine:
 % qfile /usr/lib64/debug/sbin/tune2fs.debug 
 sys-fs/e2fsprogs (/usr/lib/debug/sbin/tune2fs.debug)
 %

Looking into recent changes in CVS, I've found the regression has been introduced in qfile.c revision 1.51.  The part where it was checking for matches under "realpath(dirname(CONTENTS entry))" is now skipped with default ROOT.

The attached patch fixes that.

Reproducible: Always

Steps to Reproduce:




Portage 2.2_rc67 (default/linux/amd64/10.0, gcc-4.4.3, glibc-2.11.1-r0, 2.6.33-gentoo-r2-1 x86_64)
=================================================================
System uname: Linux-2.6.33-gentoo-r2-1-x86_64-Intel-R-_Core-TM-2_Duo_CPU_E8500_@_3.16GHz-with-gentoo-2.0.1
Timestamp of tree: Sun, 09 May 2010 13:15:02 +0000
app-shells/bash:     4.1_p5
dev-java/java-config: 2.1.11
dev-lang/python:     2.6.5-r2, 3.1.2-r3
dev-python/pycrypto: 2.1.0
dev-util/cmake:      2.8.1-r1
sys-apps/baselayout: 2.0.1
sys-apps/openrc:     0.6.1-r1
sys-apps/sandbox:    2.2
sys-devel/autoconf:  2.13, 2.65
sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA dlj-1.1 sun-bcla-java-vm"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -O2 -ggdb -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/eselect/postgresql /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=core2 -O2 -ggdb -pipe"
DISTDIR="/var/portage/distfiles"
FEATURES="assume-digests buildpkg distlocks fixpackages news parallel-fetch preserve-libs protect-owned sandbox sfperms splitdebug strict unmerge-logs unmerge-orphans userfetch usersync"
GENTOO_MIRRORS="http://mirror.ovh.net/gentoo-distfiles/ ftp://ftp.free.fr/mirrors/ftp.gentoo.org/ ftp://ftp.first-world.info/ "
LANG="fr_FR.UTF-8"
LDFLAGS="-Wl,-O1,--hash-style=gnu,--sort-common -Wl,--as-needed"
LINGUAS="en_US en fr_FR fr"
MAKEOPTS="-j3"
PKGDIR="/var/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/var/portage/tree"
PORTDIR_OVERLAY="/var/portage/overlays/tgl /var/portage/overlays/bugzilla /var/portage/layman/sunrise /var/portage/layman/nx /var/portage/layman/mrpouet /var/portage/layman/xwing /var/portage/layman/java-overlay"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi akonadi alsa amd64 apache2 bash-completion berkdb branding bzip2 cairo cdda cddb cdparanoia cdr cli consolekit cracklib crypt cups cvs cxx dbus dga dri dts dv dvd dvdr encode exif fam ffmpeg flac fontconfig fuse gdbm gif gimp git glib gnome gnome-keyring gnutls gpm graphviz gstreamer gtk hal iconv id3tag ieee1394 imagemagick imap java java5 java6 jpeg jpeg2k latex lcms libnotify logrotate lua mad matroska mikmod mmx mng modules mp3 mpeg mudflap multilib musepack musicbrainz nautilus ncurses network nls nntp nptl nptlonly ogg openexr opengl openmp pam pango pch pcre pdf pg-intdatetime plasma plotutils png policykit pppd python qt3support qt4 raw readline reflection sasl sdl semantic-desktop session sndfile spell spl sse sse2 ssl startup-notification subversion svg sysfs taglib tcpd theora threads tiff truetype udev unicode usb v4l2 vim-syntax vorbis wavpack wma wmf x264 xattr xcb xcomposite xface xinerama xml xmp xorg xosd xpm xulrunner xv xvid xvmc zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en_US en fr_FR fr" RUBY_TARGETS="ruby18" SANE_BACKENDS="epson" USERLAND="GNU" VIDEO_CARDS="i810 intel" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 TGL 2010-05-09 15:59:33 UTC
Created attachment 230867 [details, diff]
qfile-fix-realpath-checks.patch

Patch against CVS HEAD.
Comment 2 SpanKY gentoo-dev 2010-06-08 05:10:12 UTC
i dont think that's the way we want to address this.  the old code was resolving things by accident because of the new root logic being added.  i disabled that as an optimization.  that block of code does a lot more than just resolve symlinks.

i also dislike not being able to query for symlinks themselves.  i guess we need a new option to control symlink behavior, and then key off of that in a new "else" case where it *only* does a realpath() ... not the ROOT related checks too.
Comment 3 TGL 2010-06-08 09:03:09 UTC
(In reply to comment #2)
> the old code was resolving things by accident because of the new root logic being added.

That's not true.

I've introduced this "dir_name == realpath(dirname(CONTENTS))" check with bug #130004 (r1.30), and the goal was exactly that (not missing actual matches because of directories symlinks).

I've then improved this check so that it also behaves correctly when "ROOT != /" in bug #142217 (r1.35), but it was not by accident that the "ROOT == /" was still correctly handled.


> i disabled that as an optimization.  that block of code does a lot more 
> than just resolve symlinks.

I know what it does.  Disabling it gives incorrect results, and is also not really a significant optimization.

Remember that you only enter this block when the basenames of the CONTENTS entry and of target file do match, but not their dirnames.  This is a rare corner case.  On my system, the worst occurence is when searching for a file named "README.bz2" (802 entries in my VDB).

Here are some benchs of this worst case:

=== without the attached patch ===

% sync; echo 3 > /proc/sys/vm/drop_caches

% time ./q file /usr/share/doc/vlc-1.0.6/README.bz2
media-video/vlc (/usr/share/doc/vlc-1.0.6/README.bz2)

real	0m13.696s
user	0m0.152s
sys	0m0.296s

% time ./q file /usr/share/doc/vlc-1.0.6/README.bz2
media-video/vlc (/usr/share/doc/vlc-1.0.6/README.bz2)

real	0m0.190s
user	0m0.160s
sys	0m0.024s


=== with the attached patch ===

% sync; echo 3 > /proc/sys/vm/drop_caches

% time ./q file /usr/share/doc/vlc-1.0.6/README.bz2
media-video/vlc (/usr/share/doc/vlc-1.0.6/README.bz2)

real	0m14.622s
user	0m0.120s
sys	0m0.356s

% time ./q file /usr/share/doc/vlc-1.0.6/README.bz2
media-video/vlc (/usr/share/doc/vlc-1.0.6/README.bz2)

real	0m0.182s
user	0m0.156s
sys	0m0.020s


And here is an other bench with many files:

% find /usr/lib* -type f -o -type l > /tmp/libs.list
% wc -l /tmp/libs.list 
70815 /tmp/libs.list

=== without the attached patch ===

% sync; echo 3 > /proc/sys/vm/drop_caches

% time ./q file -o -f /tmp/libs.list | wc -l
29788

real	1m47.681s
user	1m25.441s
sys	0m1.308s

% time ./q file -o -f /tmp/libs.list | wc -l
29788

real	1m23.295s
user	1m22.545s
sys	0m0.700s

=== with the attached patch ===

% sync; echo 3 > /proc/sys/vm/drop_caches

% time ./q file -o -f /tmp/libs.list | wc -l
15137

real	1m55.669s
user	1m27.021s
sys	0m3.804s

% time ./q file -o -f /tmp/libs.list | wc -l
15137

real	1m26.454s
user	1m23.217s
sys	0m3.132s


Sure, your version is a bit faster here, but at the price of 50% false positives in its output.


> i also dislike not being able to query for symlinks themselves.

Huh?  You can query for symlinks themselves, that's exactly what "qfile /path/to/the/symlink" does.  What you can't do is automagically querying for its target. But I must be misunderstanding something here, because you know that already (bug #317471).

> i guess we need a new option to control symlink behavior, and then key off of 
> that in a new "else" case where it *only* does a realpath() ... not the ROOT 
> related checks too.

I don't understand what you're suggesting.  Could you give an example of this alternative behavior?
Comment 4 TGL 2010-06-08 20:25:19 UTC
I've had an idea to reduce the slight performances penalty introduced by the "realpath(dirname(CONTENTS))" check I want to reactivate: reuse previously calculated realpath when the dirname has not yet changed. Thanks to CONTENTS files being sorted, it can save a good part of the "realpath()" calls.

Here are the figures, to compare with the ones from my previous comment.

=== with new patch ===

% sync; echo 3 > /proc/sys/vm/drop_caches

% time ./q file -o -f /tmp/libs.list | wc -l
15137

real	1m55.341s
user	1m27.741s
sys	0m1.596s

% time ./q file -o -f /tmp/libs.list | wc -l
15137

real	1m25.486s
user	1m24.477s
sys	0m0.948s

Comment 5 TGL 2010-06-08 20:27:08 UTC
Created attachment 234585 [details, diff]
qfile-optimized-fix-for-realpath-checks.patch

The patch implementing the little optimization explained in my previous comment.
Comment 6 SpanKY gentoo-dev 2010-06-10 01:06:20 UTC
your assuming that i keep track of all this.  i dont use/develop portage-utils every day anymore which means i forget history.  every once in a while i pound through the bugs since people do use these utilities.

there is a performance regression with the root code handling due to the constant alloc/string copies (xasprintf) which i fixed.  that code branch looks like it existed only to support a ROOT which most commonly is nothing.  you should probably add a comment as to the purpose of the branch if you dont want it being changed again.
Comment 7 SpanKY gentoo-dev 2012-10-28 09:45:52 UTC
seems to work with portage-utils-0.11.  if you're still seeing issues, please check out latest cvs.