Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 689290 - =app-portage/portage-utils-0.80_pre20190620: 'qfile -o' regressed and does not find some files.
Summary: =app-portage/portage-utils-0.80_pre20190620: 'qfile -o' regressed and does no...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Fabian Groffen
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-04 22:35 UTC by Sergei Trofimovich (RETIRED)
Modified: 2019-07-14 16:30 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
var-db-pkg-dev-haskell-void-0.7.2.tar.gz (var-db-pkg-dev-haskell-void-0.7.2.tar.gz,27.07 KB, application/gzip)
2019-07-04 22:37 UTC, Sergei Trofimovich (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-04 22:35:58 UTC
The following file is mis-flagged as an orphan:

$ fgrep -R /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf /var/db/pkg/dev-haskell/void-0.7.2/
/var/db/pkg/dev-haskell/void-0.7.2/CONTENTS:obj /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf d41d8cd98f00b204e9800998ecf8427e 1562125360

$ qfile -o /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
/usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf

Noticed when haskell-updater started reporting a ton of orphan files that are supposed to be tracked by package manager.

I'm not sure yet what is causing misdetection. Bisectinlg locally.
Comment 1 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-04 22:37:35 UTC
Created attachment 581932 [details]
var-db-pkg-dev-haskell-void-0.7.2.tar.gz

var-db-pkg-dev-haskell-void-0.7.2.tar.gz is a compressed tarball of /var/db/pkg/dev-haskell/void-0.7.2/
Comment 2 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-04 22:51:37 UTC
Bisect converged at f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a.
Reverting this commit on top of master helps restoring 'qfile -o'.

$ git checkout . && git bisect bad
Updated 0 paths from the index
f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a is the first bad commit
commit f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a
Author: Fabian Groffen <grobian@gentoo.org>
Date:   Fri May 10 17:23:25 2019 +0200

    libq/tree: make pkg sorting based on atom_compare
    
    Using alphasort on pkgs makes little sense because they include version
    information that needs careful extraction and matching rules as
    implemented by atom_compare.
    
    In order to use atom_compare efficiently, that is, reusing the
    atom_explode work done for the elements while running qsort, use
    tree_get_atom, which caches the retrieved atom.  Extra bonus is that any
    function that retrieves the atom afterwards gets it for free.  This
    speeds up significantly apps that need to construct atoms, such as
    qkeywords.
    
    Signed-off-by: Fabian Groffen <grobian@gentoo.org>

 libq/tree.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++--------------
 libq/tree.h |  2 +-
 2 files changed, 51 insertions(+), 16 deletions(-)

$ git bisect log
# bad: [9836a593874dd3459c8ec1035635ede29d3afbfa] main: default main_overlay to first overlay
# good: [5539ab9cf34f303b7e11c5989d8cde8f1ed57043] rm_rf_at: ensure return code makes sense
git bisect start 'master' 'v0.74'
# good: [7cf702111a7350b17443f4d9d0d76138b503dac3] libq/tree: merge vdb and cache
git bisect good 7cf702111a7350b17443f4d9d0d76138b503dac3
# good: [7cf702111a7350b17443f4d9d0d76138b503dac3] libq/tree: merge vdb and cache
git bisect good 7cf702111a7350b17443f4d9d0d76138b503dac3
# bad: [4551aa7c34b13bf71359d288cd2bee39eed61590] tests/qmanifest: attempt to verify gpg setup
git bisect bad 4551aa7c34b13bf71359d288cd2bee39eed61590
# bad: [4dc16c6cfcf2c3a4d2a439ee93999a4cd1864af2] qmerge: ensure we respect SLOT while finding candidate package to unmerge
git bisect bad 4dc16c6cfcf2c3a4d2a439ee93999a4cd1864af2
# bad: [0486e2a62cdce058a9a569940a93dae1f0442f1a] libq/atom: split out SLOT and SUBSLOT for atom_format
git bisect bad 0486e2a62cdce058a9a569940a93dae1f0442f1a
# good: [c64290307919bcd65e816277914309b4423be308] gitignore: ignore generated files
git bisect good c64290307919bcd65e816277914309b4423be308
# bad: [c7e04780a3161d6e8785c175680751839f9d768b] qgrep: use tree_get_atom
git bisect bad c7e04780a3161d6e8785c175680751839f9d768b
# bad: [2977f24478a673ff869bb6d26bf69b90b099deb5] qkeyword: optimise away redundant atom_explode calls
git bisect bad 2977f24478a673ff869bb6d26bf69b90b099deb5
# bad: [f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a] libq/tree: make pkg sorting based on atom_compare
git bisect bad f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a
# first bad commit: [f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a] libq/tree: make pkg sorting based on atom_compare
Comment 3 Fabian Groffen gentoo-dev 2019-07-05 05:36:02 UTC
interesting, thanks for the bisect, aparently the tests don't test the same thing
Comment 4 Fabian Groffen gentoo-dev 2019-07-05 06:46:21 UTC
Disturbing:

% env ROOT=${PWD}/vdb Q_VDB=/ ./qlist -Iv                                                            dev-haskel/void-0.7.2
% env ROOT=${PWD}/vdb Q_VDB=/ ./qfile -o /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
% env ROOT=${PWD}/vdb Q_VDB=/ ./qfile /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
dev-haskel/void: /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
Comment 5 Fabian Groffen gentoo-dev 2019-07-05 06:49:58 UTC
Out of curiosity, does tests/qfile run successfully on your system?  I'm a bit confused as to how pkg sorting order can influence this bug somehow.
Comment 6 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-05 08:23:43 UTC
Seems to pass with both when ran against git tree (make check):

/home/slyfox/dev/git/portage-utils/tests/qfile/dotest
PASS: q file -Cq /bin/bash /bin/XXXXX
PASS: q file -Co /bin/bash /bin/XXXXX
PASS: q file -Co -x bash /bin/bash
PASS: q file -Co -x app-shells/bash /bin/bash
PASS: q file -Co -x bash:0 /bin/bash
PASS: q file -Co -x app-shells/bash:0 /bin/bash
PASS: (cd /home/slyfox/dev/git/portage-utils/tests/qfile/root/bin; q file -RCq bash)
PASS: (cd /home/slyfox/dev/git/portage-utils/tests/qfile/root/; q file -Co whatever)
qfile: 8 passes / 0 fails

and against ::gentoo ebuild (FEATURES=test emerge -v1 app-portage/portage-utils):

/tmp/portage-tmpdir/portage/app-portage/portage-utils-0.80_pre20190620/work/portage-utils-0.80_pre20190620/tests/qfile/dotest
PASS: q file -Cq /bin/bash /bin/XXXXX
PASS: q file -Co /bin/bash /bin/XXXXX
PASS: q file -Co -x bash /bin/bash
PASS: q file -Co -x app-shells/bash /bin/bash
PASS: q file -Co -x bash:0 /bin/bash
PASS: q file -Co -x app-shells/bash:0 /bin/bash
PASS: (cd /tmp/portage-tmpdir/portage/app-portage/portage-utils-0.80_pre20190620/work/portage-utils-0.80_pre20190620/tests/qfile/root/bin; q file -RCq bash)
PASS: (cd /tmp/portage-tmpdir/portage/app-portage/portage-utils-0.80_pre20190620/work/portage-utils-0.80_pre20190620/tests/qfile/root/; q file -Co whatever)
qfile: 8 passes / 0 fails

No failures reported.
Comment 7 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-05 08:36:47 UTC
asan detects only a memory leak:

$ ./q qfile -v -o /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
/usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf

=================================================================
==31380==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 23 byte(s) in 1 object(s) allocated from:
    #0 0x7fe03746d838 in __interceptor_malloc /usr/src/debug/sys-devel/gcc-9.1.0-r1/gcc-9.1.0/libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x5628d44f8aca in xmalloc (/home/slyfox/dev/git/portage-utils/q+0xcbaca)
    #2 0x5628d44f8c2f in xmemdup (/home/slyfox/dev/git/portage-utils/q+0xcbc2f)
    #3 0x5628d44f8c72 in xstrdup (/home/slyfox/dev/git/portage-utils/q+0xcbc72)
    #4 0x5628d4478d99 in set_portage_env_var (/home/slyfox/dev/git/portage-utils/q+0x4bd99)
    #5 0x5628d447a098 in read_portage_env_file (/home/slyfox/dev/git/portage-utils/q+0x4d098)
    #6 0x5628d4479720 in read_portage_env_file (/home/slyfox/dev/git/portage-utils/q+0x4c720)
    #7 0x5628d4479720 in read_portage_env_file (/home/slyfox/dev/git/portage-utils/q+0x4c720)
    #8 0x5628d447bee1 in initialize_portage_env (/home/slyfox/dev/git/portage-utils/q+0x4eee1)
    #9 0x5628d447d106 in main (/home/slyfox/dev/git/portage-utils/q+0x50106)
    #10 0x7fe036dfaf2a in __libc_start_main ../csu/libc-start.c:308

ubsan detects something minor:

$ ./configure CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined'
$ make
$ tree.c:363:5: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf

Thw following gets rit of the warning but does not fix the problem:

@@ -359,7 +359,7 @@ tree_next_pkg_int(tree_cat_ctx *cat_ctx)
                                        cat_ctx->pkg_cnt--;
                        }
 
-                       if (cat_ctx->ctx->pkgsortfunc != NULL) {
+                       if (cat_ctx->pkg_ctxs != NULL && cat_ctx->ctx->pkgsortfunc != NULL) {
                                qsort(cat_ctx->pkg_ctxs, cat_ctx->pkg_cnt,
                                                sizeof(*cat_ctx->pkg_ctxs), cat_ctx->ctx->pkgsortfunc);
                        }
Comment 8 Fabian Groffen gentoo-dev 2019-07-05 09:36:33 UTC
(In reply to Sergei Trofimovich from comment #7)
> asan detects only a memory leak:
> 
>     #8 0x5628d447bee1 in initialize_portage_env

This is an expected/intended leak, unless I interpret asan's output incorrectly.  I'll run valgrind later on this to confirm.

> ubsan detects something minor:
> 
> $ ./configure CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined'
> $ make
> $ tree.c:363:5: runtime error: null pointer passed as argument 1, which is
> declared to never be null
> /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-void-0.7.2.conf
> 
> Thw following gets rit of the warning but does not fix the problem:
> 
> @@ -359,7 +359,7 @@ tree_next_pkg_int(tree_cat_ctx *cat_ctx)
>                                         cat_ctx->pkg_cnt--;
>                         }
>  
> -                       if (cat_ctx->ctx->pkgsortfunc != NULL) {
> +                       if (cat_ctx->pkg_ctxs != NULL &&

At first sight, this cannot be the right approach, it indicates some bigger problem (like unhandled empty dir scenario or something).  I'll have to look at it later also.

Like you reported, it isn't the problem you're seeing.  Somehow the sort order either makes the problematic case (as here) appear too early and therefore make the code don't "see" this entry, or something else is messing it up.

I'm leaving for holidays now, so I cannot commit to a timeframe for this, sorry.  Thanks for your evaluations/help sofar!
Comment 9 Larry the Git Cow gentoo-dev 2019-07-12 18:04:51 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=c720262bfd9a31512b03f2e3129839886d9d27c6

commit c720262bfd9a31512b03f2e3129839886d9d27c6
Author:     Fabian Groffen <grobian@gentoo.org>
AuthorDate: 2019-07-12 17:59:20 +0000
Commit:     Fabian Groffen <grobian@gentoo.org>
CommitDate: 2019-07-12 17:59:20 +0000

    libq/tree: avoid calling qsort with empty set
    
    Encountering an empty directory in tree_next_pkg_int will not populate
    cat_ctx->pkg_ctxs, so avoid calling qsort with it.  While at it, avoid
    calling it for a single entry too.
    
    Bug: https://bugs.gentoo.org/689290#c7
    Signed-off-by: Fabian Groffen <grobian@gentoo.org>

 libq/tree.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
Comment 10 Fabian Groffen gentoo-dev 2019-07-12 18:09:41 UTC
Can I have your entire /var/db/pkg?  I need to reproduce this issue.  It's probably some corner case I haven't seen yet.  Experimenting here has been unsuccessful sofar.
Comment 11 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-13 10:47:48 UTC
My /var/db/pkg is a bit big. I've adopted your command from #comment4 to run on every 'obj' entry from CONTENTS in a copy of /var/db/pkg as:

  $ cat bug.bash 
  #!/bin/bash

  # fetch all object names (files). caveat: space is handled incorrectly
  find dev-haskell -name CONTENTS -exec grep ^obj '{}' \; |
    awk '{print $2}' |
    # query files against db
    ROOT=$(pwd) Q_VDB=/ xargs qfile -o

It's a bit hacky as it breaks on spaces but is good enough to remove most of seemingly unrelated content:

   $ ./bug.bash
   /usr/lib64/ghc-8.6.5/gentoo/gentoo-dev-haskell-libxml-sax-0.7.5-libxml-sax-0.7.5.conf
   /usr/lib64/ghc-8.6.5/gentoo/gentoo-empty-dev-haskell-libxml-sax-0.7.5.conf
   /usr/lib64/ghc-8.6.5/package.conf.d/libxml-sax-0.7.5-4zvThBcDksZKjZMmnQUocC.conf
   /usr/lib64/libxml-sax-0.7.5/ghc-8.6.5/Text/XML/LibXML/SAX.dyn_hi
   ...

I've uploaded selfcontained subset of /var/db/pkg and a script as:
    
   https://dev.gentoo.org/~slyfox/bugs/vdb-bug-b689290.tar.bz2

It's 20MB

It looks like that removing enough files makes all orphans to go away. Which hints at the amount of input data being the problem and not necessarily a specific entry.
Comment 12 Larry the Git Cow gentoo-dev 2019-07-13 15:37:41 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=61c865750c821381f2f8d3bc93b4e149127d2fdb

commit 61c865750c821381f2f8d3bc93b4e149127d2fdb
Author:     Fabian Groffen <grobian@gentoo.org>
AuthorDate: 2019-07-13 15:32:46 +0000
Commit:     Fabian Groffen <grobian@gentoo.org>
CommitDate: 2019-07-13 15:32:46 +0000

    libq/tree: ensure we don't work on garbage on sorted pkg trees
    
    The contents of dirents come from a static buffer that may get
    re-purposed when the next readdir call is made, so we cannot rely on it
    staying around.  In particular on large directories, the entries will
    get recycled, and hence garbage appear.  Thus, we need to copy the
    entries, and free those copies.  The behaviour before
    f855d0f4f7c3e6e570a1ad3dc98d737e78996e4a was actually using scandir
    which allocates space for all dirents.  We now basically just copy the
    bit we need, instead of the full dirent.
    
    Thanks Sergei Trofimovich (slyfox) for digging into this case and
    providing a VDB which displayed the problem.
    
    Bug: https://bugs.gentoo.org/689290
    Signed-off-by: Fabian Groffen <grobian@gentoo.org>

 libq/tree.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 13 Fabian Groffen gentoo-dev 2019-07-13 15:41:21 UTC
it turned out to be an embarrassing bug, I'm sure this patch should fix this, but one can never be sure, so confirmation would be appreciated
Comment 14 Sergei Trofimovich (RETIRED) gentoo-dev 2019-07-13 17:03:27 UTC
Checked on my full vdb locally. Works fine now. Thank you!

One more unrelated valgrind complain: when 'qfile' is ran through xargs (as in scrip from #comment11) valgrind detects uninitialized variable detection due to failing ioctl():

    ==29323== Conditional jump or move depends on uninitialised value(s)
    ==29323==    at 0x11C7C2: main (main.c:784)

   778  int main(int argc, char **argv)
   779  {
   780          struct stat st;
   781          struct winsize winsz;
   782
   783          ioctl(0, TIOCGWINSZ, &winsz);
   784          twidth = winsz.ws_col > 0 ? (int)winsz.ws_col : 80;
Comment 15 Larry the Git Cow gentoo-dev 2019-07-14 08:37:16 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=339297b3247c8850194a48edf1ce03dbfdef337a

commit 339297b3247c8850194a48edf1ce03dbfdef337a
Author:     Fabian Groffen <grobian@gentoo.org>
AuthorDate: 2019-07-14 08:34:31 +0000
Commit:     Fabian Groffen <grobian@gentoo.org>
CommitDate: 2019-07-14 08:34:31 +0000

    main: rework terminal-based settings somewhat
    
    As pointed out by slyfox, the result from ioctl was ignored and its
    result used anyway.  While at it to fix this, rework the logic somewhat,
    such that terminal width and colours are always disabled when we're not
    dealing with a TTY.
    
    Bug: https://bugs.gentoo.org/689290#c14
    Signed-off-by: Fabian Groffen <grobian@gentoo.org>

 main.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)
Comment 16 Larry the Git Cow gentoo-dev 2019-07-14 16:30:52 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=631b38ecd0e5d3724590ddd53c0ff453d5ddcf45

commit 631b38ecd0e5d3724590ddd53c0ff453d5ddcf45
Author:     Fabian Groffen <grobian@gentoo.org>
AuthorDate: 2019-07-14 16:30:33 +0000
Commit:     Fabian Groffen <grobian@gentoo.org>
CommitDate: 2019-07-14 16:30:46 +0000

    app-portage/portage-utils: bump 0.80 pre
    
    Closes: https://bugs.gentoo.org/688442
    Closes: https://bugs.gentoo.org/689290
    Signed-off-by: Fabian Groffen <grobian@gentoo.org>
    Package-Manager: Portage-2.3.66, Repoman-2.3.11

 app-portage/portage-utils/Manifest                    |  2 +-
 app-portage/portage-utils/metadata.xml                |  1 +
 ...0.ebuild => portage-utils-0.80_pre20190714.ebuild} | 19 ++++++++++++++++++-
 app-portage/portage-utils/portage-utils-9999.ebuild   | 19 ++++++++++++++++++-
 4 files changed, 38 insertions(+), 3 deletions(-)