Created attachment 590978 [details, diff] Fix "qpkg -c" always fails when called. # qpkg -c free(): double free detected in tcache 2 Aborted I've attached a patch removing the second free(). Looks like tree_get_atoms() is already calling tree_close() and we then are calling it again after the function... Here the debug output leading to it: # gdb /usr/bin/qpkg /usr/portage/packages/core GNU gdb (Gentoo 8.3 vanilla) 8.3 Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://bugs.gentoo.org/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/bin/qpkg... warning: core file may not match specified executable file. [New LWP 30827] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `qpkg -c'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f24301d95fb in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007f24301d95fb in raise () from /lib64/libc.so.6 #1 0x00007f24301c2539 in abort () from /lib64/libc.so.6 #2 0x00007f243021ffa9 in ?? () from /lib64/libc.so.6 #3 0x00007f2430227eaa in ?? () from /lib64/libc.so.6 #4 0x00007f2430229a7d in ?? () from /lib64/libc.so.6 #5 0x0000557d0290208f in qpkg_clean (dirp=0x557d03a27450 "/usr/portage/packages") at qpkg.c:146 #6 qpkg_main (argc=2, argv=0x7ffc4b1cb788) at qpkg.c:356 #7 0x0000557d028ee7fb in main (argc=2, argv=0x7ffc4b1cb788) at main.c:833 (gdb) frame 5 #5 0x0000557d0290208f in qpkg_clean (dirp=0x557d03a27450 "/usr/portage/packages") at qpkg.c:146 146 } (gdb) list 141 142 t = tree_open_vdb(portroot, portvdb); 143 if (t != NULL) { 144 vdb = tree_get_atoms(t, true, vdb); 145 tree_close(t); 146 } 147 148 if (eclean) { 149 size_t n; 150 const char *overlay; (gdb)
the fix is incomplete, more is broken in the new version: While qpkg no longer crashes it's now removing all binary packages, regardless if the package is installed or not. compared to app-portage/portage-utils-0.74 this is a regression.
I looked into the remaining issue but that one is more invasive to fix: The problem seems to be that we now map all installed packages of a system into a set with max 128 entries. qpkg can therefore any see max 128 packages as installed while my system has e.g. 5580. This mapping is of course also not collision free, so whatever packets are finally detected as "installed" is basically random. So the real second bug here is, that all but max 128 binary packages are removed when calling "qpkg -c" when the double free has been fixed. More technically the problem is, that the new portage-utils is using sets - add_set_unique() - to detect installed packages and that sets in the new version are using hashes with a max spread of 128. (Not sure if other users of add_set_unique() are all ok with _SET_HASH_SIZE set to 128 but at least for qpkg it can't work.) Now we could drastically increase the _SET_HASH_SIZE, to an value we have some confidence that it will cover all packages collision free. But then just stop using hashes in add_set_unique() - or a replacement function for qpkg only - again seems to be better approach. Also I'm kind of surprised that two such obvious bugs were not found in ~amd64. Are there other tools you can clean up uninstalled packages on a binhost or why I'm the only ones using qpkg -c regularly?
It's a chained hash, so the set isn't limited to 128 unique entries.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=5826ee815228775b61ed1587e1346f764dc93945 commit 5826ee815228775b61ed1587e1346f764dc93945 Author: Fabian Groffen <grobian@gentoo.org> AuthorDate: 2019-09-25 15:04:23 +0000 Commit: Fabian Groffen <grobian@gentoo.org> CommitDate: 2019-09-25 15:04:23 +0000 libq/tree: don't close tree in tree_get_atoms The caller passes tree, so the caller should clean it up. This fixes double frees observed in bug #695586 Bug: https://bugs.gentoo.org/695586 Signed-off-by: Fabian Groffen <grobian@gentoo.org> libq/tree.c | 1 - 1 file changed, 1 deletion(-)
I don't understand the code, yet. Guess I have to study it a bit more. But just found something really curious: As soon as I use a pipe it's not planning to delete the package. Here a sample with just cat. It's 100% replicable: Perry ~ # qpkg -pc | cat * Total space that would be freed in packages directory: 0 K Perry ~ # qpkg -pc [ 168 K ] app-portage/portage-utils-0.80 * Total space that would be freed in packages directory: 168 K Perry ~ # Since qpkg -c already purged all installed packages I have now just one - portage-utils-0.80 - as binary package left.
Just to be clear. I'm terribly sorry that qpkg misbehaved on your system. Thanks for your investigations, I'm taking this seriously, and if I can reproduce I'm sure I can fix the problem!
No harm done on any of my systems, I even could recover most of the binary packages and compile the few missing ones new. Just don't need them any longer. I'm sure I can track the issue down if you have problems reproducing it. But that redirecting the output to a pipe is changing the packages which gets deleted is ... insane. Really curios how that can happen. Here some more samples: Perry ~ # qpkg -pc [ 168 K ] app-portage/portage-utils-0.80 * Total space that would be freed in packages directory: 168 K Perry ~ # qpkg -pc | cat * Total space that would be freed in packages directory: 0 K Perry ~ # qpkg -pc > 1; cat 1 * Total space that would be freed in packages directory: 0 K Perry ~ # qpkg -pc 2> 1; cat 1 [ 168 K ] app-portage/portage-utils-0.80 * Total space that would be freed in packages directory: 168 K Perry ~ # Just dug a bit deeper here: When I redirect the output portage-utils-0.80 gets assigned pos 22, which is the "correct" one. Without redirection it's pos 113! But the string we hash is identical. So the hash function is acting differently?!? And indeed that seems to be the issue: Without output redirection we hash the following bytes for "portage-utils-0.80": 1b 5b 30 30 3b 30 30 6d 1b 5b 33 36 3b 30 31 6d 70 6f 72 74 61 67 65 2d 75 74 69 6c 73 1b 5b 30 30 3b 30 30 6d 1b 5b 30 30 3b 33 36 6d 2d 30 2e 38 30 1b 5b 30 30 3b 30 30 6d Which translates to the following string with hexdump:".[00;00m.[36;01mportage-utils.[00;00m.[00;36m-0.80.[00;00m" With output redirection it's: 70 6f 72 74 61 67 65 2d 75 74 69 6c 73 2d 30 2e 38 30 Which is just plain ascii for "portage-utils-0.80 Looks like there are still some colouring instructions or similar applied and we have to normalize the input first?
ah, yes, I should've used atom_to_string, not atom_format, which includes colour escapes...
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=84c003caddc16e7455d6d8d07d60d1c868ea5aa1 commit 84c003caddc16e7455d6d8d07d60d1c868ea5aa1 Author: Fabian Groffen <grobian@gentoo.org> AuthorDate: 2019-09-26 12:59:46 +0000 Commit: Fabian Groffen <grobian@gentoo.org> CommitDate: 2019-09-26 12:59:46 +0000 libq/tree: fix tree_get_atoms not to use atom_format atom_format uses colour escapes, which messes up the hashing used in the set. Thanks Alexander Wetzel. Bug: https://bugs.gentoo.org/695586 Signed-off-by: Fabian Groffen <grobian@gentoo.org> libq/tree.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage-utils.git/commit/?id=ea7f00529fa689a6624f6b26b5d3b7197f4fbfd5 commit ea7f00529fa689a6624f6b26b5d3b7197f4fbfd5 Author: Fabian Groffen <grobian@gentoo.org> AuthorDate: 2019-09-28 13:18:31 +0000 Commit: Fabian Groffen <grobian@gentoo.org> CommitDate: 2019-09-28 13:18:31 +0000 qpkg: don't emit hypothetical messages when real work has been done drop the "would be" part of how many bytes were freed when we're not pretending Bug: https://bugs.gentoo.org/695586 Signed-off-by: Fabian Groffen <grobian@gentoo.org> applets.h | 2 +- man/qpkg.1 | 4 ++-- qpkg.c | 8 +++++--- 3 files changed, 8 insertions(+), 6 deletions(-)
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0b04f542749e2fb95c627364bf58b075c164c4f6 commit 0b04f542749e2fb95c627364bf58b075c164c4f6 Author: Fabian Groffen <grobian@gentoo.org> AuthorDate: 2019-10-20 10:25:14 +0000 Commit: Fabian Groffen <grobian@gentoo.org> CommitDate: 2019-10-20 10:25:38 +0000 app-portage/portage-utils: version bump to v0.81 - 697094: qfile incorrectly matching /usr/lib - 697068: qlop -r not showing ongoing merges from parallel merges - 696078: qgrep not matching revisioned ebuilds - 695586: qpkg double frees and incorrect unpacking - 694972: qlop -r/-a speedups - 692224: qlop support for alternate ROOT - 677982: qfile report matches from prune lib registry Closes: https://bugs.gentoo.org/697094 Closes: https://bugs.gentoo.org/697068 Closes: https://bugs.gentoo.org/696078 Closes: https://bugs.gentoo.org/695586 Closes: https://bugs.gentoo.org/694972 Closes: https://bugs.gentoo.org/692224 Closes: https://bugs.gentoo.org/677982 Package-Manager: Portage-2.3.76, Repoman-2.3.16 Signed-off-by: Fabian Groffen <grobian@gentoo.org> app-portage/portage-utils/Manifest | 1 + .../portage-utils/portage-utils-0.81.ebuild | 87 ++++++++++++++++++++++ 2 files changed, 88 insertions(+)