Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 872617

Summary: sys-apps/portage: auto-deduplicate (CoW) against specified tree
Product: Portage Development Reporter: Michał Górny <mgorny>
Component: Conceptual/Abstract IdeasAssignee: Portage team <dev-portage>
Status: CONFIRMED ---    
Severity: enhancement CC: flow
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=722270
Whiteboard:
Package list:
Runtime testing required: ---

Description Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-09-24 07:46:45 UTC
Let's consider the following scenario.  I have a development container in /var/lib/machines/gentoo-amd64 where I build most of the software.  Then I use the binpkgs there to speed up upgrades of my main system.  As a result, most (but not all) files from rootfs are duplicates of files in the development container.

I would find it really helpful to have an auto-deduplicate function in Portage.  Basically, I'd set something like:

  AUTO_DEDUPLICATE="/var/lib/machines/gentoo-amd64"

and while merging, Portage would compare every installed file against "${AUTO_DEDUPLICATE}/${path}" and if it were identical, used the equivalent of `cp --reflink=auto` from that path instead of installing the file from image.

This should provide for roughly permanent space savings without the need to repeatedly run duperemove, and the performance impact of that.
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-09-24 08:27:09 UTC
Hmm, or instead of copying from the dedupe source, it could try calling the kernel sysctl for deduping — that would probably be both easier and safer.