Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 45221 - "selective sync" in portage
Summary: "selective sync" in portage
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All All
: High enhancement (vote)
Assignee: Portage team
: 44526 (view as bug list)
Depends on:
Reported: 2004-03-20 09:51 UTC by Lorenz Kiefner
Modified: 2007-01-11 12:54 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Note You need to log in before you can comment on or make changes to this bug.
Description Lorenz Kiefner 2004-03-20 09:51:37 UTC
The portage-tree is growing fast. In the last 6 months about 10.000 files were added and the directory "/usr/portage" consumes now about 280megs. Additionally about 80megs are consumed in "/var/db".
On the other side, many installations are on routers or other machines with limited resources, on whitch there's no need for X and 20+ window-managers, for example.
I know the existance of "rsync_excludes", but I think this is hard to keep up to date for individuals.
I propose to implement a "selective sync" based on predefined profiles, e.g. "system", "gnome", "kde", "games", "dev", "security" and "net". A target "all" should exist to bypass the selective sync.
These profiles could be stored on the rsync-mirrors and be fetched before the sync. These profiles could be in the format of rsync_excludes, but managed by the portage-team. It would e.g. be easy to put all GLSAs into the "security"-profile. Further would it help to save bandwidth and time on the user-side and load on the rsync-mirrors.

Reproducible: Always
Steps to Reproduce:
Comment 1 Brian Harring (RETIRED) gentoo-dev 2004-08-02 03:57:10 UTC
Klieber, you still watching this at all?
Offhand, I'm against doing any form of exclusion to what's synced- you would basically have to build a graph of everything that must be synced for each 'sync profile', which is a mild pita.
Comment 2 Kurt Lieber (RETIRED) gentoo-dev 2004-08-02 08:14:51 UTC
actually, I had overlooked this one.  After reading it,I'm not particularly keen on it, either.  Regardless, it's a portage issue, so kicking over to Nick and crew.
Comment 3 Brian Harring (RETIRED) gentoo-dev 2005-02-28 00:15:57 UTC
*** Bug 44526 has been marked as a duplicate of this bug. ***
Comment 4 Gavin 2005-02-28 10:00:31 UTC
FYI, I'm currently using a technique somewhere in between the extremes I've seen posted in various duplicates of this bug entry.  Direct support for my technique in portage would offer benefits, but such support is not required.  Anyway, I posted this a long time ago on the dev list:

I'm currently using RSYNC_EXCLUDEFROM in /etc/make.conf with a list of patterns describing files that are irrelevant to my platform/config/needs/etc., to almost cut in half the number of files sync'd by 'emerge sync'.  I first weed out certain packages and categories of packages using various scripts.  For example, I don't use X, so I listed these in my RSYNC_EXCLUDEFROM file.

Is there a better way of avoiding the wasted bandwidth incurred by emerge sync'ing without resorting to a manually maintained list referenced by RSYNC_EXCLUDEFROM?  If a mechanism for assigning multiple classifications (hereafter referred to as "dimensions"), instead of the 1 dimensional /usr/portage/<classification>, then an exclusion mechanism (similar to using RSYNC_EXCLUDEFROM) might simply list the "dimensions" that are irrelevant to a particular system, and safe to exclude when performing "emerge sync".

If every ebuild had one or more type/kind/purpose/etc. "dimension" classifications, then I could define combinations of "dimensions" with different update (emerge sync) priorities/frequencies, vastly reducing both bandwidth consumption and my time required to maintain this bandwidth reduction system for emerge sync's.  Obviously, such a system needs "smarts" to know what which ebuilds must be fetched, regardless of classification priorities (e.g. depends on which packages are already emerged and their current/new dependencies).

If I define /usr/portage/<categories> as a set of categorized ebuilds for various software "packages", then I can define "dimensions" as abstract groupings of existing Gentoo portage "<categories>".  Using this terminology, for the purposes of my prior suggestion/question regarding filtering out the "bloat" (unnecessary bandwidth consumed) during an "emerge sync", I further define the following as descriptions of candidate "dimensions":

o TYPE, as in "drivers-${TYPE}"
o "<categories>"
o requires X
o requires KDE
o requires Gnome
o requires "platform"
o requires a specific type of kernel (e.g. linux, freebsd, openbsd), or subtype (e.g. linux/mm)
o many other possibilities

I'm primarily interested in using Gentoo as a server platform.  Large portions of the portage tree might be irrelevant to those with various specialized purposes.  My hypothesis centers around the idea that "<categories>" form an inadequate set of "dimensions" by which users might utilize as criteria for exclusion during "emerge sync".  Furthermore, as more "<categories>" and ebuilds are added, the manual effort required to update and sanity check a RSYNC_EXCLUDEFROM list rises along with the bandwidth consumed.

Several others have documented various techniques to resolve the "changing dependencies" issue with using "partial sync's" of the portage tree.
Comment 5 Jason Stubbs (RETIRED) gentoo-dev 2005-07-28 07:25:18 UTC
Putting a hold on feature requests for portage as they are drowning out the 
bugs. Most of these features should be available in the next major version of 
portage. But for the time being, they are just drowning out the major bugs and 
delaying the next version's progress. 
Any bugs that contain patches and any bugs for etc-update or dispatch-conf can 
be reopened. Sorry, I'm just not good enough with bugzilla. ;) 
Comment 6 Marius Mauch (RETIRED) gentoo-dev 2007-01-11 12:54:50 UTC
Closing due to old age. If anyone wants to revive it do it on gentoo-dev, not bugzilla.