While working on several enhancements to `emerge sync` I realized that
it had to be refactored or it would become harder and harder to add new
This bug is for tracking the progress of the rewrite, other bugs about
`emerge sync` I was working on will depend on this one.
*** Bug 35540 has been marked as a duplicate of this bug. ***
Created attachment 22067 [details, diff]
patch for emerge
This patch makes emerge using the new sync module when you run `emerge --sync`.
Other modifications are support for overlay syncs and moving the cache update
to it's own function.
Overlay sync works as follows:
You create a file /etc/portage/overlays that contains lines in the format
name syncuri overlay
where name is an alphanumeric identifier (excluding the special keyword all),
syncuri is a valid SYNC url (see comment on sync.py) and overlay is the
directory to use for that url. It's not necessary to be listed in the
Then you can run `emerge --sync name` where name is an entry from the
overlays file or the special keyword "all" which syncs all repositories,
including the default SYNC/PORTDIR. `emerge --sync` without an argument
will behave as currently, syncing SYNC/PORTDIR.
Created attachment 22068 [details]
new sync module
This is he new sync module used by the patched emerge.
It features classes for cvs, rsync and snapshot syncs and a factory class
to create a Connection instance for a given syncuri by looking up the protocol
part in a table (http:// and ftp:// are used for snapshots) and creating an
instance of the correct subclass. All Connection classes provide a setup()
method that checks the creation parameters and a sync() method that does the
Most of the code for the RsyncConnection and CvsConnection classes was copied
from the current emerge code, the SnapshotConnection class is basically a
ported version of emerge-webrsync.
Created attachment 22069 [details, diff]
patch for emerge
missed a return value check
I should say that this is alpha code and missing several checks and documentation.
*** Bug 28128 has been marked as a duplicate of this bug. ***
Created attachment 23407 [details]
some functions not directly related to sync code
Created attachment 23408 [details]
Created attachment 23409 [details]
Created attachment 23410 [details]
This is still lacking some support for 3rd-party snapshots for overlays
Created attachment 23411 [details]
new sync module using separated files
This code scans /usr/lib/portage/pym/sync/*.py for available sync modules and
provides some primitive register/unregister function for protocol handlers.
(the emerge patch needs a trivial change: `connection` is now `Connection`)
*** Bug 50785 has been marked as a duplicate of this bug. ***
Just as a notice:
I've hacked together two alternative emerge-webrsync replacements, called
In this setup, one tar.bz2 for each package in each category is kept
up-to-date in a repository (ie, some directories) a static web server.
A manifest file containing md5sums for each package is kept in the root
directory. An md5sum of the manifest is kept in a separate file in the root.
Upon sync, the client first calculates an md5sum for each of his local
packages (this can and is easily cached); if properly cached, it amounts
1) if no files have changed locally since the last sync (ie, nobody touched
/usr/portage), and the last sync was done with synctool, the md5sum of the
manifest is downloaded from the server and compared against the md5sum of
the local manifest. if they differ, proceed to (2).
2) download the manifest from the server, and compare each local package's
md5sum (either cached, or calculated on the spot, then cached). for each
package that's different, get the server tar.bz2.
3) no local packages exist; download all packages referenced in the server
B) incremental update over http:
A cron-job on the server keeps track of which files are created and removed
in PORTDIR. a tar.bz2 is created (called a 'daily delta') every day of all
the changes in the previous 24 hours, along with a complete manifest of
all files that are supposed to be in the tree, with their md5sums.
When a client syncs against the server for the first time, it takes note of
the time. Subsequent syncs will only need to get sufficient daily deltas to
bring the client up-to-date.
The server is free to collapse daily deltas into weekly deltas and monthly
deltas; typically, the server will keep daily deltas back one-two weeks, then
weekly deltas four-six weeks, then monthly deltas for two months.
The client is able to pick the correct combination of past daily, weekly and
monthly deltas to bring itself back into sync.
The requirement, is that none of the files in /usr/portage changes between
calls to synctool. If that happens, a fallback to method (A) is necessary.
In both case A and B, the server can be a stupid http server serving static
pages. This will allow any old P500 to serve thousands of clients at the
same time; all the logic is in the client.
Furthermore, all traffic goes across port 80, so firewalls are practically
not a problem at all (and it can easily be proxied and cached with squid).
Is this something I should squeeze into gentoolkit (or an app-portage/synctool)
package, or should I work with you on trying to integrate into portage proper?
Karl, i think you could make sync modules like rsync.py, snapshot.py.
Could you post your tool or mail me please?
"import sync" should be moved inside if myaction == "sync" block.
Stylistically, you should import everything in the global space.
If it has a major time impact, you are probably running code in
the global space and should correct that.
Hmmm... this is already in cvs head.
Took a slightly different approach in writing it- mainly it's less dynamic in determining which sync protocol maps to which class (you have to add them to an intermediate func), but it's implemented/incvs for next major release.
*** Bug 105261 has been marked as a duplicate of this bug. ***
*** Bug 110753 has been marked as a duplicate of this bug. ***
> Hmmm... this is already in cvs head
> but it's implemented/incvs for next major release.
Is this in Portage now? If so, this bug could be marked as resolved.
(In reply to comment #20)
> > Hmmm... this is already in cvs head
> > ...
> > but it's implemented/incvs for next major release.
> Is this in Portage now? If so, this bug could be marked as resolved.
It's in the old 2.1-experimental branch which was abandoned: