Portage Package List In this document, I describe a package list type scheme to replace the rsync of ebuilds. This is inspired by Debian GNU/Linux, and is intended to provide several advantages over the current method. The first advantage would be that Portage would no longer require a long-winded `emerge sync` of all ebuilds and meta-data. Instead, an `emerge sync` would retrieve the package lists and other important data from the mirror. This would reduce both filesystem usage and rsync stress. The second advantage is that, with proper tweaking, this scheme could be used to more easily implement binary package acquisition with Portage. This would require more strict dependency control as well; but if done properly, the system could react automatically to USE flag changes. == Structure == The current portage tree structure is represented qualitatively below. This is not an exact representation of structure; just a general idea. /usr/portage/ /usr/portage/cat-egory/ /usr/portage/cat-egory/package/ /usr/portage/cat-egory/package/ebuilds.ebuild /usr/portage/cat-egory/package/Manifest /usr/portage/cat-egory/package/ChangeLog /usr/portage/cat-egory/package/files/ /usr/portage/cat-egory/package/files/digest /usr/portage/cat-egory/package/files/patches.diff /usr/portage/eclass/ /usr/portage/eclass/some.eclass /usr/portage/profiles/ /usr/portage/profiles/* The new structure would be as below. /usr/portage/ /usr/portage/categories/ /usr/portage/categories/cat-egory.list.gz /usr/portage/eclass/ /usr/portage/eclass/some.eclass /usr/portage/profiles/ /usr/portage/profiles/* cat-egory.list.gz (or .bz2, or just plaintext) would contain a list of all meta-data about an ebuild relavent to the dependency calculation process. This data would be used to determine what to download, and when to download it, as well as what packages are needed. The ebuilds would no longer be in the portage tree. Upon merge, portage would download the ebuild, manifest, digest, and appropriate patches to /tmp; and then execute the ebuild. In this manner, the (now almost 100000) list of files to hash against and transfer becomes smaller; although the files to do the calculations against and transfer of become themselves larger. Compression of the cat-egory.list will reduce both the network overhead of transfer and the CPU overhead of hashing by reducing the size of the data to be checked. This will incur a one-time overhead of compression. The use of a cat-egory.list file will incur overhead from indexing the CVS tree before it is pushed to the rsync mirrors. I suggest all rsync mirrors be two steps behind: as the rsync mirror is grabbing the data for the tree as of (-1), the tree as current (0) will be being scanned and indexed. This requires that a snapshot can be taken of the CVS tree for indexing, else there will be loss of service (CVS commits should not occur during the indexing) or lack of consistency in the cat-egory.list. Users may benefit from holding full descriptions in cat-egory.list, as Debian does in packages.gz. This will, of course, increase file size.