Since we will need to change our syncing stuffs to get proper GPG signatures on uncompressed tarballs, I think it would be nice to rename the tarballs as well, to gentoo-YYYYMMDD.tar*, following our effort on removing the conflicting use of term 'portage'.
*** Bug 679814 has been marked as a duplicate of this bug. ***
I guess what we have to do is to generate extra snapshots along the old ones. While at it, we may also leave .bz2 for old snapshots only.
Plus, the top-level directory should be renamed from portage to gentoo.
(In reply to Ulrich Müller from comment #3) > Plus, the top-level directory should be renamed from portage to gentoo. To gentoo-YYYYMMDD even, to match tarball name.
In order to rename the top-level directory, tools like emerge-webrsync and emerge-delta-webrsync need to be updated to use tar commands with --strip-components 1.
Also, if the top-level directory name is not constant, emerge-webrsync and emerge-delta-webrsync will have to find an alternative to this command that's used to extract the timestamp: > tar --to-stdout -x portage/metadata/timestamp.x It seems that --strip-components 1 is not helpful for this particular operation, but this works: > tar --to-stdout --wildcards -x '*/metadata/timestamp.x'
No progress on this? What are the blockers? If nobody is using the snapshots, in favour of emerge-webrsync then surely these can be dropped? Otherwise, the tarballs still contain a 'portage' subpath which doesn't match new locations vis https://bugs.gentoo.org/show_bug.cgi?id=662982 . Looks like the master scripts are in https://gitweb.gentoo.org/infra/mastermirror-scripts.git From comments looks like three discrete steps - fix emerge-xxx scripts/functions, update root path, rename tar-balls, unless I'm mistaken?
Created attachment 589026 [details, diff] mastermirror.diff Here's a cheap patch I proposed at some point.
(In reply to Michael 'veremitz' Everitt from comment #7) > No progress on this? What are the blockers? If nobody is using the > snapshots, in favour of emerge-webrsync then surely these can be dropped? > Otherwise, the tarballs still contain a 'portage' subpath which doesn't > match new locations vis https://bugs.gentoo.org/show_bug.cgi?id=662982 . > > Looks like the master scripts are in > https://gitweb.gentoo.org/infra/mastermirror-scripts.git > > From comments looks like three discrete steps - fix emerge-xxx > scripts/functions, update root path, rename tar-balls, unless I'm mistaken? On infra's end we need to have both filenames for some time (possibly forever, but hopefully not.) Mgorny's patch is nice (its straightforward) but it increases the disk requirement for Gentoo Mirrors. If we can get away with this perhaps we should apply the patch. Otherwise I'm not super interested in writing a bunch of complex shell to sustain a link farm just so a file has the correct name. That isn't enough value for the complexity required, IMHO. -A
infra/mastermirror-scripts.git tag 20191207T072430Z contains cleanups just prior to this new work. tag 20191207T075744Z implements this bug. 20191207T072430Z is being deployed tonight, 20191207T075744Z will be deployed tommorow (2019/12/07). mgorny: thanks for the clear example patch, I picked an alternate way of doing it that's much less IO intensive (transform the old uncompressed tarball instead of creating a new one).
ulm: tag 20191208T062327Z is in place for the run that will take place at 2019/12/09 00:45 UTC
In portage-2.3.82, emerge-webrsync supports gentoo-YYYYMMDD snapshots: https://gitweb.gentoo.org/proj/portage.git/commit/?id=b39c1731abd2b16e31e98e8deba9699dd73bbf7c commit b39c1731abd2b16e31e98e8deba9699dd73bbf7c Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2019-12-15 08:14:00 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2019-12-15 08:26:21 +0000 emerge-webrsync: support gentoo-YYYYMMDD snapshots Support gentoo-YYYYMMDD snapshots for forward compatibility, and portage-YYYYMMDD snapshots for backward compatibility. Bug: https://bugs.gentoo.org/693454 Signed-off-by: Zac Medico <zmedico@gentoo.org> bin/emerge-webrsync | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
A seamless transition is not possible for emerge-delta-webrsync, since it's not possible for the same deltas to apply to both a gentoo-YYYYMMDD snapshot and a portage-YYYYMMDD snapshot. Perhaps the best way to trigger the transition will be with a USE flag that we add to package.use.force when we want to force the transition.
zmedico: the bdelta performance on the gentoo* tarballs is ~100% larger than before, because the filename prefix keeps changing. Switching to another alg supported by differ limits it to at most 50% larger than the existing deltas. Here's the size in bytes for the different algs: 8667219 snapshot-gentoo-20200111-20200112.bdelta 4638241 snapshot-gentoo-20200111-20200112.bdelta.bz2 4568484 snapshot-gentoo-20200111-20200112.bdelta.xz 4649757 snapshot-portage-20200111-20200112.bdelta 3997388 snapshot-portage-20200111-20200112.bdelta.bz2 3971992 snapshot-portage-20200111-20200112.bdelta.xz 8959287 snapshot-gentoo-20200111-20200112.bdiff 5287214 snapshot-gentoo-20200111-20200112.bdiff.bz2 4779192 snapshot-gentoo-20200111-20200112.bdiff.xz 4573762 snapshot-portage-20200111-20200112.bdiff 4119809 snapshot-portage-20200111-20200112.bdiff.bz2 4052188 snapshot-portage-20200111-20200112.bdiff.xz 6984950 snapshot-gentoo-20200111-20200112.gdiff 5262625 snapshot-gentoo-20200111-20200112.gdiff.bz2 4791808 snapshot-gentoo-20200111-20200112.gdiff.xz 4502358 snapshot-portage-20200111-20200112.gdiff 4109229 snapshot-portage-20200111-20200112.gdiff.bz2 4040656 snapshot-portage-20200111-20200112.gdiff.xz 6394907 snapshot-gentoo-20200111-20200112.switching 4693603 snapshot-gentoo-20200111-20200112.switching.bz2 4618812 snapshot-gentoo-20200111-20200112.switching.xz 4428992 snapshot-portage-20200111-20200112.switching 4013645 snapshot-portage-20200111-20200112.switching.bz2 3991848 snapshot-portage-20200111-20200112.switching.xz
hi, any news here in this bug? what's the current status? asking for council meeting =)
(In reply to Robin Johnson from comment #14) > zmedico: the bdelta performance on the gentoo* tarballs is ~100% larger than > before, because the filename prefix keeps changing. > > Switching to another alg supported by differ limits it to at most 50% larger > than the existing deltas. > > Here's the size in bytes for the different algs: I think any one of the algorithms will be fine, since the diffs compress reasonably well regardless of the algorithm.
(In reply to Robin Johnson from comment #14) > zmedico: the bdelta performance on the gentoo* tarballs is ~100% larger than > before, because the filename prefix keeps changing. Since there is little progress here, how about the obvious solution? Namely, use the constant name "gentoo" for the top-level directory.
Maybe we should last-rite emerge-delta-webrsync, given that sync-type = git is a viable alternative for minimal-bandwidth sync these days.
(In reply to Zac Medico from comment #18) > Maybe we should last-rite emerge-delta-webrsync, given that sync-type = git > is a viable alternative for minimal-bandwidth sync these days. Maybe have a forums poll to find out if people still use it?
(In reply to Ulrich Müller from comment #17) > (In reply to Robin Johnson from comment #14) > > zmedico: the bdelta performance on the gentoo* tarballs is ~100% larger than > > before, because the filename prefix keeps changing. > > Since there is little progress here, how about the obvious solution? Namely, > use the constant name "gentoo" for the top-level directory. That means carrying a series of tarballs with the undated directory in them; if zmedico is fine with that (last time he wasn't, and wanted to keep the date in the prefix path inside gentoo-* tarballs for good reason); then this entire bug becomes moot.(In reply to Ulrich Müller from comment #19) > (In reply to Zac Medico from comment #18) > > Maybe we should last-rite emerge-delta-webrsync, given that sync-type = git > > is a viable alternative for minimal-bandwidth sync these days. > > Maybe have a forums poll to find out if people still use it? ulm: can council please start a formal poll on it, announced widely on -dev & -dev-announce; just please use somewhere other than Forums ;-) I think getting rid of emerge-delta-webrsync, to be replaced with some incremental git bundles would probably work, but some details need to be worked out.
(In reply to Robin Johnson from comment #20) > > > Maybe we should last-rite emerge-delta-webrsync, given that sync-type = git > > > is a viable alternative for minimal-bandwidth sync these days. > > > > Maybe have a forums poll to find out if people still use it? > ulm: can council please start a formal poll on it, announced widely on -dev > & -dev-announce; just please use somewhere other than Forums ;-) IIUC, we won't need a poll if we change to an undated top-level directory. So I'd suggest that we settle that other question first.
I don't like the use of "--strip-components 1", because you use the properties of the top level directory. If, for example, /var/db/repos/gentoo is owned by root:root and you extract with that flag, root ownership is retained for that folder making portage execute the fetch as root. I recommend s.th. like: tar \ --transform 's#^portage/#gentoo/#' \ --transform 's#^portage$#gentoo#' \ --transform 's#^gentoo-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/#gentoo/#' \ --transform 's#^gentoo-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$#gentoo#' \ -C /mnt/gentoo/var/db/repos/ -xvpJf <tar of your choice> With above command, ownership of /var/db/repos/gentoo is changed to portage:portage.
Even better: tar \ --transform 's#^\(portage\|gentoo-[0-9]\{8\}\)#gentoo#' \ -C /mnt/gentoo/var/db/repos/ \ -xvpJf <gentoo-latest.tar.xz or portage-latest.tar.xz> thx @ulm for the tip