Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 726334 - Suggestion: use git sync method by default
Summary: Suggestion: use git sync method by default
Status: CONFIRMED
Alias: None
Product: Gentoo Release Media
Classification: Unclassified
Component: Stages (show other bugs)
Hardware: All Linux
: Normal enhancement (vote)
Assignee: Gentoo Release Team
URL:
Whiteboard:
Keywords:
Depends on: 560518 673412
Blocks:
  Show dependency tree
 
Reported: 2020-05-31 03:27 UTC by Raymond Jennings
Modified: 2024-02-19 02:39 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Raymond Jennings 2020-05-31 03:27:43 UTC
I've been using the sync repo via anongit for quite awhile now and it's working remarkably well.  I'd like to suggest that the release media/stage 3 migrate to using it by default.
Comment 1 Joonas Niilola gentoo-dev 2020-05-31 16:51:02 UTC
Although anongit is not the place to sync your portage tree from, since it doesn't contain metadata and then you need to generate it by yourself. Therefore rsync is better, or alternatively https://github.com/gentoo-mirror/gentoo but people could be ... outrageous or confused about finding they use Github unvoluntarily. Installation guide should point out this alternative to rsync though.
Comment 2 Raymond Jennings 2020-05-31 18:09:15 UTC
I would also like to point out that git://anongit.gentoo.org/repo/sync/gentoo.git is a sync-friendly mirror specifically designed to be used for this and actually *does* contain the metadata you speak of.
Comment 3 Joonas Niilola gentoo-dev 2020-06-01 08:16:24 UTC
Indeed, I've seen /repo/ instead of /sync/. I tried syncing from it and it took a long time, so maybe it's not ready for mass-use? Some steps are needed to package git portage tree initially. 

Another thing that comes to my mind is that dev-vcs/git is nearly 30 MB by itself, and probably has more deps than net-misc/rsync (which is <900 KB), while stage3 builds are shrunk to minimum. People are of course free to switch to git later on.

I'd also like to see git become the main way of syncing tree. I tried to get this bug assigned for releng so it finds right hands, but had some problems. See the blocker bug.
Comment 4 Raymond Jennings 2020-06-01 08:45:35 UTC
(In reply to Joonas Niilola from comment #3)
> Indeed, I've seen /repo/ instead of /sync/. I tried syncing from it and it
> took a long time, so maybe it's not ready for mass-use? Some steps are
> needed to package git portage tree initially. 

This is probably due to the size of the repo, so the initial clone stage would take awhile.  However, I do believe that amortized over time it's actually faster even than rsync since once the full clone is complete, incrementally pulling further commits after that is actually LESS overhead than rsync walking the entire tree.

That said a good repack upstream on infra should mitigate even the initial clone
 
> Another thing that comes to my mind is that dev-vcs/git is nearly 30 MB by
> itself, and probably has more deps than net-misc/rsync (which is <900 KB),

Most if not all of git's deps are due to optional USE flags

> while stage3 builds are shrunk to minimum. People are of course free to
> switch to git later on.
> 
> I'd also like to see git become the main way of syncing tree. I tried to get
> this bug assigned for releng so it finds right hands, but had some problems.
> See the blocker bug.
Comment 5 Raymond Jennings 2020-06-01 23:21:22 UTC
Also anongit was specifically designed to be mirrored and scaled up with demand if infra ever sees the need to invest more into it.  By contrast, git.gentoo.org was meant to be the only writable version going straight to the repo.
Comment 6 Matt Turner gentoo-dev 2020-08-26 20:25:30 UTC
Cc'ing dev-portage@.

(Obviously switching to git by default would require a wider discussion, which I think is better had elsewhere)

If we were to make this change, the mechanics of generating the configuration file (per bug 560518) are clear to me.

What isn't clear is how we would pull git into the depgraph. Zac, do you have a suggestion? Add IUSE=+git to sys-apps/portage?
Comment 7 Zac Medico gentoo-dev 2020-08-26 21:13:37 UTC
(In reply to Matt Turner from comment #6)
> What isn't clear is how we would pull git into the depgraph. Zac, do you
> have a suggestion? Add IUSE=+git to sys-apps/portage?

Either that, or add git to the @system set.

I think we'd probably want to enable shallow sync as well (bug 673412), in order to keep disk space usage in check.
Comment 8 Andreas K. Hüttel archtester gentoo-dev 2022-04-09 16:42:09 UTC
For the record, I'm against this. rsync is still way more reliable.

Why? I'm regularly coming across machines where the gentoo tree is a git checkout and the git sync mechanism has "wedged" somehow, requiring manual intervention. I.e., a merge commit on top, errors when pulling, ...

Most of the time that's easy to fix with a few git commands, but that is not the point. You should be able to run "emerge --sync" from a cron job without ever needing intervention.
Comment 9 John Helmert III archtester Gentoo Infrastructure gentoo-dev Security 2023-02-21 00:02:53 UTC
(In reply to Andreas K. Hüttel from comment #8)
> For the record, I'm against this. rsync is still way more reliable.
> 
> Why? I'm regularly coming across machines where the gentoo tree is a git
> checkout and the git sync mechanism has "wedged" somehow, requiring manual
> intervention. I.e., a merge commit on top, errors when pulling, ...
> 
> Most of the time that's easy to fix with a few git commands, but that is not
> the point. You should be able to run "emerge --sync" from a cron job without
> ever needing intervention.

Is this still a concern after portage-3.0.40 (which has the forced-reset-to-upstream for this git shortcoming)?
Comment 10 John Helmert III archtester Gentoo Infrastructure gentoo-dev Security 2023-02-21 00:03:40 UTC
(In reply to John Helmert III from comment #9)
> (In reply to Andreas K. Hüttel from comment #8)
> > For the record, I'm against this. rsync is still way more reliable.
> > 
> > Why? I'm regularly coming across machines where the gentoo tree is a git
> > checkout and the git sync mechanism has "wedged" somehow, requiring manual
> > intervention. I.e., a merge commit on top, errors when pulling, ...
> > 
> > Most of the time that's easy to fix with a few git commands, but that is not
> > the point. You should be able to run "emerge --sync" from a cron job without
> > ever needing intervention.
> 
> Is this still a concern after portage-3.0.40 (which has the
> forced-reset-to-upstream for this git shortcoming)?

.. that being the changes here: https://github.com/gentoo/portage/pull/931