Change the default sync method to git this would probably cover: * the installation instructions provided in the handbook etc * setup of the stage3 tarballs * the default sync method for the emerge --sync operation * adding git to the @system set
I think the main issue with git is the increased disk space usage required for the .git directory. Since bug 552814 it's possible to use shallow pull, but it would be nice to see some stats on garbage collection of unused objects.
Would it be desirable to enable shallow pull by default, and if so, how effective is garbage collection?
(In reply to Zac Medico from comment #2) > Would it be desirable to enable shallow pull by default, and if so, how > effective is garbage collection? I think shallow pool is already the default if you use git sync
For shallow pull, you have to set "sync-depth = 1" in repos.conf (default is 0 which means unlimited depth): https://gitweb.gentoo.org/proj/portage.git/commit/?id=903c4b1a67689c4b8cc59113a56d58575cf7db8e
(In reply to Zac Medico from comment #4) > For shallow pull, you have to set "sync-depth = 1" in repos.conf (default is > 0 which means unlimited depth): > > https://gitweb.gentoo.org/proj/portage.git/commit/ > ?id=903c4b1a67689c4b8cc59113a56d58575cf7db8e This was news to me, thanks for the update.
(In reply to Zac Medico from comment #4) > For shallow pull, you have to set "sync-depth = 1" in repos.conf (default is > 0 which means unlimited depth): > > https://gitweb.gentoo.org/proj/portage.git/commit/ > ?id=903c4b1a67689c4b8cc59113a56d58575cf7db8e Actually I just double checked. Are you sure? Something funky's going on, I have a config file this: [DEFAULT] main-repo = gentoo [gentoo] location = /usr/portage sync-type = git sync-uri = git://anongit.gentoo.org/repo/sync/gentoo.git But the pull does this: metalhead 2 ~ # emerge --sync >>> Syncing repository 'gentoo' into '/usr/portage'... /usr/bin/git clone --depth 1 git://anongit.gentoo.org/repo/sync/gentoo.git . Cloning into '.'... remote: Enumerating objects: 151409, done. remote: Counting objects: 100% (151409/151409), done. remote: Compressing objects: 100% (134925/134925), done. remote: Total 151409 (delta 48075), reused 74037 (delta 15243) Receiving objects: 100% (151409/151409), 65.29 MiB | 4.74 MiB/s, done. Resolving deltas: 100% (48075/48075), done. Checking out files: 100% (135041/135041), done. === Sync completed for gentoo q: Updating ebuild cache in /usr/portage ... q: Finished 36144 entries in 1.106590 seconds Something's making it do --depth 1 by default
For the initial clone, it defaults to --depth=1. These are the relevant options as shown in `man 5 portage`: clone-depth Specifies clone depth to use for DVCS repositories. Defaults to 1 (only the newest commit). If set to 0, the depth is unlimited. sync-depth Specifies sync depth to use for DVCS repositories. If set to 0, the depth is unlimited. Defaults to 0.
(In reply to Zac Medico from comment #7) > For the initial clone, it defaults to --depth=1. These are the relevant > options as shown in `man 5 portage`: > > clone-depth > Specifies clone depth to use for DVCS repositories. Defaults to 1 > (only the newest commit). If set to 0, the depth is unlimited. > > sync-depth > Specifies sync depth to use for DVCS repositories. If set to 0, the > depth is unlimited. Defaults to 0. I ran across this recently when converting over to using repos.conf and git sync for my portage tree. Maybe I'm not seeing it, but the documentation seems to say 'clone-depth' defaults to 1 (shallow), while 'sync-depth' defaults to 0 (unlimited depth). This is confusing, which one overrides the other? I had set my repos.conf up without including either configuration option, rm /usr/portage, and then 'emerge --sync' and when done I ended up with a shallow git checkout. I'm thinking the documentation needs to be more clear on the man page. Maybe clone-depth should reference sync-depth and vise-versa? Why are there even 2 options for configuring depth? Is that an over sight?
(In reply to Doug Miller from comment #8) > Maybe I'm not seeing it, but the documentation seems to say 'clone-depth' > defaults to 1 (shallow), while 'sync-depth' defaults to 0 (unlimited depth). > This is confusing, which one overrides the other? The clone-depth setting is for the initial git clone, and sync-depth is for each sync thereafter. > I had set my repos.conf up without including either configuration option, rm > /usr/portage, and then 'emerge --sync' and when done I ended up with a > shallow git checkout. Correct, that's that expected behavior for the default clone-depth setting. With the default unlimited sync-depth setting, the history will grow with each sync. > I'm thinking the documentation needs to be more clear on the man page. > Maybe clone-depth should reference sync-depth and vise-versa? Yes, that's would be a good idea. Why are there even 2 options for configuring depth? Is that an over sight? Shallow clone is very simple to do. Shallow sync is much more complicated because it involves multiple git commands that are not necessarily obvious. Our first shallow sync implementation had problems, but the current one seems to work reliably.
(In reply to Zac Medico from comment #9) > (In reply to Doug Miller from comment #8) > > I'm thinking the documentation needs to be more clear on the man page. > > Maybe clone-depth should reference sync-depth and vise-versa? > > Yes, that's would be a good idea. I believe I understand the configuration distinction between the 'initial clone' vs 'following syncs' and why there are two configuration options. I've been meaning to get more involved with Gentoo. I'll see if I can get a merge request worked up that just clarifies the documentation in the man page a bit (and cross reference those 2 configuration options). Also, maybe I can add some more information to the wiki. As this was a point of confusion for me hopefully this will help out others later.
Just for the record I also ran across problems with attempting shallow syncs. In short, it somehow managed to derange the current HEAD or something involving branching. Since the syncable version of the portage tree is a perpetually merged QA enhanced version of the dev tree (i.e., with QA checks and manifests and so on), and thus every main commit is a merge commit, it might be confusing the shallow sync process.
(In reply to Raymond Jennings from comment #11) > Just for the record I also ran across problems with attempting shallow syncs. > > In short, it somehow managed to derange the current HEAD or something > involving branching. > > Since the syncable version of the portage tree is a perpetually merged QA > enhanced version of the dev tree (i.e., with QA checks and manifests and so > on), and thus every main commit is a merge commit, it might be confusing the > shallow sync process. We haven't had any bug reports against the *second* implementation (tracked by bug 552814), which was introduced in portage-2.3.42.