For example, temporary immutable repository snapshots can be use as a lock-free means to avoid interference from concurrent repository sync operations as in bug 639374. The snapshots could be created using an abstraction layer with plugins for various snapshot implementations based on things like gitfs, btrfs, zfs, device-mapper, overlayfs.
We can easily implement a read-copy-update (RCU) mechanism that works on any POSIX-compliant file system, using hardlinks to share files between snapshots, and symlinks for atomic snapshot updates. Readers of a particular immutable snapshot would resolve its canonical path via a symlink, and when the symlink is atomically updated it will only affect new readers. Old immutable snapshots can be safely garbage collected when there are no more readers. An update starts by cloning the latest snapshot using a command like cp -rl, and programs like rsync and git are smart enough to break a hardlink when updating a file, ensuring copy-on-write behavior.
I have a working implementation based on hardlinks in this branch, but I want to make some changes before I consider it complete: https://github.com/zmedico/portage/tree/bug_662070_immutable_repo_snapshot
Patches posted for review: https://archives.gentoo.org/gentoo-portage-dev/message/57cb0f69e9ba73e8125c3aa7484e8ce9 https://github.com/gentoo/portage/pull/352
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=36f50e3b64756179758a8e3a11a3c6c666550cf5 commit 36f50e3b64756179758a8e3a11a3c6c666550cf5 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-07-31 07:28:45 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-09-24 05:52:52 +0000 Add sync-rcu support for rsync (bug 662070) Add a boolean sync-rcu repos.conf setting that behaves as follows: Enable read-copy-update (RCU) behavior for sync operations. The current latest immutable version of a repository will be referenced by a symlink found where the repository would normally be located (see the location setting). Repository consumers should resolve the cannonical path of this symlink before attempt to access the repository, and all operations should be read-only, since the repository is considered immutable. Updates occur by atomic replacement of the symlink, which causes new consumers to use the new immutable version, while any earlier consumers continue to use the cannonical path that was resolved earlier. This option requires sync-allow-hardlinks and sync-rcu-store-dir options to be enabled, and currently also requires that sync-type is set to rsync. This option is disabled by default, since the symlink usage would require special handling for scenarios involving bind mounts and chroots. Bug: https://bugs.gentoo.org/662070 Reviewed-by: Brian Dolbec <dolsen@gentoo.org> Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/portage/repository/config.py | 36 +++- lib/portage/repository/storage/hardlink_rcu.py | 251 +++++++++++++++++++++++++ lib/portage/sync/syncbase.py | 4 +- lib/portage/tests/sync/test_sync_local.py | 40 +++- man/portage.5 | 35 ++++ 5 files changed, 360 insertions(+), 6 deletions(-) https://gitweb.gentoo.org/proj/portage.git/commit/?id=884ad951d700d1871cab2e321e4d8635b1a0f698 commit 884ad951d700d1871cab2e321e4d8635b1a0f698 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-07-30 06:21:30 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-09-24 04:24:10 +0000 rsync: split out repo storage framework Since there are many ways to manage repository storage, split out a repo storage framework. The HardlinkQuarantineRepoStorage class implements the existing default behavior, and the InplaceRepoStorage class implements the legacy behavior (when sync-allow-hardlinks is disabled in repos.conf). Each class implements RepoStorageInterface, which uses coroutine methods since coroutines are well-suited to the I/O bound tasks that these methods perform. The _sync_decorator is used to convert coroutine methods to synchronous methods, for smooth integration into the surrounding synchronous code. Bug: https://bugs.gentoo.org/662070 Reviewed-by: Brian Dolbec <dolsen@gentoo.org> Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/portage/repository/storage/__init__.py | 2 + .../repository/storage/hardlink_quarantine.py | 95 ++++++++++++++++++++++ lib/portage/repository/storage/inplace.py | 49 +++++++++++ lib/portage/repository/storage/interface.py | 87 ++++++++++++++++++++ lib/portage/sync/controller.py | 1 + lib/portage/sync/modules/rsync/rsync.py | 85 +++++-------------- lib/portage/sync/syncbase.py | 53 +++++++++++- 7 files changed, 306 insertions(+), 66 deletions(-) https://gitweb.gentoo.org/proj/portage.git/commit/?id=de0b60ff277311e780102131dce3111b4db1c196 commit de0b60ff277311e780102131dce3111b4db1c196 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-07-28 13:53:11 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-09-24 03:41:32 +0000 Add _sync_decorator module Add functions that decorate coroutine methods and functions for synchronous usage, allowing coroutines to smoothly blend with synchronous code. This eliminates clutter that might otherwise discourage the proliferation of coroutine usage for I/O bound tasks. In the next commit, _sync_decorator will be used for smooth integration of new classes that have coroutine methods. Bug: https://bugs.gentoo.org/662070 Reviewed-by: Brian Dolbec <dolsen@gentoo.org> Signed-off-by: Zac Medico <zmedico@gentoo.org> .../tests/util/futures/test_compat_coroutine.py | 14 ++++++ lib/portage/util/futures/_sync_decorator.py | 54 ++++++++++++++++++++++ 2 files changed, 68 insertions(+)