Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 662070 - sys-apps/portage: use temporary immutable repository snapshots to avoid interference (repos.conf sync-rcu)
Summary: sys-apps/portage: use temporary immutable repository snapshots to avoid inter...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All All
: Normal enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks: 240187 666956
  Show dependency tree
 
Reported: 2018-07-25 07:18 UTC by Zac Medico
Modified: 2022-07-28 05:16 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2018-07-25 07:18:57 UTC
For example, temporary immutable repository snapshots can be use as a lock-free means to avoid interference from concurrent repository sync operations as in bug 639374. The snapshots could be created using an abstraction layer with plugins for various snapshot implementations based on things like gitfs, btrfs, zfs, device-mapper, overlayfs.
Comment 1 Zac Medico gentoo-dev 2018-07-27 18:58:54 UTC
We can easily implement a read-copy-update (RCU) mechanism that works on any POSIX-compliant file system, using hardlinks to share files between snapshots, and symlinks for atomic snapshot updates. Readers of a particular immutable snapshot would resolve its canonical path via a symlink, and when the symlink is atomically updated it will only affect new readers. Old immutable snapshots can be safely garbage collected when there are no more readers. An update starts by cloning the latest snapshot using a command like cp -rl, and programs like rsync and git are smart enough to break a hardlink when updating a file, ensuring copy-on-write behavior.
Comment 2 Zac Medico gentoo-dev 2018-07-30 08:10:14 UTC
I have a working implementation based on hardlinks in this branch, but I want to make some changes before I consider it complete:

https://github.com/zmedico/portage/tree/bug_662070_immutable_repo_snapshot
Comment 4 Larry the Git Cow gentoo-dev 2018-09-24 06:12:10 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=36f50e3b64756179758a8e3a11a3c6c666550cf5

commit 36f50e3b64756179758a8e3a11a3c6c666550cf5
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-07-31 07:28:45 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-09-24 05:52:52 +0000

    Add sync-rcu support for rsync (bug 662070)
    
    Add a boolean sync-rcu repos.conf setting that behaves as follows:
    
        Enable read-copy-update (RCU) behavior for sync operations. The
        current latest immutable version of a repository will be referenced
        by a symlink found where the repository would normally be located
        (see the location setting). Repository consumers should resolve
        the cannonical path of this symlink before attempt to access
        the repository, and all operations should be read-only, since
        the repository is considered immutable. Updates occur by atomic
        replacement of the symlink, which causes new consumers to use the
        new immutable version, while any earlier consumers continue to
        use the cannonical path that was resolved earlier. This option
        requires sync-allow-hardlinks and sync-rcu-store-dir options to
        be enabled, and currently also requires that sync-type is set
        to rsync. This option is disabled by default, since the symlink
        usage would require special handling for scenarios involving bind
        mounts and chroots.
    
    Bug: https://bugs.gentoo.org/662070
    Reviewed-by: Brian Dolbec <dolsen@gentoo.org>
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/repository/config.py               |  36 +++-
 lib/portage/repository/storage/hardlink_rcu.py | 251 +++++++++++++++++++++++++
 lib/portage/sync/syncbase.py                   |   4 +-
 lib/portage/tests/sync/test_sync_local.py      |  40 +++-
 man/portage.5                                  |  35 ++++
 5 files changed, 360 insertions(+), 6 deletions(-)

https://gitweb.gentoo.org/proj/portage.git/commit/?id=884ad951d700d1871cab2e321e4d8635b1a0f698

commit 884ad951d700d1871cab2e321e4d8635b1a0f698
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-07-30 06:21:30 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-09-24 04:24:10 +0000

    rsync: split out repo storage framework
    
    Since there are many ways to manage repository storage, split out a repo
    storage framework. The HardlinkQuarantineRepoStorage class implements
    the existing default behavior, and the InplaceRepoStorage class
    implements the legacy behavior (when sync-allow-hardlinks is disabled in
    repos.conf).
    
    Each class implements RepoStorageInterface, which uses coroutine methods
    since coroutines are well-suited to the I/O bound tasks that these
    methods perform. The _sync_decorator is used to convert coroutine
    methods to synchronous methods, for smooth integration into the
    surrounding synchronous code.
    
    Bug: https://bugs.gentoo.org/662070
    Reviewed-by: Brian Dolbec <dolsen@gentoo.org>
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/repository/storage/__init__.py         |  2 +
 .../repository/storage/hardlink_quarantine.py      | 95 ++++++++++++++++++++++
 lib/portage/repository/storage/inplace.py          | 49 +++++++++++
 lib/portage/repository/storage/interface.py        | 87 ++++++++++++++++++++
 lib/portage/sync/controller.py                     |  1 +
 lib/portage/sync/modules/rsync/rsync.py            | 85 +++++--------------
 lib/portage/sync/syncbase.py                       | 53 +++++++++++-
 7 files changed, 306 insertions(+), 66 deletions(-)

https://gitweb.gentoo.org/proj/portage.git/commit/?id=de0b60ff277311e780102131dce3111b4db1c196

commit de0b60ff277311e780102131dce3111b4db1c196
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-07-28 13:53:11 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-09-24 03:41:32 +0000

    Add _sync_decorator module
    
    Add functions that decorate coroutine methods and functions for
    synchronous usage, allowing coroutines to smoothly blend with
    synchronous code. This eliminates clutter that might otherwise
    discourage the proliferation of coroutine usage for I/O bound tasks.
    
    In the next commit, _sync_decorator will be used for smooth
    integration of new classes that have coroutine methods.
    
    Bug: https://bugs.gentoo.org/662070
    Reviewed-by: Brian Dolbec <dolsen@gentoo.org>
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 .../tests/util/futures/test_compat_coroutine.py    | 14 ++++++
 lib/portage/util/futures/_sync_decorator.py        | 54 ++++++++++++++++++++++
 2 files changed, 68 insertions(+)