Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 660410 - sys-apps/portage: use rsync --link-dest to implement atomic repository updates (and abort if signature verification fails)
Summary: sys-apps/portage: use rsync --link-dest to implement atomic repository update...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All All
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks: 240187 659322
  Show dependency tree
 
Reported: 2018-07-05 00:30 UTC by Zac Medico
Modified: 2018-10-12 19:32 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2018-07-05 00:30:45 UTC
We can use rsync with --link-dest to sync to a temporary location that contains hardlinks to the previous state for files that are unchanged. This allows copy-on-write behavior for any filesytem that supports hardlinks. It's safe, as long as we trust rsync not to corrupt the previous state. This is inspired by the concept known as "rsync time machine", and it works great:

   https://blog.interlinked.org/tutorials/rsync_time_machine.html

This will allow us to abort the atomic repository update if signature verification fails, acting as an effective quarantine mechanism.
Comment 2 Zac Medico gentoo-dev 2018-07-08 02:02:40 UTC
The new behavior may conflict with configurations that restrict the use of hardlinks, such as overlay filesystems. Therefore, users will have to set "sync-allow-hardlinks = no" in repos.conf if they have a configuration that prevents the use of hardlinks, but this should not be very common.
Comment 3 Zac Medico gentoo-dev 2018-07-08 06:11:20 UTC
News item sent for review:

https://archives.gentoo.org/gentoo-dev/message/004d9ca4c918e55638a919b87b5d52dd
Comment 4 Larry the Git Cow gentoo-dev 2018-07-10 06:38:02 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=84822ef7a21494d3f044c2ffa7b112e4d29665ab

commit 84822ef7a21494d3f044c2ffa7b112e4d29665ab
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-07-05 13:10:43 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-07-10 05:03:53 +0000

    rsync: quarantine data prior to verification (bug 660410)
    
    Sync into a quarantine subdirectory, using the rsync --link-dest option
    to create hardlinks to identical files in the previous snapshot of the
    repository. If hardlinks are not supported, then show a warning message
    and sync directly to the normal repository location.
    
    If verification succeeds, then the quarantine subdirectory is synced
    to the normal repository location, and the quarantine subdirectory
    is deleted. If verification fails, then the quarantine directory is
    preserved for purposes of analysis.
    
    Even if verification happens to be disabled, the quarantine directory
    is still useful for making the repository update more atomic, so that
    it is less likely that normal repository location will be observed in
    a partially synced state.
    
    The new behavior may conflict with configurations that restrict the
    use of hardlinks, such as overlay filesystems. Therefore, users will
    have to set "sync-allow-hardlinks = no" in repos.conf if they have
    a configuration that prevents the use of hardlinks, but this should
    not be very common.
    
    Bug: https://bugs.gentoo.org/660410

 cnf/repos.conf                          |  1 +
 man/portage.5                           |  8 +++
 pym/portage/repository/config.py        |  7 ++-
 pym/portage/sync/modules/rsync/rsync.py | 87 ++++++++++++++++++++++++++++++---
 4 files changed, 94 insertions(+), 9 deletions(-)
Comment 5 Larry the Git Cow gentoo-dev 2018-07-11 09:05:33 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/data/gentoo-news.git/commit/?id=e398a453a62490fd3ff168887f761e0c91ab50e9

commit e398a453a62490fd3ff168887f761e0c91ab50e9
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-07-08 05:51:09 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-07-11 08:54:05 +0000

    2018-07-11-portage-sync-allow-hardlinks: Add
    
    Bug: https://bugs.gentoo.org/660410

 .../2018-07-11-portage-sync-allow-hardlinks.en.txt | 34 ++++++++++++++++++++++
 1 file changed, 34 insertions(+)
Comment 6 Spyros Papanastasiou 2018-09-28 04:45:47 UTC
rsync needs --one-file-system or just ignore distfiles, in case it is mounted like in my system. (maybe i should intruct portage to look elsewhere for distfiles using PORTDIR :)
Comment 7 Zac Medico gentoo-dev 2018-09-28 05:54:10 UTC
(In reply to Spyros Papanastasiou from comment #6)
> rsync needs --one-file-system or just ignore distfiles, in case it is
> mounted like in my system. (maybe i should intruct portage to look elsewhere
> for distfiles using PORTDIR :)

Thanks for reporting, I've added distfiles to the default excludes now:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=b587fc874ce95064139ba85552e146da957cce9e

commit b587fc874ce95064139ba85552e146da957cce9e
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2018-09-28 05:32:31 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2018-09-28 05:43:26 +0000

    HardlinkQuarantineRepoStorage: exclude distfiles and packages (bug 666554)
    
    These paths have been excluded for a long time, and they are also
    whitelisted in the top-level Manifest.files.gz:
    
        IGNORE distfiles
        IGNORE local
        IGNORE lost+found
        IGNORE packages
    
    Bug: https://bugs.gentoo.org/666554
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/repository/storage/hardlink_quarantine.py | 2 ++
 1 file changed, 2 insertions(+)