We can use rsync with --link-dest to sync to a temporary location that contains hardlinks to the previous state for files that are unchanged. This allows copy-on-write behavior for any filesytem that supports hardlinks. It's safe, as long as we trust rsync not to corrupt the previous state. This is inspired by the concept known as "rsync time machine", and it works great: https://blog.interlinked.org/tutorials/rsync_time_machine.html This will allow us to abort the atomic repository update if signature verification fails, acting as an effective quarantine mechanism.
Patch posted for review: https://github.com/gentoo/portage/pull/334 https://archives.gentoo.org/gentoo-portage-dev/threads/2018-07/
The new behavior may conflict with configurations that restrict the use of hardlinks, such as overlay filesystems. Therefore, users will have to set "sync-allow-hardlinks = no" in repos.conf if they have a configuration that prevents the use of hardlinks, but this should not be very common.
News item sent for review: https://archives.gentoo.org/gentoo-dev/message/004d9ca4c918e55638a919b87b5d52dd
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=84822ef7a21494d3f044c2ffa7b112e4d29665ab commit 84822ef7a21494d3f044c2ffa7b112e4d29665ab Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-07-05 13:10:43 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-07-10 05:03:53 +0000 rsync: quarantine data prior to verification (bug 660410) Sync into a quarantine subdirectory, using the rsync --link-dest option to create hardlinks to identical files in the previous snapshot of the repository. If hardlinks are not supported, then show a warning message and sync directly to the normal repository location. If verification succeeds, then the quarantine subdirectory is synced to the normal repository location, and the quarantine subdirectory is deleted. If verification fails, then the quarantine directory is preserved for purposes of analysis. Even if verification happens to be disabled, the quarantine directory is still useful for making the repository update more atomic, so that it is less likely that normal repository location will be observed in a partially synced state. The new behavior may conflict with configurations that restrict the use of hardlinks, such as overlay filesystems. Therefore, users will have to set "sync-allow-hardlinks = no" in repos.conf if they have a configuration that prevents the use of hardlinks, but this should not be very common. Bug: https://bugs.gentoo.org/660410 cnf/repos.conf | 1 + man/portage.5 | 8 +++ pym/portage/repository/config.py | 7 ++- pym/portage/sync/modules/rsync/rsync.py | 87 ++++++++++++++++++++++++++++++--- 4 files changed, 94 insertions(+), 9 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/data/gentoo-news.git/commit/?id=e398a453a62490fd3ff168887f761e0c91ab50e9 commit e398a453a62490fd3ff168887f761e0c91ab50e9 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-07-08 05:51:09 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-07-11 08:54:05 +0000 2018-07-11-portage-sync-allow-hardlinks: Add Bug: https://bugs.gentoo.org/660410 .../2018-07-11-portage-sync-allow-hardlinks.en.txt | 34 ++++++++++++++++++++++ 1 file changed, 34 insertions(+)
rsync needs --one-file-system or just ignore distfiles, in case it is mounted like in my system. (maybe i should intruct portage to look elsewhere for distfiles using PORTDIR :)
(In reply to Spyros Papanastasiou from comment #6) > rsync needs --one-file-system or just ignore distfiles, in case it is > mounted like in my system. (maybe i should intruct portage to look elsewhere > for distfiles using PORTDIR :) Thanks for reporting, I've added distfiles to the default excludes now: https://gitweb.gentoo.org/proj/portage.git/commit/?id=b587fc874ce95064139ba85552e146da957cce9e commit b587fc874ce95064139ba85552e146da957cce9e Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2018-09-28 05:32:31 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2018-09-28 05:43:26 +0000 HardlinkQuarantineRepoStorage: exclude distfiles and packages (bug 666554) These paths have been excluded for a long time, and they are also whitelisted in the top-level Manifest.files.gz: IGNORE distfiles IGNORE local IGNORE lost+found IGNORE packages Bug: https://bugs.gentoo.org/666554 Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/portage/repository/storage/hardlink_quarantine.py | 2 ++ 1 file changed, 2 insertions(+)