Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 671864 - sys-apps/portage: varbapi aux_update transactions with write-ahead logging
Summary: sys-apps/portage: varbapi aux_update transactions with write-ahead logging
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All All
: Normal enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks: 193766 605082
  Show dependency tree
 
Reported: 2018-11-25 13:15 UTC by Zac Medico
Modified: 2021-03-17 06:01 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Add write ahead log (integrity-w-write-ahead.patch,5.75 KB, patch)
2018-12-09 22:51 UTC, Sam
Details | Diff
Add write ahead log using aux_update (integrity-w-write-ahead-0.2.patch,26.81 KB, patch)
2019-01-30 23:03 UTC, Sam
Details | Diff
Add write ahead log using aux_update - fixed (integrity-w-write-ahead-0.3.patch,31.68 KB, patch)
2019-01-30 23:11 UTC, Sam
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2018-11-25 13:15:20 UTC
When varbapi manages multiple files containing metadata for installed files (see bug 605082), it's important to ensure that the multiple files be consistent. Write-ahead logging (WAL) can be used to record transactions before they are applied, making it possible to roll back or roll forward a previously interrupted transaction in order to achieve a consistent state. For the varbapi aux_update method, a transaction WAL can be implemented with a procedure like this:

1) Create replacement metadata files in a transaction/new/ directory, and a Manifest.

3) Create hardlinks of the old metadata files in a transaction/old/ directory, and a Manifest.

4) Use SyncfsProcess to sync the staged transaction to disk.

5) Apply changes from the transaction/new/ directory, using a series of atomic operations, like `ln transaction/new/foo foo.new && mv foo.new foo`.

6) Use SyncfsProcess to sync changes to disk.

7) mv transaction transaction.complete && rm -rf transaction.complete

If the transaction is interrupted, it's possible to roll back if transaction/old/ contains a set of files with valid Manifest. If transaction/new/ contains a set of files with valid Manifest, it's also possible to roll forward.

Using symlinks, it's possible to make the entire transaction appear to be atomic for readers, but this requires retention of old files until a TTL has expired (like in the repos.conf sync-rcu implementation).
Comment 1 Sam 2018-12-09 22:51:03 UTC
Created attachment 557478 [details, diff]
Add write ahead log

Proposed an implementation of Zac's suggestion.

What do you think of this patch?
Comment 2 Zac Medico gentoo-dev 2018-12-10 09:44:59 UTC
(In reply to Sam from comment #1)
> Created attachment 557478 [details, diff] [details, diff]
> Add write ahead log
> 
> Proposed an implementation of Zac's suggestion.
> 
> What do you think of this patch?

Thanks, I'll take a look at this soon.

I'm thinking that we'll probably want to encapsulate the transaction implementation in a plugin so that it's possible to choose an implementation that's optimal for the user's environment. For example, the repository storage framework has multiple implementations that are encapsulated in plugins:

https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/repository/storage

I would like to try AcidFS to see how it performs:

https://acidfs.readthedocs.io/en/latest/
https://github.com/Pylons/acidfs

A nice feature of AcidFS is that readers can use AcidFS to obtain a consistent view even while transactions are in progress. The history is stored as a git repository, and each reader access an immutable snapshot (git tree object) via the AcidFS API.
Comment 3 Zac Medico gentoo-dev 2018-12-10 18:36:39 UTC
(In reply to Sam from comment #1)
> Created attachment 557478 [details, diff] [details, diff]
> Add write ahead log
> 
> Proposed an implementation of Zac's suggestion.
> 
> What do you think of this patch?

Please use the aux_update method to apply the transaction.
Comment 4 Sam 2018-12-29 22:58:17 UTC
Okay, I'll have a got at it.

Grepping through dbapi/ I couldn't find calls to vardbapi's aux_update. Does that mean I can change it freely? Or are there other users of it?
Comment 5 Zac Medico gentoo-dev 2018-12-29 23:33:21 UTC
(In reply to Sam from comment #4)
> Okay, I'll have a got at it.

Great, thanks!

> Grepping through dbapi/ I couldn't find calls to vardbapi's aux_update. Does
> that mean I can change it freely? Or are there other users of it?

I should remain compatible, since we do have some consumers. It's mainly used in lib/portage/dbapi/__init__.py to apply metadata updates for package moves an renames. It's also used in these files:

bin/quickpkg
lib/_emerge/FakeVartree.py
lib/_emerge/PackageVirtualDbapi.py
lib/_emerge/resolver/DbapiProvidesIndex.py
Comment 6 Sam 2019-01-30 23:03:05 UTC
Created attachment 563338 [details, diff]
Add write ahead log using aux_update

Right, I've got something I'm happy with that now makes use of aux_update and encapsulates both updates to CONTENTS and updates to CONTENTS_*.

(I'm not sure whether it's getting a bit muddy whether changes are in scope of bug 605082 or this bug.)

The attached patch contains the following:
- Move startContentsUpdate(), stopContentsUpdate() to dblink ('pkg').
- Add abortContentsUpdate(), which performs roll-back, to dblink.
- Create a copy of aux_update names aux_update_pkg() which resides in dblink as opposed to the still existing aux_update() that resides in vardbapi. As aux_update only operates on a dblink and wants to know a cpv so that it can immediately fetch a dblink made me comfortable doing this. But maybe there are concerns I'm not aware of that keep it in vardbapi.
- Move writeContentsToContentsFile() to dblink
- Change writeContentsToContentsFile() to also write CONTENTS_* files and to move writes to pattern of catch {start(),update(),stop()} except {abort()}.
- Change vardbapi.removeFromContents() to:
  1) grab data using pkg.getcontents(), pkg.getContentsIndices(), pkg.getContentsMetadata()
  2) call pkg.writeContentsToContentsFile to write out the updates to CONTENTS and CONTENTS_* files.
(BTW; should't removeFromContents() also be moved to dblink?)
- Change pkg._add_preserve_libs_to_contents() to make use of new pkg.writeContentsToContentsFile.
- Miscellaneous small changes.
Comment 7 Sam 2019-01-30 23:11:04 UTC
Created attachment 563340 [details, diff]
Add write ahead log using aux_update - fixed

Small fix. Patch now applies to the vartree.py after applying patch of bug 605082.