Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 700138 - If PORTAGE_TMPDIR is on btrfs, use subvolume instead of directory for performance
Summary: If PORTAGE_TMPDIR is on btrfs, use subvolume instead of directory for perform...
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Enhancement/Feature Requests (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 835380
  Show dependency tree
 
Reported: 2019-11-15 03:03 UTC by adebeus
Modified: 2023-05-23 09:45 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description adebeus 2019-11-15 03:03:14 UTC
If PORTAGE_TMPDIR is on btrfs, portage should make "${PORTAGE_TMPDIR}/portage/${CATEGORY}/${P}" a subvolume instead of a directory to improve performance.

Currently, "rm -rf" is used for cleanup, which can be slow when operating on large source trees (i.e., just the ones where you need PORTAGE_TMPDIR to be on disk instead of tmpfs in the first place), especially on btrfs. On any filesystem, "rsync -a --delete /tmp/empty/ /dir/to/delete/" is faster than "rm -rf" since it deletes the files in the optimal order, so that would also be a potential performance enhancement, but on btrfs, deleting a subvolume is by far the fastest way to delete a large number of files.

Apart from cleanup, I think using a subvolume might also help the btrfs block allocator make better decisions, especially if other I/O is going on at the same time as the build.

There might be permission issues with this unless the filesystem is mounted with user_subvol_rm_allowed, but in this case it should be possible to fall back to using directories, just as if another filesystem were used instead of btrfs.
Comment 1 Enne Eziarc 2019-11-15 10:14:17 UTC
I'd be interested to see the standard set of kernel benchmark tables for this, including the medium-term effect it has on the whole system.

In my own real-world experience with a point-in-time rollback system that cycled through only about 50 subvols per hour, we ended up having to have a workaround using carefully spaced deletes and sync calls; the latency spikes were unacceptable otherwise.

So, I doubt this pattern actually makes things faster. It might look that way when you're not paying attention to anything but emerge output scrolling past, but the fact is it just defers the space deallocation to Btrfs' internal queue where the block dealloc IO will happen all at once during the next flush, but at VFS commit priority instead of under portage's ionice class and cgroup.
Comment 2 Zac Medico gentoo-dev 2019-11-15 20:07:53 UTC
We can have plugins to support various implementations. A plugin based on the containers storage library would use the storage driver you have configured in /etc/containers/storage.conf. Available drivers include btrfs, devmapper, and zfs

https://github.com/containers/storage/tree/master/drivers
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-05-23 09:45:00 UTC
Using rsync more is not necessarily a bad idea if nothing else though, fwiw.