An imported ZFS pool must never be modified outside its current session. If a pool has been imported (opened), it must be exported before it can be safely accessed again. Hibernating a system does not export a pool; it remains imported and must not be modified outside the hibernated system. Modifying metadata on a pool that has not been exported (closed) will likely corrupt it. This causes data loss. 1. set up a machine with zpool 2. hibernate it 3. boot another os and modify the pool (even clean import/export is sufficient) 4. attempt resuming the original The resumed system will expect the pool to be intact exactly as when hibernated; if a pool was imported from the outside, it becomes corrupt. genkernel's initrd does exactly the above. start_volumes imports all ZFS pools before it checks whether we are resuming or not. Importing a pool prior to resume, and then resuming the session where it was already imported, will corrupt it. start_volumes runs multiple blocks for different volume backends; it does the ZFS related stuff here: https://gitweb.gentoo.org/proj/genkernel.git/tree/defaults/initrd.scripts#n1726 . The entire ZFS block must run after resume checks. The attached patch does not modify the logic of volume init but splits the ZFS block into a separate function. start_volumes will run normally as before without ZFS, and start_zfs will run after do_resume in https://gitweb.gentoo.org/proj/genkernel.git/tree/defaults/linuxrc#n728 . the same issue has been reported before, twice, in https://bugs.gentoo.org/577484 https://bugs.gentoo.org/827281 and seemingly abandoned after a lengthy (and largely off-topic) discussion. upstream knows about this quite well: https://github.com/openzfs/zfs/issues/12842 https://github.com/openzfs/zfs/issues/14118 other distributions identified and fixed an identical issue in their init scripts: https://github.com/NixOS/nixpkgs/pull/208037 I'd like to point out that this issue has nothing to do with swap on zvol or other exotic cases; this is purely an incorrect sequence of steps in genkernel's volume init script, and the use case is supported by upstream if done right. Reproducible: Sometimes
Created attachment 875866 [details, diff] safe-zpool-import.patch