Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 827281

Summary: sys-kernel/genkernel-4.2.5: Decrypting crypt_root before crypt_swap will corrupt zfs/zpools
Product: Gentoo Hosted Projects Reporter: Daniel Morlock <info>
Component: genkernelAssignee: Gentoo Genkernel Maintainers <genkernel>
Status: UNCONFIRMED ---    
Severity: normal CC: gyakovlev, sam
Priority: Normal Keywords: PATCH
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://github.com/openzfs/zfs/issues/260
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: Patches initrd to decrypt swap before decrypting root

Description Daniel Morlock 2021-11-25 12:16:37 UTC
I'm using zfs on root and a swap partition outside the zfs filesystem for hibernating (to disk).

My kernel commandline looks as follows:

options dozfs crypt_root=UUID=612a36bf-607c-4c8f-8dfd-498b87ea6b7f crypt_swap=UUID=8d173ef7-2af5-4ae5-9b7f-ad06985b1dd0 root=ZFS=rpool_ws1/system/root resume=UUID=74ef965e-688b-495d-95b4-afc449c15750 systemd.unified_cgroup_hierarchy=0


So for a normal boot, the order is as:

1. Uncrypt crypt_root
2. Import zpool
3. Uncrypt crypt_swap
4. Determine empty resume device
5. Booting system from imported zpool.

But if my system was hibernated to disk, the resume looks as follows:

1. Uncrypt crypt_root
2. Import zpool
3. Uncrypt crypt_swap
4. Determine NON-empty resume device
5. Do resume that starts into a previously imported zpool.

Doing "zpool import" twice will corrupt a zpool in a way that if cannot be recovered without a backup, see https://github.com/openzfs/zfs/issues/260 for more details.

Reproducible: Always

Steps to Reproduce:
1. Hibernate to disk
2. Resume the system
Actual Results:  
System is resumed and the zpool is imported twice.
Comment 1 Daniel Morlock 2021-11-25 13:08:37 UTC
Created attachment 756280 [details, diff]
Patches initrd to decrypt swap before decrypting root
Comment 2 Daniel Morlock 2021-11-25 13:10:35 UTC
I've attached a patch for genkernel i.e. initrd.scripts and linuxrc that tries to decrypt und resume from swap before proceeding with the default order. I don't know what impact this has on other filesystems than zfs and/or other boot operations. So this is just a proposal.
Comment 3 Georgy Yakovlev archtester gentoo-dev 2021-11-25 20:11:45 UTC
I don't speak for genkernel, leaving that for maintainer.

But I will just note that hibernating with zfs is unsupported, discouraged and will eventually lead to data loss.
Comment 4 Daniel Morlock 2021-11-26 07:54:45 UTC
Following a zfs maintainer, hibernation with zfs should be fine as long as the zfs kthreads can be frozen during the hibernation process i.e. the hibernate image should be stored to a swap partition/file that is outside the zfs filesystem. This is not 100% bullet-proofed but there is no reason why zfs should not work with hibernation. 
On nasty trap is the double-importing of a zpool which can immediately corrupt a zpool without a chance for recovery. And the genkernel initrd.scripts (start_volumes()) will double-import a zpool if using crypt root and crypt swap. I think that the zfs handling in start_volumes() is wrong: start_volumes() should prepare the volumes i.e. for lvm this is done by scanning for lvm volumes. In case of zfs, a "zfs import" is invoked which might already mount some volumes. Since this import happens before a hibernation from a swap can happen, the root zpool will be mounted twice each time.