827281 – sys-kernel/genkernel-4.2.5: Decrypting crypt_root before crypt_swap will corrupt zfs/zpools

Bug 827281 - sys-kernel/genkernel-4.2.5: Decrypting crypt_root before crypt_swap will corrupt zfs/zpools

Summary: sys-kernel/genkernel-4.2.5: Decrypting crypt_root before crypt_swap will corr...

Status:	UNCONFIRMED

Alias:	None

Product:	Gentoo Hosted Projects
Classification:	Unclassified
Component:	genkernel (show other bugs)
Hardware:	All Linux

Importance:	Normal normal (vote)
Assignee:	Gentoo Genkernel Maintainers

URL:	https://github.com/openzfs/zfs/issues...
Whiteboard:
Keywords:	PATCH

Depends on:
Blocks:

Reported:	2021-11-25 12:16 UTC by Daniel Morlock
Modified:	2023-12-14 08:07 UTC (History)
CC List:	3 users (show)

See Also:	918688
Package list:
Runtime testing required:	---

Attachments
Patches initrd to decrypt swap before decrypting root (initrd-fix-decrypt-swap-before-root.patch,2.79 KB, patch) 2021-11-25 13:08 UTC, Daniel Morlock	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Daniel Morlock 2021-11-25 12:16:37 UTC

I'm using zfs on root and a swap partition outside the zfs filesystem for hibernating (to disk).

My kernel commandline looks as follows:

options dozfs crypt_root=UUID=612a36bf-607c-4c8f-8dfd-498b87ea6b7f crypt_swap=UUID=8d173ef7-2af5-4ae5-9b7f-ad06985b1dd0 root=ZFS=rpool_ws1/system/root resume=UUID=74ef965e-688b-495d-95b4-afc449c15750 systemd.unified_cgroup_hierarchy=0


So for a normal boot, the order is as:

1. Uncrypt crypt_root
2. Import zpool
3. Uncrypt crypt_swap
4. Determine empty resume device
5. Booting system from imported zpool.

But if my system was hibernated to disk, the resume looks as follows:

1. Uncrypt crypt_root
2. Import zpool
3. Uncrypt crypt_swap
4. Determine NON-empty resume device
5. Do resume that starts into a previously imported zpool.

Doing "zpool import" twice will corrupt a zpool in a way that if cannot be recovered without a backup, see https://github.com/openzfs/zfs/issues/260 for more details.

Reproducible: Always

Steps to Reproduce:
1. Hibernate to disk
2. Resume the system
Actual Results:  
System is resumed and the zpool is imported twice.

Comment 1 Daniel Morlock 2021-11-25 13:08:37 UTC

Created attachment 756280 [details, diff]
Patches initrd to decrypt swap before decrypting root

Comment 2 Daniel Morlock 2021-11-25 13:10:35 UTC

I've attached a patch for genkernel i.e. initrd.scripts and linuxrc that tries to decrypt und resume from swap before proceeding with the default order. I don't know what impact this has on other filesystems than zfs and/or other boot operations. So this is just a proposal.

Comment 3 Georgy Yakovlev archtester

2021-11-25 20:11:45 UTC

I don't speak for genkernel, leaving that for maintainer.

But I will just note that hibernating with zfs is unsupported, discouraged and will eventually lead to data loss.

Comment 4 Daniel Morlock 2021-11-26 07:54:45 UTC

Following a zfs maintainer, hibernation with zfs should be fine as long as the zfs kthreads can be frozen during the hibernation process i.e. the hibernate image should be stored to a swap partition/file that is outside the zfs filesystem. This is not 100% bullet-proofed but there is no reason why zfs should not work with hibernation. 
On nasty trap is the double-importing of a zpool which can immediately corrupt a zpool without a chance for recovery. And the genkernel initrd.scripts (start_volumes()) will double-import a zpool if using crypt root and crypt swap. I think that the zfs handling in start_volumes() is wrong: start_volumes() should prepare the volumes i.e. for lvm this is done by scanning for lvm volumes. In case of zfs, a "zfs import" is invoked which might already mount some volumes. Since this import happens before a hibernation from a swap can happen, the root zpool will be mounted twice each time.