== Installed Softwares zfs 0.7.0 zfs-kmod 0.7.0 spl 0.7.0 linux 4.9.34-gentoo genkernel 3.5.1.1 == Description genkernel 3.4.52.4-r2 produced a faulty initramfs that couldn't import a zfs root filesystem. But, it was fixed in genkernel 3.5.1.1. I installed genkernel 3.5.1.1, and there is a new problem. It takes ~36 seconds to import a zfs root filesystem on initramfs generated by genkernel 3.5.1.1. Reproducible: Didn't try Actual Results: It takes ~36 seconds to import a root zfs filesystem. Expected Results: It should take less than 2 seconds.
I installed sys-kernel/dracut-045-r2 and executed `dracut --hostonly`. Dracut's initramfs doesn't add 36-second delay to the boot process.
in my environment (zfs-root pool with separate /boot) using zfs-0.7.1 (kmod, spl too) and genkernel-3.5.1.1 I have this 40 seconds boot delay too.... Booting with a zfs 0.6.5.11 there is no delay.
Can confirm. Having the same issue since I switched to genkernel 3.5.1.1, which I had to switch to because of bug 626362
Created attachment 497568 [details] zpool.strace.tar.xz chris@hamiltonshells did some debugging on IRC with me, and we got the bottom of it. The delay is zpool trying to look for udev files under /run/udev/data/b*, which don't exist. straces of 'zpool import' inside genkernel from Chris are attached.
Key pieces from the strace $ grep /run/udev zpool.strace.*[0-9] -A1 |cut -d' ' -f2- |sort |uniq -c ... 6415 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0 3282 open("/run/udev/data/b259:0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) 3272 open("/run/udev/data/b259:2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) 139 sched_yield() = 0 It tries to open a file, then waits 10ms, repeats until ~3 seconds are up.
Re-assigning to ZFS maintainers. I think that the older ZFS code didn't wait for udev like this, which is why the wait seems new. Genkernel's initramfs will probably NEVER get udev in it: As the genkernel maintainer, I want it to move to dracut for the initramfs portion, so either ZFS in the existing initramfs needs to not wait for udev like this, or the ZFS users should go to dracut.
(In reply to Robin Johnson from comment #6) > Genkernel's initramfs will probably NEVER get udev in it: As the genkernel > maintainer, I want it to move to dracut for the initramfs portion, so either > ZFS in the existing initramfs needs to not wait for udev like this, or the > ZFS users should go to dracut. Just to understand/clarify: you're saying/proposing to remove the initramfs creation from genkernel and ask users to switch to use dracut to create their initramfs. Is that correct? So genkernel will only do the kernel building and there will be no initramfs related actions in genkernel?
(In reply to Simon from comment #7) > (In reply to Robin Johnson from comment #6) > > Genkernel's initramfs will probably NEVER get udev in it: As the genkernel > > maintainer, I want it to move to dracut for the initramfs portion, so either > > ZFS in the existing initramfs needs to not wait for udev like this, or the > > ZFS users should go to dracut. > > Just to understand/clarify: you're saying/proposing to remove the initramfs > creation from genkernel and ask users to switch to use dracut to create > their initramfs. Is that correct? > > So genkernel will only do the kernel building and there will be no initramfs > related actions in genkernel? In the short term, users will need to call dracut directly if they have something that's not really resolvable in genkernel as-is. Further out, genkernel will call dracut to generate the initramfs; it might also inject extra dracut modules for compatability/easier migration.
Hi. I ran in to this issue and have an alternate solution. zpool import -d ${ZPOOL_IMPORT_PATH_1} -d ${ZPOOL_IMPORT_PATH_2} ... Where ${ZPOOL_IMPORT_PATH_n} is populated from a kernel command line argument like "/dev/sda;/dev/sdb". By default, we leave the current behaviour. However, if you pass the kernel command line argument then it will scan a lot faster as only those devices specified will be scanned.
Another option would be to bundle the ZFS cachefile on the initramfs, then run zpool import -c /path/to/cachefile This will have the effect of automatically speeding up imports until fundamental zpool device membership changes at which point I guess it will regress or fail. Maybe a genkernel initramfs --bundle-zpool-cachefile option?
For the dracut approach: note that while there is apparently no mention of ZFS in gentoo's dracut wiki page at https://wiki.gentoo.org/wiki/Dracut or the package USE flags, after installing there is code in /usr/lib/dracut/modules.d/90zfs and /usr/lib/dracut/modules.d/02zfsexpandknowledge Possible documentation (maybe another implementation) at https://github.com/zfsonlinux/zfs/blob/master/contrib/dracut/README.dracut.markdown
I think I found the source of the problem. genkernel creates its initramfs using the file /usr/share/genkernel/gen_initramfs.sh This file currently assumes your ZFS pool is called zpool, and has a cachefile property set to /etc/zfs/zpool.cache ... see the line: "for i in zdev.conf zpool.cache" Unfortunately, this is not something that can be assumed. It is perfectly normal to run without a cache file, or with a cache file in another location, or with a ZFS pool with a different name. Instead, genkernel should read the list of current pool cachefiles from the environment as follows: zpool list -H |cut -f1 |xargs -n1 zpool get -H cachefile <pool name> |cut -f3 This list of cachefiles is what should really be copied to the initramfs.
# VERIFIED SHORT TERM FIX # (WARNING: NOT SAFE FOR MULTIPLE POOLS - ONLY USE ON SINGLE POOL SYSTEMS) # to fix ZFS slow boot zpool get cachefile pool # verify no cachefile set zpool set cachefile=/etc/zfs/cachefile pool # set cachefile ls -al /etc/zfs/cachefile # verify creation sed -i -e 's/zpool.cache/cachefile/' /usr/share/genkernel/gen_initramfs.sh # then edit the file # /usr/share/genkernel/defaults/initrd.scripts # and add "-c /zfs/cachefile" to the 'Importing ZFS pool ' line, before the final argument. # LONG TERM FIX # upgrade genkernel_initramfs to: # 1. save the cachefiles to locations defined by pool name, eg. in /etc/zfs/pool-caches/<poolname>.cache # 2. have the /usr/share/genkernel/defaults/initrd.scripts file load these # by default from the same location on the initramfs # 3. for best possible outcome, test failure conditions on device change # after initramfs creation with former cachefile
PS. zdev.conf doesn't even exist on my system and is definitely old cruft. This thing definitely needs a refresher. I can try submitting a patch if there is an up to date github repo.
OK I assume I should try submitting a pull request at https://github.com/robbat2/genkernel ?
Pull request submitted https://github.com/robbat2/genkernel/pull/15
(In reply to Walter from comment #13) > zpool set cachefile=/etc/zfs/cachefile pool # set cachefile > # then edit the file > # /usr/share/genkernel/defaults/initrd.scripts > # and add "-c /zfs/cachefile" to the 'Importing ZFS pool ' line, before the Thanks, it works. Note: is has to be /etc/zfs/cachefile, of course, in the initrd.scripts file.
Thanks for confirming the fix. Pull request has been sitting idle for 3+ weeks, it seems robbat2 is on holiday or in hospital or prison or something!
Unfortunately, I am very far behind on my bug mail. I am playing catch up and going through the backlog now. I expect to have this squashed within 72 hours.
alternative, simpler solution https://github.com/robbat2/genkernel/pull/17 If someone wants to test: you can use this diff https://patch-diff.githubusercontent.com/raw/robbat2/genkernel/pull/17.diff rename it to genkernel-dozfs-cache.patch put into /etc/portage/patches/sys-kernel/genkernel/ re-emerge genkernel and re-generate initrd make sure boot parameters include dozfs=cache takes couple of seconds to import with my 4 pools, instead of minute or so.
Created attachment 535016 [details, diff] patch to add dozfs=cache
Created attachment 535020 [details, diff] patch to add dozfs=cache
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/proj/genkernel.git/commit/?id=5c5c32aa7261a29a3ac48035086bb59449d3804d commit 5c5c32aa7261a29a3ac48035086bb59449d3804d Author: Georgy Yakovlev <ya@sysdump.net> AuthorDate: 2018-06-06 02:33:07 +0000 Commit: Georgy Yakovlev <ya@sysdump.net> CommitDate: 2018-06-06 06:02:35 +0000 Add option to force importing zpool using cache Add simple option to pass to kernel via loader. dozfs=cache will use /etc/zfs/zpool.cache avoiding 30+ second wait for udev in zpool import Also it's possible to use both cache and force at the same time: dozfs=force,cache (order is not important) will force import and use cache. Closes: https://bugs.gentoo.org/627320 Signed-off-by: Georgy Yakovlev <ya@sysdump.net> defaults/initrd.scripts | 6 +++--- defaults/linuxrc | 19 +++++++++++++++---- doc/genkernel.8.txt | 6 +++--- 3 files changed, 21 insertions(+), 10 deletions(-)
How am I supposed to use genkernel? Do I need no change to the genkernel command that I run to generate initramfs?
(In reply to crocket from comment #24) > How am I supposed to use genkernel? Do I need no change to the genkernel > command that I run to generate initramfs? no, initramfs generation stays the same. but you need to generate a new one if disk configuration changes and zpool.cache is updated. Procedure: 1) install genkernel-9999, I'll ask robbat to package a new release soon. 2) make sure /etc/zfs/zpool.cache is up to date and present. It should be fine by default. it would end up in initramfs. 3) generate zfs-aware initramfs as usual 4) update your bootloader to pass "dozfs=cache" to the kernel. If you ever run in situation where zpool.cache inside initramfs is stale and fails to boot, you can just remove =cache portion from dozfs. The boot will take 30+ seconds, but will boot using scanning, without relying on cache. I'm changing this bug status so it stays visible until genkernel is tagged.
Closing this bug now, genkernel-v4.0.0_beta2 hit Gentoo repository and contains that fix.
just fyi, new genkernel contains a fix with ZPOOL_IMPORT_UDEV_TIMEOUT_MS https://gitweb.gentoo.org/proj/genkernel.git/commit/?id=2eb1d04cfbfa397b58a0b388f8ed28688fd114d8 and zfs-0.8.2 and soon 0.7.13 will contain this patch https://github.com/zfsonlinux/zfs/commit/803884217f9b9b5fb235d7c5e78a809d271f6387 so dozfs=cache workaround will not be required and zfs on genkernel will just work like it used to work for 0.6.x with no delay.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0d44f61a3dcb6e943edb307c7196cd0f7044af79 commit 0d44f61a3dcb6e943edb307c7196cd0f7044af79 Author: Georgy Yakovlev <gyakovlev@gentoo.org> AuthorDate: 2019-11-26 20:13:37 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2019-11-26 20:30:41 +0000 sys-fs/zfs: unkeyworded revbump of 0.7.13, udev timeout patch Bug: https://bugs.gentoo.org/627320 Package-Manager: Portage-2.3.79, Repoman-2.3.18 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> .../0.7.13-ZPOOL_IMPORT_UDEV_TIMEOUT_MS.patch | 70 +++++++ sys-fs/zfs/zfs-0.7.13-r2.ebuild | 220 +++++++++++++++++++++ 2 files changed, 290 insertions(+)