Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 627320 - genkernel-3.5.1.1 produces a faulty initramfs where it takes ~36 seconds to import zfs root filesystem: zfs needs support for pre-udev environments.
Summary: genkernel-3.5.1.1 produces a faulty initramfs where it takes ~36 seconds to i...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: genkernel (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Richard Yao (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-08 13:30 UTC by anonymous
Modified: 2019-11-26 20:32 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
zpool.strace.tar.xz (zpool.strace.tar.xz,199.12 KB, application/x-xz)
2017-10-03 20:58 UTC, Robin Johnson
Details
patch to add dozfs=cache (genkernel-dozfs-cache.patch,1.85 KB, patch)
2018-06-06 03:09 UTC, Georgy Yakovlev
Details | Diff
patch to add dozfs=cache (genkernel-dozfs-cache.diff,1.99 KB, patch)
2018-06-06 06:08 UTC, Georgy Yakovlev
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description anonymous 2017-08-08 13:30:06 UTC
== Installed Softwares

zfs 0.7.0
zfs-kmod 0.7.0
spl 0.7.0
linux 4.9.34-gentoo
genkernel 3.5.1.1


== Description

genkernel 3.4.52.4-r2 produced a faulty initramfs that couldn't import a zfs root filesystem. But, it was fixed in genkernel 3.5.1.1.

I installed genkernel 3.5.1.1, and there is a new problem.

It takes ~36 seconds to import a zfs root filesystem on initramfs generated by genkernel 3.5.1.1.

Reproducible: Didn't try

Actual Results:  
It takes ~36 seconds to import a root zfs filesystem.

Expected Results:  
It should take less than 2 seconds.
Comment 1 anonymous 2017-08-12 00:25:31 UTC
I installed sys-kernel/dracut-045-r2 and executed `dracut --hostonly`.
Dracut's initramfs doesn't add 36-second delay to the boot process.
Comment 2 Jochen Schlick 2017-08-20 21:33:36 UTC
in my environment (zfs-root pool with separate /boot) using zfs-0.7.1 (kmod, spl too) and genkernel-3.5.1.1 I have this 40 seconds boot delay too....
Booting with a zfs 0.6.5.11 there is no delay.
Comment 3 Simon 2017-09-19 22:23:42 UTC
Can confirm. Having the same issue since I switched to genkernel 3.5.1.1, which I had to switch to because of bug 626362
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2017-10-03 20:58:19 UTC
Created attachment 497568 [details]
zpool.strace.tar.xz

chris@hamiltonshells did some debugging on IRC with me, and we got the bottom of it.

The delay is zpool trying to look for udev files under /run/udev/data/b*, which don't exist.

straces of 'zpool import' inside genkernel from Chris are attached.
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2017-10-03 21:29:28 UTC
Key pieces from the strace

$ grep /run/udev zpool.strace.*[0-9] -A1 |cut -d' ' -f2-  |sort |uniq -c
...
   6415 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
   3282 open("/run/udev/data/b259:0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
   3272 open("/run/udev/data/b259:2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
    139 sched_yield()         = 0

It tries to open a file, then waits 10ms, repeats until ~3 seconds are up.
Comment 6 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2017-10-03 21:31:37 UTC
Re-assigning to ZFS maintainers.

I think that the older ZFS code didn't wait for udev like this, which is why the wait seems new.

Genkernel's initramfs will probably NEVER get udev in it: As the genkernel maintainer, I want it to move to dracut for the initramfs portion, so either ZFS in the existing initramfs needs to not wait for udev like this, or the ZFS users should go to dracut.
Comment 7 Simon 2017-10-04 11:10:19 UTC
(In reply to Robin Johnson from comment #6) 
> Genkernel's initramfs will probably NEVER get udev in it: As the genkernel
> maintainer, I want it to move to dracut for the initramfs portion, so either
> ZFS in the existing initramfs needs to not wait for udev like this, or the
> ZFS users should go to dracut.

Just to understand/clarify: you're saying/proposing to remove the initramfs creation from genkernel and ask users to switch to use dracut to create their initramfs. Is that correct?

So genkernel will only do the kernel building and there will be no initramfs related actions in genkernel?
Comment 8 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2017-10-04 16:28:26 UTC
(In reply to Simon from comment #7)
> (In reply to Robin Johnson from comment #6) 
> > Genkernel's initramfs will probably NEVER get udev in it: As the genkernel
> > maintainer, I want it to move to dracut for the initramfs portion, so either
> > ZFS in the existing initramfs needs to not wait for udev like this, or the
> > ZFS users should go to dracut.
> 
> Just to understand/clarify: you're saying/proposing to remove the initramfs
> creation from genkernel and ask users to switch to use dracut to create
> their initramfs. Is that correct?
> 
> So genkernel will only do the kernel building and there will be no initramfs
> related actions in genkernel?

In the short term, users will need to call dracut directly if they have something that's not really resolvable in genkernel as-is.

Further out, genkernel will call dracut to generate the initramfs; it might also inject extra dracut modules for compatability/easier migration.
Comment 9 Walter 2018-01-22 23:07:28 UTC
Hi. I ran in to this issue and have an alternate solution. 

zpool import -d ${ZPOOL_IMPORT_PATH_1} -d ${ZPOOL_IMPORT_PATH_2} ...

Where ${ZPOOL_IMPORT_PATH_n} is populated from a kernel command line argument like "/dev/sda;/dev/sdb".

By default, we leave the current behaviour.

However, if you pass the kernel command line argument then it will scan a lot faster as only those devices specified will be scanned.
Comment 10 Walter 2018-01-22 23:55:02 UTC
Another option would be to bundle the ZFS cachefile on the initramfs, then run zpool import -c /path/to/cachefile

This will have the effect of automatically speeding up imports until fundamental zpool device membership changes at which point I guess it will regress or fail. Maybe a genkernel initramfs --bundle-zpool-cachefile option?
Comment 11 Walter 2018-01-23 00:05:38 UTC
For the dracut approach: note that while there is apparently no mention of ZFS in gentoo's dracut wiki page at https://wiki.gentoo.org/wiki/Dracut or the package USE flags, after installing there is code in /usr/lib/dracut/modules.d/90zfs and /usr/lib/dracut/modules.d/02zfsexpandknowledge

Possible documentation (maybe another implementation) at https://github.com/zfsonlinux/zfs/blob/master/contrib/dracut/README.dracut.markdown
Comment 12 Walter 2018-01-24 20:19:43 UTC
I think I found the source of the problem.

genkernel creates its initramfs using the file /usr/share/genkernel/gen_initramfs.sh

This file currently assumes your ZFS pool is called zpool, and has a cachefile property set to /etc/zfs/zpool.cache ... see the line: "for i in zdev.conf zpool.cache"

Unfortunately, this is not something that can be assumed. It is perfectly normal to run without a cache file, or with a cache file in another location, or with a ZFS pool with a different name.

Instead, genkernel should read the list of current pool cachefiles from the environment as follows:

zpool list -H |cut -f1 |xargs -n1 zpool get -H cachefile <pool name> |cut -f3

This list of cachefiles is what should really be copied to the initramfs.
Comment 13 Walter 2018-01-24 20:41:36 UTC
# VERIFIED SHORT TERM FIX
#  (WARNING: NOT SAFE FOR MULTIPLE POOLS - ONLY USE ON SINGLE POOL SYSTEMS)
# to fix ZFS slow boot
zpool get cachefile pool   # verify no cachefile set
zpool set cachefile=/etc/zfs/cachefile pool   # set cachefile
ls -al /etc/zfs/cachefile                     # verify creation
sed -i -e 's/zpool.cache/cachefile/' /usr/share/genkernel/gen_initramfs.sh
# then edit the file
#  /usr/share/genkernel/defaults/initrd.scripts 
# and add "-c /zfs/cachefile" to the 'Importing ZFS pool ' line, before the final argument.


# LONG TERM FIX
#  upgrade genkernel_initramfs to:
#   1. save the cachefiles to locations defined by pool name, eg. in /etc/zfs/pool-caches/<poolname>.cache
#   2. have the /usr/share/genkernel/defaults/initrd.scripts file load these
#      by default from the same location on the initramfs
#   3. for best possible outcome, test failure conditions on device change
#      after initramfs creation with former cachefile
Comment 14 Walter 2018-01-24 20:42:43 UTC
PS. zdev.conf doesn't even exist on my system and is definitely old cruft. This thing definitely needs a refresher. I can try submitting a patch if there is an up to date github repo.
Comment 15 Walter 2018-01-28 00:42:46 UTC
OK I assume I should try submitting a pull request at https://github.com/robbat2/genkernel ?
Comment 16 Walter 2018-01-28 03:12:37 UTC
Pull request submitted https://github.com/robbat2/genkernel/pull/15
Comment 17 Markus Osterhoff 2018-02-18 19:20:43 UTC
(In reply to Walter from comment #13)
> zpool set cachefile=/etc/zfs/cachefile pool   # set cachefile
> # then edit the file
> #  /usr/share/genkernel/defaults/initrd.scripts 
> # and add "-c /zfs/cachefile" to the 'Importing ZFS pool ' line, before the

Thanks, it works.

Note: is has to be /etc/zfs/cachefile, of course, in the initrd.scripts file.
Comment 18 Walter 2018-02-21 02:42:00 UTC
Thanks for confirming the fix.

Pull request has been sitting idle for 3+ weeks, it seems robbat2 is on holiday or in hospital or prison or something!
Comment 19 Richard Yao (RETIRED) gentoo-dev 2018-06-03 21:52:57 UTC
Unfortunately, I am very far behind on my bug mail. I am playing catch up and going through the backlog now. I expect to have this squashed within 72 hours.
Comment 20 Georgy Yakovlev archtester gentoo-dev 2018-06-06 03:00:23 UTC
alternative, simpler solution
https://github.com/robbat2/genkernel/pull/17

If someone wants to test:

you can use this diff
https://patch-diff.githubusercontent.com/raw/robbat2/genkernel/pull/17.diff

rename it to genkernel-dozfs-cache.patch
put into /etc/portage/patches/sys-kernel/genkernel/
re-emerge genkernel and re-generate initrd

make sure boot parameters include dozfs=cache

takes couple of seconds to import with my 4 pools, instead of minute or so.
Comment 21 Georgy Yakovlev archtester gentoo-dev 2018-06-06 03:09:11 UTC
Created attachment 535016 [details, diff]
patch to add dozfs=cache
Comment 22 Georgy Yakovlev archtester gentoo-dev 2018-06-06 06:08:44 UTC
Created attachment 535020 [details, diff]
patch to add dozfs=cache
Comment 23 Larry the Git Cow gentoo-dev 2018-06-12 21:26:19 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/proj/genkernel.git/commit/?id=5c5c32aa7261a29a3ac48035086bb59449d3804d

commit 5c5c32aa7261a29a3ac48035086bb59449d3804d
Author:     Georgy Yakovlev <ya@sysdump.net>
AuthorDate: 2018-06-06 02:33:07 +0000
Commit:     Georgy Yakovlev <ya@sysdump.net>
CommitDate: 2018-06-06 06:02:35 +0000

    Add option to force importing zpool using cache
    
    Add simple option to pass to kernel via loader.
    
    dozfs=cache will use /etc/zfs/zpool.cache
    avoiding 30+ second wait for udev in zpool import
    
    Also it's possible to use both cache and force
    at the same time:
    dozfs=force,cache (order is not important) will
    force import and use cache.
    
    Closes: https://bugs.gentoo.org/627320
    Signed-off-by: Georgy Yakovlev <ya@sysdump.net>

 defaults/initrd.scripts |  6 +++---
 defaults/linuxrc        | 19 +++++++++++++++----
 doc/genkernel.8.txt     |  6 +++---
 3 files changed, 21 insertions(+), 10 deletions(-)
Comment 24 anonymous 2018-06-13 13:26:56 UTC
How am I supposed to use genkernel? Do I need no change to the genkernel command that I run to generate initramfs?
Comment 25 Georgy Yakovlev archtester gentoo-dev 2018-07-08 20:48:42 UTC
(In reply to crocket from comment #24)
> How am I supposed to use genkernel? Do I need no change to the genkernel
> command that I run to generate initramfs?

no, initramfs generation stays the same.
but you need to generate a new one if disk configuration changes and zpool.cache is updated.

Procedure:
1) install genkernel-9999, I'll ask robbat to package a new release soon.
2) make sure /etc/zfs/zpool.cache is up to date and present. It should be fine by default. it would end up in initramfs.
3) generate zfs-aware initramfs as usual
4) update your bootloader to pass "dozfs=cache" to the kernel.


If you ever run in situation where zpool.cache inside initramfs is stale and fails to boot, you can just remove =cache portion from dozfs. The boot will take 30+ seconds, but will boot using scanning, without relying on cache.

I'm changing this bug status so it stays visible until genkernel is tagged.
Comment 26 Thomas Deutschmann (RETIRED) gentoo-dev 2019-07-15 15:43:33 UTC
Closing this bug now, genkernel-v4.0.0_beta2 hit Gentoo repository and contains that fix.
Comment 27 Georgy Yakovlev archtester gentoo-dev 2019-11-26 20:18:53 UTC
just fyi, new genkernel contains a fix with ZPOOL_IMPORT_UDEV_TIMEOUT_MS

https://gitweb.gentoo.org/proj/genkernel.git/commit/?id=2eb1d04cfbfa397b58a0b388f8ed28688fd114d8


and zfs-0.8.2 and soon 0.7.13 will contain this patch

https://github.com/zfsonlinux/zfs/commit/803884217f9b9b5fb235d7c5e78a809d271f6387



so dozfs=cache workaround will not be required and zfs on genkernel will just work like it used to work for 0.6.x with no delay.
Comment 28 Larry the Git Cow gentoo-dev 2019-11-26 20:32:33 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0d44f61a3dcb6e943edb307c7196cd0f7044af79

commit 0d44f61a3dcb6e943edb307c7196cd0f7044af79
Author:     Georgy Yakovlev <gyakovlev@gentoo.org>
AuthorDate: 2019-11-26 20:13:37 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2019-11-26 20:30:41 +0000

    sys-fs/zfs: unkeyworded revbump of 0.7.13, udev timeout patch
    
    Bug: https://bugs.gentoo.org/627320
    Package-Manager: Portage-2.3.79, Repoman-2.3.18
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 .../0.7.13-ZPOOL_IMPORT_UDEV_TIMEOUT_MS.patch      |  70 +++++++
 sys-fs/zfs/zfs-0.7.13-r2.ebuild                    | 220 +++++++++++++++++++++
 2 files changed, 290 insertions(+)