Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 647688 - sys-fs/zfs - /etc/init.d/zfs-mount fails to unmount during shutdown
Summary: sys-fs/zfs - /etc/init.d/zfs-mount fails to unmount during shutdown
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Richard Yao (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-15 05:49 UTC by anonymous
Modified: 2023-12-14 07:32 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge.info,5.69 KB, text/plain)
2018-02-16 10:56 UTC, anonymous
Details

Note You need to log in before you can comment on or make changes to this bug.
Description anonymous 2018-02-15 05:49:40 UTC
During every shutdown, I see the following messages.

zfs-mount          |umount: /root: target is busy.
zfs-mount          |cannot unmount '/root': umount failed
zfs-mount          |umount: /mnt/data: target is busy.
zfs-mount          |cannot unmount '/mnt/data': umount failed
zfs-mount          |umount: /home: target is busy.
zfs-mount          |cannot unmount '/home': umount failed
zfs-mount          |umount: /: target is busy.
zfs-mount          |cannot unmount '/': umount failed
 [ !! ]
dhcpcd             | * Stopping DHCP Client Daemon ...
 [ ok ]
 [ ok ]
 [ ok ]
 [ ok ]
 [ ok ]
localmount         | * Unmounting loop devices
localmount         | * Unmounting filesystems
localmount         | *   Unmounting /tmp ...
 [ ok ]
localmount         | *   Unmounting /boot/efi ...
 [ ok ]
localmount         | *   Unmounting /var/tmp/portage ...
 [ ok ]
localmount         | *   Unmounting /root ...
 [ ok ]
localmount         | *   Unmounting /mnt/data ...
 [ ok ]
localmount         | *   Unmounting /home ...
 [ ok ]

/etc/init.d/zfs-mount belongs to sys-fs/zfs package.
Here's the content of /etc/fstab

PARTUUID="some_uuid" none swap sw 0 0
PARTUUID="some_uuid" none swap sw 0 0
tmpfs /var/tmp/portage tmpfs size=14G,uid=portage,gid=portage,mode=775,noatime 0 0
PARTUUID="some_uuid" /boot/efi vfat defaults,noatime 0 0
tmpfs /tmp tmpfs size=8G,mode=1777,noatime 0 0

`/`, `/home`, `/mnt/data`, and `/root` are zfs mounts.
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2018-02-16 07:01:47 UTC
1) Please post your `emerge --info` output in a comment.
2) Please post your `emerge -vpq /etc/init.d/zfs-mount` output in a comment.
Comment 2 anonymous 2018-02-16 10:56:07 UTC
Created attachment 519700 [details]
emerge --info
Comment 3 anonymous 2018-02-16 10:56:38 UTC
$ emerge -vpq /etc/init.d/zfs-mount
[ebuild   R   ] sys-fs/zfs-0.7.6  USE="rootfs -custom-cflags -debug (-kernel-builtin) -static-libs -test-suite" PYTHON_TARGETS="python2_7 python3_5 -python3_4"
Comment 4 Georgy Yakovlev archtester gentoo-dev 2018-03-12 16:41:10 UTC
just add 

ZFS_UNMOUNT='no'

to /etc/conf.d/zfs

filesystems will still be unmounted correctly, but not just all at once with zfs unmount -a.


I haven't really looked at internals, but to me it looks like it should be the default with USE=rootfs.
Comment 5 anonymous 2018-03-19 12:02:34 UTC
After setting ZFS_UNMOUNT="no", gentoo seems to occasionally fail to import the zfs pool containing rootfs.
Comment 6 anonymous 2018-04-14 22:41:01 UTC
It turns out that dracut fails to import a zfs pool from time to time regardless of the value of ZFS_UNMOUNT. I think it is safe to set ZFS_UNMOUNT to "no".
Comment 7 Georgy Yakovlev archtester gentoo-dev 2018-04-14 23:29:06 UTC
I recently also fought with dracut and inconsistent initrd.

what ultimately solved intermittent issues is properly generating /etc/hostid and making sure it's included in dracut (forcing it  to include it)
It can be created using /bin/zgenhostid(8) 

and forcibly included using dracut.conf.d stanza

install_items+=" /etc/hostid "
^^ don't omit the spaces before and after the filename.

I have no idea why sometimes it fails to include it by default.

what you need to achieve is that both dracut and rest of the system import pools using the same hostid.
Comment 8 anonymous 2018-04-18 00:24:01 UTC
I executed zgenhostid and then dracut.
I haven't yet needed to force dracut to include /etc/hostid in initramfs.
Comment 9 Jonathan Vasquez (RETIRED) gentoo-dev 2018-04-19 01:23:15 UTC
ZFS on Linux (At least on Gentoo) doesn't require hostid for anything. bliss-initramfs doesn't use hostid at all, and I don't think genkernel does either. You can ignore hostid.
Comment 10 Jonathan Vasquez (RETIRED) gentoo-dev 2018-04-19 01:25:40 UTC
Also I forgot to mention that this is a known problem that you can avoid since ZFS is atomic transactions, COW. In order for this problem to be properly fixed you would need to write some custom code where after init is completely one, it would go to some temporary area and finish unmounting the rootfs (Basically the reverse of initramfs). I believe systemd could do this. ryao and I experimented with this a few years ago but decided not to pursue it. I also won't be spending time on this so if someone wants to spend time getting a proper openrc/systemd shutdown implementation to completely unmount the zfs devices, feel free to do that.
Comment 11 anonymous 2018-04-20 08:43:36 UTC
But, booting for the first time after making /etc/hostid actually failed because hostid changed.

It seems hostid affects importing zpool during boot in some way.

I guess zpool import failed intermittently because hostid changed randomly without the presence of /etc/hostid.
Comment 12 anonymous 2018-04-20 08:45:23 UTC
I'm currently using dracut, and dracut ZFS module puts /etc/hostid into initramfs.
Comment 13 Georgy Yakovlev archtester gentoo-dev 2018-04-20 09:26:26 UTC
(In reply to crocket from comment #11)
> But, booting for the first time after making /etc/hostid actually failed
> because hostid changed.
> 
> It seems hostid affects importing zpool during boot in some way.
> 
> I guess zpool import failed intermittently because hostid changed randomly
> without the presence of /etc/hostid.

exactly. if not present hostid gets generated using ip address.
usually it all zeros or  007f0001 -> 7f.00.00.01  = 127.0.0.1

but stupid dracut can generate a random one. have no idea why.

to boot after changing hostid you may need to pass "spl_hostid=0xNEWHOSTID zfs_force=1 zfsforce=1 zfs.force=1" to the kernel.

after you boot that way couple of times and all the hostid match it'll be fine.


there are at least 4 places to get hostid
kernel cmdline -> dracut hostid -> system hostid after it booted -> zpool recorded hostid.
cmdline can be omitted and if present it overrides everything, but the rest HAVE TO MATCH at all steps.

it's a bit stupid, yes, but as soon as you get it straight it should boot every time without any problem

as alternative, just boot with this trio "zfs_force=1 zfsforce=1 zfs.force=1" and never have a problem with mismatching hostid.
Comment 14 Jonathan Vasquez (RETIRED) gentoo-dev 2018-04-21 18:02:59 UTC
From what I understand, genkernel/bliss-initramfs and I think ZoL by default uses all zeroes for hostid because we don't really care about it. We also don't use /etc/hostid IIRC, so our initramfs shouldn't have this file. Your boot could also fail because of zpool.cache being inconsistent between initramfs environment and live system. bliss-initramfs doesn't have this problem because it:

1. doesn't contain a zpool.cache inside the initramfs
2. bliss-initramfs will dynamically fetch the zpool.cache from the live system from the initramfs environment and then boot with the actual cache.

Example:

1. You tell bliss-initramfs what pool contains your rootfs
2. bliss-initramfs will then mount your rootfs pool as Read Only and copy the zpool.cache it has temporarily into the temp initramfs environment.
3. It will then umount your root pool
4. It will then remount the pool using your zpool.cache
5. Then it switches to the live system.
Comment 15 Georgy Yakovlev archtester gentoo-dev 2018-04-21 23:05:58 UTC
(In reply to Jonathan Vasquez from comment #14)
> From what I understand, genkernel/bliss-initramfs and I think ZoL by default
> uses all zeroes for hostid because we don't really care about it. We also
> don't use /etc/hostid IIRC, so our initramfs shouldn't have this file. Your
> boot could also fail because of zpool.cache being inconsistent between
> initramfs environment and live system. bliss-initramfs doesn't have this
> problem because it:
> 
> 1. doesn't contain a zpool.cache inside the initramfs
> 2. bliss-initramfs will dynamically fetch the zpool.cache from the live
> system from the initramfs environment and then boot with the actual cache.
> 
> Example:
> 
> 1. You tell bliss-initramfs what pool contains your rootfs
> 2. bliss-initramfs will then mount your rootfs pool as Read Only and copy
> the zpool.cache it has temporarily into the temp initramfs environment.
> 3. It will then umount your root pool
> 4. It will then remount the pool using your zpool.cache
> 5. Then it switches to the live system.

I think you are right, all zeroes by default, BUT dracut may mess it up.
not sure what triggers it to use hostid but it came out of nowhere and I never took a closer look at it.
It just randomly failed at boot until I synced hostid at every step.

I used genkernel before until #627320, but looks like it's being fixed soon in https://bugs.gentoo.org/627320

here is a quick grep analysis:

90zfs/module-setup.sh
37:	dracut_install hostid
80:	# Synchronize initramfs and system hostid
81:	AA=`hostid | cut -b 1,2`
82:	BB=`hostid | cut -b 3,4`
83:	CC=`hostid | cut -b 5,6`
84:	DD=`hostid | cut -b 7,8`
85:	printf "\x${DD}\x${CC}\x${BB}\x${AA}" > "${initdir}/etc/hostid"

so it looks like if and ip address changes, dracut will put a changed hostid into initramfs

situation:
1. you have ip address of 192.168.0.1
2. do not have /etc/hostid file on the host
dracut will calculate hostid dynamically and put one into initramfs

and if you ever generate initd with different ip address, hostid inside the initrd will change! 
and zpool will fail to import next boot.

zfs only uses hostid if the file is present. so dracut module kinda forces zfs to use hostid.

how hostid command works:
if /etc/hostid is present, hostid command will always return fixed value, without taking ip address into account.
if /etc/hostid is absent, hostid command will dynamically calculate the value depending on current ip address.


so it looks like it's a bug in dracut zfs module...
Comment 16 Larry the Git Cow gentoo-dev 2019-08-18 01:19:27 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=17b9ca6a5d3e11ddc9ba372f9d50104d7a6a5a71

commit 17b9ca6a5d3e11ddc9ba372f9d50104d7a6a5a71
Author:     Georgy Yakovlev <gyakovlev@gentoo.org>
AuthorDate: 2019-08-18 00:19:26 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2019-08-18 01:04:10 +0000

    sys-fs/zfs: update live ebuild
    
    remove *.la files if no static-libs requested
    
    clarify genkernel encryption support with genkernel-4
    
    remove obsolete zfs initscript checks, those were here since cvs times.
    Those initscripts long gone and systems already migrated to new scripts.
    
    remove obsolete systemd-reenable calls, those were needed with earlier
    versions incorrectly installing systemd units to wrong location.
    it has been more than a year since those versions are gone.
    
    Bug: https://bugs.gentoo.org/647688
    Package-Manager: Portage-2.3.71, Repoman-2.3.17
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 sys-fs/zfs/zfs-9999.ebuild | 73 ++++++++++++----------------------------------
 1 file changed, 18 insertions(+), 55 deletions(-)
Comment 17 anonymous 2019-08-18 03:32:55 UTC
I think ZFS_UNMOUNT='no' should be the default in /etc/conf.d/zfs since it does the same job as /etc/init.d/localmount during shutdown.
Comment 18 Georgy Yakovlev archtester gentoo-dev 2019-08-18 05:07:12 UTC
it will be now with rootfs use flag set (the default), just forgot to mention it in the commit message, but the diff has it.
Comment 19 anonymous 2019-08-18 07:25:33 UTC
Whether or not rootfs is not enabled, localmount is going to unmount any ZFS mount point during shutdown.
Comment 20 Georgy Yakovlev archtester gentoo-dev 2019-08-18 07:32:05 UTC
that's true. I'll likely move it out of use-conditional. thanks for your input.
Comment 21 anonymous 2020-05-26 01:37:41 UTC
/etc/init.d/zfs-mount now disregards ZFS_UNMOUNT variable in /etc/conf.d/zfs
Now, ZFS_UNMOUNT is invariably set to yes in /etc/zfs/zfs-functions.

I think it's better to just prevent zfs-mount from unmounting anything during shutdown.
Comment 22 Georgy Yakovlev archtester gentoo-dev 2020-05-26 05:38:01 UTC
By now, you mean something changed in git master? I’ll go check.
I haven’t noticed changes in 0.8.4 but I don’t reboot that often tbh.
Comment 23 anonymous 2020-05-26 06:16:14 UTC
ZFS_UNMOUNT is set in /etc/zfs/zfs-functions from 0.8.4
Comment 24 Georgy Yakovlev archtester gentoo-dev 2020-05-26 22:19:33 UTC
aha, we have a bug here

# Source zfs configuration, overriding the defaults
if [ -f @initconfdir@/zfs ]; then
	. @initconfdir@/zfs
fi



gets expanded to

# Source zfs configuration, overriding the defaults
if [ -f /zfs ]; then
        . /zfs
fi


thus options from /etc/conf.d/zfs are not applied.

initconfdir is lost.

I'll look at it.
Comment 25 Georgy Yakovlev archtester gentoo-dev 2020-05-26 23:10:33 UTC
in the makefile I see correct

338:DEFAULT_INITCONF_DIR = /etc/conf.d

and

572:initconfdir = $(DEFAULT_INITCONF_DIR)

but

sed replacing it as empty string

>          $(SED) \
>                 -e 's,@bindir\@,$(bindir),g' \
>                 -e 's,@sbindir\@,$(sbindir),g' \
>                 -e 's,@udevdir\@,$(udevdir),g' \
>                 -e 's,@udevruledir\@,$(udevruledir),g' \
>                 -e 's,@sysconfdir\@,$(sysconfdir),g' \
>                 -e 's,@initconfdir\@,$(initconfdir),g' \
>                 -e 's,@initdir\@,$(initdir),g' \
>                 -e 's,@runstatedir\@,$(runstatedir),g' \
>                 -e "s,@SHELL\@,$$SHELL,g" \
>                 -e "s,@NFS_SRV\@,$$NFS_SRV,g" \
>                 $< >'$@'; \
>                chmod +x '$@')


make[3]: Entering directory '/var/tmp/portage/sys-fs/zfs-0.8.4/work/zfs-0.8.4/etc/zfs'
(if [ -e /etc/debian_version ]; then \
        NFS_SRV=nfs-kernel-server; \
  else \
        NFS_SRV=nfs; \
  fi; \
  if [ -e /sbin/openrc-run ]; then \
        SHELL=/sbin/openrc-run; \
  else \
        SHELL=/bin/sh; \
  fi; \
  /bin/sed \
         -e 's,@sbindir\@,/sbin,g' \
         -e 's,@sysconfdir\@,/etc,g' \
         -e 's,@initconfdir\@,,g' \
         zfs-functions.in >'zfs-functions'; \
  [ 'zfs-functions' = 'zfs-functions' ] || \
        chmod +x 'zfs-functions')
Comment 26 Georgy Yakovlev archtester gentoo-dev 2020-05-26 23:18:07 UTC
ok straight after ./configure step, but before make, initconfdir is not defined in etc/zfs/Makefile

but it's defined in 

etc/init.d/Makefile

this would explain the behavior.
Comment 27 Georgy Yakovlev archtester gentoo-dev 2020-05-26 23:32:35 UTC
https://github.com/openzfs/zfs/issues/10375
Comment 28 Larry the Git Cow gentoo-dev 2020-05-26 23:58:03 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=20207419457c676f156e212bf03cf27982eea494

commit 20207419457c676f156e212bf03cf27982eea494
Author:     Georgy Yakovlev <gyakovlev@gentoo.org>
AuthorDate: 2020-05-26 23:56:58 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2020-05-26 23:57:11 +0000

    sys-fs/zfs: revbump 0.8.4, fix not loading /etc/conf.d/zfs
    
    Upstream issue: https://github.com/openzfs/zfs/issues/10375
    Upstream issue: https://github.com/openzfs/zfs/issues/10341
    
    Bug: https://bugs.gentoo.org/647688
    Package-Manager: Portage-2.3.100, Repoman-2.3.22
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 sys-fs/zfs/files/0.8.4-initconfdir.patch           | 35 ++++++++++++++++++++++
 .../zfs/{zfs-0.8.4.ebuild => zfs-0.8.4-r1.ebuild}  |  5 +++-
 2 files changed, 39 insertions(+), 1 deletion(-)
Comment 29 Richard Yao (RETIRED) gentoo-dev 2020-05-27 00:07:19 UTC
(In reply to Georgy Yakovlev from comment #24)
> aha, we have a bug here
> 
> # Source zfs configuration, overriding the defaults
> if [ -f @initconfdir@/zfs ]; then
> 	. @initconfdir@/zfs
> fi
> 
> 
> 
> gets expanded to
> 
> # Source zfs configuration, overriding the defaults
> if [ -f /zfs ]; then
>         . /zfs
> fi
> 
> 
> thus options from /etc/conf.d/zfs are not applied.
> 
> initconfdir is lost.
> 
> I'll look at it.

Nice find. I just did a positive review at upstream. :)
Comment 30 anonymous 2020-06-01 00:56:26 UTC
sys-fs/zfs-0.8.4-r1 fixes the issue.
Comment 31 Georgy Yakovlev archtester gentoo-dev 2020-06-10 21:55:55 UTC
no affected versions left in the tree, closing.