Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 830342 - app-containers/lxd-4.0.8 removes zfs pool at service stop
Summary: app-containers/lxd-4.0.8 removes zfs pool at service stop
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Joonas Niilola
URL: https://github.com/lxc/lxd/issues/9739
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-31 08:18 UTC by Stephen Bosch
Modified: 2022-01-09 08:31 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Bosch 2021-12-31 08:18:58 UTC
On an OpenRC system, when the lxd service is stopped:

'rc-service lxd stop'

it detaches previously imported pools, so that they are not available at next reboot:

 4800.pts-1.t470 root@t470 2021-12-31 01:11:01 /var/db/repos/gentoo # zpool status
no pools available
 4800.pts-1.t470 root@t470 2021-12-31 01:11:04 /var/db/repos/gentoo # zpool import -d /var/lib/zfs_img/ zfs_lxd
 4800.pts-1.t470 root@t470 2021-12-31 01:11:27 /var/db/repos/gentoo # zpool status
  pool: zfs_lxd
 state: ONLINE
  scan: resilvered 45K in 00:00:00 with 0 errors on Thu Aug 19 07:00:27 2021
config:

        NAME                           STATE     READ WRITE CKSUM
        zfs_lxd                        ONLINE       0     0     0
          mirror-0                     ONLINE       0     0     0
            /var/lib/zfs_img/zfs0.img  ONLINE       0     0     0
            /var/lib/zfs_img/zfs1.img  ONLINE       0     0     0
          mirror-1                     ONLINE       0     0     0
            /var/lib/zfs_img/zfs2.img  ONLINE       0     0     0
            /var/lib/zfs_img/zfs3.img  ONLINE       0     0     0

errors: No known data errors
 4800.pts-1.t470 root@t470 2021-12-31 01:11:28 /var/db/repos/gentoo # lxc list
Error: Get "http://unix.socket/1.0": dial unix /var/lib/lxd/unix.socket: connect: no such file or directory
 4800.pts-1.t470 root@t470 2021-12-31 01:11:39 /var/db/repos/gentoo # rc-service lxd start
 * Starting lxd service ...                                                                                   [ ok ]
 4800.pts-1.t470 root@t470 2021-12-31 01:11:53 /var/db/repos/gentoo # rc-service lxd status
 * status: started
 4800.pts-1.t470 root@t470 2021-12-31 01:11:57 /var/db/repos/gentoo # lxc list
+---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+
|     NAME      |  STATE  |         IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+
| gentoo-PG-C01 | RUNNING | 10.248.20.110 (eth0) | fd42:9c78:69e0:463f:216:3eff:fe80:df1d (eth0) | CONTAINER | 0         |
+---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+
| gentoo-WS-C01 | RUNNING | 10.248.20.100 (eth0) | fd42:9c78:69e0:463f:216:3eff:fea4:13e5 (eth0) | CONTAINER | 1         |
+---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+
| ubuntu-PG-C01 | RUNNING | 10.248.20.120 (eth0) | fd42:9c78:69e0:463f:216:3eff:fe8f:3d93 (eth0) | CONTAINER | 0         |
+---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+
 4800.pts-1.t470 root@t470 2021-12-31 01:12:00 /var/db/repos/gentoo # zpool status
  pool: zfs_lxd
 state: ONLINE
  scan: resilvered 45K in 00:00:00 with 0 errors on Thu Aug 19 07:00:27 2021
config:

        NAME                           STATE     READ WRITE CKSUM
        zfs_lxd                        ONLINE       0     0     0
          mirror-0                     ONLINE       0     0     0
            /var/lib/zfs_img/zfs0.img  ONLINE       0     0     0
            /var/lib/zfs_img/zfs1.img  ONLINE       0     0     0
          mirror-1                     ONLINE       0     0     0
            /var/lib/zfs_img/zfs2.img  ONLINE       0     0     0
            /var/lib/zfs_img/zfs3.img  ONLINE       0     0     0

errors: No known data errors
 4800.pts-1.t470 root@t470 2021-12-31 01:12:22 /var/db/repos/gentoo # rc-service lxd stop
 * Stopping lxd service and containers, waiting 40s ...                                                       [ ok ]
 4800.pts-1.t470 root@t470 2021-12-31 01:12:43 /var/db/repos/gentoo # zpool status
no pools available
 4800.pts-1.t470 root@t470 2021-12-31 01:12:46 /var/db/repos/gentoo # zpool get cachefile
 4800.pts-1.t470 root@t470 2021-12-31 01:12:50 /var/db/repos/gentoo # 


Reproducible: Always

Actual Results:  
The zfs pools are detached at every shutdown so that they must be manually added upon reboot.

Expected Results:  
The zfs pools should persist across reboots.

# emerge --info
Portage 3.0.28 (python 3.9.9-final-0, default/linux/amd64/17.1/desktop/plasma, gcc-11.2.0, glibc-2.33-r7, 5.4.156-gentoo x86_64)
=================================================================
System uname: Linux-5.4.156-gentoo-x86_64-Intel-R-_Core-TM-_i7-7600U_CPU_@_2.80GHz-with-glibc2.33
KiB Mem:    32752272 total,  28508556 free
KiB Swap:   32813020 total,  32813020 free
Timestamp of repository gentoo: Wed, 29 Dec 2021 07:22:01 +0000
Head commit of repository gentoo: 877105acb404e9d4c085976fd8d545f7592cf696

sh bash 5.1_p8
ld GNU ld (Gentoo 2.37_p1 p0) 2.37
ccache version 4.4.2 [disabled]
app-misc/pax-utils:        1.3.3::gentoo
app-shells/bash:           5.1_p8::gentoo
dev-java/java-config:      2.3.1::gentoo
dev-lang/perl:             5.34.0-r3::gentoo
dev-lang/python:           2.7.18_p13::gentoo, 3.9.9::gentoo, 3.10.0_p1::gentoo
dev-lang/rust:             1.56.1::gentoo
dev-util/ccache:           4.4.2::gentoo
dev-util/cmake:            3.21.4::gentoo
dev-util/meson:            0.59.4::gentoo
sys-apps/baselayout:       2.7-r3::gentoo
sys-apps/openrc:           0.44.10::gentoo
sys-apps/sandbox:          2.25::gentoo
sys-devel/autoconf:        2.13-r1::gentoo, 2.71-r1::gentoo
sys-devel/automake:        1.13.4-r2::gentoo, 1.16.4::gentoo
sys-devel/binutils:        2.37_p1::gentoo
sys-devel/binutils-config: 5.4::gentoo
sys-devel/clang:           13.0.0::gentoo
sys-devel/gcc:             11.2.0::gentoo
sys-devel/gcc-config:      2.4::gentoo
sys-devel/libtool:         2.4.6-r6::gentoo
sys-devel/lld:             13.0.0::gentoo
sys-devel/llvm:            13.0.0::gentoo
sys-devel/make:            4.3::gentoo
sys-kernel/linux-headers:  5.15-r1::gentoo (virtual/os-headers)
sys-libs/glibc:            2.33-r7::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/gentoo.git
    priority: -1000

localrepo
    location: /var/db/repos/localrepo
    masters: gentoo

haskell
    location: /var/lib/layman/haskell
    masters: gentoo
    priority: 50

jorgicio
    location: /var/lib/layman/jorgicio
    masters: gentoo
    priority: 50

science
    location: /var/lib/layman/science
    masters: gentoo
    priority: 50

torbrowser
    location: /var/lib/layman/torbrowser
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=skylake -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.8/conf"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=skylake -O2 -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-march=skylake -O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=skylake -O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="de_DE.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="de en fr nb es"
MAKEOPTS="-j2"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cli crypt cups dbus declarative dri dts dvd dvdr elogind emboss encode exif flac fortran gdbm gif gpm gtk gui iconv icu ipv6 jpeg kde kipi kwallet lcms libglvnd libnotify libtirpc mad mng mp3 mp4 mpeg multilib ncurses networkmanager nls nptl ogg opengl openmp pam pango pcre pdf plasma png policykit ppds pulseaudio qml qt5 readline sdl seccomp semantic-desktop spell split-usr ssl startup-notification svg tiff truetype udev udisks unicode upower usb vorbis widgets wxwidgets x264 xattr xcb xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2020" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="de de-DE en-CA en-GB fr-CA fr nb es-ES es" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26 ruby27" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LEX, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Stephen Bosch 2022-01-03 18:37:12 UTC
Upstream has informed me that this is expected behaviour.

If supplied with a complete ZFS pool, LXD will export it on shutdown for safety.

On startup, LXD only looks for pools in /var/lib/lxd/disks/ or in block devices (visible with lsblk).

The suggested solutions (quoted from https://discuss.linuxcontainers.org/t/lxd-exports-zfs-pools-at-shutdown-but-does-not-import-them-properly-at-startup/13031/2 )

- Put an init job that runs prior to LXD and runs the correct zpool import for your setup, or

- Relocate your zpool backing files to /var/lib/lxd/disks. Note that this option isn’t exactly ideal as LXD generally expects that directory to be used for loop-backed pools that it itself manages, so it may get confused by files that don’t line up.

I'll put a note to this effect in the Gentoo LXD wiki page.
Comment 2 Stephen Bosch 2022-01-03 21:48:04 UTC
I thought about this and it's crazy to me that upstream hard-codes this. Shouldn't it allow variable search paths?

Feel free to close again if you feel the bug is invalid. But I think we can do better here.
Comment 3 Georgy Yakovlev archtester gentoo-dev 2022-01-03 22:37:35 UTC
docker allows relocation of it's storage indeed, which I always use for zfs.



at least on systemd you can add explicit generator for this usecase

this is what I do on systemd machines

zfs set org.openzfs.systemd:required-by=docker.service pool/dataset/docker


so it will be mounted before docker start, and unmounted after stop too.



on openrc system you could add a start_pre() function to /etc/conf.d/lxd that takes care of importing your pool.

an example:

start_pre() {
   if ! grep -q zfs_lxd <(zpool list); then
      zpool import zfs_lxd
   fi
}
Comment 4 Stephen Bosch 2022-01-05 23:46:09 UTC
Hi Georgy, thanks for this.

(In reply to Georgy Yakovlev from comment #3)
> on openrc system you could add a start_pre() function to /etc/conf.d/lxd
> that takes care of importing your pool.
> 
> an example:
> 
> start_pre() {
>    if ! grep -q zfs_lxd <(zpool list); then
>       zpool import zfs_lxd
>    fi
> }

This solution is a good start but is not sufficient because LXD exports the pool on shutdown, meaning it will not appear in the output of zpool list:

> # zpool list
> no pools available
> # zpool import zfs_lxd
> cannot import 'zfs_lxd': no such pool available

I've filed issue 9739 with the upstream maintainers, but I'd like to see if we can do something sooner for Gentoo because we will be waiting some time for this fix.

Another problem I'm encountering: If the only active network interface is wireless and the system is suspended, NetworkManager becomes inactive and lxd will stop, exporting the pool as it does so. The result is that the pools are gone when the machine resumes from suspend and I have to import manually again.
Comment 5 Georgy Yakovlev archtester gentoo-dev 2022-01-06 01:35:01 UTC
(In reply to Stephen Bosch from comment #4)
> Hi Georgy, thanks for this.
> 
> (In reply to Georgy Yakovlev from comment #3)
> > on openrc system you could add a start_pre() function to /etc/conf.d/lxd
> > that takes care of importing your pool.
> > 
> > an example:
> > 
> > start_pre() {
> >    if ! grep -q zfs_lxd <(zpool list); then
> >       zpool import zfs_lxd
> >    fi
> > }
> 
> This solution is a good start but is not sufficient because LXD exports the
> pool on shutdown, meaning it will not appear in the output of zpool list:
> 
> > # zpool list
> > no pools available
> > # zpool import zfs_lxd
> > cannot import 'zfs_lxd': no such pool available

I believe that's specific issue with your configuration.
normally 'zpool import poolname' is enough to import a pool if it's not imported.

that guard here is to prevent import attempt if it's already imported, so it does exactly what it was intended to do.

you obviously import pool differently, so extend the command with your requirements
like

zpool import -d /var/lib/zfs_img zfs_lxd


btw, why you are using such weird zfs setup?
it will work much better if partitions, than disk images from non-zfs filesystem.

> 
> I've filed issue 9739 with the upstream maintainers, but I'd like to see if
> we can do something sooner for Gentoo because we will be waiting some time
> for this fix.
> 
> Another problem I'm encountering: If the only active network interface is
> wireless and the system is suspended, NetworkManager becomes inactive and
> lxd will stop, exporting the pool as it does so. The result is that the
> pools are gone when the machine resumes from suspend and I have to import
> manually again.
Comment 6 Georgy Yakovlev archtester gentoo-dev 2022-01-06 01:39:49 UTC
and just to prevent possible confusion
exported pools never appear in zpool list output


if ! grep -q zfs_lxd <(zpool list); then

^ this line checks zpool list output and makes sure zfs_lxd is NOT there, so not imported.
and if it's the case it runs import command.


and since you use disk images, you need to specify directory to look for block devices(images in your case)


so zpool import -d /var/lib/zfs_img ...

does exactly that.

it tells zpool to scan /var/lib/zfs_img, and not /dev for pools.
Comment 7 Stephen Bosch 2022-01-06 21:39:35 UTC
I have now tested your function in /etc/conf.d/lxd, Georgy:

start_pre() {
   if ! grep -q zfs_lxd <(zpool list); then
      zpool import zfs_lxd
   fi
}

and can confirm it works beautifully. It also works even if the lxd service has to wait for NetworkManager to come (as might be the case on wireless connections). Many thanks!
Comment 8 Stephen Bosch 2022-01-06 21:40:17 UTC
I will close this bug now, as we are just waiting for upstream to add this feature.
Comment 9 Joonas Niilola gentoo-dev 2022-01-07 13:44:47 UTC
Sorry I'm a bit late to the party, but will upstream's fix achieve the same result as the openrc conf.d edit? In other words, is the conf.d modification still needed _after_ upstream fixes this?

And huge thanks to Georgy for handling it. While I do use ZFS nowadays, it's not on use where my LXD containers are.
Comment 10 Georgy Yakovlev archtester gentoo-dev 2022-01-07 15:48:44 UTC
this is a system-specific configuration trick and is unlikely to be needed after upstream adds a knob to avoid exporting pool on lxd shutdown. pool will just remain active if lxd goes down.


it MAY still be needed on this particular setup where zfs is used on top of raw files instead of disks, but that's very very very non-standard and nothing you can do for that in init.d script in a generic way. conf.d hook is the perfect solution.
Comment 11 Georgy Yakovlev archtester gentoo-dev 2022-01-07 15:56:59 UTC
there's 1 thing I can think of that can somewhat improve situation: 


adding a order dependency on zfs services in appropriate files, that affects systemd.


similar bug for docker:

https://bugs.gentoo.org/680094

in the comments I suggest user adding
rc_need=zfs_mount", but that's a hard dependency, specific for system.

initd script could specify rc_after for zfs-mount or zfs-import

this will still work on zfs-less systems, but will order services properly if zfs is present.


same goes for systemd unit.

After=zfs.target

should prevent races and ordering issues without hard dependencies.

I'm not familiar with lxd at all, and I don't know if this should go to lxd or lxcfs scripts.
But it could make it just a little bit more robust.
Comment 12 Georgy Yakovlev archtester gentoo-dev 2022-01-07 15:59:38 UTC
it may already be covered by dependency on localmount / local-fs.target

because zfs is ordered before that.
but again, it's not critical, just a minor tiny improvement I could think of.
Comment 13 Joonas Niilola gentoo-dev 2022-01-09 08:31:00 UTC
There's https://bugs.gentoo.org/817287 for the init.d/service files update pending - I should try to get these both done at once.