On an OpenRC system, when the lxd service is stopped: 'rc-service lxd stop' it detaches previously imported pools, so that they are not available at next reboot: 4800.pts-1.t470 root@t470 2021-12-31 01:11:01 /var/db/repos/gentoo # zpool status no pools available 4800.pts-1.t470 root@t470 2021-12-31 01:11:04 /var/db/repos/gentoo # zpool import -d /var/lib/zfs_img/ zfs_lxd 4800.pts-1.t470 root@t470 2021-12-31 01:11:27 /var/db/repos/gentoo # zpool status pool: zfs_lxd state: ONLINE scan: resilvered 45K in 00:00:00 with 0 errors on Thu Aug 19 07:00:27 2021 config: NAME STATE READ WRITE CKSUM zfs_lxd ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /var/lib/zfs_img/zfs0.img ONLINE 0 0 0 /var/lib/zfs_img/zfs1.img ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 /var/lib/zfs_img/zfs2.img ONLINE 0 0 0 /var/lib/zfs_img/zfs3.img ONLINE 0 0 0 errors: No known data errors 4800.pts-1.t470 root@t470 2021-12-31 01:11:28 /var/db/repos/gentoo # lxc list Error: Get "http://unix.socket/1.0": dial unix /var/lib/lxd/unix.socket: connect: no such file or directory 4800.pts-1.t470 root@t470 2021-12-31 01:11:39 /var/db/repos/gentoo # rc-service lxd start * Starting lxd service ... [ ok ] 4800.pts-1.t470 root@t470 2021-12-31 01:11:53 /var/db/repos/gentoo # rc-service lxd status * status: started 4800.pts-1.t470 root@t470 2021-12-31 01:11:57 /var/db/repos/gentoo # lxc list +---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+ | gentoo-PG-C01 | RUNNING | 10.248.20.110 (eth0) | fd42:9c78:69e0:463f:216:3eff:fe80:df1d (eth0) | CONTAINER | 0 | +---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+ | gentoo-WS-C01 | RUNNING | 10.248.20.100 (eth0) | fd42:9c78:69e0:463f:216:3eff:fea4:13e5 (eth0) | CONTAINER | 1 | +---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+ | ubuntu-PG-C01 | RUNNING | 10.248.20.120 (eth0) | fd42:9c78:69e0:463f:216:3eff:fe8f:3d93 (eth0) | CONTAINER | 0 | +---------------+---------+----------------------+-----------------------------------------------+-----------+-----------+ 4800.pts-1.t470 root@t470 2021-12-31 01:12:00 /var/db/repos/gentoo # zpool status pool: zfs_lxd state: ONLINE scan: resilvered 45K in 00:00:00 with 0 errors on Thu Aug 19 07:00:27 2021 config: NAME STATE READ WRITE CKSUM zfs_lxd ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /var/lib/zfs_img/zfs0.img ONLINE 0 0 0 /var/lib/zfs_img/zfs1.img ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 /var/lib/zfs_img/zfs2.img ONLINE 0 0 0 /var/lib/zfs_img/zfs3.img ONLINE 0 0 0 errors: No known data errors 4800.pts-1.t470 root@t470 2021-12-31 01:12:22 /var/db/repos/gentoo # rc-service lxd stop * Stopping lxd service and containers, waiting 40s ... [ ok ] 4800.pts-1.t470 root@t470 2021-12-31 01:12:43 /var/db/repos/gentoo # zpool status no pools available 4800.pts-1.t470 root@t470 2021-12-31 01:12:46 /var/db/repos/gentoo # zpool get cachefile 4800.pts-1.t470 root@t470 2021-12-31 01:12:50 /var/db/repos/gentoo # Reproducible: Always Actual Results: The zfs pools are detached at every shutdown so that they must be manually added upon reboot. Expected Results: The zfs pools should persist across reboots. # emerge --info Portage 3.0.28 (python 3.9.9-final-0, default/linux/amd64/17.1/desktop/plasma, gcc-11.2.0, glibc-2.33-r7, 5.4.156-gentoo x86_64) ================================================================= System uname: Linux-5.4.156-gentoo-x86_64-Intel-R-_Core-TM-_i7-7600U_CPU_@_2.80GHz-with-glibc2.33 KiB Mem: 32752272 total, 28508556 free KiB Swap: 32813020 total, 32813020 free Timestamp of repository gentoo: Wed, 29 Dec 2021 07:22:01 +0000 Head commit of repository gentoo: 877105acb404e9d4c085976fd8d545f7592cf696 sh bash 5.1_p8 ld GNU ld (Gentoo 2.37_p1 p0) 2.37 ccache version 4.4.2 [disabled] app-misc/pax-utils: 1.3.3::gentoo app-shells/bash: 5.1_p8::gentoo dev-java/java-config: 2.3.1::gentoo dev-lang/perl: 5.34.0-r3::gentoo dev-lang/python: 2.7.18_p13::gentoo, 3.9.9::gentoo, 3.10.0_p1::gentoo dev-lang/rust: 1.56.1::gentoo dev-util/ccache: 4.4.2::gentoo dev-util/cmake: 3.21.4::gentoo dev-util/meson: 0.59.4::gentoo sys-apps/baselayout: 2.7-r3::gentoo sys-apps/openrc: 0.44.10::gentoo sys-apps/sandbox: 2.25::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.71-r1::gentoo sys-devel/automake: 1.13.4-r2::gentoo, 1.16.4::gentoo sys-devel/binutils: 2.37_p1::gentoo sys-devel/binutils-config: 5.4::gentoo sys-devel/clang: 13.0.0::gentoo sys-devel/gcc: 11.2.0::gentoo sys-devel/gcc-config: 2.4::gentoo sys-devel/libtool: 2.4.6-r6::gentoo sys-devel/lld: 13.0.0::gentoo sys-devel/llvm: 13.0.0::gentoo sys-devel/make: 4.3::gentoo sys-kernel/linux-headers: 5.15-r1::gentoo (virtual/os-headers) sys-libs/glibc: 2.33-r7::gentoo Repositories: gentoo location: /var/db/repos/gentoo sync-type: git sync-uri: https://github.com/gentoo-mirror/gentoo.git priority: -1000 localrepo location: /var/db/repos/localrepo masters: gentoo haskell location: /var/lib/layman/haskell masters: gentoo priority: 50 jorgicio location: /var/lib/layman/jorgicio masters: gentoo priority: 50 science location: /var/lib/layman/science masters: gentoo priority: 50 torbrowser location: /var/lib/layman/torbrowser masters: gentoo priority: 50 ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="@FREE" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=skylake -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.8/conf" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=skylake -O2 -pipe" DISTDIR="/var/cache/distfiles" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-march=skylake -O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-march=skylake -O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="de_DE.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="de en fr nb es" MAKEOPTS="-j2" PKGDIR="/var/cache/binpkgs" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" SHELL="/bin/bash" USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cli crypt cups dbus declarative dri dts dvd dvdr elogind emboss encode exif flac fortran gdbm gif gpm gtk gui iconv icu ipv6 jpeg kde kipi kwallet lcms libglvnd libnotify libtirpc mad mng mp3 mp4 mpeg multilib ncurses networkmanager nls nptl ogg opengl openmp pam pango pcre pdf plasma png policykit ppds pulseaudio qml qt5 readline sdl seccomp semantic-desktop spell split-usr ssl startup-notification svg tiff truetype udev udisks unicode upower usb vorbis widgets wxwidgets x264 xattr xcb xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2020" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="de de-DE en-CA en-GB fr-CA fr nb es-ES es" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26 ruby27" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LEX, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Upstream has informed me that this is expected behaviour. If supplied with a complete ZFS pool, LXD will export it on shutdown for safety. On startup, LXD only looks for pools in /var/lib/lxd/disks/ or in block devices (visible with lsblk). The suggested solutions (quoted from https://discuss.linuxcontainers.org/t/lxd-exports-zfs-pools-at-shutdown-but-does-not-import-them-properly-at-startup/13031/2 ) - Put an init job that runs prior to LXD and runs the correct zpool import for your setup, or - Relocate your zpool backing files to /var/lib/lxd/disks. Note that this option isn’t exactly ideal as LXD generally expects that directory to be used for loop-backed pools that it itself manages, so it may get confused by files that don’t line up. I'll put a note to this effect in the Gentoo LXD wiki page.
I thought about this and it's crazy to me that upstream hard-codes this. Shouldn't it allow variable search paths? Feel free to close again if you feel the bug is invalid. But I think we can do better here.
docker allows relocation of it's storage indeed, which I always use for zfs. at least on systemd you can add explicit generator for this usecase this is what I do on systemd machines zfs set org.openzfs.systemd:required-by=docker.service pool/dataset/docker so it will be mounted before docker start, and unmounted after stop too. on openrc system you could add a start_pre() function to /etc/conf.d/lxd that takes care of importing your pool. an example: start_pre() { if ! grep -q zfs_lxd <(zpool list); then zpool import zfs_lxd fi }
Hi Georgy, thanks for this. (In reply to Georgy Yakovlev from comment #3) > on openrc system you could add a start_pre() function to /etc/conf.d/lxd > that takes care of importing your pool. > > an example: > > start_pre() { > if ! grep -q zfs_lxd <(zpool list); then > zpool import zfs_lxd > fi > } This solution is a good start but is not sufficient because LXD exports the pool on shutdown, meaning it will not appear in the output of zpool list: > # zpool list > no pools available > # zpool import zfs_lxd > cannot import 'zfs_lxd': no such pool available I've filed issue 9739 with the upstream maintainers, but I'd like to see if we can do something sooner for Gentoo because we will be waiting some time for this fix. Another problem I'm encountering: If the only active network interface is wireless and the system is suspended, NetworkManager becomes inactive and lxd will stop, exporting the pool as it does so. The result is that the pools are gone when the machine resumes from suspend and I have to import manually again.
(In reply to Stephen Bosch from comment #4) > Hi Georgy, thanks for this. > > (In reply to Georgy Yakovlev from comment #3) > > on openrc system you could add a start_pre() function to /etc/conf.d/lxd > > that takes care of importing your pool. > > > > an example: > > > > start_pre() { > > if ! grep -q zfs_lxd <(zpool list); then > > zpool import zfs_lxd > > fi > > } > > This solution is a good start but is not sufficient because LXD exports the > pool on shutdown, meaning it will not appear in the output of zpool list: > > > # zpool list > > no pools available > > # zpool import zfs_lxd > > cannot import 'zfs_lxd': no such pool available I believe that's specific issue with your configuration. normally 'zpool import poolname' is enough to import a pool if it's not imported. that guard here is to prevent import attempt if it's already imported, so it does exactly what it was intended to do. you obviously import pool differently, so extend the command with your requirements like zpool import -d /var/lib/zfs_img zfs_lxd btw, why you are using such weird zfs setup? it will work much better if partitions, than disk images from non-zfs filesystem. > > I've filed issue 9739 with the upstream maintainers, but I'd like to see if > we can do something sooner for Gentoo because we will be waiting some time > for this fix. > > Another problem I'm encountering: If the only active network interface is > wireless and the system is suspended, NetworkManager becomes inactive and > lxd will stop, exporting the pool as it does so. The result is that the > pools are gone when the machine resumes from suspend and I have to import > manually again.
and just to prevent possible confusion exported pools never appear in zpool list output if ! grep -q zfs_lxd <(zpool list); then ^ this line checks zpool list output and makes sure zfs_lxd is NOT there, so not imported. and if it's the case it runs import command. and since you use disk images, you need to specify directory to look for block devices(images in your case) so zpool import -d /var/lib/zfs_img ... does exactly that. it tells zpool to scan /var/lib/zfs_img, and not /dev for pools.
I have now tested your function in /etc/conf.d/lxd, Georgy: start_pre() { if ! grep -q zfs_lxd <(zpool list); then zpool import zfs_lxd fi } and can confirm it works beautifully. It also works even if the lxd service has to wait for NetworkManager to come (as might be the case on wireless connections). Many thanks!
I will close this bug now, as we are just waiting for upstream to add this feature.
Sorry I'm a bit late to the party, but will upstream's fix achieve the same result as the openrc conf.d edit? In other words, is the conf.d modification still needed _after_ upstream fixes this? And huge thanks to Georgy for handling it. While I do use ZFS nowadays, it's not on use where my LXD containers are.
this is a system-specific configuration trick and is unlikely to be needed after upstream adds a knob to avoid exporting pool on lxd shutdown. pool will just remain active if lxd goes down. it MAY still be needed on this particular setup where zfs is used on top of raw files instead of disks, but that's very very very non-standard and nothing you can do for that in init.d script in a generic way. conf.d hook is the perfect solution.
there's 1 thing I can think of that can somewhat improve situation: adding a order dependency on zfs services in appropriate files, that affects systemd. similar bug for docker: https://bugs.gentoo.org/680094 in the comments I suggest user adding rc_need=zfs_mount", but that's a hard dependency, specific for system. initd script could specify rc_after for zfs-mount or zfs-import this will still work on zfs-less systems, but will order services properly if zfs is present. same goes for systemd unit. After=zfs.target should prevent races and ordering issues without hard dependencies. I'm not familiar with lxd at all, and I don't know if this should go to lxd or lxcfs scripts. But it could make it just a little bit more robust.
it may already be covered by dependency on localmount / local-fs.target because zfs is ordered before that. but again, it's not critical, just a minor tiny improvement I could think of.
There's https://bugs.gentoo.org/817287 for the init.d/service files update pending - I should try to get these both done at once.