Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 351861 - genkernel should support native ZFS on Linux
Summary: genkernel should support native ZFS on Linux
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal with 1 vote (vote)
Assignee: Gentoo Genkernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-16 18:20 UTC by devsk
Modified: 2012-06-17 01:19 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch to ebuild for zfs USE flag (genkernel-ebuild.patch,506 bytes, patch)
2011-05-31 03:06 UTC, Zachary Bedell
Details | Diff
Patch for first attempt at initramfs ZFS support using genkernel (genkernel-zfs-draft1.patch,17.61 KB, patch)
2011-05-31 03:07 UTC, Zachary Bedell
Details | Diff
Demo script with dangerously sets up a sample ZFS root config (zfsinst.sh,12.62 KB, application/x-sh)
2011-05-31 03:07 UTC, Zachary Bedell
Details

Note You need to log in before you can comment on or make changes to this bug.
Description devsk 2011-01-16 18:20:23 UTC
This is a placeholder for 3.4.12 features we need to have.

Native ZFS on Linux is real. Gentoo can be the first distro to release booting from a native ZFS rootfs. I have it working but the patch is little dirty. I will be cleaning it up and attaching it here by today.
Comment 1 Alexey Shvetsov archtester gentoo-dev 2011-05-01 10:08:40 UTC
Hmm. So how about patch for genkernel? Or you can alternatively fix dracut stuff from zfs to make it work with current dracut and use genkernel from aidecoe overlay
Comment 2 devsk 2011-05-01 15:17:57 UTC
Actually, "here by today" sounds silly now. Because my experiments failed with KQI's ZFS. Eventually, I have now picked up LLNL's ZFS code and have been using it for a while. Seems stable. The good thing is that RuddO wrote dracut module for it. And it works for even booting from ZFS rootfs. I have ditched genkernel (I used it mainly for initrd) and completely moved to dracut (dracut is how genkernel should have handled initrd). So, this bug can be closed now.
Comment 3 Alexey Shvetsov archtester gentoo-dev 2011-05-01 17:58:11 UTC
dracut code shipped with zfs from llnl doesnt work with latest dracut revision from portage. It simply cannot find zfs root. Or can you write some kind of howto to boot from zfs root?
Comment 4 devsk 2011-05-01 19:12:34 UTC
(In reply to comment #3)
> dracut code shipped with zfs from llnl doesnt work with latest dracut revision
> from portage. It simply cannot find zfs root. Or can you write some kind of
> howto to boot from zfs root?

I haven't tested the latest dracut yet. But I had a version of dracut working with ZFS rootfs. I would definitely write a howto. But the biggest issue is time. So much to do so little time.
Comment 5 Alexey Shvetsov archtester gentoo-dev 2011-05-01 19:51:57 UTC
(In reply to comment #4)
> I haven't tested the latest dracut yet. But I had a version of dracut working
> with ZFS rootfs. I would definitely write a howto. But the biggest issue is
> time. So much to do so little time.

Ok. but what dracut version works for you and what cmdline do you use?
Comment 6 devsk 2011-05-01 20:02:00 UTC
dracut-008-r1 and the command line I use to create initrd is 

time dracut -k /lib/modules/${kversion} -d "zfs zcommon znvpair zavl zunicode spl zlib_deflate" --install 'nano lsmod find grep df bash lsof' --lzma --force -o "mdraid dm dmraid crypt i18n" /boot/initramfs-${kver} ${kversion}

kver is x86_64-2.6.38.4
kversion is 2.6.38.4

The grub entry is:

title=Gentoo Linux x86_64-2.6.38.4 ZFS
        find --set-root /signv
        kernel /x86_64-2.6.38.4 root=ZFS=rpool/root resume=LABEL=swap \
                splash=quiet,theme:livecd-2007.0 console=tty1 quiet
        initrd /initramfs-x86_64-2.6.38.4

This is the setup on my laptop.

Hope that helps.
Comment 7 Zachary Bedell 2011-05-31 03:06:44 UTC
Created attachment 275259 [details, diff]
Patch to ebuild for zfs USE flag
Comment 8 Zachary Bedell 2011-05-31 03:07:16 UTC
Created attachment 275261 [details, diff]
Patch for first attempt at initramfs ZFS support using genkernel
Comment 9 Zachary Bedell 2011-05-31 03:07:54 UTC
Created attachment 275263 [details]
Demo script with dangerously sets up a sample ZFS root config
Comment 10 Zachary Bedell 2011-05-31 03:08:15 UTC
Just wanted to chime in that I'm working on getting genkernel to produce ZFS-capable initramfs'.  I've managed to get it "working" in so much as it'll boot a system, but to say that it's a bit rough would be overly polite...

I'll attach patches in case anyone wants to point & laugh.  This is my first non-trivial foray into proffering patches for Gentoo's guts, so please be kind about all the policies & practices I've no doubt trampled in the process.


Implementing the change so far has required a minor change to genkernel's ebuild to add a new 'zfs' USE flag and trigger dependency on sys-devel/spl and sys-fs/zfs.  Both of those need to be pulled in from the 'science' layman overlay at this point, so it definitely won't work out-of-box.  The changes to genkernel itself have been surprisingly minor, though the end results aren't stellar.

Assuming this patch builds for anyone else, using it would entail:
 * Add ZFS="yes" to your /etc/genkernel.conf
 * Add the 'science' layman overlay and make sure sys-fs/zfs emerges correctly.  I'm currently using the direct git pull (9999) version rather than attempting to use one of the releases.  I don't think anything in my genkernel mods depend on latest git, but I did find some bugs that effected me were resolved since the last rc release.
 * Build a new initramfs with genkernel.  You may need to do a `genkernel ... all` as opposed to just `initramfs` as I found in at least one case, module version mismatches killed startup and rebuilding the kernel helped.
 * Add something like the following to grub.conf: (Note no real_root param)

title Gentoo Linux ZFS
  root (hd0,0)
  kernel /boot/kernel-... root=/dev/ram0 dozfs
  initrd /boot/initramfs-...

 * Ensure the pool you want as root has a bootfs attribute set.
 * Set the mountpoint on that fs to 'legacy'.
 * Set the mountpoint for the root of the pool to '/'.

See attached zfsinst.sh script for a sure-fire way to toast your root install and/or migrate it to ZFS root with the above (and possibly other) assumptions in place...

Once the above is satisfied, you should be able to reboot, pick your newly created Grub entry, and with any luck, ZFS will be mounted as root and Gentoo will boot off it.  To the best of my knowledge, leaving 'dozfs' off the kernel command line will completely deactivate the changes I've made, so it should be safe to use the same initramfs for both ZFS and non-ZFS booting.  That said, making a copy of a working initramfs and stashing it in a failsafe Grub entry is probably a GoodIdea.

As I might have mentioned a few times, this is pretty rough.  Known issues include:

 * zfsinst.sh should NOT be run on any non-junk system.  It makes a boat load of assumptions that are specific to the VM I'm testing in and does very little sanity checking to ensure those assumptions are the least bit valid.  Specifically, it will repartition your /dev/sdb and /dev/sdc, reinstall Grub, destroy and recreate any existing pool named 'rpool', and do a bunch of other VeryBadThings(tm).  That said, it does embed all the requirements to make this work, and the code's probably (slightly) clearer than my rambling...  If you wanted to setup a test VM, just create something with three scsi drives, put your Gentoo install on sda, and sdb/c will be used for a mirrored zpool by this script.

 * ZFS doesn't currently build statically, so a bunch of dylibs are pulled into the initramfs to support it.  It's a fair bit of extra rd size (+~4.2MB LZMA'd).  I borrowed (under GPL-2) dylib resolution scripts from Dracut in order to figure out what a binary needs, resolve symlinks, etc.  Those routines are added as gen_dylibs.sh and are minimally changed from the Dracut code in order to shim in genkernel's logging & such.

 * The initramfs will attempt to find a rootfs by one of two means:  (assumes dozfs on kernel command line)
   > If real_root is set and in the form ZFS=pool/fs, it will parse it, import the named pool, mount the fs as /newroot and then mount all other non-legacy zfs' on that pool using their configured mountpoints relative to /newroot.

   > If real_root is set and NOT in the form ZFS=*, ZFS ignores it and the normal rootfs finding procedure applies.

   > If real_root is NOT set, it will import all pools on your system (using -f) and scan for the first one that has the bootfs attribute set.  It will then set real_root to that fs and proceed as above.  If no importable pools have bootfs set, it will emit a warning and fall back to normal rootfs finding.


 * The initramfs currently mounts ALL non-legacy filesystems on the pool containing the bootfs.  It imports the pool with `zpool import -f -N -R /newroot`, then mounts the rootfs with `mount -t zfs ${parsed_real_root} /newroot`.  So far that's okay, but...

 * Then it does `zfs mount -a` to pull in everything else and hopes for the best when switch_root flops everything over.  This is bad, wrong, and probably not what you want, but it's what I need for the way I layout my disks.  Specifically, I create multiple zfs in the rpool for different purposes:

NAME         USED  AVAIL  REFER  MOUNTPOINT
rpool       2.50G  56.1G    21K  /mnt/gentoo
rpool/ROOT  93.1M  56.1G  93.1M  legacy
rpool/boot  19.6M  56.1G  19.6M  /mnt/gentoo/boot
rpool/etc   1.88M  56.1G  1.88M  /mnt/gentoo/etc
rpool/home    30K  56.1G    30K  /mnt/gentoo/home
rpool/root  10.5M  56.1G  10.5M  /mnt/gentoo/root
rpool/usr   2.31G  56.1G  2.31G  /mnt/gentoo/usr
rpool/var   68.8M  56.1G  68.8M  /mnt/gentoo/var

I got in the habit of that layout from FreeBSD, and I like it because you can snap configuration files, set different compression on logs in /var, etc.  Creating sub-zfs' a little deeper lets you do things like keep your /(?:usr|opt)/local and homes safe during OS upgrades, keep the portage package tree compressed, etc.  It's very handy, and the simple layout above is just the start of  what can be done.

The problem is that to boot the system, I need at least ROOT, etc, usr, and var mounted before the system rc scripts could start running.  init.d scripts could likely take care of /home /root, and anything else, but the OS guts need to be mounted before that can happen.  

My current "solution" is to mount everything on the pool with gets the job done but has undesirable side effects.  Specifically:

 * Everything shows up in `zfs list` after boot as mounted under "/newroot"
 * The initramfs isn't able to completely clean up the rootfs ramdisk
 * The init.d scripts for ZFS lose their mind when they come up
 * Something segfaults trying to umount everything at shutdown.
 * Pool usually needs '-f' to import after reboot.
 * zpool.cache isn't used properly.
Minor stuff, ya know.....

I suspect (though have not confirmed in source) that the way FreeBSD and Solaris both handle this is to have the bootloader capable of reading /etc/zfs/zpool.cache from the bootfs and only mounting what was mounted at shutdown.  My FreeBSD system does actually have /etc as a simple subdirectory or rpool/ROOT (not a sub-zfs), so it's likely the cache could be read from there, and it's even possible that the system might be able to get far enough in boot with just /etc, /bin, and /sbin (not /var or /usr) to run an init script to bring ZFS up properly.  I'm tempted to go take a peek at FreeBSD, but on the other hand, I haven't (yet...) tainted myself by looking at any CDDL code, and I'm not sure if that might be a necessary state to be in given that Grub and genkernel both need to be GPL-2.

Speaking of Grub, try as I might, I've not managed to get grub2's ZFS support to work, so that's not yet an option for reading zpool.cache.  That's probably also not a great option given the separation of grub and genkernel and the significantly tightened dependency between them if they need to work together to extract boot time filesystems.  The above work is all done on Grub 0.97 using an ext2 based boot partition.  

And speaking of partitions... The zfsinst.sh does things a bit oddly there.  It leaves about 4MB of space before the first partition as extra embedding space required by my attempts to use grub2.  Then it creates 64MB boot ext2 partitions on both mirror devs and DOESN'T md mirror them as I had no luck getting Grub to boot off mdraid (probably a grub2 bug, but I didn't go back and try to fix it with grub1).  Also, there's no swap, so beware if you try to use the script.


So all told, it's probably a ham-fisted start, but I figured I'd flush it out to the intarwebs and see if anyone else could make something out of it.  I'm not ready to give up yet, but as I have run out of three-day weekend, I figured I'd post what I've got at least.

Any input on how to improve the patches or resolve the multi-fs issue would be most welcome.  I believe Gentoo & genkernel in particular work on an author-keeps-copyright approach, and if so, then I'll assert here that I'm the author of these changes (save for the library code pulled from Dracut under GPL-2), and that I license my changes under GPL-2.  If these changes ever go anywhere and anyone needs copyright assignment, I'd be willing to do that as well.

Best regards,
Zac Bedell
Comment 11 Sebastian Pipping gentoo-dev 2011-05-31 03:20:16 UTC
Zachary, genkernel is moving towards using Dracut in the long run so it would make sense to team up with developer aidecoe on the Dracut side of this for the future.  Without a counter part in Dracut a feature like ZFS would mean a new pieces on aidecoe stack of things to port to Dracut.

Btw if there's anything else that we can involve you with in genkernel land, please let us know at genkernel@g.o.  We need hands like yours.
Comment 12 Richard Yao (RETIRED) gentoo-dev 2012-06-17 01:19:03 UTC
ZFS support entered genkernel in version 3.4.24. It is in good shape as of 3.4.32.

Closing as fixed.