With zfs-kmod-0.6.1-r1 boot is hanging after displaying "Mounting ZFS filesystems". I have also tried to do "zfs mount" from single-user, and that also hangs. Downgrading to 0.6.1 helps. Since I have one zvol I suspect it has something to do with zfs-kmod-0.6.1-fix-zvol-initialization.patch $LANG=C sudo zfs get all pool0/ibdata NAME PROPERTY VALUE SOURCE pool0/ibdata type volume - pool0/ibdata creation Wed Apr 24 23:35 2013 - pool0/ibdata used 851M - pool0/ibdata available 166G - pool0/ibdata referenced 813M - pool0/ibdata compressratio 1.00x - pool0/ibdata reservation none default pool0/ibdata volsize 800M local pool0/ibdata volblocksize 4K - pool0/ibdata checksum on default pool0/ibdata compression off default pool0/ibdata readonly off default pool0/ibdata copies 1 default pool0/ibdata refreservation 851M local pool0/ibdata primarycache all default pool0/ibdata secondarycache all default pool0/ibdata usedbysnapshots 0 - pool0/ibdata usedbydataset 813M - pool0/ibdata usedbychildren 0 - pool0/ibdata usedbyrefreservation 37.9M - pool0/ibdata logbias latency default pool0/ibdata dedup off inherited from pool0 pool0/ibdata mlslabel none default pool0/ibdata sync standard default pool0/ibdata refcompressratio 1.00x - pool0/ibdata written 813M - pool0/ibdata snapdev hidden default pool0 has mountpoint=/ but canmount=off. My real / is ext3. Please ask for additional information you might think is relevant. Reproducible: Always
Oh, kernel-version might be relevant. I'm running a home-built 3.9.4.
Code involving zvols should be unable to cause a hang at "Mounting ZFS filesystems". Would you try removing zfs from the boot runlevel, rebooting, running `zpool import -N rpool0` and running `zfs mount -a`?
I checked again. And the hang is actually at "Importing ZFS pools". I'm very sorry about that. I waited a couple of days between before I reported it. And apparently remembered wrong. Anyway, I have tested your suggestion now. If I boot sinle user, the system hang when I run "zfs import -N rpool0" I have also tested this on a second machine by creating a zvol. And I have no problem at all there. The kernel is now 3.9.5
I have made some more testing... I stowed away the lvol and removed it. sudo dd bs=4096 conv=sparse if=/dev/pool0/ibdata of=/slask/ibdata sudo zfs destroy pool0/ibdata Then I don't get the hang. And when I later added the lvol again, the hang reappeared.
Created attachment 350538 [details, diff] Patch that might fix the issue (In reply to Christer Ekholm from comment #3) > I checked again. And the hang is actually at > "Importing ZFS pools". I'm very sorry about that. I waited a couple of > days between before I reported it. And apparently remembered wrong. > > Anyway, I have tested your suggestion now. > If I boot sinle user, the system hang when I run > "zfs import -N rpool0" > > I have also tested this on a second machine by creating a zvol. And I > have no problem at all there. > > The kernel is now 3.9.5 I am attaching a patch. Would you place it at /etc/portage/patches/sys-fs/zfs-kmod-0.6.1-r1/zfs-kmod-0.6.1-zvol-initialization.patch, rebuild sys-fs/zfs-kmod-0.6.0-r1 and let me know it works.
Created attachment 350540 [details, diff] Patch that might fix the issue The previous patch included code from another patch by mistake. I am attaching a new version that removes it. Please use this one instead.
No, that didn't help.
I have experimented some by adding some printk to the code. --- /tmp/portage/sys-fs/zfs-kmod-0.6.1-r1/work/zfs-zfs-0.6.1/module/zfs/zvol.c 2013-06-14 23:36:31.547361236 +0200 +++ zvol.c 2013-06-14 23:41:33.443424000 +0200 @@ -102,6 +102,7 @@ if (*minor >= (1 << MINORBITS)) return ENXIO; + printk(KERN_ALERT "ZFS: zvol_find_minor: %u\n", *minor); return 0; } @@ -1213,6 +1214,7 @@ zvol_state_t *zv; int error = 0; + printk(KERN_ERR "ZFS: zvol_alloc: %s", name); zv = kmem_zalloc(sizeof (zvol_state_t), KM_SLEEP); if (zv == NULL) goto out; @@ -1481,6 +1483,7 @@ { spa_t *spa = NULL; int error = 0; + printk(KERN_ALERT "ZFS: zvol_create_minors: %s\n",pool); if (zvol_inhibit_dev) return (0); @@ -1502,6 +1505,7 @@ } mutex_exit(&zvol_state_lock); + printk(KERN_ALERT "ZFS: zvol_create_minors: done\n"); return error; } @@ -1569,6 +1573,7 @@ zvol_init(void) { int error; + printk(KERN_ALERT "ZFS: zvol_init\n"); list_create(&zvol_state_list, sizeof (zvol_state_t), If I boot single-user and do # echo 7 > /proc/sys/kernel/printk # zpool import I get this output. (manualy typed) ZFS: zvol_create_minors: pool0 ZFS: zvol_find_minor: 0 ZFS: zvol_alloc: pool0/ibdata ZFS: zvol_create_minors: done SPL: using hostid 0x00000000 After that the machine hangs. No zvol_init apparently? I don't know if this is useful?
Good news, I have reproduced the hang on my other machine. I compared my kernel-settings and played with them until I get the hang on the other server also. I found that with CONFIG_PREEMPT_NONE=y I get the hang, but with CONFIG_PREEMPT=y I don't I have not tested the reverse on my first machine. I can't boot it right now. But I will as soon as possible.
I have now tested with CONFIG_PREEMPT=y on the first machine also. And yes, that helped.
This bug should have received much more attention, but I have been busy with things offline. With that said, you would try regenerating your initramfs with the following: genkernel all --no-clean --zfs --callback="env ACCEPT_KEYWORDS=** EGIT_BRANCH=gentoo-next spl_LIVE_REPO='https://github.com/ryao/spl.git' zfs_kmod_LIVE_REPO='https://github.com/ryao/zfs.git' zfs_LIVE_REPO='https://github.com/ryao/zfs.git' emerge --oneshot --nodeps sys-kernel/spl sys-fs/zfs-kmod sys-fs/zfs" You might need to make some minor adjustments for your setup. That will pull my latest development code from git and knowing whether or not your issue is resolved by that will be helpful to me in figuring out what is wrong here.
I don't use a initramfs at all, is that a problem? I have tested aproximately what you suggested by: Adding to /etc/portage/package.keywords # ZFS testing sys-kernel/spl ** sys-fs/zfs-kmod ** sys-fs/zfs ** Adding to /etc/portage/make.conf #ZFS-testing EGIT_BRANCH=gentoo-next spl_LIVE_REPO='https://github.com/ryao/spl.git' zfs_kmod_LIVE_REPO='https://github.com/ryao/zfs.git' zfs_LIVE_REPO='https://github.com/ryao/zfs.git' And rebuilt zfs zfs-kmod and spl The commit-point used according to the build-logs are: GIT update --> repository: https://github.com/ryao/zfs.git at the commit: cdc7fc1523ee428fab03b3285c94d135f39e4c61 branch: gentoo-next storage directory: "/usr/portage/distfiles/egit-src/zfs.git" checkout type: bare repository GIT update --> repository: https://github.com/ryao/spl.git at the commit: 198b2763b3aa7d802d101a3acfa4958c075fe85b branch: gentoo-next storage directory: "/usr/portage/distfiles/egit-src/spl.git" checkout type: bare repository The server still hangs when CONFIG_PREEMPT_NONE=y and not when CONFIG_PREEMPT=y The kernel-version is by now 3.9.7
(In reply to Christer Ekholm from comment #12) > I don't use a initramfs at all, is that a problem? It is only a problem when using ZFS as your rootfs. > I have tested aproximately what you suggested by: > > Adding to /etc/portage/package.keywords > > # ZFS testing > sys-kernel/spl ** > sys-fs/zfs-kmod ** > sys-fs/zfs ** > > Adding to /etc/portage/make.conf > > #ZFS-testing > EGIT_BRANCH=gentoo-next > spl_LIVE_REPO='https://github.com/ryao/spl.git' > zfs_kmod_LIVE_REPO='https://github.com/ryao/zfs.git' > zfs_LIVE_REPO='https://github.com/ryao/zfs.git' > > And rebuilt zfs zfs-kmod and spl > > The commit-point used according to the build-logs are: > > GIT update --> > repository: https://github.com/ryao/zfs.git > at the commit: cdc7fc1523ee428fab03b3285c94d135f39e4c61 > branch: gentoo-next > storage directory: "/usr/portage/distfiles/egit-src/zfs.git" > checkout type: bare repository > > GIT update --> > repository: https://github.com/ryao/spl.git > at the commit: 198b2763b3aa7d802d101a3acfa4958c075fe85b > branch: gentoo-next > storage directory: "/usr/portage/distfiles/egit-src/spl.git" > checkout type: bare repository > > > > The server still hangs when CONFIG_PREEMPT_NONE=y and not when > CONFIG_PREEMPT=y > > The kernel-version is by now 3.9.7 At this point, I am going to suggest filing an upstream bug. Having a bug here basically limits its attention to myself. My time is extremely limited and we would make more progress on this issue if upstream and I collaborated as we do on other issues. https://github.com/zfsonlinux/zfs/issues/new
Ok. I have reported this at: https://github.com/zfsonlinux/zfs/issues/1574
This is now fixed upstreams in zfs-0.6.2-129-gba6a240
This has been fixed in Gentoo since 0.6.2-r5. Closing as resolved upstream.