Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 548452 - Will not boot past ">> Using mount -t auto -o ro" with genkernel image(and grub2) after lost power once
Summary: Will not boot past ">> Using mount -t auto -o ro" with genkernel image(and gr...
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Hardened (show other bugs)
Hardware: AMD64 Linux
: Normal critical (vote)
Assignee: The Gentoo Linux Hardened Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-03 07:23 UTC by abandoned account
Modified: 2017-09-07 20:18 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
screenshot (stuck at mount ro.png,13.34 KB, image/png)
2015-05-03 07:24 UTC, abandoned account
Details
mount command is the culprit(?) (mount gets stuck.png,11.31 KB, image/png)
2015-05-03 07:54 UTC, abandoned account
Details
the actual error after a # dmesg -n 8 (after_dmesg_-n_8.png,13.82 KB, image/png)
2015-05-03 08:12 UTC, abandoned account
Details
tried mount -o rw too (tried rw too.png,12.77 KB, image/png)
2015-05-03 08:21 UTC, abandoned account
Details

Note You need to log in before you can comment on or make changes to this bug.
Description abandoned account 2015-05-03 07:23:22 UTC
After power was lost (eg. blackout) the next time gentoo boots, it doesn't go past this:

>> Using mount -t auto -o ro

(screenshot of this inside virtualbox included in next comment, and is reproducible with the steps below)



Reproducible: Always

Steps to Reproduce:
1. boot system
2. startx
3. run a terminal
4. hard reset or unplug power cord (assuming non-battery system)

Actual Results:  
stuck at:
...
>> Using mount -t auto -o ro

the system just hangs there, apparently not using any CPU  (6% of 4 cores, and it's inside virtualbox)
It's been at least 20mins since it's stuck at this point. Screen blanking occurrs, after a while and pressing a key snaps out of it.

Expected Results:  
system boots until login prompt

The steps that I used to install gentoo on a desktop are these: https://github.com/emanueLczirai/coostomhuston/blob/b0849f32580589f543cc6d7ae684fad131127a42/texts/gentoo_desktop.wofl

The steps that I used to install gentoo inside virtualbox are these: https://github.com/emanueLczirai/coostomhuston/blob/a36822b234c2c82cbaf3d38f3208792b521d3364/texts/gentoo_vm.wofl

Initially the issue was seen on a desktop. I've then reproduced it inside virtualbox.

I have used genkernel to generate initramfs and kernel images and then grub2 ...

I have a saved snapshot in virtualbox when the system is working fine, so I can go back to it and reproduce this as many times as wanted. Any ideas on what to try, please let me know. Or other types of information that I may give; I am here and ready to hack at this, just say the word.

Thanks in advance!
Cheers,
EmanueL.
Comment 1 abandoned account 2015-05-03 07:24:12 UTC
Created attachment 402474 [details]
screenshot
Comment 2 abandoned account 2015-05-03 07:54:37 UTC
Created attachment 402480 [details]
mount command is the culprit(?)

By typing my luks password wrong 3 times, I managed to invoke the "shell" and luksOpen the luks, then the lvm volumes and I can see that the mount command gets stuck while attempting to mount root partition...
Comment 3 abandoned account 2015-05-03 08:12:16 UTC
Created attachment 402486 [details]
the actual error after a # dmesg -n 8


I had to pretend I don't know the luks password 3 times, then enter "shell" and do a:
# dmesg -n 8
then exit, and q a few times to skip, enter luks password and now see the actual dmesg when error happens(see screenshot.


That is mount of BusyBox:
# mount --help
BusyBox v1.20.2 (2015-04-15 00:51:05 CEST) multi-call binary.
Comment 4 abandoned account 2015-05-03 08:21:29 UTC
Created attachment 402496 [details]
tried mount -o rw too

basically, there's no error (it just tries different filesystems until it gets the right one: btrfs)

but the issue is, for some reason the 'mount' command(busybox) gets stuck and doesn't exit after mounting. (unsure though if it completes the mount process and then gets stuck, or it didn't complete it and that's why is stuck)
Comment 5 abandoned account 2015-05-03 08:45:58 UTC
kernel 3.18.9-hardened
(stable at the time of compilation: 15th April)

I also want to mention that appending an & at the end of the mount command then trying to kill -9 or -11 it  won't have any effect.

Looks like reproduction steps can be reduced to:
1. boot until login prompt
2. hard reset (or unplug power cord)  [or maybe just don't cleanly unmount the btrfs file system?]
3. next boot, stuck at: Using mount -t auto -o ro

this might only affect btrfs on lvm on luks systems, or maybe any btrfs root filesystem using kernel 3.18.9-hardened and genkernel generated image with busybox v1.20.2
Comment 6 abandoned account 2015-05-03 10:49:58 UTC
tl;dr: upgrading to kernel 3.19.6-hardened-r1 (from kernel 3.18.9-hardened)  avoids this issue!
-------
I had genkernel 3.4.51.2 which was latest ~amd64 (aka unstable)
I am now emerging the current amd64 stable one which is 3.4.49.2  and regenerating the kernel and initramfs images, rerun grub2-mkconfig ....

(I also updated @world to latest but that only installed 5 irrelevant packages, like git and x11vnc etc.)

# cryptsetup --verbose luksOpen /dev/sda2 luks_on_sda2_boot
Enter passphrase for /dev/sda2: 
Key slot 0 unlocked.
Command successful.
# mount /but

# time FEATURES="-ccache" genkernel  all --bootdir="/but" --install --symlink --no-splash --no-mountboot --makeopts="-j4 V=0" --no-keymap --lvm  --no-mdadm --no-dmraid --no-zfs --no-multipath --no-iscsi --disklabel --luks --no-gpg --no-netboot --no-unionfs --kernname=genkernel --no-firmware --no-integrated-initramfs --compress-initramfs --compress-initrd --compress-initramfs-type=best --loglevel=5 --color --no-mrproper --no-clean --no-postclear --oldconfig --no-mountboot
...
* Make sure you have the latest ~arch genkernel before reporting bugs.

real	3m9.893s
user	2m13.180s
sys	0m44.380s

on the other hand, 3.18.9-hardened is still latest stable kernel at this time.

# grub2-mkconfig -o /but/grub/grub.cfg 2>&1
Generating grub configuration file ...
Found linux image: /but/kernel-genkernel-x86_64-3.18.9-hardened
Found initrd image: /but/initramfs-genkernel-x86_64-3.18.9-hardened
Found linux image: /but/kernel-genkernel-x86_64-3.18.9-hardened.old
Found initrd image: /but/initramfs-genkernel-x86_64-3.18.9-hardened.old
done

# umount /but
# cryptsetup --verbose luksClose /dev/mapper/luks_on_sda2_boot
Command successful.
# poweroff & exit

(saved a new virtualbox snapshot)

1. booting until login prompt
2. hard reset
3. booting until, confirmed stuck at "Using mount -t auto -o ro" again

restoring snapshot

booting until login prompt

attempting to emerge updated busybox
# emerge -av --update busybox
nope, already was up to date, in fact even though mount --help reports 1.20.2 the busybox package in gentoo is(emerge --info busybox): 
...
sys-apps/busybox-1.23.1-r1::gentoo was built with the following:
USE="pam -debug -ipv6 -livecd -make-symlinks -math -mdev -savedconfig (-selinux)
 -sep-usr -static -syslog -systemd"                                             CFLAGS="-O2 -pipe -march=native -Wstack-protector -fstack-protector-all -g3 -fno
-strict-aliasing"                                                               CXXFLAGS="-O2 -pipe -march=native -Wstack-protector -fstack-protector-all -g3 -f
no-strict-aliasing" 

there isn't a newer one except -9999

emerging -9999 then:
# time emerge -av --update =busybox-9999
required this line:
=sys-apps/busybox-9999 **
in file: /etc/portage/package.accept_keywords
ok done.

doing all the genkernel and grub steps from above...

poweroff, saving snapshot...
1. booting until login prompt
2. hard reset
3. booting until, confirmed stuck at "Using mount -t auto -o ro" again

perhaps is worth mentioning that at step 1 there, i waited like 5-10minutes at login prompt before issuing the hard reset at step 2 and step 3 actually worked! it didn't get suck, but then it continued until login prompt and I did from step 2 again(without the 5-10mins of waiting this time) and just as expected stuck in step 3 now. So whatever that extra time did(like it gave the system enough time to sync everything to disk) the hard reset had no bad effect on it.

mount --help reports the exact same version though, maybe it's using busybox 1.20.2 that comes with genkernel, that'd explain it.

restoring snapshot,
emerging genkernel -9999
# time emerge -av --update =genkernel-9999
before emerge, required line:
=sys-kernel/genkernel-9999 **
in file: /etc/portage/package.accept_keywords
done.
# genkernel --version
3.4.51

mounting /but, running genkernel, running grub2-mkconfig, unmount, rebooting, saving snapshot

1. booting until login prompt
2. hard reset
3. booting until, confirmed stuck at "Using mount -t auto -o ro" again

restoring snapshot, booting normally,
emerging a non-stable kernel:
# time emerge -nav =hardened-sources-3.19.6-r1
the following line:
=sys-kernel/hardened-sources-3.19.6-r1 ~amd64
was needed before emerge, in file: /etc/portage/package.accept_keywords
done.

# eselect kernel list
Available kernel symlink targets:
  [1]   linux-3.18.9-hardened *
  [2]   linux-3.19.6-hardened-r1

# cp /usr/src/linux/.config /usr/src/linux-3.19.6-hardened-r1/
# eselect kernel set 2
# cryptsetup --verbose luksOpen /dev/sda2 luks_on_sda2_boot
Enter passphrase for /dev/sda2: 
Key slot 0 unlocked.
Command successful.
# mount /but
# time FEATURES="-ccache" genkernel  all --bootdir="/but" --install --symlink --no-splash --no-mountboot --makeopts="-j4 V=0" --no-keymap --lvm  --no-mdadm --no-dmraid --no-zfs --no-multipath --no-iscsi --disklabel --luks --no-gpg --no-netboot --no-unionfs --kernname=genkernel --no-firmware --no-integrated-initramfs --compress-initramfs --compress-initrd --compress-initramfs-type=best --loglevel=5 --color --no-mrproper --no-clean --no-postclear --oldconfig --no-mountboot
...
real	28m10.452s
user	77m52.350s
sys	18m35.090s
# grub2-mkconfig -o /but/grub/grub.cfg 2>&1
Generating grub configuration file ...
Found linux image: /but/kernel-genkernel-x86_64-3.19.6-hardened-r1
Found initrd image: /but/initramfs-genkernel-x86_64-3.19.6-hardened-r1
Found linux image: /but/kernel-genkernel-x86_64-3.18.9-hardened
Found initrd image: /but/initramfs-genkernel-x86_64-3.18.9-hardened
Found linux image: /but/kernel-genkernel-x86_64-3.18.9-hardened.old
Found initrd image: /but/initramfs-genkernel-x86_64-3.18.9-hardened.old
done
# umount /but
# cryptsetup --verbose luksClose /dev/mapper/luks_on_sda2_boot
Command successful.
# poweroff & exit

save (virtualbox)snapshot
1. booting until login prompt
2. hard reset
3. booting until login prompt again, not stuck nothing
must be the kernel upgrade that fixed it then(right?)
retrying these steps 3 more times to be sure

yup, still not getting stuck anymore...

So, upgrading to kernel 3.19.6-hardened-r1  avoids this issue from happening.

But now, how do I fix an already stuck one, if I don't have a bootable device for it...and it's far away... grrr.
Comment 7 abandoned account 2015-05-03 11:11:01 UTC
apparently it's because of this:

Stable kernel version 3.19.1+ can cause a deadlock at mount time
Fixed in 3.19.5, 3.14.39
workaround: boot with older kernel, or run btrfs-zero-log to clear the log. This will lose up to the last 30 seconds of writes to the filesystem. You will have to reboot after running the btrfs-zero-log command, to clear the jammed locks.
fix: scheduled for 3.19.5, or apply 9c4f61f01d269815bb7c37.
also affected: 3.14.35+, 3.18.9+

src: https://btrfs.wiki.kernel.org/index.php/Gotchas


I would try it but, there's no such command btrfs-zero-log  or just btrfs command inside the tiny shell that I can invoke by typing my luks pwd wrong 3 times at boot.

In virtualbox I can try it by booting from admincd.iso then,
# cryptsetup --verbose --allow-discards luksOpen /dev/sda3 lvm_on_luks_on_sda3_root
# lvm lvchange --verbose -a y vgall
# btrfs-zero-log /dev/mapper/vgall-rootlvol 
(there's no output)
# sync
# lvm lvchange -v -a n vga
# cryptsetup --verbose luksClose /dev/mapper/lvm_on_luks_on_sda3_root . 
Command successful.
# sync && reboot & exit

boot from normal (virtualbox's) HDD (not from the .iso)
and it works! nolonger gets stuck! great!
Comment 8 Ben Kohler gentoo-dev 2017-09-07 20:18:49 UTC
Looks like this issue is resolved, please reopen if this is still an issue.

Thanks