Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 323765

Summary: sys-fs/lvm2-2.02.67: breakage in /etc/init.d/lvm when using autodetected RAID (but there is more)
Product: Gentoo Linux Reporter: Francis Galiegue <fgaliegue>
Component: [OLD] Core systemAssignee: Robin Johnson <robbat2>
Status: RESOLVED NEEDINFO    
Severity: critical CC: agk, cardoe, ralfglauberman
Priority: High    
Version: 2008.0   
Hardware: AMD64   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 324485    

Description Francis Galiegue 2010-06-13 11:36:27 UTC
Two systems, both unstable AMD64, with the following characteristics:

* one is RAID1 over two disks;
* the other is RAID5 over three disks.

In BOTH cases:

* one disk is larger than the others;
* I autodetect RAID partitions at boot time;
* the root filesystems are not on LVM...
* but all other filesystems are.

In both cases, using a vanilla 2.6.34 kernel (even though I think it's not relevant in this case) and the latest LVM update, /etc/init.d/lvm fails to complete at boot, rendering the system... Well... Non booting.

I first saw this on the two-disk RAID1 setup. Now I see this on the three-disk RAID5 setup as well. In order to work around the problem, I rebooted with the same kernel and init=/bin/bb and had to rc-update del lvm boot. Then it booted fine but, of course, LV-based filesystems were not mounted. Not a big deal, at least it booted.

After the system boots, I run /etc/init.d/lvm start on both: either it fails to complete the first time, or not... But if it fails to complete the first time, I can /etc/init.d/lvm start again and then it completes.

[NOTE: in both cases, I have elected to update the init scripts - not doing so would have made things worse, I surmised]


Reproducible: Always
Comment 1 Michael Weber (RETIRED) gentoo-dev 2010-06-14 21:46:59 UTC
Can you please provide the exact version of lvm2 you are using?
Comment 2 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-14 21:52:16 UTC
Information required:
0. exact version of lvm2, udev, mdadm.
1. emerge --info
2. Are you using an initrd? If so, what source and version?
3. What type of RAID?
4. Please include your startup order, if using md-softraid, is it starting before the LVM service?
5. You say LVM is failing to start, please include complete error output.
Comment 3 Francis Galiegue 2010-06-14 22:24:19 UTC
(In reply to comment #2)

OK, I'll answer to all questions, but some of the questions asked make me think that my original post wasn't read thouroughly: many questions were already answered.

As this bug can prevent setups such as mine from booting normally at all, I bump up the severity again from normal to critical.

> Information required:
> 0. exact version of lvm2, udev, mdadm.

As I said, I'm using ~amd64 on both setups with the latest versions (as of 20100615):

lvm2: 2.0.67-r1
mdadm: 3.1.2
udev: 154

As mdadm and udev haven't been updated since my last emerge --sync and then emerge -uvtDN system (I do this once every two weeks on average), this problem is more than probably related to the lvm2 package update.

> 1. emerge --info

The same on both machines:

Portage 2.1.8.3 (default/linux/amd64/10.0, gcc-4.4.4, glibc-2.11.2-r0, 2.6.34-radeon x86_64)
=================================================================
System uname: Linux-2.6.34-radeon-x86_64-Intel-R-_Core-TM-2_Quad_CPU_@_2.40GHz-with-gentoo-2.0.1
Timestamp of tree: Thu, 10 Jun 2010 17:15:02 +0000
ccache version 2.4 [disabled]
app-shells/bash:     4.1_p7
dev-java/java-config: 2.1.11
dev-lang/python:     2.6.5-r2, 3.1.2-r3
dev-util/ccache:     2.4-r8
dev-util/cmake:      2.8.1-r2
sys-apps/baselayout: 2.0.1
sys-apps/openrc:     0.6.1-r1
sys-apps/sandbox:    2.2
sys-devel/autoconf:  2.13, 2.65
sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.4
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.10
virtual/os-headers:  2.6.34
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="assume-digests distlocks fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
LDFLAGS="-Wl,-O1"
LINGUAS="fr"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/x11"
SYNC="rsync://89.206.169.171/gentoo-portage"
USE="X aac acl acpi aiglx alsa amd64 avahi bash-completion berkdb bzip2 cleartype cli consolekit cracklib crypt css cups cxx dbus dga dri dvd dvdr encode encore ffmpeg flac gdbm gpm gstreamer hal iconv ipv6 java6 jpeg kde lame mad mdnsrespondercompat mmx modules mp3 mudflap multilib ncurses nls nptl nptlonly nsplugin ogg opengl openmp pam pcre png policykit pppd qt3support qt4 readline reflection semantic-desktop sensors session source spl sse sse2 ssl sysfs syslog tcpd theora threads udev unicode v4l2 vim-syntax vorbis x264 xattr xcb xcomposite xinerama xorg xscreensaver zeroconf zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="fr" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

> 2. Are you using an initrd? If so, what source and version?

I do NOT use an initrd. As I said in the initial post, I use Linux' automatic RAID array detection at boot time. I do NOT use /etc/init.d/mdadm, and this has never been a cause for trouble... Until the last lvm2 update!

> 3. What type of RAID?

_AGAIN_, the original post said it all: one machine is RAID 1 over 2 disks, the other is RAID 5 over 3 disks.

> 4. Please include your startup order, if using md-softraid, is it starting
> before the LVM service?

I do not use md-softraid. I do not use mdraid either. And, see below.

> 5. You say LVM is failing to start, please include complete error output.
> 

Aaargh... See original post again! THERE. IS. NO. ERROR. OUTPUT. It just hangs. On both setups.

On both setups, I had to boot with init=/bin/bb, remove the lvm service from the boot runlevel, and only then could I get a prompt, and then I logged in as root, and then did /etc/init.d/lvm start, and it would (or not) succeed at the first attempt... Again, see the original post.
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-14 22:42:45 UTC
> OK, I'll answer to all questions, but some of the questions asked make me think
> that my original post wasn't read thouroughly: many questions were already
> answered.
No, your prior answers left open possibilities.

> I do NOT use an initrd. As I said in the initial post, I use Linux' automatic
> RAID array detection at boot time. I do NOT use /etc/init.d/mdadm, and this has
> never been a cause for trouble... Until the last lvm2 update!
If you downgrade back to the 2.02.67 does the problem vanish?

> 
> > 3. What type of RAID?
> _AGAIN_, the original post said it all: one machine is RAID 1 over 2 disks, the
> other is RAID 5 over 3 disks.
I personally use: 3ware, LSI, Adaptec, Compaq, MD raid.
I did not ask what RAID layout you were using, simply what type of RAID, similar to my usage above.

> > 4. Please include your startup order, if using md-softraid, is it starting
> > before the LVM service?
> I do not use md-softraid. I do not use mdraid either. And, see below.
Your startup order still wasn't included. State the exact contents of your your sysinit and boot runlevels, and what order you see stuff running BEFORE the lvm init script hangs.


> > 5. You say LVM is failing to start, please include complete error output.
> > 
> 
> Aaargh... See original post again! THERE. IS. NO. ERROR. OUTPUT. It just hangs.
> On both setups.
You're going to have to go to interactive mode right when your system starts to boot, start all services prior to LVM, and when prompted to start LVM, drop to the shell.

Run the contents of /lib64/rcscripts/addons/lvm-start.sh manually, we need to know which command there is hanging. Alternatively, edit that script and replace the >/dev/null with --verbose and more debug output.
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-14 22:45:16 UTC
Your answers above are also contradictory:
1.
> I do not use md-softraid. I do not use mdraid either. And, see below.
2.
> As I said in the initial post, I use Linux' automatic
> RAID array detection at boot time.

That automatic detection (CONFIG_MD_AUTODETECT) is MD raid, which you claim to not use.
Comment 6 Francis Galiegue 2010-06-14 22:52:22 UTC
> > As I said in the initial post, I use Linux' automatic
> > RAID array detection at boot time.
> 
> That automatic detection (CONFIG_MD_AUTODETECT) is MD raid, which you claim to
> not use.
> 

Well, I don't use /etc/init.d/mdraid, I rely on the kernel for RAID autodetection... Which should, to my eyes, amount to the same thing. Since all RAID arrays are detected before the init kicks in, my statement holds true.
Comment 7 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-14 22:56:48 UTC
Ok, when you debug to drop into the shell and run the LVM commands manually, I'd also like the contents of /proc/mdstat at that exact point.
Comment 8 Francis Galiegue 2010-06-14 23:06:13 UTC
(In reply to comment #4)

> > I do NOT use an initrd. As I said in the initial post, I use Linux' automatic
> > RAID array detection at boot time. I do NOT use /etc/init.d/mdadm, and this has
> > never been a cause for trouble... Until the last lvm2 update!
> If you downgrade back to the 2.02.67 does the problem vanish?
> 

I haven't tried... I'll try that next.

> > 
> > > 3. What type of RAID?
> > _AGAIN_, the original post said it all: one machine is RAID 1 over 2 disks, the
> > other is RAID 5 over 3 disks.
> I personally use: 3ware, LSI, Adaptec, Compaq, MD raid.
> I did not ask what RAID layout you were using, simply what type of RAID,
> similar to my usage above.
> 

OK, my bad. This is Linux software RAID every time.

> > > 4. Please include your startup order, if using md-softraid, is it starting
> > > before the LVM service?
> > I do not use md-softraid. I do not use mdraid either. And, see below.
> Your startup order still wasn't included. State the exact contents of your your
> sysinit and boot runlevels, and what order you see stuff running BEFORE the lvm
> init script hangs.
> 

The lvm service is added at the boot runlevel. How can I display the order in which services at the boot runlevel are launched? "rc-update show boot" displays a list of services, not their order...

[...]
> You're going to have to go to interactive mode right when your system starts to
> boot, start all services prior to LVM, and when prompted to start LVM, drop to
> the shell.
> 

Can I do that with a boot runlevel service? I have tried pressing "i" like a monkey on steroids but could not achieve interactive mode. Missing configuration option?

> Run the contents of /lib64/rcscripts/addons/lvm-start.sh manually, we need to
> know which command there is hanging. Alternatively, edit that script and
> replace the >/dev/null with --verbose and more debug output.
> 

Will do, if downgrading to plain 2.0.67 doesn't fix it.
Comment 9 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-14 23:17:15 UTC
(In reply to comment #8)
> OK, my bad. This is Linux software RAID every time.
Ok, trying to narrow the problem scope down here.

> The lvm service is added at the boot runlevel. How can I display the order in
> which services at the boot runlevel are launched? "rc-update show boot"
> displays a list of services, not their order...
Just write them down when you see them booting, or capture the output another way.

> Can I do that with a boot runlevel service? I have tried pressing "i" like a
> monkey on steroids but could not achieve interactive mode. Missing
> configuration option?
1. rc_interactive="YES" in /etc/rc.conf
2. Capital "I", not lowercase (and beware Turkish keyboards with 4 I's).
Comment 10 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-17 19:01:41 UTC
Have you run the tests I asked for?
Comment 11 N. Andrew Walsh 2010-06-23 21:14:45 UTC
Can you elaborate a bit what you mean by "doesn't start"? I recently rebooted for the first time in 15 days (during which time I also updated lvm), and I see the following:

/sys/devices/virtual/block/md{n} (m)

many times (I assume that it is the output of some message that scrolled off the screen too fast to read). 

Then the bootloader reaches the line 
"Settine up the Logical Volume Manager"

and reads the disks forever. It's been running 40 minutes now, and hasn't stopped. Since I can't get to a rpompt, or ssh in, I can't really debug it, so I'm just waiting.

My setup is the latest ~amd64 current to about five days ago.

Thanks for the help.

A
Comment 12 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-06-24 03:37:35 UTC
fgaliegue:
If there is no output added to this bug, I'm going to close RESO NEEDINFO on June 25th.

n.andrew.walsh:
please follow the instructions for interactive boot I provided on this bug, and figure out which of the commands is causing that output.
Comment 13 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-07-07 17:39:03 UTC
Can't reproduce, no response from users.
Comment 14 Francis Galiegue 2010-09-10 08:53:04 UTC
Sorry, I was away for a looong time

And since then, I've solved the problem!

The problem is that as I had auto RAID detection at boot, I figured that I didn't need the mdadm service. Well, I was wrong.

I have:

rc-update add mdadm boot

and it now works.

Sorry for the noise.