334125 – >=sys-fs/lvm2-2.02.67 unable to release volume groups on shutdown

Bug 334125 - >=sys-fs/lvm2-2.02.67 unable to release volume groups on shutdown

Summary: >=sys-fs/lvm2-2.02.67 unable to release volume groups on shutdown

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	All Linux

Importance:	High normal (vote)
Assignee:	Robin Johnson

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-08-23 18:49 UTC by Maciej Mrozowski
Modified:	2014-02-03 09:39 UTC (History)
CC List:	12 users (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
lvm_start output (Image009.jpg,438.88 KB, image/jpeg) 2010-09-06 00:50 UTC, Maciej Mrozowski	Details
lvm-stop output (Image005.jpg,440.90 KB, image/jpeg) 2010-09-06 01:00 UTC, Maciej Mrozowski	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Maciej Mrozowski gentoo-dev

2010-08-23 18:49:58 UTC

Simplified lvm2-stop.sh introduced in 2.02.67-r1 is unable to cleanly release volume groups on shutdown. Apparently some volume (likely the one containing / or /var) is still activated and/or in use.

Using lvm2-stop.sh-2.02.49-r3 fixes the problem.

My setup:

md1 (RAID1 sda3 + sdb3)
+- local (VG)
   +- distfiles (LV, /usr/portage/distfiles)
   +- home (LV, /home)
   +- portage (LV, /usr/portage)
   +- root (LV, /)
   +- usrlocal (LV, /usr/local)
   +- var (LV, /var)

It also happens on my second install, when I have /var on the same partition as /. What should I put to lvm2-stop to debug the problem so that you can deal with it properly?

Comment 1 Robin Johnson archtester

2010-08-23 18:54:48 UTC

Are you really sure that the older one actually shuts them down?
The change was actually to speed up the shutdown process per bug 319017.

Can you please provide full shutdown output from your system for both the new and old scripts?

In both the old and new scripts, a setup such as yours should give an error with shutting down the root LV (since it's in use), but then continue.

Comment 2 Andreas Klauer 2010-08-23 19:33:34 UTC

It's expected for the root volume and volume group to not shut down if root itself is located on LVM. It does not hurt the LVM or the filesystems, as they get properly unmounted (or readonly remounted).

It may very well be possible that the new lvm shutdown script simply gives a different error message than before, since the message now comes directly from the LVM script itself whereas before it was a message by the shutdown script.

Is there an actual issue with the current behaviour that needs fixing?

Comment 3 Maciej Mrozowski gentoo-dev

2010-08-23 19:40:14 UTC

> Can you please provide full shutdown output from your system for both
> the new and old scripts?

I would like to know how to obtain those since no fs is supposed to be available at this point imho. I suppose I need to make lvm stop script more verbose, hints welcome.

Anyway releasing does give and error in new script (no error indication in old one) and at does continue in both cases so it's not a fatal issue.

Comment 4 Robin Johnson archtester

2010-08-23 20:11:33 UTC

> I would like to know how to obtain those since no fs is supposed to be
> available at this point imho. I suppose I need to make lvm stop script more
> verbose, hints welcome.
As for capturing the output, possible methods:
- take a photo of the screen and throw it in OCR (i use this for kernel panics often).
- serial console.
- testing in a VM.(In reply to comment #3)

Add some einfo output for your debugs.

> Anyway releasing does give and error in new script (no error indication in old
> one) and at does continue in both cases so it's not a fatal issue.
In the old script, you should have got a number of messages:
"Unable to shutdown: LV_NAME"

The old version did dump stdout to /dev/null, which we do not do in the new version. The eend should have gotten a fail output in both however.

Comment 5 Maciej Mrozowski gentoo-dev

2010-09-06 00:50:38 UTC

Created attachment 246186 [details]
lvm_start output

Comment 6 Maciej Mrozowski gentoo-dev

2010-09-06 01:00:12 UTC

Created attachment 246188 [details]
lvm-stop output

The one actually showing error messages.

Above screens show how it looks like with lvm scripts with no modifications on my setup (no aditional debugging etc):
- baselayout2, rc.conf: rc_parallel="NO", rc_interactive="YES"
- conf.d/device-mapper: RC_AFTER="lvm" (could it be a problem?)
- conf.d/lvm: RC_AFTER="mdraid" (this one is fine I suppose, given the fact that lvm volume groups sit on top of raid1)

Comment 7 Robert Trace 2010-10-30 05:27:04 UTC

The "maps lock" error is supposedly a harmless kernel bug fixed in 2.6.35.3 (http://www.spinics.net/lists/lvm/msg19932.html).

Comment 8 Robert Trace 2010-12-03 03:12:28 UTC

And the second error message(s) "... was not removed by udev" and "... should have been removed by udev" is because udev doesn't know about the device nodes/symlinks involved. If you ask udev what it knows about lvm devices immediately after boot, it'll tell you that the only DEVLINK it knows about is /dev/block/253:x

I think this'll only happen when booting from the initramfs and enabling 'dolvm'. Why? Because under the initramfs mdev is used to create the device nodes. When the system pivots over to real_root and starts udev, the nodes/symlinks (/dev/vg/foo and /dev/mapper/vg-foo, for example) already exist and udev doesn't pick up that they're associated with the real device-mapper/lvm device.

So, as an interesting test (on a initramfs & dolvm system):

1. Create a new lvm device and then reboot.

2. "udevadm info --query=all --path=/devices/virtual/block/dm-?" (where ? is the minor number of your new device) You'll notice that DEVLINKS only contains /dev/block/253:?.

3. "lvchange -a n /dev/vg/name_of_new_device" Notice that you'll get complaints about removing the device nodes.

4. "lvchange -a y /dev/vg/name_of_new_device"

5. "udevadm info --query=all --path=/devices/virtual/block/dm-?" Now DEVLINKS should be full of interesting nodes/symlinks (including /dev/vg/name_of_new_device)

6. "lvchange -a n /dev/vg/name_of_new_device" No errors this time, yes? That's because udev was able to properly clean up after the device.

So the solution is to either start using real udev in the initramfs or teach udev about already created block nodes/symlinks when it comes online.

Comment 9 Robert Trace 2010-12-03 03:24:14 UTC

(In reply to comment #8)
> So the solution is to either start using real udev in the initramfs or teach
> udev about already created block nodes/symlinks when it comes online.

Poking around a bit I found bug 330651, comment 36 which suggests that adding "udevadm trigger --action=change --attr-match=dm/name" can solve these types of problems and my test system agrees.  After running the above, the udev database now lists all kinds of nice things in DEVLINKS and rebooting ends up looking much cleaner.

Comment 10 Maciej Mrozowski gentoo-dev

2010-12-05 00:26:16 UTC

My home system agrees as well. DEVLINKS are created and on shutdown there are no complaints wrt devlinks removal
I suppose running this trigger early enough (init- wise) should make those warnings wrt devlinks creation disappear as well?

Last problem, "Can't deactivate volume group "local" with 1 open logical volume(s)" remains however. I'll digg into this when I collect more free time.

("maps lock" messsages are of course gone with recent gentoo-sources)

Comment 11 bugzilla93443 2011-06-12 19:33:19 UTC

(In reply to comment #9)
> Poking around a bit I found bug 330651, comment 36 which suggests that adding
> "udevadm trigger --action=change --attr-match=dm/name" can solve these types of
> problems and my test system agrees.
Hi, which script do you have changed to add this command?

Anyway, is there a solution for the "Can't deactivate volume group "local" with 1 open logical volume(s)" problem?
I have the same issue with my root partition in a LVM.

Comment 12 Andreas Klauer 2011-06-12 20:54:34 UTC

(In reply to comment #11)
> Hi, which script do you have changed to add this command?

If you want to run this command on startup, you can put it in your local start.

> Anyway, is there a solution for the "Can't deactivate volume group "local" with
> 1 open logical volume(s)" problem?

The error message is entirely harmless and expected if you have root on LVM.

Comment 13 Robin Johnson archtester

2012-03-27 03:28:53 UTC

I'm thinking of changing these to always eend 0 instead of listening the the exit code.

Esp with the new /usr requirement, this is almost gaurenteed to never shut down all LVs/VGs.

Comment 14 Robin Johnson archtester

2014-02-02 05:56:06 UTC

in .105, please test.

Comment 15 Andreas Klauer 2014-02-03 09:39:49 UTC

Test what, specifically? You mentioned eend 0 but I don't see it in 2.02.105-r1

Shutdown seems to work fine (root on lvm), but that's no change to earlier versions.

lvmetad fails to start, invalid option --pidfile, should probably be -p? I'm unfamiliar with lvmetad, sorry.