There is a Gentoo-local patch to lvm.conf which sets pvmetadatacopies to 2, where the upstream default is 1. This prevents the sensible use of pvresize to resize partitions smaller. This is because when 2 copies of the metadata are created, one copy is placed at the beginning of the PV and the other at the end. pvresize does not move the end-copy of the metadata, which means a subsequent resize of the underlying partition cuts off the second copy. This results in complaints of invalid arguments to lseek on all LVM operations, as they try to seek to examine the second copy of the metadata.
Setting pvmetadatacopies to 1 leaves the only copy of the metadata at the beginning of the partition, which makes shrinking partitions work perfectly.
The ideal fix is to have pvresize move the backup metadata copy to an appropriate new location. The more-easily-applied immediate fix is to NOT patch lvm.conf, such that pvmetadatacopies remains equal to 1 and the problem does not appear. This will not, unfortunately, fix existing installations with 2 metadata copies.
Steps to Reproduce:
1. Create a partition
2. Create a PV on it
3. pvresize the PV down
4. fdisk the partition down
Lots of complaints from vgscan about lseek: invalid argument.
Should work fine, assuming one pvresizes smaller than the new size of the partition.
Doesn't the gentoo patch just change the setting in an example config file?
Anyways, assigning to ebuild maintainers, who at least might want to add a comment pointing out the trade-off between ability to shrink partitions and the extra safety.
Yes, you are correct that the Gentoo-specific patch only changes the value in the configuration file. However, Gentoo bugzilla is the regular first point of contact for bugs, whether upstream's fault or Gentoo's, and the real bug IMO is that pvresize doesn't move the second metadata copy.
We use the 'Fedora rawhide' bugzilla as upstream so this can go there, with an 'external bug location' back to this.
I doubt it'll be a quick one to fix, but the code needs to detect the condition and give appropriate messages for now.
Reported at <https://bugzilla.redhat.com/show_bug.cgi?id=477891>.
First of all - Grrr! EVIL PATCH!! :[
Now, here's the section from "man pvcreate" stating *the defaults* (note the inconsistency):
The number of metadata areas to set aside on each PV. Currently this can be 0, 1 or 2. If set to 2, two copies of the volume group metadata are held on the PV, one at the front of the PV and one at the end. If set to 1 (the default), one copy is kept at the front of the PV (starting in the 5th sector). If set to 0, no copies are kept on this PV - you might wish to use this with VGs containing large numbers of PVs. But if you do this and then later use vgsplit you must ensure that each VG is still going to have a suitable number of copies of the metadata after the split!
PS: Is there at least some simple and safe workaround, like dropping the second metadata block or something, so I fix my disk partitioning in a space-constrained environment?
There are indeed workarounds. If you have sufficient free space, just ferry stuff through another PV (pvcreate the temp PV, vgextend the VG over the temp PV, pvmove all the LVs onto the temp PV, vgreduce the main PV out of the VG, re-pvcreate the main PV with one metadata copy, vgextend the VG over the main PV, pvmove all the LVs back onto the main PV, vgreduce the temp PV out of the VG, and you're done).
If you don't have enough space to store a second copy of the data, then you can do the extremely dangerous and terrifying way of working around this bug: find your LVM backup file in /etc/lvm/backup, then run pvcreate *against the existing PV* with --metadatacopies 1, the --restorefile set to the backup file, and the --uuid parameter set to the UUID of the PV (you can find it in the backup file). Then run vgcfgrestore to reinstantiate your LVs. Backups are highly recommended before trying this... but then again, if you had space to store said backups, you'd just use that space as a temporary PV and go with the more pleasant ferrying option.
I also found the following patch, which *might* solve this:
Also, the LVM2 upstream changelog entry for version 2.02.40 states "Fix pvresize to not allow resize if PV has two metadata areas."
I'm wanting more changes to that patch to extend the internal interface - I want to have function to disable/enable and add/remove mdas separately. So it would disable/remove the 2nd one, pvresize, then add/enable a replacement 2nd one.
The pvcreate/restorefile/vgcfgrestore method is the right workaround and could, in fact, be turned into a little script fairly easily if someone wanted to do that.
Nevertheless - currently Gentoo applies a patch which changes the default behaviour from one metadata area to two, but "man pvcreate" still says that the default is one copy. I think that this change MUST be reflected in the man-page also, or as an alternative the patch should be removed.
(In reply to comment #9)
> also, or as an alternative the patch should be removed.
I agree on the removing that part of the patch. Untill lvm2/pvresize is "fixed" to handle 2 metadatacopies Gentoo should *not* change the default behaviour. The example config file should have sane defaults.
(In reply to comment #10)
> (In reply to comment #9)
> > also, or as an alternative the patch should be removed.
> I agree on the removing that part of the patch. Untill lvm2/pvresize is "fixed"
> to handle 2 metadatacopies Gentoo should *not* change the default behaviour.
> The example config file should have sane defaults.
I agree with this suggested policy, a major portion of LVM -is- the ability to resize and this breaks that functionality, while requiring an even -more- dangerous action to resolve the problem (remove the second copy of the meta-data).
I presume that since no one has yet written a -simple- tool to just remove the second metadata copy that it isn't as trivial as it seems at first glance?
for .51, i've change the default for the copies back to 1.
nobody has written a script to do it because they haven't been suitably motivated. It's not hard, as the patch was already linked here that does it in lvm, just that upstream hasn't merged that patch still.
Can you please merge the PV metadata resize fixup patch:
anyway? From there I think functions to handle the MDAs can be extracted more easily, and get exposed probably under pvchange.
(In reply to comment #12)
> Can you please merge the PV metadata resize fixup patch:
(considering out upstream)
This patch works, but it has one little snag - it fails when using metadata copies on disk (the lvm.conf/dirs setting). This setting is supposed to be for debugging purposes mainly, but anyway you should know about it when trying to use it (it's quite outdated - there have been some other changes made in pv_write code anyway in the meantime).
I've made an update for the patch afterwards that solves this situation but we need to restructure the code a little internally. The newer patch was not sent to lvm-devel iirc, I've discussed it with Alasdair only. Then we concluded we would need more changes.
Also, while we're digging here, we'd like to solve this in a way that it would be a first step for a more general work on metadata handling.
I'll dive into this more once we have udev work sorted out which I've been working on till now. But yes, I've been looking into this again last couple of days. So a better solution is on its way...
prajnoha: ping on status.
Unfortunately, I was always interrupted by some udev problems most of the time and I also concentrated on other things... In the meantime, I think there were a few changes in that part of the code, so I need to look at it again. I'll try to get to it as soon as possible. Thanks for reminding me :)
I'm assuming you've tested now, and there are no issues left.
I'm not sure if "test request" is the right answer here. This bug no longer causes catastrophic failures, pvresize just plain refuses the work (which doesn't really mean "fixed", it just means "no longer eating your drives"—"fixed", as per the title, would mean "pvresize now works with 2 MDAs"). The upstream bug is still ASSIGNED, and the most recent comment (on May 17) indicates that it's still intending to be worked on.
(In reply to comment #17)
> I'm not sure if "test request" is the right answer here. This bug no longer
> causes catastrophic failures, pvresize just plain refuses the work (which
> doesn't really mean "fixed", it just means "no longer eating your
> drives"—"fixed", as per the title, would mean "pvresize now works with 2
> MDAs"). The upstream bug is still ASSIGNED, and the most recent comment (on May
> 17) indicates that it's still intending to be worked on.
I guess this bug should either be marked as RESOLVED UPSTREAM, or reopened until this is actually fixed.
I think Jaak's comment #18 is appropriate.
Stable LVM appears to contain the complete fix from upstream; in fact, given that this is fixed, it may be a suitable time to switch back to two metadata copies by default in lvm.conf, or if not that then at least remove the comment "but PV resize is then disabled".