Hi, It seems that the vgchange command in halt.sh (baselayout 1.10.4) doesn't want to work when /var is on an LVM volume. It complains about locking during the de-activation -- the volume probably becomes 'busy' if vgchange keeps a lock file open on it which means that it can't deactivate it. The best approach I could find is to change the lockdir (in /etc/lvm/lvm.conf) to something directly under "/", e.g. "/.lvm.locks". Note that even using /tmp (which is otherwise permissible according to the comments in lvm.conf) doesn't work because /tmp might be on an LVM volume. So my proposal is that the tempdir be changed in the default /etc/lvm/lvm.conf (possibly with a comment explaining the situation), so that others aren't tripped up by this. Cheers,
Can you give me the kernel version, version of baselayout and version of lvm you are running please. I have always put /var on lvm and its never been a problem for me.
Ah, yes, sorry about the lack of version numbers: - baselayout 1.10.4 - sys-kernel/development-sources-2.6.9_rc1 - sys-fs/lvm2-2.00.15
Interesting, did this problem occur with a <2.6.9 kernel? Thats the second odd lvm bug I've seen with 2.6.9 users. Is it possible for you to try a 2.6.7 kernel?
I've only tried LVM2 on 2.6.9-rc1, but I'll see if I can try with 2.6.7 a bit later today. I'll also try removing the checkfs udevstart workaround from bug#62679 too see if the boot also works properly without modification to the checkfs/rc scripts on this kernel.
many thanks :)
2.6.7 vs. 2.6.9-rc1 doesn't change anything with respect to this bug, so I'll just add my test results to the other bug. However, I *think* I've diagnosed the problem here: The problem was not LVM's own lockfile, but in fact a file opened by VMWare(!). It seems that one of the vmnet- modules opens a lock file in /var/ somewhere -- which, of course, happens *after* /var is mounted. Since I haven't compiled support for module unloading into the kernel, the module wasn't being unloaded when the 'vmware' service was stopped. Which left open files and left pgchange unable to deactivate the VG. The only thing which might disagree with the above explanation is that the filesystems themselves *seem* to unmount fine, there are no umount-related errors during shutdown. I don't have any concrete explanation as to how that's possible, but it just might be that kernel modules can keep files open across umounts...? I'll try to disable vmware completely to see if that fixes it. I'll report back if/when something interesting happens (beware that I *normally* don't reboot all that often so it might take a while... I'll try rebooting once a day just to see).
I've done some testing, and vmware does not seem to be the problem after all (since I haven't had the modules loaded at all). During a reboot umount /usr failed (with a "busy file system" message) and consequently vgchange couldn't deactivate the VGs. I was dropped to a shell, but strangely, # lsof | grep /usr produced nothing, and the process list was almost empty. However, I still couldn't remount /usr readonly. I've attached the process list and the list of loaded modules in case there is something useful there -- I couldn't see anything interesting, but I digress... (Btw, I forgot to save the lsof output to a file. I'll try to remember it next time it happens).
Bugger. Sorry, but I've just accidentally deleted the files containing the process list and module list before attaching them. (Stupid, stupid, stupid me...) Anyway, this is probably not something which can be trivially diagnosed/fixed, so I'll just close the bug until I come up a way to reproduce it reliably.