Summary: | sys-fs/lvm2-2.02.98 - lvm segfaults during autoactivation via udev+lvmetad | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Alexander Tsoy <alexander> |
Component: | [OLD] Core system | Assignee: | Robin Johnson <robbat2> |
Status: | RESOLVED UPSTREAM | ||
Severity: | normal | CC: | agk, cardoe, poncho, prajnoha |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
emerge --info lvm2
lsblk lvm backtrace (lvm2-2.02.98) lvm backtrace (lvm2-2.02.99) lvm2-lvmetad-pvscan-cache-fix-segv.patch |
Description
Alexander Tsoy
2013-05-01 21:15:07 UTC
Created attachment 347074 [details]
emerge --info lvm2
Created attachment 347076 [details]
lsblk
Created attachment 347078 [details]
lvm backtrace (lvm2-2.02.98)
Created attachment 347080 [details]
lvm backtrace (lvm2-2.02.99)
Created attachment 347264 [details, diff]
lvm2-lvmetad-pvscan-cache-fix-segv.patch
Maybe this patch doesn't fix the root of the problem, but it fixes segfaults for me. All LVs gets activated during boot now.
Thanks for the report! We'll investigate this and sort out an upstream fix! (In reply to comment #5) > Created attachment 347264 [details, diff] [details, diff] > lvm2-lvmetad-pvscan-cache-fix-segv.patch > > Maybe this patch doesn't fix the root of the problem, but it fixes segfaults > for me. All LVs gets activated during boot now. Well, yes, we need to find the real source of the problem here... The situation described here should not normally happen - if the PV does belong to some VG, then *all* metadata areas that belong to that PV should refer to the same existing VG. What happened here is that one of the calls to vg_read for one of the metadata areas returned no VG though the call returned some VG for *other* metadata area which is bound to same PV. This is clearly an inconsistent lvmetad/lvmcache state. For starters, could you please try following: - use the original code without that patch - set use_lvmetad=0 in the lvm.conf - reboot - after reboot, set use_lvmetad=1 and start the lvm2-lvmetad.service - run "pvscan --cache /dev/<PV_device>" for each device on the cmd line directly - can you reproduce the segfault this way? Thanks. (In reply to comment #7) > For starters, could you please try following: > > - use the original code without that patch > - set use_lvmetad=0 in the lvm.conf > - reboot > - after reboot, set use_lvmetad=1 and start the lvm2-lvmetad.service > - run "pvscan --cache /dev/<PV_device>" for each device on the cmd line > directly - can you reproduce the segfault this way? > > Thanks. No segfaults, but on the first attempt it tries to open cdrom and sdd (flash card reader), both are empty. $ sudo LANG=C pvscan --cache /dev/md2 /dev/cdrom: open failed: No medium found /dev/sdd: open failed: No medium found $ sudo LANG=C pvscan --cache /dev/md3 $ sudo LANG=C pvscan --cache /dev/md2 $ sudo LANG=C pvscan --cache /dev/md3 This also "solves" the problem with segfaults :) $ diff -u /etc/lvm/lvm.conf{.dist,} --- /etc/lvm/lvm.conf.dist 2013-04-25 22:42:19.000000000 +0400 +++ /etc/lvm/lvm.conf 2013-05-07 23:58:26.755399905 +0400 @@ -87,7 +87,7 @@ # global_filter. The syntax is the same as for normal "filter" # above. Devices that fail the global_filter are not even opened by LVM. - # global_filter = [] + global_filter = [ "r|/dev/cdrom|", "r|/dev/sdd|" ] # The results of the filtering are cached on disk to avoid # rescanning dud devices (which can take a very long time). @@ -495,7 +495,7 @@ # # If lvmetad has been running while use_lvmetad was 0, it MUST be stopped # before changing use_lvmetad to 1 and started again afterwards. - use_lvmetad = 0 + use_lvmetad = 1 # Full path of the utility called to check that a thin metadata device # is in a state that allows it to be used. (In reply to comment #9) > This also "solves" the problem with segfaults :) > > $ diff -u /etc/lvm/lvm.conf{.dist,} > --- /etc/lvm/lvm.conf.dist 2013-04-25 22:42:19.000000000 +0400 > +++ /etc/lvm/lvm.conf 2013-05-07 23:58:26.755399905 +0400 > @@ -87,7 +87,7 @@ > # global_filter. The syntax is the same as for normal "filter" > # above. Devices that fail the global_filter are not even opened by LVM. > > - # global_filter = [] > + global_filter = [ "r|/dev/cdrom|", "r|/dev/sdd|" ] > > # The results of the filtering are cached on disk to avoid > # rescanning dud devices (which can take a very long time). Hmm.. This not completely solve the problem. Segfaults still rarely occurs. Then in emergency shell I do the following: home ~ # systemctl start lvm2-lvmetad.service home ~ # ls -l total 16 -rw-r--r-- 1 root root 0 May 9 19:16 typescript home ~ # pvscan --cache /dev/md2 �EF: open failed: Is a directory home ~ # ls -l total 20 -rw-r--r-- 1 root root 0 May 9 19:16 typescript drwx------ 2 root root 4096 May 9 19:17 ??E?F Directory with strange name got created. O_o I just tried latest snapshot from Git again. Problem still exist. Any ideas? I can't reproduce these segfaults with 2.02.99 release. So looks like this bug was fixed. ^^ Closing as upstream per comment #12 |