Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 242778 - sys-fs/cryptsetup-1.0.6 calls udevsettle even for mdev systems
Summary: sys-fs/cryptsetup-1.0.6 calls udevsettle even for mdev systems
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High minor (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on: 273039
Blocks:
  Show dependency tree
 
Reported: 2008-10-19 14:53 UTC by Sven E.
Modified: 2012-10-22 04:50 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
cryptsetup-1.0.6-svn-udev.patch (cryptsetup-1.0.6-svn-udev.patch,7.99 KB, patch)
2008-10-26 10:57 UTC, SpanKY
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sven E. 2008-10-19 14:53:17 UTC
cryptsetup-1.0.6 calls out to udevadm settle. A Program calling another program like this is a no go. Especially this ist absolutely pointless if another hotplug processor is used. The race conditions this call tries to fix should be handled at some other place and if it is only udev specific cryptsetup should firdst check if:
 a) udev is used at all and
 b) then use an udev-specific fix.


Reproducible: Always

Steps to Reproduce:
1. compile cryptetup
2. run crptetup

Actual Results:  
crptsetup does udev specific calls

Expected Results:  
no udev specific calls if udev is not used.

I don'T know if the race condition that behavior trie sto avoid is specific to udev. If not, then maybe udev nd other hotplug processors need a comon interface to check their queue nd wait for settling. if it it udev specific, then udev probably needs fixing or some code in libdevmapper that seems to produce the race (as noted in the cryptsetup source)
Comment 1 Mike Auty (RETIRED) gentoo-dev 2008-10-19 21:02:27 UTC
Perhaps you could explain why this is a major problem, for instance, what difficulties does it cause for other systems?  Do you have a system that's affected by this call?  Is it a security issue?
Comment 2 Sven E. 2008-10-19 21:43:34 UTC
Well, in my case it's an in kernel initramfs built upon bb and its mdev.

Though the devices seem to get created, it's just not right to get error messages about an error that never occured (or if it could have occured is not handled properly, because udev is not used). What I mean by severly broken, is that this being handled that way is utterly wrong.

Either cryptsetup needs to do udev specific tasks when udev is running - or it might be udev in general having a design problem.

I don't know the exact specifications on the kernel hotplug interface, but if the kernel passes out events to an external program (like udev/mdev or any other) the interface needs to specify if this can be done asynchronously, if not and udev does this (thus having it's own queue) udev needs fixing, if the kernel allows asynchronous processing of the event queue by a hotplug processor the hotplug processor may return while not being done (as udev does), but the kernel interface needs to offer a way to signal, if there is still pending events and thus have the ability to wait for these to complete. If the kernel doesn't have this, then the kernel is broken.

And depending on the actual speficications cryptsetup should consult the proper interface, if it needs to wait for events to be handled, but not by calling a userspace programm which only one hotplug processor offers. IF the problem is valid for all hotplug processors then some common interface is needed to wait for 'settling' or cryptsetup might need to handle them all seperately.

Shellscrips call other executeables, not programs, they should use libs and kernel functions.

I am well aware that this probably needs to be taken upstream for proper fixing and yes this could be a problem that needs fixing in all 3 components, kernel, cryptsetup and the hotplug processor ... but cryptsetup certainly goes the completely wrong way.

It might be interesting to know why cryptsetup exactly waits for udev to settle at this place in the source, maybe waiting is just the wrong fix.
Comment 3 Mike Auty (RETIRED) gentoo-dev 2008-10-20 09:16:03 UTC
The reason this settles is when it uses devmapper to create the new mapped device.  As you pointed out udev can run asynchronously, and hence other programs can continue running whilst the device mapper device is still being setup.  This means that even after the call has been made to create the device it may not be ready, and cryptsetup would fail.

Rather than a program would you prefer a udev-specific API which forced all other hotplugging mechanisms to implement the same API for waiting?  That seems just as restrictive, but rather than replacing a symlink/file, it requires completely recompiling the code.

I've got a couple more questions for you, since I think the actual problem may be that the initial ramdisk doesn't have the right symlinks setup for udev (ie, I think mdev provides a udevsettle-like program, but that it hasn't been properly added to the initrd).

So, do you use genkernel and if so which version, and are you using LUKS for your encrypted system, or some other encryption that requires cryptsetup?
Comment 4 Sven E. 2008-10-20 13:30:42 UTC
I am not using genkernel, as I pointed out the initramfs is embedded into the kernel and thus created by the kernelbuild itself.

As the busybox documentation shows, there's only mdev -s (for seeding) and in all other cases mdev is called by the kernel as hotplug processor, whereas udev directly reads out the kernels event queue. I have doubts that the kernel does callouts to the processor in /proc/sys/kernel/hotplug and directly returns, I assume it does this synchrounously (If I am wrong here then the kernel is even more broken, than I feared).

In contrast udev reading out the kernel queue in a non-ordererd ansynchronous fashion is obvious prone to such a race condition.

By the way, the symlinks for busybox are created with --install -s.

Anyway, if the kernel's uevent interface is meant to be handled by directly reading it's queue, then udev should have to send an event to the kernel after settling and the kernel should have an interface to wait for this as a proper generic approach to this problem.

I am thinking of something like:

from crypsetup's POV:

instruct libdevmapper to create inodes etc.
call to something like wait_for_devicemanager_to_settle.

from udev's point of view:

process eventqueue, when empty send event/notification to kernel.

For the time being  workaround might look like this (pseudocode):

if (exists(/sbin/udevadm){
   call udevadm and check result
}else if (exists(/sbin/udevsettle)){
   call udevsettle and check result
}

Now here's another point why calling udevadm is wrong: there might be systems where udev with udevsettle is in use (though they are certainl rare).

And yes, I am using luks.

Bottom line: If the kernel uevent interface is meant to be asynchronous there need to be proper mechanisms for proper synchronization, otherwise it's broken by design. And both udev and crypsetup should make proper use of it.

(Aditional thought: Who guarantees udevadm is installed where cryptsetup is looking for it? As I said, calling another programm is a rather disturbing idea)

BTW: What exactly does cryptsetup do, after waiting for udev to settle?


Comment 5 Sven E. 2008-10-20 15:31:30 UTC
After some aditional research I found out that the kernel obviously sends uevents to a netlink socket, as well as /proc/sys/kernel/hotplug.

The events are unordered but numbered.

For the processor set via /proc this imposes the following:
1. remember the last procesed uevent and keep those not in order, till the missing one is present.
2. If spawned several times, synchronize ...

Calling a process which is indeed a stateless thing but requiring some sort of state to be kept is really braindead, but one could do this by using a lock and a state file - plain ugly, but okay - this only works though if the lockfile is created with the first sequence number, by the first uevent. Even if the first event is to be guaranteed to be one still a race might occur. Except if the kernel waits for each callout to finish and events are ordered. So, this interface is broken by design.

Now, for netlink. a daemon listening on netlink can order the events, it needs to know though, which the first one was. Now all it needs is for any potential listener to acknowledge the sequence numbers by sendind a special event back to the kernel.

So, for example a programm wants to wait for the device manager (wahtever it will be) to settle. Aftr sending the events to create inodes, the program would do a call to wait_for_ack (number of last event sent) .. which would return only after the device manager whatever it would be, has finished.

As I see it cryptsetup should not do the call to udevadm, but libdevmapper/dm-rypt should wait for the request to finish and the proper interfaces need to be created. If the kernel does not hve a wait() function for events to be processed, then that's where fixing needs to be started ... and when those interfces are in place ryptsetup should adhere to them.


Comment 6 Mike Auty (RETIRED) gentoo-dev 2008-10-20 15:59:09 UTC
Ok, but every one of the changes you've requested needs to be taken upstream, since they're not problems with Gentoo or the way it packages the programs, but with the programs themselves.  So I'd go mention this at one of:

http://luks.endorphin.org/dm-crypt
https://www.redhat.com/mailman/listinfo/dm-devel
http://lkml.org/
Comment 7 Sven E. 2008-10-20 16:39:28 UTC
Okay, makes sense, will try to start at the luks/dm-crypt people.

Thanks for advising.
Comment 8 Sven E. 2008-10-24 23:17:44 UTC
http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/2804

Discussess the problem and includes a patch.

The Patch should really be included into cryptsetup to get a proper working core system.

And the patch got comitted to the upstream svn inbetween.

So two options arise: Include the proper fix, instead ov changing the udevettle to udevadm call, or create a build for the svn version.

Thanks
Comment 9 SpanKY gentoo-dev 2008-10-26 04:08:10 UTC
where do you see their svn ?  i cant seem to find it
Comment 10 Mike Auty (RETIRED) gentoo-dev 2008-10-26 10:29:39 UTC
Hmmm, I managed to track it down.  It seems they set themselves up on googlecode without mentioning it anywhere on the (two or three) old homepages...

http://code.google.com/p/cryptsetup/
Comment 11 SpanKY gentoo-dev 2008-10-26 10:57:29 UTC
Created attachment 169916 [details, diff]
cryptsetup-1.0.6-svn-udev.patch

thanks ... can someone please test out this patch
Comment 12 Mike Auty (RETIRED) gentoo-dev 2008-10-26 11:17:28 UTC
Ok, this patch has to replace the existing cryptsetup-1.0.6-udevsettle.patch (which changed "udevsettle" into "udevadm settle" and checked the return call), but after that it'll apply fine.

It seemed to boot my rootfs fine, and mount the encrypted swap ok too.  I'm not sure how to cause the race condition, or know how common it was, but certainly initially it works fine...
Comment 13 Mike Auty (RETIRED) gentoo-dev 2008-10-26 11:19:20 UTC
I forgot to mention sorry, since I'm using genkernel and LUKS, the rootfs was mounted under mdev, and the swapfs was mounted under udev.  In fact, under mdev the patched up cryptsetup (which called "udevadm settle") used to leave two error messages about not being able to protect against the race condition, and those are now gone...
Comment 14 Jürgen Geuter 2008-10-30 11:31:33 UTC
I do get a race condition with UDEV/dmcrypt.

I have an encrypted /home partition. When I boot my system (genkernel generated 2.6.27-gentoo-r1) the system "hangs" after I enter my passphrase for dmcrypt. I can CTRL-C and then restart dmcrypt after the system has booted I have no problems at all, I could try waiting for timeouts if that would help.

sys-fs/cryptsetup is 1.0.6-r2
sys-fs/udev       is 130-r1
Comment 15 Sven E. 2008-10-30 17:14:24 UTC
I tested the patch inbetween and obviously the complaints about the missing udev are gone. I use mdev for hotplugging on my initramfs, just to mention it.

Aside from that there are no races and no other problems for any of the logical volumes on the dm-crypt device so far.

Jürgen did you apply the patch posted here to cryptsetup? Did you not have races before using the patch?

http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/2804

Talks about a deadlock that might occur without the new patch. If you did not apply it, can you do that and check if your problem is solved?
Comment 16 Jürgen Geuter 2008-10-30 17:19:03 UTC
The weird thing is that I don't have the issue with a 2.6.26-kernel (I used the 2.6.26 config with genkernel to build the 2.6.27 image).
I can't right now, but I can check whether the patch fixes the issue tomorrow, I have not tested it yet.
Comment 17 W. Elschner 2008-11-23 08:51:00 UTC
(In reply to comment #11)
> > cryptsetup-1.0.6-svn-udev.patch
> 
> thanks ... can someone please test out this patch

Works fine here with udev-133 on AMD64 with 2.6.27.7 kernel. I replaced the existing cryptsetup-1.0.6-udevsettle.patch with your patch.
I got these races with an udev rule that calls "cryptsetup luksOpen...". No  problems anymore with your patch.


Comment 18 Mike Auty (RETIRED) gentoo-dev 2008-12-08 23:42:03 UTC
Seems there haven't been any problems with this patch.  Any news on getting it in the tree?
Comment 19 Elias Probst 2009-02-15 05:35:21 UTC
The patch solved the problem here for me too on 4 boxes.
Please get it into the tree if there are any reasons left not to do it.

Thanks a lot

Elias P.
Comment 20 Greg Fitzgerald 2009-02-16 00:17:23 UTC
This patch is working good here, ~amd64.
Comment 21 Alex Legler (RETIRED) archtester gentoo-dev Security 2009-02-16 20:50:26 UTC
Works like a charm (~x86), thanks vapier!
Comment 22 Felix Tiede 2009-09-07 07:17:31 UTC
Works very well here (~amd64). Thanks.

Also solved my problem:
Automounting a LUKS-encrypted USB-Key long after system has started. Took at least two minutes before.
Comment 23 SpanKY gentoo-dev 2012-10-22 04:50:37 UTC
shouldn't be an issue with latest versions of cryptsetup