On some fibre-channel hardware (my case is a Sun StorEdge A5000 with a Qlogic2100 controller), there is a very long delay between the kernel informing udev that the device is coming online, and the device actually being ready to probe. The ioerr_cnt check is only part of the problem, as the drive has not yet spun up then even. In hotplug's scsi.agent, there is a check for the type attribute with a 10 second timeout. It needs to be greatly extended to pick up my hardware. If not extended, scsi.agent times out, and then scsi_id fails to get data, followed by path_id not running because ID_TYPE is not set - a wonderful cascading failure. I've produced a patch for this (attached), and I'll apply it in the tree if there is no response by the end of the week. I did 100 passes of testing, and found that it took an average of 30 seconds to come online, with a worst case of 45 seconds. To have some margin, I've set the timeout 30% higher, 60 seconds. This change does not cause delays for other hardware, as their type attribute exists long before the existing 10 second timeout anyway.
Created attachment 82182 [details, diff] Increases timeout to 60 seconds.
commited now after ACK by gregkh.