Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 37308 - wrong dependency type in init-script of drbd 0.6.6-r2
Summary: wrong dependency type in init-script of drbd 0.6.6-r2
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Gentoo Cluster Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-01-05 08:37 UTC by JG
Modified: 2010-09-10 18:59 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description JG 2004-01-05 08:37:33 UTC
when starting up drbd on boot, drbd will hang and is unable to sync its discs, because it needs the network up and running.
hence "net" should be a NEED dependency instead of a USE dependency so that the network gets started before drbd.
if drbd can't sync its data, data loss is to be expected.

Reproducible: Always
Steps to Reproduce:
1. emerge drbd and configure it
2. add it to runlevel default, rc-update add drbd default
3. reboot server
4. drbd will hang if there is no network connection and ask the user to force primary state (which will cause data loss if the other server of the cluster has been primary)

Actual Results:  
drbd hangs and waits for the user to force primary state (which can cause data
loss if the other server was primary and hence has the recent data) 

Expected Results:  
the network should be started before drbd so drbd can find the other server/node
and sync with it.
Comment 1 Jan Krueger 2004-01-05 17:19:32 UTC
Thank you for reporting this "issue".

Dataloss:
should never be caused as your Cluster Manager (heartbeat or something else) should be able to detect such condition and together with drbd resolve it.
Remember: you should have at least 2 (different) connections between nodes to allow the CM to detect such condition.

NEED dependency:
drbd is often used in multi-link setups with one or more cluster internal links and one more links to the outside world. In such setup the interfaces to the outside world could have the NEED flag and the cluster internal links a USE flag as the internal link or one of the 2 nodes is expected to fail (a failure of the external link in this case should cause a node failure [which currently doesnt happen] so no drbd starts) for example:
 NEED net.eth0
 USE net.eth1
So drbd can start even if net.eth1 fails. The node can startup even without the link (which might be a failed WAN-Link to the other node).
A lot of other configurations are possible depending on individual requirements.

If you specify only NEED net the node will fail to startup if one of the links or the other node failed. That does not provide high availablity.

In other words:
IMHO with the current gentoo startup-script environment it is not possible to specify a setup which works in almost all cases, the individual requirements differ a lot. The Sysadmin is responsible for configuring (editing the drbd startup script to match the individual requirements) and testing.

Please check your configuration and change it to meet your node-failure requirements. You can specify a timeout so no user-interaction is required, thus drbd starts up, USEing the net. As soon as the net is up and the CM is there synchronization will proceed.

No need to fix anything.

Feel free to take over maintainership of this ebuild.
Comment 2 Michael Imhof (RETIRED) gentoo-dev 2004-01-07 16:20:45 UTC
so i think you solved this problem?
Comment 3 JG 2004-01-08 03:16:28 UTC
@michael
yes, sorry for not answering, i didn't have internet access the last days.
you can close this "bug".

@jan
thank you for your fast and very detailed answer!! when implementing drbd for our little (student-)project we didn't think of such setups as you described. as you mentioned, we just edited the init-script for ourselves.
sorry for the inconvenience!

JG
Comment 4 Michael Imhof (RETIRED) gentoo-dev 2004-01-08 04:11:11 UTC
Thanks for your response.
So I'm closing this bug now.