Summary: | sys-cluster/ceph-0.80.5 - init script breaks standard ceph conventions; fails to start daemons properly | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Aaron Ten Clay <gentoo-bugzilla> |
Component: | [OLD] Server | Assignee: | Patrick McLean <chutzpah> |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | cluster, dlan, mgorny |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Aaron Ten Clay
2014-08-19 00:06:04 UTC
Workaround for now is: echo "/usr/lib/ceph/ceph_init.sh start" >> /etc/local.d/ceph.start echo "/usr/lib/ceph/ceph_init.sh stop" >> /etc/local.d/ceph.stop chmod +x /etc/local.d/ceph.{start,stop} yes, we expect user to create additional symbol link. Also, I'd leave "mounting OSDs" out of ceph script, can you handle it with proper /etc/fstab? still, I've not seen what is the specific error? just mounting OSDs failure closed as invalid, we'd leave mount-file-system out of ceph init script. (In reply to Yixun Lan from comment #2) > yes, we expect user to create additional symbol link. > Also, I'd leave "mounting OSDs" out of ceph script, can you handle it with > proper /etc/fstab? > > still, I've not seen what is the specific error? just mounting OSDs failure Sorry, BugZilla didn't email me when you replied this time, only the subsequent time, so I couldn't reply expediently. Ceph's approach to managing OSDs (and all daemons) is a bit more comprehensive than most init scripts because the Ceph ecosystem is complex. Having to update fstab, and make a bunch of symlinks, whenever one piece changes, is quite cumbersome and inelegant compared to what the Ceph init script already supports. Ceph is responsible for knowing the correct mount-time options, filesystem type, etc. for getting an OSD online. Adding that detail to fstab would be error-prone, at best, and could have data-loss consequences at worst. As long as the Ceph init script is still shipped with the Ceph ebuild at /usr/lib/ceph/ceph_init.sh, there is no problem for me - the workaround is simple enough. But the changes you've made to the Gentoo init script will make it much more difficult for people new to Ceph to get up and running with Ceph on Gentoo. Also, there is still a bug because the "--cluster" argument is not provided to daemons when they are started by the Gentoo-provided init script. after second thoughts, I re-opened this bug. (In reply to Aaron Ten Clay from comment #4) > Ceph's approach to managing OSDs (and all daemons) is a bit more > comprehensive than most init scripts because the Ceph ecosystem is complex. sigh.. ceph init is quite large and comprehensive, and their philosophy is "put all functions in one script(mon, msd, osd), make it work". while my initial motivation was to convert it to Gentoo style init script: do one thing in one script, make it clean and neat. now, I couldn't say I've successfully made this. > Having to update fstab, and make a bunch of symlinks, whenever one piece > changes, is quite cumbersome and inelegant compared to what the Ceph init > script already supports. Ceph is responsible for knowing the correct > mount-time options, filesystem type, etc. for getting an OSD online. Adding > that detail to fstab would be error-prone, at best, and could have data-loss > consequences at worst. your arguments are reasonable here, let us leave alone the ceph script's design philosophy. following upstream is always good, since they've already tested the script, so it should work out of the box. > > As long as the Ceph init script is still shipped with the Ceph ebuild at > /usr/lib/ceph/ceph_init.sh, there is no problem for me - the workaround is > simple enough. your workaround is exactly the same as old Gentoo init script (before I converted). > But the changes you've made to the Gentoo init script will > make it much more difficult for people new to Ceph to get up and running > with Ceph on Gentoo. I can restore the old ceph init script logic (which is ceph upstream's version), before we fully convert it to Gentoo style (huge work, potential out of sync with upstream's version). sum of my plan 1) restore ceph upstream init script(/usr/lib/ceph/ceph_init.sh), make it default 2) keep current Gentoo init script logic if possible. 3) try to make it compatible with current init script, eg. link to ceph-osd.0 still works. (In reply to Yixun Lan from comment #5) > I can restore the old ceph init script logic (which is ceph upstream's > version), before we fully convert it to Gentoo style (huge work, potential > out of sync with upstream's version). > > sum of my plan > 1) restore ceph upstream init script(/usr/lib/ceph/ceph_init.sh), make it > default > 2) keep current Gentoo init script logic if possible. > 3) try to make it compatible with current init script, eg. link to > ceph-osd.0 still works. I would love to see a more "Gentoo" approach, personally, and I applaud the efforts - unfortunately, I can't think of any way to simplify the process. I think Ceph will improve the init process over time, and maybe the Gentoo architecture can help guide that. If you're interested, I would suggest haivng the Gentoo init script call the Ceph-distributed init script, and if there is a specific symlink (e.g. ceph-osd.0), then call the Ceph init script as '/usr/lib/ceph/ceph_init.sh <verb> osd.0', when the Gentoo script is invoked as e.g. '/etc/init.d/ceph-osd.0 <verb>'. To clarify: (Gentoo) '/etc/init.d/ceph <verb>' calls '/usr/lib/ceph/ceph_init.sh <verb>', (Gentoo) '/etc/init.d/ceph-osd.0 <verb>' calls '/usr/lib/ceph/ceph_init.sh <verb> osd.0' I believe this would satisfy both the upstream use case of one init script doing everything based on ceph.conf, as well as the Gentoo style of service-specific symlinks. What I'm not sure about is how to incorporate the potential "--cluster" parameter, since that could be very important for some users. Maybe if there is a dot in the symlink before any dashes, that is the cluster name? e.g. 'ceph.<cluster>-osd.0' or 'ceph.<cluster>'? That follows the OpenVPN init symlink naming convention. I'm happy to discuss further if you'd like. I frequent the Ceph IRC channel and mailing list, perhaps I can assist with the efforts. Just let me know. (In reply to Aaron Ten Clay from comment #6) > > To clarify: > (Gentoo) '/etc/init.d/ceph <verb>' calls '/usr/lib/ceph/ceph_init.sh <verb>', > (Gentoo) '/etc/init.d/ceph-osd.0 <verb>' calls '/usr/lib/ceph/ceph_init.sh > <verb> osd.0' yeah, this is exact as I planed, just not sure if the ceph upstream init already support following: '/usr/lib/ceph/ceph_init.sh> <verb> osd.0' -> one specific osd daemon from my reading of the code it should support the use case of '/usr/lib/ceph/ceph_init.sh <verb> [mon|mds|osd]', it parse the /etc/ceph/ and get all ids of one specific type. > > I believe this would satisfy both the upstream use case of one init script > doing everything based on ceph.conf, as well as the Gentoo style of > service-specific symlinks. > > What I'm not sure about is how to incorporate the potential "--cluster" > parameter, since that could be very important for some users. Maybe if there > is a dot in the symlink before any dashes, that is the cluster name? e.g. > 'ceph.<cluster>-osd.0' or 'ceph.<cluster>'? That follows the OpenVPN init > symlink naming convention. > I haven't looked this, but it sounds good to me. is "--cluster" an option that can be switched on/off? we may control it via /etc/conf.d/ceph or something? > I'm happy to discuss further if you'd like. I frequent the Ceph IRC channel > and mailing list, perhaps I can assist with the efforts. Just let me know. that's good, helps are always welcome! if you willing to push this forward, I'd just say "go ahead", I'd more than happy to review and test it. thanks very much. Created attachment 588264 [details]
554
|