I have a /etc/init.d/radosgw.$(hostname -f) -> ceph symlink. It appears that the radosgw daemon will not correctly identify itself without the --cluster and --name arguments, and that's necessary for it to appear as a uniquely distinguishable rgw instance in the `ceph status` output. The issue is easiest to see when all keyring settings are removed from ceph.conf, default keyring locations are not available in ceph.conf, and cephx is enabled. In this case the log shows an error like this: > auth: unable to find a keyring on /var/lib/ceph/radosgw/-admin/keyring: (2) No such file or directory With command_args+=" --cluster $cluster --name client.radosgw.$(hostname -f)" arguments, it chooses the "/var/lib/ceph/radosgw/$cluster-radosgw.$(hostname -f)/keyring" location for the keyring, which confirms that it is correctly identifying itself.
The key difference is that radosgw does not set its "name" parameter from the --id argument given by the init script: > $ radosgw -i foo -c /dev/null --show_config | grep '^name =' > name = client.admin > $ ceph-mgr -i foo -c /dev/null --show_config | grep '^name =' > name = mgr.foo > $ ceph-mon -i foo -c /dev/null --show_config | grep '^name =' > name = mon.foo > $ ceph-osd -i foo -c /dev/null --show_config | grep '^name =' > name = osd.foo Changing the behavior of the radosgw program itself is probably not desirable, since it would affect the way that radosgw identifies itself in existing clusters (radosgw will no longer use the client.admin key by default). Also, modifying the init script to use the --name parameter will cause a similar problem for users of the init script. So, it seems that the best solution is to provide a way for the user to indicate that the init script should pass a --name parameter to radosgw.
Created attachment 553844 [details, diff] Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting With radosgw behavior as it is, I think the behavior added in the patch is pretty reasonable. Meanwhile, I'll look into patching radosgw to behave better in the absence of a --cluster argument. It seems like an obvious bug, since it works fine in the absence of a --cluster argument if these settings are added to the ceph config file: > [client] > rgw_data = /var/lib/ceph/radosgw/$cluster-$id > keyring = /var/lib/ceph/radosgw/$cluster-$id/keyring
Note that daemon_id="${RC_SVCNAME#ceph-*.}" causes ${daemon_id} and ${RC_SVCNAME} to be equivalent for radosgw.* service instances, so --name client.${daemon_id} is equivalent to --name client.${RC_SVCNAME} for radosgw.
(In reply to Zac Medico from comment #2) > Meanwhile, I'll look into patching radosgw to behave better in the absence > of a --cluster argument. It seems like an obvious bug, since it works fine > in the absence of a --cluster argument if these settings are added to the > ceph config file: > > > [client] > > rgw_data = /var/lib/ceph/radosgw/$cluster-$id > > keyring = /var/lib/ceph/radosgw/$cluster-$id/keyring The odd behavior in absence of a --cluster argument can be traced back to the commit that added libradosgw: https://github.com/ceph/ceph/commit/08b89a5f6a0310552fb1f3a02534f4df82b0c221 > /* alternative default for module */ > vector<const char *> def_args; > def_args.push_back("--debug-rgw=1/5"); > def_args.push_back("--keyring=$rgw_data/keyring"); > def_args.push_back("--log-file=/var/log/radosgw/$cluster-$name.log"); The def_args settings are evaluated very early in the global_pre_init function, apparently before the cluster name from the config file is available for use by expand_meta: > if (alt_def_args) > conf->parse_argv(*alt_def_args); // alternative default args A possible fix might be to implement lazy expand_meta evaluation.
Behavior with ceph-13.2.2 is similar to ceph-12.2.x, but without the --cluster argument it behaves like --cluster=ceph as we can see here: > # cat /etc/ceph/ceph.conf > [global] > cluster = foo > # radosgw -c /etc/ceph/ceph.conf --no-mon-config --show_config | grep '^rgw_data =' > rgw_data = /var/lib/ceph/radosgw/ceph-admin > # radosgw -c /etc/ceph/ceph.conf --no-mon-config --show_config | grep '^keyring =' > keyring = /var/lib/ceph/radosgw/ceph-admin/keyring
Created attachment 556548 [details, diff] Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting Moved command_args adjustment to start_pre instead of global scope.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bbbd36a51a7a23e7d7d40e881e4d703a3a4ac25f commit bbbd36a51a7a23e7d7d40e881e4d703a3a4ac25f Author: Patrick McLean <chutzpah@gentoo.org> AuthorDate: 2018-11-29 23:56:27 +0000 Commit: Patrick McLean <chutzpah@gentoo.org> CommitDate: 2018-11-29 23:56:27 +0000 sys-cluster/ceph: Version bump to 12.2.10 Also update init script for radosgw args (bug #670066) and bluefs mounting (bug #661912) Bug: https://bugs.gentoo.org/670066 Bug: https://bugs.gentoo.org/661912 Package-Manager: Portage-2.3.52, Repoman-2.3.12 Signed-off-by: Patrick McLean <chutzpah@gentoo.org> sys-cluster/ceph/Manifest | 1 + sys-cluster/ceph/ceph-12.2.10.ebuild | 308 ++++++++++++++++++++++++++++++++++ sys-cluster/ceph/files/ceph.confd-r5 | 15 ++ sys-cluster/ceph/files/ceph.initd-r10 | 108 ++++++++++++ 4 files changed, 432 insertions(+)
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=96df26d385f072bf63b8884b26129f726f3b2e14 commit 96df26d385f072bf63b8884b26129f726f3b2e14 Author: Patrick McLean <chutzpah@gentoo.org> AuthorDate: 2018-11-30 01:28:27 +0000 Commit: Patrick McLean <chutzpah@gentoo.org> CommitDate: 2018-11-30 01:28:27 +0000 sys-cluster/ceph: Revision bump to 13.2.2-r3 Update to new version of confd and init scripts. Closes: https://bugs.gentoo.org/670066 Closes: https://bugs.gentoo.org/661912 Package-Manager: Portage-2.3.52, Repoman-2.3.12 Signed-off-by: Patrick McLean <chutzpah@gentoo.org> sys-cluster/ceph/{ceph-13.2.2-r2.ebuild => ceph-13.2.2-r3.ebuild} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)