Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 670066 - sys-cluster/ceph-12.2.8: init script radosgw command_args insufficient
Summary: sys-cluster/ceph-12.2.8: init script radosgw command_args insufficient
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Patrick McLean
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2018-11-01 05:23 UTC by Zac Medico
Modified: 2018-11-30 01:28 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting (0001-sys-cluster-ceph-Add-confd-RADOSGW_WANT_NAME_PARAM-y.patch,2.71 KB, patch)
2018-11-02 04:10 UTC, Zac Medico
Details | Diff
Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting (0001-sys-cluster-ceph-Add-confd-RADOSGW_WANT_NAME_PARAM-y.patch,2.53 KB, patch)
2018-11-28 02:03 UTC, Zac Medico
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2018-11-01 05:23:39 UTC
I have a /etc/init.d/radosgw.$(hostname -f) -> ceph symlink.  It appears that the radosgw daemon will not correctly identify itself without the --cluster and --name arguments, and that's necessary for it to appear as a uniquely distinguishable rgw instance in the `ceph status` output.

The issue is easiest to see when all keyring settings are removed from ceph.conf, default keyring locations are not available in ceph.conf, and cephx is enabled. In this case the log shows an error like this:

> auth: unable to find a keyring on /var/lib/ceph/radosgw/-admin/keyring: (2) No such file or directory

With command_args+=" --cluster $cluster --name client.radosgw.$(hostname -f)" arguments, it chooses the "/var/lib/ceph/radosgw/$cluster-radosgw.$(hostname -f)/keyring" location for the keyring, which confirms that it is correctly identifying itself.
Comment 1 Zac Medico gentoo-dev 2018-11-02 02:38:25 UTC
The key difference is that radosgw does not set its "name" parameter from the --id argument given by the init script:

> $ radosgw -i foo -c /dev/null --show_config | grep '^name ='
> name = client.admin
> $ ceph-mgr -i foo -c /dev/null --show_config | grep '^name ='
> name = mgr.foo
> $ ceph-mon -i foo -c /dev/null --show_config | grep '^name ='
> name = mon.foo
> $ ceph-osd -i foo -c /dev/null --show_config | grep '^name ='
> name = osd.foo

Changing the behavior of the radosgw program itself is probably not desirable, since it would affect the way that radosgw identifies itself in existing clusters (radosgw will no longer use the client.admin key by default). Also, modifying the init script to use the --name parameter will cause a similar problem for users of the init script.

So, it seems that the best solution is to provide a way for the user to indicate that the init script should pass a --name parameter to radosgw.
Comment 2 Zac Medico gentoo-dev 2018-11-02 04:10:03 UTC
Created attachment 553844 [details, diff]
Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting

With radosgw behavior as it is, I think the behavior added in the patch is pretty reasonable.

Meanwhile, I'll look into patching radosgw to behave better in the absence of a --cluster argument. It seems like an obvious bug, since it works fine in the absence of a --cluster argument if these settings are added to the ceph config file:

> [client]
> rgw_data = /var/lib/ceph/radosgw/$cluster-$id
> keyring = /var/lib/ceph/radosgw/$cluster-$id/keyring
Comment 3 Zac Medico gentoo-dev 2018-11-02 04:17:20 UTC
Note that daemon_id="${RC_SVCNAME#ceph-*.}" causes ${daemon_id} and ${RC_SVCNAME} to be equivalent for radosgw.* service instances, so --name client.${daemon_id} is equivalent to --name client.${RC_SVCNAME} for radosgw.
Comment 4 Zac Medico gentoo-dev 2018-11-05 09:40:22 UTC
(In reply to Zac Medico from comment #2)
> Meanwhile, I'll look into patching radosgw to behave better in the absence
> of a --cluster argument. It seems like an obvious bug, since it works fine
> in the absence of a --cluster argument if these settings are added to the
> ceph config file:
> 
> > [client]
> > rgw_data = /var/lib/ceph/radosgw/$cluster-$id
> > keyring = /var/lib/ceph/radosgw/$cluster-$id/keyring

The odd behavior in absence of a --cluster argument can be traced back to the commit that added libradosgw: https://github.com/ceph/ceph/commit/08b89a5f6a0310552fb1f3a02534f4df82b0c221

>  /* alternative default for module */
>  vector<const char *> def_args;
>  def_args.push_back("--debug-rgw=1/5");
>  def_args.push_back("--keyring=$rgw_data/keyring");
>  def_args.push_back("--log-file=/var/log/radosgw/$cluster-$name.log");

The def_args settings are evaluated very early in the global_pre_init function, apparently before the cluster name from the config file is available for use by expand_meta:

>  if (alt_def_args)
>    conf->parse_argv(*alt_def_args);  // alternative default args

A possible fix might be to implement lazy expand_meta evaluation.
Comment 5 Zac Medico gentoo-dev 2018-11-05 21:00:22 UTC
Behavior with ceph-13.2.2 is similar to ceph-12.2.x, but without the --cluster argument it behaves like --cluster=ceph as we can see here:

> # cat /etc/ceph/ceph.conf
> [global]
> cluster = foo
> # radosgw -c /etc/ceph/ceph.conf --no-mon-config --show_config | grep '^rgw_data ='
> rgw_data = /var/lib/ceph/radosgw/ceph-admin
> # radosgw -c /etc/ceph/ceph.conf --no-mon-config --show_config | grep '^keyring ='
> keyring = /var/lib/ceph/radosgw/ceph-admin/keyring
Comment 6 Zac Medico gentoo-dev 2018-11-28 02:03:39 UTC
Created attachment 556548 [details, diff]
Add confd RADOSGW_WANT_NAME_PARAM=<y|n> setting

Moved command_args adjustment to start_pre instead of global scope.
Comment 7 Larry the Git Cow gentoo-dev 2018-11-29 23:57:40 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bbbd36a51a7a23e7d7d40e881e4d703a3a4ac25f

commit bbbd36a51a7a23e7d7d40e881e4d703a3a4ac25f
Author:     Patrick McLean <chutzpah@gentoo.org>
AuthorDate: 2018-11-29 23:56:27 +0000
Commit:     Patrick McLean <chutzpah@gentoo.org>
CommitDate: 2018-11-29 23:56:27 +0000

    sys-cluster/ceph: Version bump to 12.2.10
    
    Also update init script for radosgw args (bug #670066) and bluefs
    mounting (bug #661912)
    
    Bug: https://bugs.gentoo.org/670066
    Bug: https://bugs.gentoo.org/661912
    Package-Manager: Portage-2.3.52, Repoman-2.3.12
    Signed-off-by: Patrick McLean <chutzpah@gentoo.org>

 sys-cluster/ceph/Manifest             |   1 +
 sys-cluster/ceph/ceph-12.2.10.ebuild  | 308 ++++++++++++++++++++++++++++++++++
 sys-cluster/ceph/files/ceph.confd-r5  |  15 ++
 sys-cluster/ceph/files/ceph.initd-r10 | 108 ++++++++++++
 4 files changed, 432 insertions(+)
Comment 8 Larry the Git Cow gentoo-dev 2018-11-30 01:28:46 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=96df26d385f072bf63b8884b26129f726f3b2e14

commit 96df26d385f072bf63b8884b26129f726f3b2e14
Author:     Patrick McLean <chutzpah@gentoo.org>
AuthorDate: 2018-11-30 01:28:27 +0000
Commit:     Patrick McLean <chutzpah@gentoo.org>
CommitDate: 2018-11-30 01:28:27 +0000

    sys-cluster/ceph: Revision bump to 13.2.2-r3
    
    Update to new version of confd and init scripts.
    
    Closes: https://bugs.gentoo.org/670066
    Closes: https://bugs.gentoo.org/661912
    Package-Manager: Portage-2.3.52, Repoman-2.3.12
    Signed-off-by: Patrick McLean <chutzpah@gentoo.org>

 sys-cluster/ceph/{ceph-13.2.2-r2.ebuild => ceph-13.2.2-r3.ebuild} | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)