Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 143951 - baselayout-1.12-r3 rc_start_daemon gets confused by daemons that change names
Summary: baselayout-1.12-r3 rc_start_daemon gets confused by daemons that change names
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] baselayout (show other bugs)
Hardware: All Other
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-14 16:30 UTC by Dustin J. Mitchell
Modified: 2006-09-06 07:33 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
patch to /lib/rcscripts/sh/rc-daemon.sh to use ${name} as the manpage says it will (rc-daemon.patch,3.27 KB, patch)
2006-08-15 12:35 UTC, Dustin J. Mitchell
Details | Diff
smaller patch (x,1.01 KB, patch)
2006-08-15 13:38 UTC, Roy Marples (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dustin J. Mitchell 2006-08-14 16:30:19 UTC
I have an (in-house) daemon where my initscript does the following:

- uses start-stop-daemon to start a shell script which
- performs some initialization and
- exec's a python script which
- performs some initialization and
- changes its name from 'python /path/to/foo.py' to 'foo'

thus, the same PID will be '/bin/bash /path/to/wrapper', then 'python /path/to/foo', then 'foo' at fairly unpredictable times.  This was fine with the previous version of baselayout (in fact, we didn't do the last name-change -- it was added to try to fix this), but produces a race condition with this version.

Specifically, if I use "--name foo" in the s-s-d invocation, then *if* the daemon gets to its name-changing within the one second allowed by rc_start_daemon, things will work.  Otherwise, s-s-d kills the nascent daemon.

It seems that start-stop-daemon should be configurable to look *only* at the PID of the daemon it's starting, for cases just like this one.  Other options might be allowing the wait-for-name-stabilization timeout to be configurable (a la RC_RETRY_TIMEOUT/RC_RETRY_COUNT).

I'd appreciate either an idea on how to fix this or a suggestion of a future-portable way for me to fix my  app.
Comment 1 Roy Marples (RETIRED) gentoo-dev 2006-08-15 03:06:37 UTC
(In reply to comment #0)
> - uses start-stop-daemon to start a shell script which
> - performs some initialization and
> - exec's a python script which
> - performs some initialization and
> - changes its name from 'python /path/to/foo.py' to 'foo'

start-stop-daemon starts and stops daemons, not shell scripts. If you have a shell script that does initialisation, and then called another script (ie python) why even use start-stop-daemon?

> 
> thus, the same PID will be '/bin/bash /path/to/wrapper', then 'python
> /path/to/foo', then 'foo' at fairly unpredictable times.  This was fine with
> the previous version of baselayout (in fact, we didn't do the last name-change
> -- it was added to try to fix this), but produces a race condition with this
> version.

s-s-d now monitors daemons started and stopped so we know if a daemon exited when it shouldn't have done.
Comment 2 Dustin J. Mitchell 2006-08-15 12:34:27 UTC
s-s-d has the advantage of tight integration with the Gentoo initscripts, which is helpful.  I took the time to rewrite the Python scripts as daemons (using e.g., http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/278731).

The scripts are invoked as e.g., '/lib/daemons/bin/foobar', but run with the following in /proc/*/cmdline:
 python /lib/daemons/bin/foobar --daemonize=TRUE --daemon-log-file=/var/log/foobar_daemon.log --pid-file=/var/run/foobar.pid
/proc/*/stat contains 'python2.4'. 

It would seem from the manual page that s-s-d will use --name as the argument to 'pidof':
       -n|--name process-name
              Check for processes with the name process-name (according to /proc/ pid /stat ).
yet in /lib/rcscripts/sh/rc-daemon.sh, rc_start_daemon calls is_daemon_running with ${cmd}, not ${name}, meaning that it runs 'pidof /lib/daemons/bin/foobar', which it does not find.  I consider this latter circumstance a bug, though admittedly not the bug I had originally reported.  I've attached a patch to fix it.

With this patch in place, and with a change to my daemon such that it changes its cmdline and stat to 'foo', I can effectively use

  start-stop-daemon ... --name foobar --startas /lib/daemons/bin/foobar

to control the daemon.  Without this patch, I must use:

  start-stop-daemon --startas python -- /lib/daemons/bin/foobar

which is obviously less than ideal if there are multiple python-based daemons.

I'm sorry this took me a while to get around to the central problem -- it's rather new territory to me.
Comment 3 Dustin J. Mitchell 2006-08-15 12:35:10 UTC
Created attachment 94344 [details, diff]
patch to /lib/rcscripts/sh/rc-daemon.sh to use ${name} as the manpage says it will
Comment 4 Roy Marples (RETIRED) gentoo-dev 2006-08-15 13:38:38 UTC
Created attachment 94347 [details, diff]
smaller patch

I cannot fault your argument, and thanks for the patch :)

This patch is smaller and less intrusive but should achieve the same thing. Please test that it works for you.
Comment 5 Dustin J. Mitchell 2006-08-15 14:24:09 UTC
Looks great -- thanks!
Comment 6 Roy Marples (RETIRED) gentoo-dev 2006-08-15 14:45:58 UTC
Fixed in baselayout-1.12.4-r6 :)
Comment 7 Erik Wasser 2006-09-06 04:01:08 UTC
Sorry I don't get the solution for this problem here. Since the upgrade s-s-d failed to start my perl daemon. I use the following line:

start-stop-daemon --start --quiet --pidfile /var/run/xrd.pid \
--exec /usr/sbin/xrd.pl -- --daemonize

The '/usr/sbin/xrd.pl' is a perl daemon which changes its name to 'xrd' and forks into the background.

'/etc/init.d/iqadm-xrd' fails to start the daemon but entering the above command directly will work. Two thinks are worth mentioning:

a) Why the difference between the command in the startup script and the command on the command line?

b) Why is the start of the service failing? An better explanation would be useful here: 'Failed to start xrd' is a little bit (too) short. B-)
Comment 8 Roy Marples (RETIRED) gentoo-dev 2006-09-06 07:33:01 UTC
(In reply to comment #7)
> Sorry I don't get the solution for this problem here. Since the upgrade s-s-d
> failed to start my perl daemon. I use the following line:
> 
> start-stop-daemon --start --quiet --pidfile /var/run/xrd.pid \
> --exec /usr/sbin/xrd.pl -- --daemonize
> 
> The '/usr/sbin/xrd.pl' is a perl daemon which changes its name to 'xrd' and
> forks into the background.

So use the --name directive

> '/etc/init.d/iqadm-xrd' fails to start the daemon but entering the above
> command directly will work. Two thinks are worth mentioning:
> 
> a) Why the difference between the command in the startup script and the command
> on the command line?
> 
> b) Why is the start of the service failing? An better explanation would be
> useful here: 'Failed to start xrd' is a little bit (too) short. B-)

init scripts have a bash wrapper around s-s-d to check the daemon in question really did start and not bail through a configuration issue. The downside is that you now need to use the --name directive for some perl/python/bash scripts.