Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 601540 - start-stop-daemon ignores multiple pids
Summary: start-stop-daemon ignores multiple pids
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal major (vote)
Assignee: OpenRC Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 584410
  Show dependency tree
 
Reported: 2016-12-03 21:48 UTC by Robin Johnson
Modified: 2016-12-14 00:01 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-12-03 21:48:51 UTC
Manpage start-stop-daemon(8):
  -p, --pidfile pidfile
  ... When stopping we only stop the *pid(s)* listed in the pidfile.

Emphasis mine, for multiple PIDs.

This is crucial for haproxy with the nbprocs option, where multiple PIDs are written to the pidfile (see bug 584410).

static pid_t get_pid(const char *pidfile) only returns a SINGLE pid, instead of all of the PIDs.

I think it should be modified to return RC_PIDLIST instead, and all of the usages of it should operate on all elements of the list.
Comment 1 Mike Gilbert gentoo-dev 2016-12-04 16:33:47 UTC
How are the PIDs delimited? Newlines? Spaces? Both?

Is there any documented format for that?
Comment 2 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-12-04 17:11:48 UTC
haproxy writes one pid per line.

If you don't think it's appropriate, please correct the documentation to take out the plural form, and I'll find another solution for haproxy (probably using the systemd wrapper's generic support).
Comment 3 Mike Gilbert gentoo-dev 2016-12-04 18:14:47 UTC
It just seems a little risky to make any assumptions about a file format that is not well-defined beyond a single pid at offset 0.
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-12-04 22:39:50 UTC
While there are other programs that have multiple pids, it seems that the --pidfile implementations NEVER supported multiple PIDs. The documentation however has used the plural form for 15+ years.

Debian's original start-stop-daemon.pl circa 1996 only did a single pidfile, but the early docs used the plural form for processes from the pidfile, possibly as a linguistic byproduct (possibly not english-first-language documentation author).

It got rewritten as public-domain C, and later imported to Gentoo in 2001.
https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-src/rc-scripts/sbin/start-stop-daemon.c?hideattic=0&revision=1.1&view=markup

OpenRC did lose the original attribution at some point.

Proposed resolution:
change the manpage to use the singular form and description that only the first number will be used.
Comment 5 William Hubbs gentoo-dev 2016-12-05 18:33:00 UTC
https://github.com/openrc/openrc/commit/3552f0a

This commit fixes the man page.

The attribution of start-stop-daemon.c  as originally imported into
OpenRC is two-clause bsd (C) Roy Marples.

https://github.com/openrc/openrc/commit/ac21d7
Comment 6 calimeroteknik 2016-12-06 10:00:30 UTC
Shouldn't the implementation be fixed to follow the 15-year old documentation instead?

PID separators would "evidently" the shell's word separators. (ASCII whitespace)

I propose the same behaviour as this (with the default IFS):
for pid in $(<pidfile);do kill "$pid"; done
Comment 7 Mike Gilbert gentoo-dev 2016-12-06 17:15:30 UTC
(In reply to calimeroteknik from comment #6)
> Shouldn't the implementation be fixed to follow the 15-year old
> documentation instead?

How is that preferable to updating the documentation to reflect reality?

I can only reiterate comment 3: it is risky to change the behavior here, since there is no documented standard to following regarding the format of PID files.
Comment 8 calimeroteknik 2016-12-07 15:36:12 UTC
(In reply to Mike Gilbert from comment #7)
> (In reply to calimeroteknik from comment #6)
> > Shouldn't the implementation be fixed to follow the 15-year old
> > documentation instead?
> 
> How is that preferable to updating the documentation to reflect reality?

Indeed, changes in default behaviour must be considered with extreme caution.

We have two options:
Documentation updated, all service file maintainers need to be informed (sometimes the service file is even from upstream), those who need multiple PID support each roll their own implementation for the stop() function.

Implementation updated to match the documentation, no need to inform maintainers (who referred to the docs to write their service files), and apparently no breakage to anticipate (see below), only desirable 'fixage'.

I do believe that this second option is preferrable, considering the following:

> I can only reiterate comment 3: it is risky to change the behavior here,
> since there is no documented standard to following regarding the format of
> PID files.

Precisely, and that means there is no documented standard to determine what the "first" PID in the pidfile is, either.

We can dig out the current de-facto format.
It turns out fscanf(f, "%d", &pid) is used:
https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-src/rc-scripts/sbin/start-stop-daemon.c?hideattic=0&revision=1.1&view=markup#l599

fscanf will skip any whitespace (which means the PID does not actually have to be at offset 0); it will accept a minus sign; and it will consider it has found its integer when it sees anything else than a digit.

If I'm not mistaken, pidfiles usually contain one PID, so for most services it would change nothing to support several.

For services that write several PIDs, supporting them (say, separated by whitespace) would only cause 'fixage' if anything for e.g. haproxy.
If some weird service tries to use commas or whatever, it will stay broken the same as before, since fscanf() will stop finding integers as soon as it encounters garbage.

Is there any case of possible breakage this could logically introduce?
Comment 9 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-12-07 19:46:50 UTC
As an example of something using multiple PIDS and could be problematic, see:
https://issues.apache.org/jira/browse/FLINK-2302

The format of each line is:
"PID description\n"

The description example given is 'tm1', 'tm2'.

If the description is just a number, say '1', we must be VERY VERY careful not to parse that part (sending SIGTERM to PID 1 would be bad...).

If there is an app that writes an integer other than a PID to the second line, that could also cause problems.

Furthermore, I think documentation is the correct answer, because Debian's start-stop-daemon, that the Gentoo one is based on, never actually implemented multiple PIDs in a single pidfile, AND fixed their documentation to use the singular format.

If a single app wants to support multiple PIDs in a pidfile, they can implement it themselves like this (this wouldn't work for the description case above):
t=$(mktemp)
for pid in $(cat $PIDFILE) ; do
  echo $pid > $t
  start-stop-daemon --pidfile $t ...
done
rm -f $t

Similarly, if there
Comment 10 calimeroteknik 2016-12-08 09:59:51 UTC
(In reply to Robin Johnson from comment #9)
> As an example of something using multiple PIDS and could be problematic, see:
> https://issues.apache.org/jira/browse/FLINK-2302
> 
> The format of each line is:
> "PID description\n"
> 
> If the description is just a number, say '1', we must be VERY VERY careful
> not to parse that part (sending SIGTERM to PID 1 would be bad...).

That was an excellent counter-example, which is sufficient in itself.
This could give inspiration for a standard that I think should be made; more on this later in this post.

> Furthermore, I think documentation is the correct answer, because Debian's
> start-stop-daemon, that the Gentoo one is based on, never actually
> implemented multiple PIDs in a single pidfile, AND fixed their documentation
> to use the singular format.

Agreed. If we want to reach a standard, everybody needs to do the same thing.

However, I find the current resolution of this bug unsatisfactory.
Just fixing the docs is something nobody will notice, and trying to tell maintainers to check their services is tedious.

Therefore:

## Proposal 1/2 ##

start-stop-deamon must not change its behaviour at all (what it *does*).

However, it should *print a warning* if the pidfile contains anything else than a PID and ASCII whitespace, instead of silently ignoring any garbage (as it currently does).

##

That way, users and maintainers will notice the warning and fix the services.

I believe this alone is entirely satisfactory, as opposed to .

> 
> If a single app wants to support multiple PIDs in a pidfile, they can
> implement it themselves like this (this wouldn't work for the description
> case above): [redacted for concision]
> 
> Similarly, if there
[assuming this was what you meant to reference:]
> If there is an app that writes an integer other than a PID to the second
> line, that could also cause problems.



## Proposal 2/2: ##

Document a standard for pidfiles with multiple PIDs, and implement it in start-stop-daemon as a new option.

In order for most services to not have to roll their own implementation, and since pidfiles with one PID per line are common, the following sounds sane:

One PID at the beginning of each line (after possible ASCII horizontal whitespace), and an optional description after it.
The description is separated from the PID by ASCII horizontal whitespace.

Print an error and exit with an error code if the pidfile is not in that format.

##

pidfiles not following that format will require manual implementation, but that should hopefully weed out the vast majority of multi-PID pidfiles.

I want to put emphasis on proposal 1/2. (warn on fishy pidfile formats)
Comment 11 Mike Gilbert gentoo-dev 2016-12-08 15:43:15 UTC
(In reply to calimeroteknik from comment #10)

I think you are turning a molehill into a mountain here.

We already know that PID files containing multiple PIDs are quite rare, and altering the OpenRC code to check for it seems excessive to me.
Comment 12 Mike Gilbert gentoo-dev 2016-12-08 15:44:31 UTC
I think the most this change warrants is an entry in any release notes that are published.
Comment 13 calimeroteknik 2016-12-08 17:13:54 UTC
Sorry, I tend to overdo things.

Printing a warning when the pid files contains more than just the expected PID sounds good, though. Can we focus on that?
Comment 14 Mike Gilbert gentoo-dev 2016-12-08 17:22:58 UTC
(In reply to calimeroteknik from comment #13)

There are cases where it is appropriate to grab the first PID only, and discard the rest.

For example, postgresql writes the PID if its main daemon first, followed by a bunch of other postgres-related information.

There is no need to warn about this.
Comment 15 Austin English (RETIRED) gentoo-dev 2016-12-14 00:01:10 UTC
(In reply to Mike Gilbert from comment #14)
> (In reply to calimeroteknik from comment #13)
> 
> There are cases where it is appropriate to grab the first PID only, and
> discard the rest.
> 
> For example, postgresql writes the PID if its main daemon first, followed by
> a bunch of other postgres-related information.
> 
> There is no need to warn about this.

That's an argument for providing a way to disable the warning, for valid usecases. But a warning presumably would've helped Robin identify his problem sooner.