Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 593794 - net-misc/tor: openrc script does not do a gracefulstop correctly
Summary: net-misc/tor: openrc script does not do a gracefulstop correctly
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Anthony Basile
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-14 18:18 UTC by Toralf Förster
Modified: 2016-09-24 01:05 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
testcase.tar.bz2 (testcase.tar.bz2,779 bytes, application/x-tar)
2016-09-16 17:55 UTC, William Hubbs
Details
tor (tor,1.28 KB, text/plain)
2016-09-17 21:30 UTC, William Hubbs
Details
old.log (old.log,61.34 KB, text/plain)
2016-09-18 16:44 UTC, Toralf Förster
Details
old.log (old.log,61.34 KB, text/plain)
2016-09-18 18:12 UTC, Toralf Förster
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Toralf Förster gentoo-dev 2016-09-14 18:18:24 UTC
"/etc/init.d/tor gracefulstop" always sets the status to "crashed"

From a discussion with blueness on IRC :

[19:42] <toralf> blueness: the return code is 0 - I checked it , but every time I stopped my tor relay with "gracefulstop" I had to "zap stop" it afterwards b/c the startus is "crashed"
[19:47] <sklv> nvm got it
[19:48] <blueness> toralf: let me test
[19:50] <maolang> emerging 198 of 198, then i wil re-run @world update
[19:50] <blueness> toralf: something is messed up with start-stop-daemon
[19:51] <blueness> its not properly doing the signaling
[19:54] <toralf> yep, funny thing, "graceful" works 
[19:58] <blueness> toralf: i think its a bug in openrc
[20:00] <toralf> blueness: /me too, but wasn't sure
[20:00] <blueness> toralf: SIGINT works manually
[20:01] <blueness> on tor
[20:01] <sklv> is there something like corenet_tcp_sendrecv_all_ports but not for all ports
[20:02] <toralf> blueness: yes - tested it too
[20:03] <blueness> toralf: its a bug in openrc
Comment 1 dwfreed 2016-09-16 03:28:33 UTC
OpenRC is behaving as well as it can here.  You are using a function that OpenRC has no way of knowing stops the service, which asks start-stop-daemon to try to stop to the daemon.  For tor, this stop request frequently does not succeed in the time allowed by the service script, but will always result in the daemon stopping eventually.  Because start-stop-daemon fails to see the daemon stop in the time allowed, it gives up and assumes the daemon is still running.  When the daemon eventually stops, the next time OpenRC goes to check the status of the service, it sees that the daemon is gone, and marks the service crashed.  If it does stop before the timeout, start-stop-daemon cleans up its tracking information, but using gracefulstop does not mark the service as stopped, so the next time OpenRC checks the status of the service, it has no daemon to check, so it only relies on its internal status, which is started.
Comment 2 Anthony Basile gentoo-dev 2016-09-16 10:34:59 UTC
(In reply to dwfreed from comment #1)
> OpenRC is behaving as well as it can here.  You are using a function that
> OpenRC has no way of knowing stops the service, which asks start-stop-daemon
> to try to stop to the daemon.  For tor, this stop request frequently does
> not succeed in the time allowed by the service script, but will always
> result in the daemon stopping eventually.  Because start-stop-daemon fails
> to see the daemon stop in the time allowed, it gives up and assumes the
> daemon is still running.  When the daemon eventually stops, the next time
> OpenRC goes to check the status of the service, it sees that the daemon is
> gone, and marks the service crashed.  If it does stop before the timeout,
> start-stop-daemon cleans up its tracking information, but using gracefulstop
> does not mark the service as stopped, so the next time OpenRC checks the
> status of the service, it has no daemon to check, so it only relies on its
> internal status, which is started.

The line looks as follows:

start-stop-daemon -P --stop --signal INT -R 60 ...

yet the start-stop-daemon does not wait 60s.  Instead it returns an immediate failure.  This is not the expected behavior.  Before I jump into the C code, can someone from the openrc team look at this and confirm my understanding of what should happen: start-stop-daemon should wait on tor exiting with -R 60.
Comment 3 dwfreed 2016-09-16 13:49:20 UTC
start-stop-daemon -P --stop --signal INT -R 60 works just fine here.  Please run the initscript directly with -v (ie, /etc/init.d/tor -v gracefulstop *not* service -v tor gracefulstop), with stdout directed to a file (if it's working normally, it will spam *a lot*), and attach the resulting file to this bug.
Comment 4 dwfreed 2016-09-16 13:59:06 UTC
Follow-on: include stderr as well, please.
Comment 5 William Hubbs gentoo-dev 2016-09-16 17:55:53 UTC
Created attachment 446010 [details]
testcase.tar.bz2

This is a test case based on Doug's work which works here.

1. put the service script in /etc/init.d.
2. compile the daemon and put the binary in /tmp.
3. start then stop the service.

You should see that it starts and stops successfully.
Comment 6 Anthony Basile gentoo-dev 2016-09-17 13:05:10 UTC
This is a reproduceable bug with tor's init script.  If openrc doesn't feel it is their problem then fine, but don't close a verifiable and reproduceable bug.
Comment 7 William Hubbs gentoo-dev 2016-09-17 21:30:42 UTC
Created attachment 446220 [details]
tor

This is an updated tor service script.

I took several ideas from the systemd service and from the tor man page.

You don't need to hard code the configuration file name. The default
within tor is /etc/tor/torrc-defaults, /etc/tor/torrc or $HOME/.torrc,
in that order. check the man page and let me know if you don't like
the choice I made to not mess with the config files.

stop is now graceful (again taken from the systemd service), and
there is no non-graceful stop. This also means that restart is graceful.
If you want a separate command to do a non-graceful stop I can add that,
but I think it should not be the default.

I saw several places in the script where stderr/stdout were redirected
to /dev/null. When I looked at the tor man page, I found the --hush
command line option. It seems to be a better way to handle this, so
I used it.

let me know your thoughts.
Comment 8 Toralf Förster gentoo-dev 2016-09-18 14:26:37 UTC
(In reply to William Hubbs from comment #7)
works fine here at my Tor exit AFAICT - tested "start", "stop" and "restart".
Comment 9 Anthony Basile gentoo-dev 2016-09-18 15:35:58 UTC
(In reply to Toralf Förster from comment #8)
> (In reply to William Hubbs from comment #7)
> works fine here at my Tor exit AFAICT - tested "start", "stop" and "restart".

Let me work on figuring out what the initial problem was before we go with this solution.  So don't close this bug just yet.
Comment 10 Toralf Förster gentoo-dev 2016-09-18 16:44:29 UTC
Created attachment 446448 [details]
old.log

/etc/init.d/tor -v gracefulstop 2>&1 | tee /tmp/old.log
Comment 11 Toralf Förster gentoo-dev 2016-09-18 16:59:12 UTC
FWIW for the time of debugging this issue /etc/init.d/tor.old is the older init,d script, causing this bug report; tor.new at my system contains the new (here attached in comment #7)

mr-fox init.d # cat /run/openrc/daemons/tor.old/001
exec=/usr/bin/tor
argv_0=/usr/bin/tor
argv_1=-f
argv_2=/etc/tor/torrc
argv_3=--runasdaemon
argv_4=1
argv_5=--PidFile
argv_6=/var/run/tor/tor.pid
pidfile=/var/run/tor/tor.pid

mr-fox init.d # ls -l /run/openrc/daemons/tor.old/001
-rw-r--r-- 1 root root 174 Sep 18 18:56 /run/openrc/daemons/tor.old/001
Comment 12 Toralf Förster gentoo-dev 2016-09-18 17:10:53 UTC
(In reply to Toralf Förster from comment #11)
just for completeness I changed gracefulstop() slightly
from:
    eend "done"
to:
    eend "done rc=$rc"
Comment 13 dwfreed 2016-09-18 17:36:02 UTC
FOUND IT!  The current tor initscript provides start-stop-daemon with --exec and a partial argv in both stop and gracefulstop, in addition to a pidfile.  After start-stop-daemon has successfully done what you asked, it calls rc_service_daemon_set with this information, whose job is to delete the daemon file.  rc_service_daemon_set uses all of the information given to try to match the daemon file to delete, but it does not match, because it is not a proper subset of the arguments given when starting the daemon.  (In this sense, proper subset has its usual meaning, except for argv.  For argv, you can remove as many of the rightmost parameters as you wish; eg, you could provide only the -f for argv_1, but you can't omit argv_1 through argv_4, and only specify argv_5 and argv_6, as you're doing now.  If --exec is provided, that automatically becomes argv_0.)  Because the daemon file is not deleted, OpenRC checks for the daemon whenever it next makes a status check of the service, sees that it's gone, and marks the service crashed.  There are two ways to fix this: give start-stop-daemon the same argv for --stop as you do for --start, or better yet, leave out --exec and argv entirely and just use --pidfile, which will work correctly.

There are other errors in the tor initscript resulting in confusion.  If start-stop-daemon does not itself print an error message of either "no matching processes found" or "X process(es) refused to stop" then it completed successfully.  However, the current initscript does 'eend "done"' after the start-stop-daemon call; this is in error, because the first parameter to eend is a return status, and giving a non-numeric string there instead behaves the same as giving a non-zero return status.  This results in a '[!!]' followed by a '[ok]' in the output.  The second output indicates the start-stop-daemon return status, and the first will always be '[!!]'.  (Of note: eend is a binary, and thus does not have any access to $?, which is why it expects it as the first parameter).
Comment 14 Toralf Förster gentoo-dev 2016-09-18 18:12:38 UTC
Created attachment 446466 [details]
old.log

/etc/init.d/tor -v gracefulstop 2>&1 | tee /tmp/old.log
Comment 15 Anthony Basile gentoo-dev 2016-09-21 14:19:50 UTC
(In reply to dwfreed from comment #13)

do you want to suggest a patch.  i'm still busy with real life.
Comment 16 Anthony Basile gentoo-dev 2016-09-23 22:25:29 UTC
(In reply to Anthony Basile from comment #15)
> (In reply to dwfreed from comment #13)
> 
> do you want to suggest a patch.  i'm still busy with real life.

I've tried the following for gracefulstop() and it sends the right signal, but doesn't mark the service as stopped.  If I put the same code in stop() it does.  That's rather annoying.  Am I right that a process is only marked as stopped if stop() is executed correctly?  Is there any way of marking a service as stopped manually?

gracefulstop() {
        ebegin "Gracefully stopping Tor: max ${GRACEFUL_TIMEOUT} seconds"
        start-stop-daemon --stop -P --signal INT -R ${GRACEFUL_TIMEOUT} --pidfile ${PIDFILE}
        eend $?
}
Comment 17 William Hubbs gentoo-dev 2016-09-23 22:40:40 UTC
(In reply to Anthony Basile from comment #16)
> (In reply to Anthony Basile from comment #15)
> > (In reply to dwfreed from comment #13)
> > 
> > do you want to suggest a patch.  i'm still busy with real life.
> 
> I've tried the following for gracefulstop() and it sends the right signal,
> but doesn't mark the service as stopped.  If I put the same code in stop()
> it does.  That's rather annoying.  Am I right that a process is only marked
> as stopped if stop() is executed correctly?  Is there any way of marking a
> service as stopped manually?

You are correct. There is no way for openrc to know which functions other than stop() are meant to mark a service as stopped.

You may be able to call mark_service_stopped to manually mark the service stopped, but I suspect that will not handle dependencies.

The other advantage of using the stop() function instead of gracefulstop() is that you then don't need graceful() because restart becomes a graceful restart.
Comment 18 Anthony Basile gentoo-dev 2016-09-24 01:05:16 UTC
Okay, I've gone with making INT the default stop signal, following William's example above.  Its in tor-0.2.8.8-r1.ebuild and tor-0.2.9.3_alpha-r1.ebuild.  Please test and reopen if this still has issue.  I've tested and I've found no problems.