Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 435398 - net-firewall/ipsec-tools-0.8.0-r4: init-scripts for racoon not working reliably
Summary: net-firewall/ipsec-tools-0.8.0-r4: init-scripts for racoon not working reliably
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Anthony Basile
URL:
Whiteboard:
Keywords:
Depends on: 435174
Blocks:
  Show dependency tree
 
Reported: 2012-09-18 10:41 UTC by cilly
Modified: 2012-09-28 07:29 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description cilly 2012-09-18 10:41:46 UTC
Right after reboot, the init-scripts are not working correctly:

1. reboot
2. racoon is running and has pid
3. init-script fails to mark racoon as started cause of already existing pid of racoon

Restarting racoon with init-script is not working cause of existing pid.

You need to manually killall racoon and start init-script.
Comment 1 cilly 2012-09-18 10:42:56 UTC
Problem was inherited with init-script changes from:

https://bugs.gentoo.org/435174
Comment 2 Diego Elio Pettenò (RETIRED) gentoo-dev 2012-09-18 14:32:02 UTC
Uhm, wait, what you mean "cause of already existing pid of racoon" — if you're saying that the pid from the previous reboot is already there, that's a problem for most init scripts, that's why /var/run is cleaned up and (in the most recent OpenRC) handled in tmpfs.
Comment 3 cilly 2012-09-18 15:04:26 UTC
(In reply to comment #2)
> Uhm, wait, what you mean "cause of already existing pid of racoon" — if
> you're saying that the pid from the previous reboot is already there, that's
> a problem for most init scripts, that's why /var/run is cleaned up and (in
> the most recent OpenRC) handled in tmpfs.

No. I mean racoon is running with valid pid-file and the init-script has status "stopped".
Comment 4 Anthony Basile gentoo-dev 2012-09-18 17:23:57 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Uhm, wait, what you mean "cause of already existing pid of racoon" — if
> > you're saying that the pid from the previous reboot is already there, that's
> > a problem for most init scripts, that's why /var/run is cleaned up and (in
> > the most recent OpenRC) handled in tmpfs.
> 
> No. I mean racoon is running with valid pid-file and the init-script has
> status "stopped".

cilly i haven't been able to reproduce that.  How did the valid pid get there?  Did you start racoon without the init script?
Comment 5 Anthony Basile gentoo-dev 2012-09-18 17:33:26 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > Uhm, wait, what you mean "cause of already existing pid of racoon" — if
> > > you're saying that the pid from the previous reboot is already there, that's
> > > a problem for most init scripts, that's why /var/run is cleaned up and (in
> > > the most recent OpenRC) handled in tmpfs.
> > 
> > No. I mean racoon is running with valid pid-file and the init-script has
> > status "stopped".
> 
> cilly i haven't been able to reproduce that.  How did the valid pid get
> there?  Did you start racoon without the init script?

cilly does it happen every time?  I think i'm going to set up a vm and just reboot a few times.
Comment 6 cilly 2012-09-18 20:58:39 UTC
Yes it happens every time. The previous init-scripts didn't have the problem.
Comment 7 cilly 2012-09-18 21:00:36 UTC
(In reply to comment #4)

> cilly i haven't been able to reproduce that.  How did the valid pid get
> there?  Did you start racoon without the init script?

I have no idea how racoon got started, the init-script should start it. I didn't start anything manually.
Comment 8 Diego Elio Pettenò (RETIRED) gentoo-dev 2012-09-18 21:20:29 UTC
Do you have the actual message that gets printed? I think I have a hunch of what could be: the time to start racoon might be higher than the default timeout that s-s-d expects so it reports it as failed even though it's started... I got that problem with another service before, the trick is to increase the timeout.
Comment 9 cilly 2012-09-19 07:39:14 UTC
(In reply to comment #8)
> Do you have the actual message that gets printed? I think I have a hunch of
> what could be: the time to start racoon might be higher than the default
> timeout that s-s-d expects so it reports it as failed even though it's
> started... I got that problem with another service before, the trick is to
> increase the timeout.

Sep 18 14:22:49 pluto /etc/init.d/racoon[4284]: start-stop-daemon: did not create a valid pid in `/var/run/racoon.pid'
Sep 18 14:22:49 pluto /etc/init.d/racoon[4265]: ERROR: racoon failed to start

But that message is bs, since the pid is valid and racoon is running.

I don't see the new init-scripts as improvement. Please, make them more solid. Ipsec is an important service for me which must be available right after boot, since I am hosting some clients.

So why isn't it working reliably? Is this the new init-system? Or are there missing some checks in the init-script?
Comment 10 Anthony Basile gentoo-dev 2012-09-19 10:15:39 UTC
> missing some checks in the init-script?

cilly, let's test diego's suggestion.  Try adding this line:

   start_stop_daemon_args="--wait 1000"

You can add it just before start_pre(), like this:

    command=/usr/sbin/racoon
    command_args="-f ${RACOON_CONF} ${RACOON_OPTS}"
    pidfile=/var/run/racoon.pid
    start_stop_daemon_args="--wait 1000"




If it still doesn't work, try an insanely high value like 10000 just to make sure this isn't the isue.
Comment 11 Diego Elio Pettenò (RETIRED) gentoo-dev 2012-09-19 17:12:16 UTC
FWIW without the pidfile checking in s-s-d racoon can crash and you have no way to be notified as OpenRC won't monitor it. That's why I needed those changes.

By the way, Anthony, can't you close the depending bug and just have this one open until fixed?
Comment 12 cilly 2012-09-20 13:00:35 UTC
(In reply to comment #10)
> > missing some checks in the init-script?
> 
> cilly, let's test diego's suggestion.  Try adding this line:
> 
>    start_stop_daemon_args="--wait 1000"
> 
> You can add it just before start_pre(), like this:
> 
>     command=/usr/sbin/racoon
>     command_args="-f ${RACOON_CONF} ${RACOON_OPTS}"
>     pidfile=/var/run/racoon.pid
>     start_stop_daemon_args="--wait 1000"
> 
> 
> 
> 
> If it still doesn't work, try an insanely high value like 10000 just to make
> sure this isn't the isue.

This seems to fix it.
Comment 13 Anthony Basile gentoo-dev 2012-09-20 14:06:40 UTC
(In reply to comment #12)
> (In reply to comment #10)
> > > missing some checks in the init-script?
> > 
> > cilly, let's test diego's suggestion.  Try adding this line:
> > 
> >    start_stop_daemon_args="--wait 1000"
> > 
> > You can add it just before start_pre(), like this:
> > 
> >     command=/usr/sbin/racoon
> >     command_args="-f ${RACOON_CONF} ${RACOON_OPTS}"
> >     pidfile=/var/run/racoon.pid
> >     start_stop_daemon_args="--wait 1000"
> > 
> > 
> > 
> > 
> > If it still doesn't work, try an insanely high value like 10000 just to make
> > sure this isn't the isue.
> 
> This seems to fix it.

cilly before I commit this, can you try smaller values and see where it breaks for you.  Sorry to throw more work at you but there is a competition here: the longer I make the wait, the greater the guarantee it will work, but the longer the delay in booting/restarting.  1000 = 1 second.
Comment 14 cilly 2012-09-27 11:49:49 UTC
blueness: The value of 1000 = 1 second is okay. With shorter values it sometimes works and sometimes doesn't.
Comment 15 Anthony Basile gentoo-dev 2012-09-28 01:08:50 UTC
(In reply to comment #14)
> blueness: The value of 1000 = 1 second is okay. With shorter values it
> sometimes works and sometimes doesn't.

Fixed in the tree with ipsec-tools-0.8.0-r5.  You can adjust the wait time by editing /etc/conf.d/racoon.  I've set the default to

    RACOON_WAIT="1000"

Please test and reopen if there's a problem.
Comment 16 cilly 2012-09-28 07:29:56 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > blueness: The value of 1000 = 1 second is okay. With shorter values it
> > sometimes works and sometimes doesn't.
> 
> Fixed in the tree with ipsec-tools-0.8.0-r5.  You can adjust the wait time
> by editing /etc/conf.d/racoon.  I've set the default to
> 
>     RACOON_WAIT="1000"
> 
> Please test and reopen if there's a problem.

Working perfectly!