Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 927552 - net-dns/nsd: some OpenRC service script improvements
Summary: net-dns/nsd: some OpenRC service script improvements
Status: IN_PROGRESS
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Joshua Kinard
URL: https://github.com/NLnetLabs/nsd/issu...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-22 18:40 UTC by Michael Orlitzky
Modified: 2024-04-09 19:19 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
nsd init script (nsd,1.30 KB, text/plain)
2024-03-25 11:51 UTC, Michael Orlitzky
Details
nsd init script v2 (nsd,1.10 KB, text/plain)
2024-03-25 12:43 UTC, Michael Orlitzky
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Orlitzky gentoo-dev 2024-03-22 18:40:00 UTC
I'm in the market for a new authoritative DNS server and was snooping around. I noticed a few things that could be simplified in the nsd init script, some of which are covered in the service script guide: https://github.com/OpenRC/openrc/blob/master/service-script-guide.md

The easy ones:

1. "need net" is probably wrong (covered in the guide)
2. The "test -e" for $NSD_CONFIG in checkconfig() can be handled with "required_files" (man openrc-run)
3. The "kill" command in reload() can be done better with SSD: start-stop-daemon --signal HUP --pidfile "${pidfile}"

The hard one:

nsd creates its PID file as nsd:nsd, which makes it unsafe to use in the init script. start-stop-daemon runs (and kills things) as root, and the nsd user can write to the PID file, meaning that he can write (for example) "1" to it, causing the system to be rebooted when nsd is stopped. (This is a problem for systemd, too).

I don't see an easy way to fix this right now because nsd forks, and if we let OpenRC create its own PID file, it will contain the wrong PID. There's also no option to run nsd in the foreground. This is a confusing one to report upstream, but I'll have to try.
Comment 1 Michael Orlitzky gentoo-dev 2024-03-25 11:51:40 UTC
Created attachment 888522 [details]
nsd init script

I missed the "-d" flag that can be used to run the daemon in the foreground. It still creates an (insecure) PID file, but it can be worked around. Here's a version of the init script that lets OpenRC handle it.
Comment 2 Michael Orlitzky gentoo-dev 2024-03-25 12:43:46 UTC
Created attachment 888535 [details]
nsd init script v2

v2: passing -P '' stops it from creating the extra PID file when it's running in the foreground
Comment 3 Michael Orlitzky gentoo-dev 2024-03-27 13:54:59 UTC
The upstream issue is now fixed, so the PID file will remain root:root after creation and there is no risk in later killing the PID it contains as root.

The fork/pid approach can therefore stick around after the next release, although running in the foreground is probably simpler anyway. In any case the comment about the insecure PID file in my v2 is obsolete.
Comment 4 Joshua Kinard gentoo-dev 2024-04-07 04:45:06 UTC
Do you also have a systemd setup and could maybe look at the service file in $FILESDIR and compare it to the upstream-provided service file in 'contrib/' and see if we can drop ours?  I only use OpenRC, so vetting systemd stuff is out of my purview.  Thanks!
Comment 5 Michael Orlitzky gentoo-dev 2024-04-07 11:40:13 UTC
(In reply to Joshua Kinard from comment #4)
> Do you also have a systemd setup and could maybe look at the service file in
> $FILESDIR and compare it to the upstream-provided service file in 'contrib/'
> and see if we can drop ours?  I only use OpenRC, so vetting systemd stuff is
> out of my purview.  Thanks!

Uh oh, I also use OpenRC exclusively. I know a bit about how the service scripts work... I just can't test them. With that said, I think the upstream script should work as well as the one in $FILESDIR. The upstream copy has the small issue that it will still create a PID file even though the daemon is running in the foreground (type=simple, in systemd slang). As a result, it needs /run/nsd to exist to hold the PID file. But really, nothing is going to use the PID file, so we shouldn't bother:

  https://github.com/NLnetLabs/nsd/issues/317#issuecomment-2041438207

In Gentoo we could address that with 

  --with-pidfile=""

in src_configure(), for both OpenRC and systemd. In that case, the -P "" that I added to the OpenRC script is not needed.
Comment 6 Michael Orlitzky gentoo-dev 2024-04-07 11:51:16 UTC
Another thing I thought of in the meantime:

My version of the OpenRC script avoids creating /run/nsd for the PID file, because a PID file is no longer created.

However, some users may use nsd-control with a local socket. Where would they have put it? My guess is that they would have put it in /run/nsd. And in any case, we should probably support users who want to run nsd-control locally by providing a standard location with the correct permissions to hold the control socket.

It's not safe to parse the config file and chown whatever path is in there to nsd:nsd. So instead, I would suggest that we create /run/nsd (owned by nsd) unconditionally, and document that if you want to use a local path for nsd-control, that you should use /run/nsd/something.

We had this exact issue with clamav and its socket. It's not pretty, but using sed to add e.g.

  # control-interface: /run/nsd/nsd-control.sock

into the sample config file has avoided most of the socket-related issues we used to have. So long as that example path works out-of-the-box, users don't try too hard to rock the boat.
Comment 7 Michael Orlitzky gentoo-dev 2024-04-08 11:29:17 UTC
The upstream systemd service script now avoids creating a PID file, so no tmpfiles.d entry (and no --with-pidfile="") is needed: https://github.com/NLnetLabs/nsd/commit/1366b6a9f3528ab6438013be8a62ddd3e64a210f
Comment 8 Joshua Kinard gentoo-dev 2024-04-09 00:32:31 UTC
(In reply to Michael Orlitzky from comment #7)
> The upstream systemd service script now avoids creating a PID file, so no
> tmpfiles.d entry (and no --with-pidfile="") is needed:
> https://github.com/NLnetLabs/nsd/commit/
> 1366b6a9f3528ab6438013be8a62ddd3e64a210f

The only bits I think I need to copy over from our shipped copy of 'nsd.service' are the two flags to set the user and the group, and then that can be sunset w/ 4.8.x.  For the nsd-control socket, I'll look into patching the sample config file over sed, so that it's a bit cleaner.  Am thinking we really need to mod the build system to pass in a path for that, but that'll be a more intrusive change, so I can make that longer-term.

That said, I have a cleaned-up 4.9.1 ebuild almost done.  I've exposed more of the configure toggles as local USE flags, including making some of the forced settings in 4.8.x (ipv6, relro, largefile, etc) togglable.  I need to add an exclusion for musl for year2038 support -- been seeing that toggle turn up lately in configure scripts, but only o0n glibc systems.  It disappears under musl.  The munin and dnstap bits I'll have to just keep my fingers crossed, as I don't use either of those, so hard to test.
Comment 9 Michael Orlitzky gentoo-dev 2024-04-09 00:49:02 UTC
(In reply to Joshua Kinard from comment #8)
> 
> The only bits I think I need to copy over from our shipped copy of
> 'nsd.service' are the two flags to set the user and the group, and then that
> can be sunset w/ 4.8.x. 

Here's where it would be great if either of us could test the systemd service. The ./configure output doesn't mention it, but if you check configure.ac (in 4.9.1 at least), the default value for --with-user is "nsd". So I think in the absence of the -o and -g flags, it will use nsd:nsd automatically. In other words, the upstream .service file might work without modification.
Comment 10 Joshua Kinard gentoo-dev 2024-04-09 02:52:44 UTC
(In reply to Michael Orlitzky from comment #9)
> (In reply to Joshua Kinard from comment #8)
> > 
> > The only bits I think I need to copy over from our shipped copy of
> > 'nsd.service' are the two flags to set the user and the group, and then that
> > can be sunset w/ 4.8.x. 
> 
> Here's where it would be great if either of us could test the systemd
> service. The ./configure output doesn't mention it, but if you check
> configure.ac (in 4.9.1 at least), the default value for --with-user is
> "nsd". So I think in the absence of the -o and -g flags, it will use nsd:nsd
> automatically. In other words, the upstream .service file might work without
> modification.

Yeah, I unfortunately have some...strong opinions...about systemd, so that's why I'm on OpenRC :)

That said, I found that it does set the default user context to "nsd".  After configure runs, it defines this in config.h:
> /* the user name to drop privileges to */
> #define USER "nsd"
Nothing about the default group context, though.  I guess it just blindly assumes to use "nsd" also, which I suppose might be a minor issue, cause not everything out there uses the "group is same as user" setup.

As for the socket...it looks like configure has a trick for that, too:
>  --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
>  --runstatedir=DIR       modifiable per-process data [LOCALSTATEDIR/run]
It's kludgy, because runstatedir is off of $localstatedir, which by default, is /var.  But we have a /var/run symlink that points back to /run, so I think setting runstatedir to "LOCALSTATEDIR/run/nsd" might give nsd a safe place to store its nsd-control socket and thus, make it usable.  Might cause issues with PID files created by NSD, though, but since we're intentionally disabling that, might not be an issue.

Testing this is going to be interesting.  I have two NSD instances already running on my home network, but they both live in FreeBSD jails on two different machines.  Thinking of stringing up a third instance on my dev box and tying that in, then configuring an AXFR key to see if it receives local domain updates.  If it can get that far and answer some queries about my local zone, then it should be good.
Comment 11 Michael Orlitzky gentoo-dev 2024-04-09 19:19:38 UTC
(In reply to Joshua Kinard from comment #10)
> 
> As for the socket...it looks like configure has a trick for that, too:
> >  --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
> >  --runstatedir=DIR       modifiable per-process data [LOCALSTATEDIR/run]
> It's kludgy, because runstatedir is off of $localstatedir, which by default,
> is /var.  But we have a /var/run symlink that points back to /run, so I
> think setting runstatedir to "LOCALSTATEDIR/run/nsd" might give nsd a safe
> place to store its nsd-control socket and thus, make it usable.  Might cause
> issues with PID files created by NSD, though, but since we're intentionally
> disabling that, might not be an issue.

Maybe we shouldn't worry about the socket for 4.9.1. The sample config file uses autoconf templating, so later on if we upstream the OpenRC script and have it also use autoconf templating, we can get everything just right:

  * --runstatedir=/run in the ebuild
  * checkpath -o @user@ -d @runstatedir@/${RC_SVCNAME} for the socket directory in the openrc script
  * # control-interface: @runstatedir@/nsd as an example in the config file
  * command=@sbindir@/nsd (et cetera) in the openrc script rather than hard-coding the paths

If you can get the rest to a place where you're happy with it (and nobody reports any bugs), I'll work on a PR eventually.