Hi, following the nginx mailing list today I saw $URL: > [...] > > I don't think that calling "nginx -t" as a mandatory step before > configuration reload is a good idea: nginx binary running and > nginx binary on disk might be different, and "nginx -t" result > might be incorrect because of this, in some cases rejecting valid > configurations. > > Additionally, it does duplicate work by parsing/loading a > configuration which will be again parsed by a master process > during configuration reload. While in most cases it's not > significant, I've seen configurations taking more than 1m to load > due to big geo module bases used. > > [...] In addition: Our reload function just validates kill's return value. But the kill command may succeed, but nginx may refuse to load the new configuration. In the beginning of the same thread, upstream says: > Reloading is enough. What is very wrong is to assume that sending > a HUP signal to nginx is enough for a reload. For various > reasons, ranging from configuration syntax errors to out of memory > problems, configuration reload might fail. > > [...] So for me it seems to be wrong to tell the user "Refreshing nginx was successful", because we cannot tell. A better message would be "Sending SIGHUP to nginx" using ebegin and when succeed, show an einfo or ewarn telling the user to watch logs for errors, due to nginx may have refused to reload, see http://mailman.nginx.org/pipermail/nginx/2013-August/040186.html Reproducible: Always
Sounds ok to me (if it only applies to reload/upgrade). Patch?
(In reply to Johan Bergström from comment #1) > Sounds ok to me (if it only applies to reload/upgrade). Patch? Wouldn't it be more reasonable to change the nginx return value if the reload isn't ok (for instance, configuration syntax error)?
Which return value? We are sending a signal using "kill". So all we will get is a return value from "kill"... we never know (without checking nginx's error log) if HUP'ing was really successful. Or am I missing something?
We changed it to use `nginx -t` at some point and evaluate this return value. So I guess we can close this as FIXED.
This bug seems to have been mistakenly closed as FIXED. The argument was to *remove* a config check (using nginx -t), not to add it. Assuming the arguments presented in the original report and the referenced bug thread still hold (and I'd say they do), the original proposed resolution looks good to me, at least for the 'reload' action. That is: > A better message would be "Sending SIGHUP to nginx" using > ebegin and when succeed, show an einfo or ewarn telling the user to watch logs for > errors, due to nginx may have refused to reload, see http://mailman.nginx.org/pipermail > /nginx/2013-August/040186.html
So let's re-opening this bug for further discussions: The problem: ============ Problem 1: `nginx -t` isn't reliable/can lead to unexpected results The command will always use the binary from disk. This version don't have to match which the version of the running nginx master process. That means even if `nginx -t` succeeded (because your configuration is valid with the nginx version on disk) you could have an invalid configuration for the _running_ nginx version. (and that's why https://bugs.gentoo.org/show_bug.cgi?id=481456#c4 wasn't a fix for this bug) Problem 2: Duplicated time consuming work when dealing with large configurations It is duplicated work because the nginx master process will do the same check when receiving the HUP signal. Regarding problem 1: We could record nginx' version in OpenRC (service_set_value) and only allow reload when the nginx version from disk matches the one running. Then we should never hit such a situation. However we should indicate in the output that we only sent the command to reload but cannot tell if nginx is now really using the new configuration (only nginx can you tell). I.e. it is possible that nginx rejects reloading and keep using the old configuration from memory: # grep `pgrep --uid 0 nginx` /var/log/nginx/error_log 2016/03/04 14:17:03 [emerg] 26354#26354: unknown directive "this_will_break_configuration" in /etc/nginx/nginx.conf:67 The only reliable way, which can be used in scripts/CFM tools, seems to be the "upgrade" command. Does anybody knows a reason why you shouldn't always use "upgrade"? Regarding problem 2: I am not sure if we need to address this problem. While I agree that this can double nginx startup time I prefer a response I can use in scripts. But we could add "skip_cfgcheck_on_{reload,restart,start}" in /etc/conf.d/nginx to allow users to bypass these checks.