Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 720094 - app-admin/restart-services-0.14.4 does not restart couple of services
Summary: app-admin/restart-services-0.14.4 does not restart couple of services
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Marc Schiffbauer
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-30 08:05 UTC by Klemen Mihevc
Modified: 2020-05-11 17:59 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
lib_users.log (lib_users-failed.log,2.67 KB, text/plain)
2020-05-03 06:08 UTC, Klemen Mihevc
Details
restart-services debug (restart-services-failed.log,6.03 KB, text/plain)
2020-05-03 06:08 UTC, Klemen Mihevc
Details
0.15pre1 log (0.15_pre1.log,27.34 KB, text/plain)
2020-05-04 06:49 UTC, Klemen Mihevc
Details
rc1 log (0.15.0_rc1.log,61.62 KB, application/octet-stream)
2020-05-04 20:38 UTC, Klemen Mihevc
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Klemen Mihevc 2020-04-30 08:05:15 UTC
I have few issues with restarting few of the services.

=> Warning: The following running processes did not match any known service but have been updated or deleted (or some deps):
supervising syslog-ng (pid: 2765, process-root: /)
bpfilter_umh (pid: 2, process-root: /)
init [3] (pid: 1, process-root: /)
/usr/sbin/dovecot -c /etc/dovecot/dovecot.conf (pid: 5401, process-root: /)
sshd: /usr/sbin/sshd -o PidFile=/run/sshd.pid -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups (pid: 5067, process-root: /)
/usr/libexec/postfix/master -w (pid: 5681, process-root: /)

and this is my local config:

SV_ALWAYS='(acpid|cronie|dhcpd|dhcpd6|miniupnpd|monit|ntpd|qbittorrent|radvd|sshd|udpxy|vnstatd|wsdd)'
SV_ALWAYS_WITH_NODEPS='(syslog-ng|udev)'
SV_CRITICAL='(apache2|dovecot|ipsec|mysql|named|opendkim|postfix|samba)'
SV_IGNORE='(ntp-client)'
INITTAB_KILLALL='(/sbin/agetty)'

I also tried to put * around dovecot/postfix without luck...

also bpfilter is from kernel, so that process could be ignored and sshd is from current shell, so thats (usually) ok as well... other services are working fine, this warnings are after glibc recompile, when it basically wants to restart everything and i usually just restart server at this point, but point of it is to collect all warnings about processes...

Reproducible: Always
Comment 1 Tomáš Mózes 2020-04-30 18:31:32 UTC
Same here, it fails to restart syslog-ng, postfix, dovecot, postgresql, openldap..
Comment 2 Marc Schiffbauer gentoo-dev 2020-05-03 04:04:12 UTC
Thanks for your report! 

It would be helpful for me to have the debug output of restart-services when it fails to find the correct init script for some service.

Are you able to add it to the bug? TIA!
Comment 3 Marc Schiffbauer gentoo-dev 2020-05-03 04:30:30 UTC
(In reply to Tomáš Mózes from comment #1)
> Same here, it fails to restart syslog-ng, postfix, dovecot, postgresql,
> openldap..

Please, can you also add debug output of restart-services

and maybe 'lib_users -m' could also be useful.

thanks
Comment 4 Klemen Mihevc 2020-05-03 06:08:08 UTC
Created attachment 635770 [details]
lib_users.log
Comment 5 Klemen Mihevc 2020-05-03 06:08:51 UTC
Created attachment 635772 [details]
restart-services debug

Sure, you can get both...
Comment 6 Klemen Mihevc 2020-05-03 09:30:02 UTC
I checked environ from /proc/pid/ for example for dovecot and its empty so it cant parse RC_SERVICE... why its empty... no clue since actual init script seems to look more or less similar as other scripts that do set environ file...
Comment 7 Tomáš Mózes 2020-05-03 09:30:58 UTC
(In reply to Marc Schiffbauer from comment #3)
> (In reply to Tomáš Mózes from comment #1)
> > Same here, it fails to restart syslog-ng, postfix, dovecot, postgresql,
> > openldap..
> 
> Please, can you also add debug output of restart-services
> 
> and maybe 'lib_users -m' could also be useful.
> 
> thanks

# emerge -v1 syslog-ng postfix

# restart-services -d
debug: loading config file: /etc/restart-services.conf
debug: loading config file: /etc/restart-services.d/00-local.conf
debug: loading config file: /etc/restart-services.d/10-local.conf
-> Searching for services that need to be restarted ...  debug: analyzing lib_users output ...
debug: ignoring false positive (proc=bpfilter_umh with pid(s) 259): seems to be a kernel process
debug: 30611: checking pid 30611
debug: 30611: lu_pid set to its ppid (was 30611)
debug: 30611: process root is /
debug: 30611: no direct hit: process supervising syslog-ng has no RC_SERVICE env set, adding to TODO_PROCESSES
debug: 30612: checking pid 30612
debug: 30611: lu_pid set to its ppid (was 30612)
debug: 30611: skipping /usr/sbin/syslog-ng with: we already found that ppid
debug: 30731: checking pid 30731
debug: 30731: lu_pid set to its ppid (was 30731)
debug: 30731: process root is /
debug: 30731: no direct hit: process /usr/libexec/postfix/master -w has no RC_SERVICE env set, adding to TODO_PROCESSES
debug: 30732: checking pid 30732
debug: 30731: lu_pid set to its ppid (was 30732)
debug: 30731: skipping /usr/libexec/postfix/master with: we already found that ppid
debug: 30733: checking pid 30733
debug: 30731: lu_pid set to its ppid (was 30733)
debug: 30731: skipping /usr/libexec/postfix/master with: we already found that ppid
debug: 30731: analyzing remaining processes (not direct hits) ...
debug: 30731: TODO_PROCESSES_EXE: /usr/libexec/postfix/master /usr/sbin/syslog-ng
debug: 30731: checking exe /usr/libexec/postfix/master
debug: 30731: process does not look like a script
debug: 30731: found package: mail-mta/postfix:
debug: 30731: no init scripts found.
debug: 30731: adding to UNKNOWN_PROCESSES: /usr/libexec/postfix/master
debug: 30611: checking exe /usr/sbin/syslog-ng
debug: 30611: process does not look like a script
debug: 30611: found package: app-admin/syslog-ng:
debug: 30611: no init scripts found.
debug: 30611: adding to UNKNOWN_PROCESSES: /usr/sbin/syslog-ng                                                                                                                                                                               done
No known services need to be restarted.
debug: looking for false positives in unknown processes
debug: looking at unknown process /usr/libexec/postfix/master -w (30731)
debug: looking at unknown process supervising syslog-ng (30611)

=> Warning: The following running processes did not match any known service but have been updated or deleted (or some deps):
/usr/libexec/postfix/master -w (pid: 30731, process-root: /)
supervising syslog-ng (pid: 30611, process-root: /)


# lib_users -m
259;/;bpfilter_umh
30611;/usr/lib64/libevtlog-3.26.so.0.0.0,/usr/lib64/libsecret-storage.so.0.0.0,/usr/lib64/libsyslog-ng-3.26.so.0.0.0,/usr/sbin/syslog-ng;supervising syslog-ng
30612;/usr/lib64/libevtlog-3.26.so.0.0.0,/usr/lib64/libsecret-storage.so.0.0.0,/usr/lib64/libsyslog-ng-3.26.so.0.0.0,/usr/lib64/syslog-ng/libaffile.so,/usr/lib64/syslog-ng/libafsocket.so,/usr/lib64/syslog-ng/libafuser.so,/usr/lib64/syslog
-ng/libappmodel.so,/usr/lib64/syslog-ng/libbasicfuncs.so,/usr/lib64/syslog-ng/libconfgen.so,/usr/lib64/syslog-ng/libcsvparser.so,/usr/lib64/syslog-ng/libkvformat.so,/usr/lib64/syslog-ng/liblinux-kmsg-format.so,/usr/lib64/syslog-ng/libsysl
ogformat.so,/usr/lib64/syslog-ng/libsystem-source.so,/usr/sbin/syslog-ng;/usr/sbin/syslog-ng --cfgfile /etc/syslog-ng/syslog-ng.conf --control /run/syslog-ng.ctl --persist-file /var/lib/syslog-ng/syslog-ng.persist --pidfile /run/syslog-ng
.pid
30731;/usr/lib64/postfix/3.4.9/libpostfix-global.so,/usr/lib64/postfix/3.4.9/libpostfix-util.so,/usr/libexec/postfix/master;/usr/libexec/postfix/master -w
30732;/usr/lib64/postfix/3.4.9/libpostfix-global.so,/usr/lib64/postfix/3.4.9/libpostfix-master.so,/usr/lib64/postfix/3.4.9/libpostfix-util.so,/usr/libexec/postfix/pickup;pickup -l -t unix -u
30733;/usr/lib64/postfix/3.4.9/libpostfix-global.so,/usr/lib64/postfix/3.4.9/libpostfix-master.so,/usr/lib64/postfix/3.4.9/libpostfix-util.so,/usr/libexec/postfix/qmgr;qmgr -l -t unix -u
Comment 8 Marc Schiffbauer gentoo-dev 2020-05-03 16:19:07 UTC
(In reply to Klemen Mihevc from comment #6)
> I checked environ from /proc/pid/ for example for dovecot and its empty so
> it cant parse RC_SERVICE... why its empty... no clue since actual init
> script seems to look more or less similar as other scripts that do set
> environ file...

Thank you!
Interesting… and can you see if any child-process has RC_SERVICE set?
Comment 9 Marc Schiffbauer gentoo-dev 2020-05-03 16:22:17 UTC
(In reply to Tomáš Mózes from comment #7)
> (In reply to Marc Schiffbauer from comment #3)
> > (In reply to Tomáš Mózes from comment #1)
> > > Same here, it fails to restart syslog-ng, postfix, dovecot, postgresql,
> > > openldap..
> > 
> > Please, can you also add debug output of restart-services
> > 
> > and maybe 'lib_users -m' could also be useful.
> > 
> > thanks
> 
> # emerge -v1 syslog-ng postfix
> 
> # restart-services -d
> debug: loading config file: /etc/restart-services.conf
[…]

Thank you! How about postgresql and openldap? These two would be very interesting. Though I could understand if you cannot rebuild/restart them now...
Comment 10 Klemen Mihevc 2020-05-03 16:33:48 UTC
mih ~ # ps xau | grep dovecot
root      5401  0.0  0.0   4312  2688 ?        Ss   11:33   0:00 /usr/sbin/dovecot -c /etc/dovecot/dovecot.conf
dovecot   5407  0.0  0.0   4012  1044 ?        S    11:33   0:00 dovecot/anvil
root      5409  0.0  0.0   4272  2972 ?        S    11:33   0:00 dovecot/log
root      5411  0.0  0.0   6352  4708 ?        S    11:33   0:00 dovecot/config
dovecot   6416  0.0  0.0   7828  3996 ?        S    12:02   0:00 dovecot/stats

mih ~ # cat /proc/5401/environ
mih ~ # cat /proc/5407/environ
mih ~ # cat /proc/5409/environ
mih ~ # cat /proc/5411/environ

all empty :/

sshd:

mih /etc/init.d # cat /proc/5067/environ -A
onfig [listener] 0 of 10-100 startupsn^@

yes this mess...

postfix:

mih /etc/init.d # cat /proc/5682/environ -A
meta_directory=/etc/postfix^@manpage_directory=/usr/share/man^@PWD=/var/spool/postfix^@MAIL_LOGTAG=postfix^@shlib_directory=/usr/lib64/postfix/3.5.1^@sendmail_path=/usr/sbin/sendmail^@queue_directory=/var/spool/postfix^@config_directory=/etc/postfix^@LANG=C^@setgid_group=postdrop^@readme_directory=no^@data_directory=/var/lib/postfix^@newaliases_path=/usr/bin/newaliases^@MAIL_CONFIG=/etc/postfix^@SHLVL=1^@command_directory=/usr/sbin^@mail_owner=postfix^@html_directory=no^@daemon_directory=/usr/libexec/postfix^@PATH=/bin:/usr/bin:/sbin:/usr/sbin^@mailq_path=/usr/bin/mailq^@sample_directory=/etc/postfix^@OLDPWD=/etc/postfix^@_=/usr/libexec/postfix/master^@


syslog:

mih /etc/init.d # ps xau | grep syslog
root      2766  0.0  0.0   8500   528 ?        S    11:33   0:00 supervising syslog-ng
root      2767  0.0  0.0 371764  7348 ?        Ss   11:33   0:00 /usr/sbin/syslog-ng --cfgfile /etc/syslog-ng/syslog-ng.conf --control /run/syslog-ng.ctl --persist-file /var/lib/syslog-ng/syslog-ng.persist --pidfile /run/syslog-ng.pid

1st pid have same 2766 pid have same mess as sshd, however 2767 seems correct:

mih /etc/init.d # cat /proc/2767/environ -A
LESS=-R -M --shift 5^@ROOTPATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin^@CONFIG_PROTECT_MASK=/etc/sandbox.d /etc/php/cli-php7.4/ext-active/ /etc/php/cgi-php7.4/ext-active/ /etc/php/apache2-php7.4/ext-active/ /etc/fonts/fonts.conf /etc/gentoo-release /etc/terminfo /etc/ca-certificates.conf /etc/revdep-rebuild^@LE_CONFIG_HOME=/etc/acme-sh/^@SVCNAME=syslog-ng^@PWD=/^@CONFIG_PROTECT=/var/bind /usr/share/gnupg/qualified.txt^@MANPATH=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/man:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.34/man:/usr/lib64/php7.4/man/:/usr/local/share/man:/usr/share/man^@OPENCL_PROFILE=nvidia^@LANG=sl_SI.utf8^@RC_SERVICE=/etc/init.d/syslog-ng^@INFOPATH=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.3.0/info:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.34/info:/usr/share/info^@TERM=linux^@LESSOPEN=|lesspipe %s^@LE_WORKING_DIR=/etc/acme-sh/^@MANPAGER=manpager^@SHLVL=1^@LC_MESSAGES=en_US.utf8^@GCC_SPECS=^@LC_TIME=en_GB.utf8^@RC_SVCNAME=syslog-ng^@EINFO_LOG=/etc/init.d/syslog-ng^@LC_COLLATE=POSIX^@_=/sbin/start-stop-daemon^@HOME=/root^@USER=root^@PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin^@
Comment 11 Klemen Mihevc 2020-05-03 16:35:07 UTC
damn this comment was real mess with this long lines :/
Comment 12 Marc Schiffbauer gentoo-dev 2020-05-03 19:57:54 UTC
(In reply to Klemen Mihevc from comment #11)
> damn this comment was real mess with this long lines :/

Thanks! Its helpful!
Comment 13 Marc Schiffbauer gentoo-dev 2020-05-04 06:30:32 UTC
I uploaded 0.15.0_pre1 to https://dev.gentoo.org/~mschiff/tmp/restart-services
(sha256sum f0927e486ab219ef0434a6903717fe419fd9810397b7a3fede6a343cdc0d8463)

Can you give this a try?

Should address already most of the issues...
Comment 14 Klemen Mihevc 2020-05-04 06:49:17 UTC
Created attachment 635918 [details]
0.15pre1 log

Here, log...

Also btw just noticed main sshd process can/should be restarted.

root      5061  0.0  0.0   8280  3144 ?        Ss   00:04   0:00 sshd: /usr/sbin/sshd -o PidFile=/run/sshd.pid -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
root     15610  0.0  0.0   9112  7016 ?        Ss   08:40   0:00 sshd: root@pts/0

because as long as PTS process is up you wont get disconnected or something. Its safe to restart sshd in init script while logged in.
Comment 15 Marc Schiffbauer gentoo-dev 2020-05-04 14:51:51 UTC
(In reply to Klemen Mihevc from comment #14)
> Created attachment 635918 [details]
> 0.15pre1 log
> 
> Here, log...
> 
> Also btw just noticed main sshd process can/should be restarted.
> 
> root      5061  0.0  0.0   8280  3144 ?        Ss   00:04   0:00 sshd:
> /usr/sbin/sshd -o PidFile=/run/sshd.pid -f /etc/ssh/sshd_config [listener] 0
> of 10-100 startups
> root     15610  0.0  0.0   9112  7016 ?        Ss   08:40   0:00 sshd:
> root@pts/0
> 
> because as long as PTS process is up you wont get disconnected or something.
> Its safe to restart sshd in init script while logged in.

Thank you.

One issue I noticed that I cannot reproduce right now:

Your debug output has package names that always end in a colon, which must be a bug.

The command:

  qfile --nocolor -- "/usr/sbin/dovecot"|awk '{print $1}'

seems to output "net-mail/dovecot:" instead of "net-mail/dovecot"

Can you confirm that? And if so, what awk package/version are you using?


If I can fix that, then postfix and dovecot should work out of the box again.


Furthermore for version 0.15 I plan to implement another additional way to find an init script if current methods do not succeed:
- try to find init script with name == package name
- try to find init script based on process name

But first I want to be able to reproduce the bug with the package name...
Comment 16 Klemen Mihevc 2020-05-04 15:17:23 UTC
Its infact ends in a colon for me...

mih ~ # qfile --nocolor -- "/usr/sbin/dovecot"|awk '{print $1}'
net-mail/dovecot:
mih ~ # eix sys-apps/gawk
[I] sys-apps/gawk
     Available versions:  4.2.1-r1 (~)5.0.1 (~)5.1.0 {mpfr nls readline}
     Installed versions:  5.1.0(09:25:07 29/04/20)(nls readline -mpfr)
     Homepage:            https://www.gnu.org/software/gawk/gawk.html
     Description:         GNU awk pattern-matching language

I tried to roll back to 4.2.1-r1 (stable) and it still ends in a colon, so not sure whats that about...
Comment 17 Klemen Mihevc 2020-05-04 15:27:17 UTC
Found out the culprint... its portage-utils change from 0.74 to 0.80+

mih ~ # qfile --nocolor -- "/usr/sbin/dovecot"|awk '{print $1}'
net-mail/dovecot
mih ~ # eix portage-utils
[U] app-portage/portage-utils
     Available versions:  0.74 0.80 (~)0.86 **9999*l {libressl nls openmp +qmanifest +qtegrity static}
     Installed versions:  0.74(17:25:03 04/05/20)(nls -static)
     Homepage:            https://wiki.gentoo.org/wiki/Portage-utils
     Description:         Small and fast Portage helper tools written in C


but when i run latest one (0.86) you get a semicolon....
Comment 18 Marc Schiffbauer gentoo-dev 2020-05-04 15:40:57 UTC
(In reply to Klemen Mihevc from comment #17)
> Found out the culprint... its portage-utils change from 0.74 to 0.80+
> 
> mih ~ # qfile --nocolor -- "/usr/sbin/dovecot"|awk '{print $1}'
> net-mail/dovecot
> mih ~ # eix portage-utils
> [U] app-portage/portage-utils
>      Available versions:  0.74 0.80 (~)0.86 **9999*l {libressl nls openmp
> +qmanifest +qtegrity static}
>      Installed versions:  0.74(17:25:03 04/05/20)(nls -static)
>      Homepage:            https://wiki.gentoo.org/wiki/Portage-utils
>      Description:         Small and fast Portage helper tools written in C
> 
> 
> but when i run latest one (0.86) you get a semicolon....


Yes, I just found the same. portage-utils >= 0.80 having a colon.

And a ";" with 0.86? Wow.. OK, I will find a solution for that. Thanks!
Comment 19 Klemen Mihevc 2020-05-04 15:42:12 UTC
no sorry! its a colon :) my mistake !

Also now that i noticed that restart-services is depended on portage-utils, you should maybe throw it in to DEPEND in ebuild to make sure its installed...
Comment 20 Marc Schiffbauer gentoo-dev 2020-05-04 20:20:43 UTC
(In reply to Klemen Mihevc from comment #19)
> no sorry! its a colon :) my mistake !
> 
> Also now that i noticed that restart-services is depended on portage-utils,
> you should maybe throw it in to DEPEND in ebuild to make sure its
> installed...

Yes its added for the next version

Can you give 0.15.0_rc1 a try?

sha256sum da151c49255883b9e9911e7ee39dcff49e448b42e5757298fce26a8b664ad295

(same URL als before)

Changelog:
FIX: ignore more kernel threads showing up in lib_users
FIX: support newer versions of portage-utils (qfile)
MOD: improve output of unknown processes
NEW: Added new feature CUSTOM_PROCESS_MAP which allows
     definition of custom process<->service mappings
NEW: add some kind of fuzzy search for better service recognition

And you can create your own CUSTOM_PROCESS_MAP for custom (and unknown) processes if there are more weird things (although I am still interested in things that will not be found as expected)

You can add to your config:

# Assign processes to (custom) services manually.
# A match in CUSTOM_PROCESS_MAP takes precedence over any automatically found
# service.
# Format of each entry is [<unknown proc as in restart-services -l>]=<service name>
# Defined services will then be processed as any automatically found service.
# proc will be matched using regular bash pattern mathing.
# To use a extended regular expression (as in regex(3)) for a proc prepend it with 'E@'
# Examples:
#   [/usr/local/bin/myproc*]=myproc
#   [E@/usr/(local)?/bin/myproc[0-9]*]=myproc
#CUSTOM_PROCESS_MAP='( )'

Feedback welcome ;-)
Comment 21 Klemen Mihevc 2020-05-04 20:38:43 UTC
Created attachment 636092 [details]
rc1 log

rc1 log of full restart after glibc recompile... seems to work fine, just add /sbin/init on some ignore list :)

and about CUSTOM_PROCESS_MAP i guess if you have multiple, you seperate them with | like for other options?
Comment 22 Marc Schiffbauer gentoo-dev 2020-05-05 05:44:59 UTC
(In reply to Klemen Mihevc from comment #21)
> Created attachment 636092 [details]
> rc1 log
> 
> rc1 log of full restart after glibc recompile... seems to work fine, just
> add /sbin/init on some ignore list :)

Thanks again! I will include something for init as well and then release and upload version 0.15.0 which will then close this bug.

> 
> and about CUSTOM_PROCESS_MAP i guess if you have multiple, you seperate them
> with | like for other options?

No just with a space. You can also use one line per entry.

CUSTOM_PROCESS_MAP=(
...
...
...
)

-Marc
Comment 23 Marc Schiffbauer gentoo-dev 2020-05-05 06:27:28 UTC
Version 0.15.0 now released and uploaded

Thansk again for reporting!
Comment 24 Tomáš Mózes 2020-05-11 17:59:01 UTC
Thanks, version 0.15.0 looks much better!