restart-services incorrectly restarts postgresql if postgresql uses hugepages. lib_users returns: # lib_users -m 8366;/anon_hugepage;postgres: autovacuum launcher process 8360;/anon_hugepage;/usr/lib64/postgresql-9.5/bin/postgres -D /etc/postgresql-9.5 --data-directory=/dane/bazy/postgresql/9.5/ --unix-socket-directories=/run/postgresql 28294;/anon_hugepage;postgres: dovecot exim 127.0.0.1(7244) idle 8365;/anon_hugepage;postgres: wal writer process 8363;/anon_hugepage;postgres: checkpointer process 28295;/anon_hugepage;postgres: dovecot exim 127.0.0.1(7246) idle 8364;/anon_hugepage;postgres: writer process 28298;/anon_hugepage;postgres: dovecot exim 127.0.0.1(7250) idle 28297;/anon_hugepage;postgres: dovecot exim 127.0.0.1(7248) idle
Hi Marcin, thanks for reporting! Whats the output of lib_users without "-m"?
Hi Marc! This is output of lib_users: 12398,9415 "-su" 14670 "postgres: dovecot exim 127.0.0.1(23974) idle" 14671 "postgres: dovecot exim 127.0.0.1(23976) idle" 9427 "postgres: writer process" 14191 "/sbin/agetty 38400 tty2 linux" 14206 "/sbin/agetty 38400 tty1 linux --noclear" 14147 "/sbin/agetty 38400 tty4 linux" 22326 "ekg -i -u 1034266" 14179 "/sbin/agetty 38400 tty3 linux" 5059 "tmux" 3669 "/usr/sbin/openvpn --config /etc/openvpn/mejor-serwer.conf --writepid /var/run/openvpn.mejor-serwer.pid --daemon --setenv SVCNAME openvpn.mejor-serwer --cd /etc/openvpn --setenv PEER_DNS yes" 12368,20739,5066,5072,5080,5083,5086 "-bash" 9429 "postgres: autovacuum launcher process" 12388,9414 "su -" 9422 "/usr/lib64/postgresql-9.5/bin/postgres -D /etc/postgresql-9.5 --data-directory=/dane/bazy/postgresql/9.5/ --unix-socket-directories=/run/postgresql" 9428 "postgres: wal writer process" 5124 "irssi" 9426 "postgres: checkpointer process" 14131 "/sbin/agetty 38400 tty6 linux"
Sorry for the delay. Is this still an issue for you? I dont see anything unusual in lib_users output :-/ Please attach log of "restart-services -dl"
Yes, nothing changed, problem still exists on my box. Please look at: # lsof -n | grep postgres | grep -i del postgres 462 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 462 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 982 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 982 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 1001 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 1001 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 15687 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 15687 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 15690 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 15690 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 15691 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 15691 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 15692 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 15692 postgres DEL REG 0,5 557056 /SYSV0052e2c1 postgres 15693 postgres DEL REG 0,14 189137919 /anon_hugepage postgres 15693 postgres DEL REG 0,5 557056 /SYSV0052e2c1 maybe this triggers restart of postgresql?
And this is what I get if I disable hugepages in postgres: # lsof -n | grep postgres | grep -i del postgres 9667 postgres DEL REG 0,5 189348486 /dev/zero postgres 9667 postgres DEL REG 0,5 589824 /SYSV0052e2c1 postgres 9670 postgres DEL REG 0,5 189348486 /dev/zero postgres 9670 postgres DEL REG 0,5 589824 /SYSV0052e2c1 postgres 9671 postgres DEL REG 0,5 189348486 /dev/zero postgres 9671 postgres DEL REG 0,5 589824 /SYSV0052e2c1 postgres 9672 postgres DEL REG 0,5 189348486 /dev/zero postgres 9672 postgres DEL REG 0,5 589824 /SYSV0052e2c1 postgres 9673 postgres DEL REG 0,5 189348486 /dev/zero postgres 9673 postgres DEL REG 0,5 589824 /SYSV0052e2c1 postgres 9978 postgres DEL REG 0,5 189348486 /dev/zero postgres 9978 postgres DEL REG 0,5 589824 /SYSV0052e2c1 # restart-services -> Searching for services that need to be restarted ... done => sshd: Not restarting unclassified service (requires -u) ... Now is OK.
Maybe I need to add another excpetion for postgres here. Can you please attach the debug output (restart-services -ld) when huge_pages is enabled? TIA
Here it is: # restart-services -ld debug: loading config file: /etc/restart-services.conf debug: loading config file: /etc/restart-services.d/00-local.conf -> Searching for services that need to be restarted ... debug: analyzing lib_users output ... Warning: Some files could not be read. Note that lib_users has to be run as root to get a full list of deleted in-use libraries. debug: checking pid 1575 debug: lu_pid set to its ppid: 1575 debug: process root is / debug: no direct hit: process "/usr/lib64/postgresql-9.5/bin/postgres -D /etc/postgresql-9.5 --data-directory=/dane/bazy/postgresql/9.5/ --unix-socket-directories=/run/postgresql" (1575) has n o RC_SERVICE env set, adding to TODO_PROCESSES debug: checking pid 1591 debug: lu_pid set to its ppid: 1575 debug: skipping /usr/lib/postgresql-9.5/bin/postgres with pid 1575: we already found that ppid debug: checking pid 1592 debug: lu_pid set to its ppid: 1575 debug: skipping /usr/lib/postgresql-9.5/bin/postgres with pid 1575: we already found that ppid debug: checking pid 1593 debug: lu_pid set to its ppid: 1575 debug: skipping /usr/lib/postgresql-9.5/bin/postgres with pid 1575: we already found that ppid debug: checking pid 1594 debug: lu_pid set to its ppid: 1575 debug: skipping /usr/lib/postgresql-9.5/bin/postgres with pid 1575: we already found that ppid debug: checking pid 3766 debug: lu_pid set to its ppid: 3766 debug: process root is / debug: no direct hit: process "ssh: /home/marcin/.ssh/sockets/root@10.10.10.33-22 [mux]" (3766) has no RC_SERVICE env set, adding to TODO_PROCESSES debug: checking pid 7087 debug: lu_pid set to its ppid: 6014 debug: process root is / debug: no direct hit: process "irssi" (6014) has no RC_SERVICE env set, adding to TODO_PROCESSES debug: checking pid 7184 debug: lu_pid set to its ppid: 6014 debug: skipping /usr/bin/tmux with pid 6014: we already found that ppid debug: checking pid 7272 debug: lu_pid set to its ppid: 6014 debug: skipping /usr/bin/tmux with pid 6014: we already found that ppid debug: checking pid 7672 debug: lu_pid set to its ppid: 6014 debug: skipping /usr/bin/tmux with pid 6014: we already found that ppid debug: checking pid 25708 debug: lu_pid set to its ppid: 25708 debug: process root is / debug: no direct hit: process "ssh: /home/marcin/.ssh/sockets/root@eos.sprawy.it-22 [mux]" (25708) has no RC_SERVICE env set, adding to TODO_PROCESSES debug: analyzing remaining processes (not direct hits) ... debug: TODO_PROCESSES_EXE: /usr/bin/tmux /usr/bin/ssh /usr/lib/postgresql-9.5/bin/postgres /usr/bin/ssh debug: checking exe /usr/bin/tmux (pid 6014) debug: pid 6014 does not look like a script debug: found package: app-misc/tmux debug: no init scripts found. debug: adding to UNKNOWN_PROCESSES: /usr/bin/tmux (6014) debug: checking exe /usr/bin/ssh (pid 3766) debug: pid 3766 does not look like a script debug: found package: net-misc/openssh debug: found init scripts: /etc/init.d/sshd debug: found started service, adding init script: /etc/init.d/sshd debug: checking exe /usr/lib/postgresql-9.5/bin/postgres (pid 1575) debug: pid 1575 does not look like a script debug: found package: dev-db/postgresql debug: found init scripts: /etc/init.d/postgresql-10 /etc/init.d/postgresql-9.5 debug: no started service found. debug: found started service, adding init script: /etc/init.d/postgresql-9.5 debug: checking exe /usr/bin/ssh (pid 25708) debug: pid 25708 does not look like a script debug: found package: net-misc/openssh debug: found init scripts: /etc/init.d/sshd debug: found started service, adding init script: /etc/init.d/sshd done => Found 2 services that need to be restarted or reloaded =-> unclassified services (2): postgresql-9.5 sshd => Found 0 inittab processes that need to be restarted debug: looking for false positives in unknown processes debug: looking at unknown process "irssi" (6014) => Warning: The following running processes did not match any init script and have been updated or deleted: "irssi" (6014)
Hi Marcin, i could now reproduce this here in my machine. Unfortunately this seems to be a bug in lib_users which reports "/anon_hugepage" as a deleted library. Postgres with "huge_pages = on": ~ # lib_users -m|grep postgres 30014;/anon_hugepage;postgres: wal writer process 30013;/anon_hugepage;postgres: writer process 30008;/anon_hugepage;/usr/lib64/postgresql-9.6/bin/postgres -p 5432 -D /var/lib/postgresql/9.6/data 30012;/anon_hugepage;postgres: checkpointer process 30015;/anon_hugepage;postgres: autovacuum launcher process and with "huge_pages = off", there is no output for postgres at all. Lets see what Tobias thinks about this.
Fixed in git: https://github.com/klausman/lib_users/commit/0586f1605aed0f56a957032dda704a43d40f6bf9 This will be in the next release. Until then, lib_users can be made to ignore /anon_hugepage by using the -I commandline option.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3a03a4a7ea282a7fd9da74be187b15826fb654c2 commit 3a03a4a7ea282a7fd9da74be187b15826fb654c2 Author: Tobias Klausmann <klausman@gentoo.org> AuthorDate: 2018-11-19 09:25:34 +0000 Commit: Tobias Klausmann <klausman@gentoo.org> CommitDate: 2018-11-19 09:25:34 +0000 app-admin/lib_users: Add v0.13, drop v0.12 Bug: https://bugs.gentoo.org/648356 Package-Manager: Portage-2.3.51, Repoman-2.3.12 Signed-off-by: Tobias Klausmann <klausman@gentoo.org> app-admin/lib_users/Manifest | 2 +- app-admin/lib_users/{lib_users-0.12.ebuild => lib_users-0.13.ebuild} | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
(In reply to Tobias Klausmann from comment #9) > Fixed in git: > > https://github.com/klausman/lib_users/commit/ > 0586f1605aed0f56a957032dda704a43d40f6bf9 > > This will be in the next release. Until then, lib_users can be made to > ignore /anon_hugepage by using the -I commandline option. Thanks Tobias for this and the new release ;-) I added "-I /anon_hugepage" to lib_users call in restart-services 0.14.2 which I will remove again in a later version when lib_users <0.13 should have vanished from most systems And thanks to Marcin for reporting!
Quick fix! Thank you.