Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 264734 - app-admin/eselect does not use portable 'ps'
Summary: app-admin/eselect does not use portable 'ps'
Status: RESOLVED FIXED
Alias: None
Product: Gentoo/Alt
Classification: Unclassified
Component: Prefix Support (show other bugs)
Hardware: All IRIX
: High minor (vote)
Assignee: Gentoo eselect Team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2009-04-03 09:44 UTC by Stuart Shelton
Modified: 2009-05-18 21:28 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stuart Shelton 2009-04-03 09:44:13 UTC
On many (all?) prefix platforms, sys-process/procps (supplying the GNU 'ps' command) is unavailable.

Running 'eselect' with a module which doesn't exist results in:

$ eselect badmodule
!!! Error: Can't load module badmodule
UX:ps: ERROR: Illegal option -- x                                               UX:ps: INFO: Usage: ps [ -edalfcjMPT ] [ -t termlist ] [ -u uidlist ] [ -o format ]                                                                             UX:ps: INFO:           [ -U userlist ] [ -G grplist ] [ -p proclist ] [ -g grplist ]
UX:ps: INFO:           [ -s sidlist ] [ -J jidlist ]
Killed

... because IRIX 'ps' lacks a '-x' option.  This doesn't look to affect correct invocations, but should probably be fixed regardless.

The only file using 'ps' in this way appears to be ${EPREFIX}/usr/share/eselect/libs/core.bash
Comment 1 Fabian Groffen gentoo-dev 2009-04-03 10:06:23 UTC
Known issue.  It fails nearly everywhere.  I hope that the eselect rewrite will take portability a bit more serious.

Since it usually just bombs out and terminates anyway, I prefer to leave it as is, since eselect is in a zombie state maintainer wise, and concrete plans to replace it are being worked on as we speak.
Comment 2 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2009-04-03 14:59:42 UTC
I would like to leave this open as a known issue so people can reference one of the problems with eselect.
Comment 3 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2009-04-03 15:00:48 UTC
(In reply to comment #2)
> I would like to leave this open as a known issue so people can reference one of
> the problems with eselect.
> 

Even though it may not get fixed soon or at all.
Comment 4 Fabian Groffen gentoo-dev 2009-04-03 15:02:52 UTC
I think the ps is introduced by one of our own patches, because it originally uses pgrep and pkill, which are even less portable.
Comment 5 Timothy Redaelli (RETIRED) gentoo-dev 2009-04-03 15:31:00 UTC
pgrep is more p(In reply to comment #4)
> I think the ps is introduced by one of our own patches, because it originally
> uses pgrep and pkill, which are even less portable.

Err, pgrep is more portable.
It's present in Linux, Solaris and FreeBSD (and maybe more)
Comment 6 Fabian Groffen gentoo-dev 2009-04-03 15:36:12 UTC
Well, pgrep and pkill are originally Solaris utilities.  A simulation has been added to procps.  The Darwin supposedly support in some package was totally broken, and I'm not sure if you can call BSD, Linux and Solaris "more portable".

Anyway, there's always a platform that lacks the one, or has the other just a bit differently implemented.  And that all to kill the parent group or something.  (I don't recall exactly any more).
Comment 7 Stuart Shelton 2009-04-03 17:26:41 UTC
No 'pgrep' here (on IRIX) I'm afraid ...
Comment 8 Stuart Shelton 2009-04-03 17:28:24 UTC
... no 'pkill' either.  There is '/sbin/killall', which works in the Linux sense (pkill) rather than the Solaris sense (halt the system, IIRC)
Comment 9 Ulrich Müller gentoo-dev 2009-04-14 21:12:34 UTC
As far as I can see, there are no occurences of "ps" in all the eselect source.
"pgrep" is called once, namely from the "die" function (in libs/core.bash).

And this call is sort of optional: "die" sends a SIGTERM to the pid of the eselect process. Then it sends a SIGKILL to it and to all its children. And then it calls "exit". ;-) I think that except for some pathological cases, a SIGTERM to the parent should be enough. (And in fact, that's what the "die" implementation of Portage does.)

How about the following? It is not a perfect fix, but will suppress the error message if "pgrep" is missing (and "childs" will be empty in that case).

-    childs=$(pgrep -P ${ESELECT_KILL_TARGET})
+    childs=$(pgrep -P ${ESELECT_KILL_TARGET} 2>/dev/null)
Comment 10 Fabian Groffen gentoo-dev 2009-04-14 21:14:51 UTC
why can't it use jobs (bash built-in) for instance to get the child pids?
Comment 11 Ulrich Müller gentoo-dev 2009-04-14 23:08:41 UTC
(In reply to comment #10)
> why can't it use jobs (bash built-in) for instance to get the child pids?

Because job control is disabled for non-interactive shells. One would have to explicitly enable it ("set -m"), and I don't think that's a good idea.

Does the following work (should be pure POSIX, as far as ps is concerned):

    ps -o pid= -o ppid= | while read pid ppid; do
	[[ ${ppid} = ${ESELECT_KILL_TARGET} ]] \
	    && childs="${childs} ${pid}"
    done
Comment 12 Fabian Groffen gentoo-dev 2009-04-15 09:16:04 UTC
(Open)Solaris:
% echo $$
24546
% ps -o pid= -o ppid=
20275 24546
24546 12598

(so that's ps and the shell itself)

Linux:
% echo $$
23627
% ps -o pid= -o ppid=
 6405  6404
23627 23626
23758 23627

(something, the shell, and ps)

Darwin:
% echo $$
28750
% ps -o pid= -o ppid=
     
 7780
 9094
 9095
 9225
 7889
 9362
 9228
 9364
 9516
10989
28750
20675

(hmmmm?  an empty line and a no parent pids?)


FreeBSD:
% echo $$
51732
% ps -o pid= -o ppid=
86008 86006
92376 86008
92377 92376
51732 51730
51735 51732

(no clue)
Comment 13 Michael Haubenwallner (RETIRED) gentoo-dev 2009-04-15 13:10:51 UTC
AIX 5.3:

$ echo $$
1863010
$ ps -o pid= -o ppid=   
1863010 1380876
2290882 1863010

(shell, ps)


HP-UX:

$ echo $$
20576

$ ps -o pid= -o ppid=                                                                                                                                                                       
ps: illegal option -- o
usage: ps [-edaxzflP] [-u ulist] [-g glist] [-p plist] [-t tlist] [-R prmgroup] [-Z psetidlist]

but wait (reading manpages): starting with hpux10, there is the UNIX95 environment variable, that needs to be set (value is irrelevant):

$ UNIX95= ps -o pid= -o ppid=
20576 20574
20769 20576

(shell, ps)

And since hpux11.31, more of them are known, as seen in http://docs.hp.com/en/B2355-60130/standards.5.html
UNIX95=
UNIX_STD=95
UNIX_STD=1995
UNIX_STD=2003
Comment 14 Stuart Shelton 2009-04-15 15:45:56 UTC
IRIX:

$ echo $$
2761
$ ps -o pid= -o ppid=         
      2761       2748 
  62103828       2761 

(shell, ps.  System has been up for 99 days, so the ps PID is a little high ;)
Comment 15 Ulrich Müller gentoo-dev 2009-04-15 22:36:52 UTC
(In reply to comment #12)
> Darwin:
> [...] 
> (hmmmm?  an empty line and a no parent pids?)

I don't really expect that "LC_ALL=POSIX ps -o pid= -o ppid=" would work, but maybe you could try it anyway?
Comment 16 Ulrich Müller gentoo-dev 2009-04-17 04:29:18 UTC
Looks like ps is also not portable enough and cannot be used. Therefore, I've  committed the fix from comment #9 to SVN trunk (r429) and 1.0.x branch (r430).

And I've locally tested a version that doesn't do any pgrep, but just kills the parent process. Even that seems to work fine. But let's be conservative and leave the pgrep in, for archs that support it.

(In reply to comment #1)
> Since it usually just bombs out and terminates anyway, [...]

Are there any known cases where it fails to die?

> since eselect is in a zombie state maintainer wise, and concrete plans to
> replace it are being worked on as we speak.

I have my own opinion on this one. ;-) But a bug report is not the right place for this, so let's have such general discussions elsewhere.
Comment 17 Fabian Groffen gentoo-dev 2009-04-17 05:59:05 UTC
sorry for the delay, the POSIX trick doesn't work for Darwin either
Comment 18 Ulrich Müller gentoo-dev 2009-04-18 09:09:27 UTC
(In reply to comment #16)
> Looks like ps is also not portable enough and cannot be used. Therefore, I've 
> committed the fix from comment #9 to SVN trunk (r429) and 1.0.x branch (r430).

This is in 1.0.12. 

In the SVN trunk, I've eliminated pgrep altogether (r457). This will require some testing; eselect-9999 is a live ebuild for the brave.
Comment 19 Ulrich Müller gentoo-dev 2009-05-18 20:21:52 UTC
(In reply to comment #18)
> In the SVN trunk, I've eliminated pgrep altogether (r457). This will require
> some testing; eselect-9999 is a live ebuild for the brave.

eselect-1.1_rc1 was released today. Closing.
Comment 20 Fabian Groffen gentoo-dev 2009-05-18 21:28:13 UTC
cool, thanks!