The wrapper script used for python executables fails if a wrapped python script forks and executes another wrapped python script. In such cases, sys.argv[0] is set to the parent process for some reason (although the correct child wrapper is called), causing the child wrapper to execute the parent executable instead of the child executable. One case where this occurs is the ipcluster command of ipython, which executes ipcontroller and ipengine processes. A test case demonstrating the problem is easy to make. Copy a wrapper script to files test1, test2. Create a test1-2.6 (if python 2.6 is installed) containing: #!/your_prefix/usr/bin/python2.6 import os print "test1 executed" newargs = ['./test2'] os.execvpe('./test2', newargs, os.environ) and a test2-2.6 containing: #!/your_prefix/usr/bin/python2.6 print "test2 executed" Now call ./test1-2.6 to see that the script functions. Calling ./test1 however will throw the script in a loop (which is easy to break as we intentionally left out the fork command (which makes no difference to the behaviour)). Possible solutions: - set the script name as a constant upon generation of the wrapper - a more general solution is in the diff below: --- ipcluster 2010-04-17 23:54:31.788047026 +0200 +++ test 2010-04-18 08:49:16.469013350 +0200 @@ -5,6 +5,7 @@ import re import subprocess import sys +import inspect EPYTHON_re = re.compile(r"^python(\d+\.\d+)$") python_shebang_re = re.compile(r"^#! *(/data/home/mhulsman/gentoo/usr/bin/python|(/data/home/mhulsman/gentoo)?/usr/bin/env +(/data/home/mhulsman/gentoo/usr/bin/)?python)") @@ -31,7 +32,7 @@ sys.stderr.write("'eselect python show --python2' printed unrecognized value '%s'\n" % EPYTHON) sys.exit(1) -wrapper_script_path = os.path.realpath(sys.argv[0]) +wrapper_script_path = os.path.realpath(inspect.getfile( inspect.currentframe())) target_executable_path = "%s-%s" % (wrapper_script_path, PYTHON_ABI) os.environ["GENTOO_PYTHON_PROCESS_NAME"] = os.path.basename(sys.argv[0]) os.environ["GENTOO_PYTHON_WRAPPER_SCRIPT_PATH"] = sys.argv[0] Tested on amd64-linux Reproducible: Always
An even better solution: use __file__ instead of the inspect machinery
This could be the same as bug #313639, but I'm not sure
(In reply to comment #2) > This could be the same as bug #313639, but I'm not sure > I think you're right. Replacing test1 with the following solves the problem too: #!/data/home/mhulsman/gentoo/usr/bin/python2.6 import os, sys newargs = ['./test2'] env = os.environ.copy() env.pop('GENTOO_PYTHON_WRAPPER_SCRIPT_PATH',None) os.execvpe('./test2', newargs, env) Apparently python patch 61_all_process_data.patch causes this, it changes the sys module such that it replaces sys.argv[0] with GENTOO_PYTHON_WRAPPER_SCRIPT_PATH. This means that every python child process which was executed from a wrapped python script and inherits the parent environement will have its sys.argv[0] set incorrectly. For the wrapper scripts itself this is easy to fix, however this leaves the non-wrapped child processes. I guess those can only be fixed by patching the exec-- calls of python such that they remove the GENTOO_PYTHON vars from the environment.
Looked at it a bit more. The real problem is that makeargvobject (which unsets the environment variable) is called after convertenviron in posixmodule.c. This function converts the environment to a python dictionary for use in the os module, which is later on used in the exec calls, still containing the GENTOO_PYTHON_WRAPPER_SCRIPT_PATH variable.
Arfrever: do you agree that bug #313639 is the same as this bug?
*** Bug 313639 has been marked as a duplicate of this bug. ***
Arfrever: can you please solve this annoying problem? It brought down a substantial amount of buildnodes now. Mark gave a fairly precise pointer to the cause of this problem in comment #4, so I'd expect you to be able to easily release a fix, don't you?
It breaks stuff badly for quite some time now.
(In reply to comment #7 and comment #8) I was busy with other bugs.
Created attachment 229849 [details, diff] python-2.6-os.environ.patch Please test this patch.
seems not to help: running autogen /scratch/tmp/monetdb-buildbot/slave-sparc/sparc-solaris10/build/sources/buildtools/autogen/autogen.py: Unknown command: . Usage: buildbot <command> [command options] Options: --version --help Display this help and exit. --verbose Commands: I'm not sure if these two in the environment cause the trouble: GENTOO_PYTHON_PROCESS_NAME=buildbot GENTOO_PYTHON_TARGET_SCRIPT_PATH=/scratch/tmp/gentoo/usr/bin/buildbot-2.6 If I change the patch to read: + if (strncmp("GENTOO_PYTHON_", *e, strlen("GENTOO_PYTHON_")) == 0) + continue; then they are all gone, and the autogen job in buildbot appears to work again.
Could you test the following executables: $ cat test1.c #include <stdio.h> #include <stdlib.h> int main(int argc, char** argv) { const char* test_variable = getenv("TEST_VARIABLE"); if (test_variable) printf("test1: TEST_VARIABLE=\"%s\"\n", test_variable); putenv("TEST_VARIABLE"); execv("./test2", argv); } $ cat test2.c #include <stdio.h> #include <stdlib.h> int main(int argc, char** argv) { const char* test_variable = getenv("TEST_VARIABLE"); if (test_variable) printf("test2: TEST_VARIABLE=\"%s\"\n", test_variable); return 0; } $ gcc -o test1 test1.c $ gcc -o test2 test2.c $ TEST_VARIABLE=1 ./test1 test1: TEST_VARIABLE="1"
% env TEST_VARIABLE=1 ./test1 test1: TEST_VARIABLE="1" test2: TEST_VARIABLE="1" putenv - change or add value to environment The putenv() function makes the value of the environment variable name equal to value by altering an existing variable or creating a new one. In either case, the string pointed to by string becomes part of the environment, so altering the string will change the environment.
My /usr/include/stdlib.h contains: #if defined __USE_SVID || defined __USE_XOPEN /* The SVID says this is in <stdio.h>, but this seems a better place. */ /* Put STRING, which is of the form "NAME=VALUE", in the environment. If there is no `=', remove NAME from the environment. */ extern int putenv (char *__string) __THROW __nonnull ((1)); #endif So it seems that your libc is broken.
No, my libc just has a different contract than yours.
I will use unsetenv() instead of putenv().
unsetenv and putenv both need not to exist. putenv may leak memory on some hosts. unsetenv and setenv don't exist on this host. Look at the python code and documentation to see all the ifdefs and the strong argumentation that you should just pass a new environment array to exec. I'm still wondering what special the wrapper does not to be able to impersonate itself to the python process.
configure.in already has check for unsetenv(). unsetenv() will be used when HAVE_UNSETENV is defined. Otherwise putenv() will be used.
great, so it still won't work, because unsetenv doesn't exist, and putenv doesn't work as you expect.
unsetenv() will be used when HAVE_UNSETENV is defined. putenv() will not be used when HAVE_UNSETENV is not defined. GENTOO_PYTHON_* variables will be ignored in convertenviron().
Fixed in 2.6.5-r2 and 3.1.2-r3.