| Summary: | dev-python/pyopencl should make selection of OpenCL version possible | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | hangglider |
| Component: | Current packages | Assignee: | Marek Szuba <marecki> |
| Status: | RESOLVED INVALID | ||
| Severity: | enhancement | ||
| Priority: | Normal | ||
| Version: | unspecified | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
|
Description
hangglider
2020-09-23 17:54:39 UTC
In general I am not against the idea, then again I would like to know for sure it is actually necessary. Could you please demonstrate how using pyopencl in its default configuration (i.e. declaring support for whatever the highest OpenCL version supported by opencl-headers is) on hardware only supporting an older OpenCL, actually causes problems? /usr/include/CL/cl_version.h supports even setting a default version (refer line 20..35 there). I'm using a Thinkpad W541 currently (IMHO not so historical hardware), containing a NVIDIA Quadro K2100M, where the latest driver version is -418.113, so the latest in portage is -390.138-r4, allowing for OpenCL platform version 1.2, referring to clinfo; even if I hacked -418.113, there's no benefit. If I install pyopencl and try to run an example program (e.g. python /usr/share/doc/pyopencl-2020.2.2/examples/demo_mandelbrot.py), I get an error that symbol OPENCL_2_1 is not available in _cl.cpython-37m-x86_64-linux-gnu.so (will have to re-install the original tonight to re-generate the full message). Only if I modify the ebuild to additionaly set --cl-pretend-version=1.2 in my case, the example executes correctly. If it'd be possible to select that at runtime (by an env variable or symlink, I'm maybe only not aware of), I'd be completely glad with - no need to improve the complexity. (In reply to hangglider from comment #2) > /usr/include/CL/cl_version.h supports even setting a default version (refer > line 20..35 there). Yes, pyopencl uses this at build time. > I'm using a Thinkpad W541 currently (IMHO not so historical hardware), > containing a NVIDIA Quadro K2100M, where the latest driver version is > -418.113, so the latest in portage is -390.138-r4, allowing for OpenCL > platform version 1.2, referring to clinfo; even if I hacked -418.113, > there's no benefit. The reason you are stuck with OpenCL 1.2 is not due to either the age of the hardware or the driver, it's because NVidia have chosen not to implement SVM support in their OpenCL runtime - meaning it can never be fully compliant with 2.x. > If I install pyopencl and try to run an example program (e.g. python > /usr/share/doc/pyopencl-2020.2.2/examples/demo_mandelbrot.py), I get an > error that symbol OPENCL_2_1 is not available in > _cl.cpython-37m-x86_64-linux-gnu.so (will have to re-install the original > tonight to re-generate the full message). Only if I modify the ebuild to > additionaly set --cl-pretend-version=1.2 in my case, the example executes > correctly. Okay, now this is interesting - not only is this a linker issue rather than the code itself failing to run, OPENCL_2_1 and the like should come from libOpenCL.so itself rather than from the pyopencl Python extension. Which OpenCL ICD loader do you use, dev-libs/ocl-icd or dev-libs/opencl-icd-loader? Could you run 'nm' on your libOpenCL.so and see what symbols beginning with OPENCL it contains? > If it'd be possible to select that at runtime (by an env variable or > symlink, I'm maybe only not aware of), I'd be completely glad with - no need > to improve the complexity. Alas, as far as I can tell changing this is only possible at build time. Clarification in case you haven't used nm before: you will need to invoke it with the option -D in order to see the dynamic symbols, by default it shows something else - or nothing in case of stripped binaries, which will likely be the case for your libOpenCL. Thanks, Marek, for the rapid response. # nm -D /usr/lib64/libOpenCL.so | grep OPENCL_ 0000000000000000 A OPENCL_1.0 0000000000000000 A OPENCL_1.1 0000000000000000 A OPENCL_1.2 0000000000000000 A OPENCL_2.0 0000000000000000 A OPENCL_2.1 0000000000000000 A OPENCL_2.2 and I'm using dev-libs/ocl-icd Shame on NVIDIA, but I fear that many people share that... so a solution would probably keep such behaviour, but help some on the other hand. OK, that looks reasonable. Next step, let's see if this is in fact the OpenCL library pyopencl attempts to use :-) From the same environment as where you can observe errors in pyopencl examples, run 'ldd /usr/lib/python3.7/site-packages/pyopencl/_cl.cpython-37m-x86_64-linux-gnu.so' and check what the full path to libOpenCL.so is. I have managed to reproduce the problem using =x11-drivers/nvidia-drivers-450.66. It happens if the system uses libOpenCL provided by NVidia (which does not export symbols "OPENCL_2.1" and "OPENCL_2.2") rather than the one from an ICD loader; when the latter is used pyopencl works fine without version overrides. Marking this issue as INVALID because since May 2020 Gentoo officially only supports using libOpenCL provided by one of the two ICD loaders in the tree; feel free to reopen it if it occurs on your system even in supported configuration. |