561706 – x11-drivers/nvidia-drivers should install CUDA/OpenCL libraries separately from USE=X

Bug 561706 - x11-drivers/nvidia-drivers should install CUDA/OpenCL libraries separately from USE=X

Summary: x11-drivers/nvidia-drivers should install CUDA/OpenCL libraries separately fr...

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal normal
Assignee:	David Seifert

URL:
Whiteboard:
Keywords:	PullRequest

Duplicates (3):	592984 600182 663830 (view as bug list)
Depends on:
Blocks:

Reported:	2015-09-28 08:32 UTC by Mirko Guenther
Modified:	2021-03-21 15:53 UTC (History)
CC List:	13 users (show)

See Also:	https://github.com/gentoo/gentoo/pull/19812
Package list:
Runtime testing required:	---

Attachments
system.info (system.info,10.00 KB, text/plain) 2016-03-19 18:10 UTC, Sabayonino	Details
debug-nvidia-drivers.log.tbz2 (debug-nvidia-drivers.log.tbz2,146 bytes, application/x-bzip-compressed-tar) 2016-03-20 22:59 UTC, Sabayonino	Details
nvidia-drivers-361.28.ebuild patch (nv.diff,685 bytes, patch) 2016-06-28 11:04 UTC, frank	Details \| Diff
nvidia-drivers-361.28.ebuild.patch (file_561706.txt,2.15 KB, patch) 2016-08-26 15:10 UTC, frank	Details \| Diff
nvidia-drivers-381.22.ebuild.patch (nvidia-drivers-381.22.ebuild.patch,3.28 KB, patch) 2017-06-06 21:21 UTC, Tommie	Details \| Diff
nvidia-drivers-430.40.ebuild (nvidia-drivers-430.40.ebuild.patch,3.64 KB, patch) 2019-08-06 18:17 UTC, Liam Shepherd	Details \| Diff
Show Obsolete (3) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Mirko Guenther 2015-09-28 08:32:29 UTC

with older driver versions it was possible to install and use CUDA/OpenCL without need of install the X system using use flag 'x'.

Now the logic in the new ebuild was rewritten which enforces to enable 'x' use flag to get CUDA/OpenCL support.

Reproducible: Always

Comment 1 Sabayonino 2016-03-19 18:10:31 UTC

Created attachment 428604 [details]
system.info

Comment 2 Sabayonino 2016-03-19 18:12:45 UTC

Same issue with 4 pc without X server 
 OpenCL stop to work with nvidia-drivers >252.xx

Only one pc run openCl fine with latest nvidia-drivers and X-server running

regards

Comment 3 Sabayonino 2016-03-20 22:59:41 UTC

Created attachment 428686 [details]
debug-nvidia-drivers.log.tbz2

Comment 4 Sabayonino 2016-03-20 23:00:35 UTC

try new nvidia-drivers install (361.18 , 358.15 , 355.11)

drivers seem successfuly installed but   :

"* Adding module to moduledb.
!!! Error: Unrecognized option: nvidia
exiting"

#########


>>> Installing (1 of 1) x11-drivers/nvidia-drivers-358.16-r1::gentoo
 * Removing /usr/share/doc
 * >>> SetUID: [chmod go-r] /opt/bin/nvidia-modprobe ...                                                                                                                                           [ ok ]
 * Removing x11-drivers/nvidia-drivers-352.63 from moduledb.
 * Updating module dependencies for 4.1.15-gentoo-r1 ...                                                                                                                                           [ ok ]
 * Adding module to moduledb.
!!! Error: Unrecognized option: nvidia
exiting
 * You have elected to not install the X.org driver. Along with
 * this the OpenGL libraries and VDPAU libraries were not
 * installed. Additionally, once the driver is loaded your card
 * and fan will run at max speed which may not be desirable.
 * Use the 'nvidia-smi' init script to have your card and fan
 * speed scale appropriately.
 * 
 * USE=tools controls whether the nvidia-settings application
 * is installed. If you would like to use it, enable that
 * flag and re-emerge this ebuild. Optionally you can install
 * media-video/nvidia-settings
 * 

 * Messages for package x11-drivers/nvidia-drivers-358.16-r1:

 * You have elected to not install the X.org driver. Along with
 * this the OpenGL libraries and VDPAU libraries were not
 * installed. Additionally, once the driver is loaded your card
 * and fan will run at max speed which may not be desirable.
 * Use the 'nvidia-smi' init script to have your card and fan
 * speed scale appropriately.
 * 
 * USE=tools controls whether the nvidia-settings application
 * is installed. If you would like to use it, enable that
 * flag and re-emerge this ebuild. Optionally you can install
 * media-video/nvidia-settings
 * 
>>> Auto-cleaning packages...

>>> No outdated packages were found on your system.

 * GNU info directory index is up-to-date.


No nvidia opencl available

eselect opencl list
Available OpenCL implementations:
  [1]   mesa


merging nvidia-drivers with --debug option get :
+ //usr/bin/eselect opencl set --use-old nvidia
!!! Error: Unrecognized option: nvidia
exiting
 see full attachment

Comment 5 Zoltan Puskas 2016-03-30 09:41:29 UTC

Same with nvidia-drivers-364.12-r1. Any particular reason for this? Why the need to install a full X server system?

Comment 6 Jeroen Roovers (RETIRED) gentoo-dev

2016-04-16 11:08:53 UTC

Comment on attachment 428686 [details]
debug-nvidia-drivers.log.tbz2

I have no idea what this is.

Comment 7 frank 2016-06-28 11:04:36 UTC

Created attachment 439072 [details, diff]
nvidia-drivers-361.28.ebuild patch

an easy workaround for installing (opencl) libraries without enabling the X use flag and it's dependencies...

Comment 8 Sabayonino 2016-07-08 22:41:12 UTC

Hi Frank, thanks for this but opencl still not works

nvidia-drivers-361.28 installed and loaded

lsmod  | grep nvidia
nvidia               9305080  0


but no opencl available for nvidia

eselect opencl list
Available OpenCL implementations:
  [1]   intel



 equery u nvidia-drivers
[ Legend : U - final flag setting for installation]
[        : I - package is installed with flag     ]
[ Colors : set, unset                             ]
 * Found these USE flags for x11-drivers/nvidia-drivers-361.28:
 U I
 - - X           : Install the X.org driver, OpenGL libraries, XvMC libraries, and VDPAU libraries
 + + acpi        : Add support for Advanced Configuration and Power Interface
 - - compat      : Install non-GLVND libGL for backwards compatibility
 + - driver      : Install the kernel driver module
 - - gtk3        : Install nvidia-settings with support for GTK+ 3
 + - kms         : Enable support for kernel mode setting (KMS)
 + + multilib    : On 64bit systems, if you want to be able to compile 32bit and 64bit binaries
 - - pax_kernel  : PaX patches from the PaX project
 - - static-libs : Build static versions of dynamic libraries as well
 - - tools       : Install additional tools such as nvidia-settings
 + - uvm         : Install the Unified Memory kernel module (nvidia-uvm) for sharing memory between CPU and GPU in CUDA programs

any suggestions ?
:)
cheers

Comment 9 frank 2016-07-08 22:56:25 UTC

(In reply to Sabayonino from comment #8)
> Hi Frank, thanks for this but opencl still not works
> 
> nvidia-drivers-361.28 installed and loaded
> 
> lsmod  | grep nvidia
> nvidia               9305080  0
> 
> 
> but no opencl available for nvidia
> 
> eselect opencl list
> Available OpenCL implementations:
>   [1]   intel
> 
> 
> 
>  equery u nvidia-drivers
> [ Legend : U - final flag setting for installation]
> [        : I - package is installed with flag     ]
> [ Colors : set, unset                             ]
>  * Found these USE flags for x11-drivers/nvidia-drivers-361.28:
>  U I
>  - - X           : Install the X.org driver, OpenGL libraries, XvMC
> libraries, and VDPAU libraries
>  + + acpi        : Add support for Advanced Configuration and Power Interface
>  - - compat      : Install non-GLVND libGL for backwards compatibility
>  + - driver      : Install the kernel driver module
>  - - gtk3        : Install nvidia-settings with support for GTK+ 3
>  + - kms         : Enable support for kernel mode setting (KMS)
>  + + multilib    : On 64bit systems, if you want to be able to compile 32bit
> and 64bit binaries
>  - - pax_kernel  : PaX patches from the PaX project
>  - - static-libs : Build static versions of dynamic libraries as well
>  - - tools       : Install additional tools such as nvidia-settings
>  + - uvm         : Install the Unified Memory kernel module (nvidia-uvm) for
> sharing memory between CPU and GPU in CUDA programs
> 
> any suggestions ?
> :)
> cheers

you should apply the above attached patch to the ebuild and then emerge it with the opencl use flag enabled.

Comment 10 Jeroen Roovers (RETIRED) gentoo-dev

2016-07-09 06:40:13 UTC

Comment on attachment 439072 [details, diff]
nvidia-drivers-361.28.ebuild patch

1. Inverted patch.
2. Looks like a hack: why would you want all of the GLX libraries without USE=X?

Wouldn't it be better to separate out the OpenCL specific libraries using a new USE flag, and then change virtual/opencl to depend on x11-drivers/nvidia-drivers[opencl]?

Comment 11 frank 2016-07-09 10:35:36 UTC

of course, mine was just a quick workaround, not a proposal...

Comment 12 Sabayonino 2016-07-13 17:17:50 UTC

nvidia-352.63 are latest working drivers with OpenCL 
I can see /etc/OpenCL/* , /usr/lib{32,64}/OpenCL/* directories

I can't see /usr/lib32/OpenCL directories with >=nvidia-352.79
/usr/lib64/OpenCL directory exists.

there is a broken link in 
/usr/lib32/libOpenCL.so that point to OpenCL/vendors/nvidia/libOpenCL.so.1.0.0

forcing ABI_X86="32 64" doesn't solve.


get follow contents
*** nvidia-352.63 *** (OpenCL working)
/usr/lib32/OpenCL/vendors/nvidia

libOpenCL.so -> libOpenCL.so.1
libOpenCL.so.1 -> libOpenCL.so.1.0.0
libOpenCL.so.1.0.0

/usr/lib64/OpenCL/global/include/CL
cl_ext.h  cl_gl_ext.h  cl_gl.h  cl.h  cl.hpp  cl_platform.h  opencl.h



 *** nvidia 367.27 or lower ***

/usr/lib64/OpenCL/global/include/CL
cl_ext.h  cl_gl_ext.h  cl_gl.h  cl.h  cl.hpp  cl_platform.h  opencl.h

/usr/lib32 has no OpenCL directory.

:)

Comment 13 Sabayonino 2016-07-13 17:46:45 UTC

sorry for a new post

*** 352.63 ***
/usr/lib32 contents

ls | grep nvidia
libnvidia-compiler.so
libnvidia-compiler.so.1
libnvidia-compiler.so.352.63
libnvidia-ml.so
libnvidia-ml.so.1
libnvidia-ml.so.352.63
libnvidia-opencl.so
libnvidia-opencl.so.1
libnvidia-opencl.so.352.63

*** >252.63 ***
Missed


Does ABI_X86="32 64" or "multilib" useflag working bad ?

:) cheers

Comment 14 Zoltan Puskas 2016-07-24 01:50:05 UTC

The need for glx, opencl, cuda libraries without X usually arises on headless servers running (usually some form of scientific) computation on the GPU.

Comment 15 Sabayonino 2016-07-26 17:23:53 UTC

(In reply to Zoltan Puskas from comment #14)
> The need for glx, opencl, cuda libraries without X usually arises on
> headless servers running (usually some form of scientific) computation on
> the GPU.

I'm running BOINC (and only this) in several PCs without X-Server. I don't need "X".
I'm playing all boinc-clients remotely or by command line.

Comment 16 frank 2016-08-26 15:10:09 UTC

Created attachment 444198 [details, diff]
nvidia-drivers-361.28.ebuild.patch

(In reply to Jeroen Roovers from comment #10)
> Comment on attachment 439072 [details, diff] [details, diff]
> nvidia-drivers-361.28.ebuild patch
> 
> 1. Inverted patch.
> 2. Looks like a hack: why would you want all of the GLX libraries without
> USE=X?
> 
> Wouldn't it be better to separate out the OpenCL specific libraries using a
> new USE flag, and then change virtual/opencl to depend on
> x11-drivers/nvidia-drivers[opencl]?

here it goes again...
this ebuild patch adds an "opencl" use flag (enabled by default) which only installs needed opencl/cuda libraries (leaving out opengl and unneeded stuff).
again, this works for me and i don't have idea what i did

Comment 17 Brian Munro 2016-09-06 13:13:44 UTC

*** Bug 592984 has been marked as a duplicate of this bug. ***

Comment 18 Sabayonino 2016-10-05 10:44:13 UTC

(In reply to frank from comment #16)
> Created attachment 444198 [details, diff] [details, diff]
> nvidia-drivers-361.28.ebuild.patch
> 
> (In reply to Jeroen Roovers from comment #10)
> > Comment on attachment 439072 [details, diff] [details, diff] [details, diff]
> > nvidia-drivers-361.28.ebuild patch
> > 
> > 1. Inverted patch.
> > 2. Looks like a hack: why would you want all of the GLX libraries without
> > USE=X?
> > 
> > Wouldn't it be better to separate out the OpenCL specific libraries using a
> > new USE flag, and then change virtual/opencl to depend on
> > x11-drivers/nvidia-drivers[opencl]?
> 
> here it goes again...
> this ebuild patch adds an "opencl" use flag (enabled by default) which only
> installs needed opencl/cuda libraries (leaving out opengl and unneeded
> stuff).
> again, this works for me and i don't have idea what i did

Hi 
I've tried your patch but I've no OpenCL implemetations.

drivers are installed

Can't running OpenCL apps

# eselect opencl list
Available OpenCL implementations:
  [1]   mesa

[?] x11-drivers/nvidia-drivers (361.28-r100(0/361){tbz2}@10/05/2016 -> 361.28(0/361)^msd{tbz2}): NVIDIA Accelerated Graphics Driver


:(

Comment 19 Brian Munro 2016-10-05 10:59:59 UTC

For nvidia-drivers 367.44, I had to use the above patch plus include the following two files under NV_OPENCL_LIBRARIES

libnvidia-compiler.so
libnvidia-fatbinaryloader.so

Only then was I able to run the cuda 8 sdk samples without X.

Comment 20 Sabayonino 2016-10-05 11:11:54 UTC

your patch was applied

but I can't see opencl flag after drivers installation

Installed versions:  361.28-r100(0/361)^msd{tbz2}[1](01:07:32 PM 10/05/2016)(acpi driver kms multilib uvm -X -gtk3 -pax_kernel -static-libs -tools 

[1] "local" /usr/local/portage


and still can't run OpenCl

eselect opencl list
Available OpenCL implementations:
  [1]   mesa

Comment 21 Sabayonino 2016-10-05 11:13:23 UTC

(In reply to Brian Munro from comment #19)
> For nvidia-drivers 367.44, I had to use the above patch plus include the
> following two files under NV_OPENCL_LIBRARIES
> 
> libnvidia-compiler.so
> libnvidia-fatbinaryloader.so
> 
> Only then was I able to run the cuda 8 sdk samples without X.

I'll try asap

tnx

Comment 22 Marius Brehler 2017-02-24 14:23:47 UTC

Any news to this? I would like to drop the dependency on the X useflag in the nvidia-cuda-toolkit.

Comment 23 Tommie 2017-06-06 21:21:05 UTC

Created attachment 475426 [details, diff]
nvidia-drivers-381.22.ebuild.patch

Comment 24 Tommie 2017-06-06 21:24:03 UTC

Suggesting the 381.22 patch, from https://github.com/tommie/portage-overlay/blob/eae88761e1e22f17b7faffac3b46c299ff6773e7/x11-drivers/nvidia-drivers/nvidia-drivers-381.22-r1.ebuild

I didn't add the opencl USE flag (no regression). Tried to keep it a minimal patch. The library rearrangement is based on doing ldd to see what libraries depend on libX11 et al.

Comment 25 Tommie 2017-07-24 08:06:12 UTC

Bumped to 384.47 https://github.com/gentoo/gentoo/pull/5188

Comment 26 Marius Brehler 2018-08-17 07:52:25 UTC

*** Bug 663830 has been marked as a duplicate of this bug. ***

Comment 27 Nathan Lewis 2018-08-23 21:34:24 UTC

bumping this as well

Comment 28 Liam Shepherd 2019-08-06 18:17:53 UTC

Created attachment 585928 [details, diff]
nvidia-drivers-430.40.ebuild

Based on Tommie's patches but also fixes the modprobe config

Comment 29 Klemen Mihevc 2020-03-28 15:17:16 UTC

Why is this still a thing? CUDA/opencl librarys depended on X, also installing of libnvidia-ml.so is depended on X, so if you run nvidia-smi it complains that it cant find library... however if you remove all X dependency nvidia-smi and opencl seems to work... im using nvidia graphics card in a server without X, but i still want to have cuda/opencl librarys and basic access to temperatures....

Comment 30 Alex Orange 2020-11-20 19:07:13 UTC

Again, very much need this fix. The patch seems to work for me (with a slight tweak to drop fatbinary), is there anything more that we (or I) can do to push this along?

P.S. I'd really like to confirm this as both a. it's a problem and b. the patch seems to work for me.

P.P.S. In the mean time, does anyone have a solution to mask the nvidia::gentoo packages and still yell loudly when nvidia has been update so as to patch the new version?

Comment 31 Ionen Wolkens gentoo-dev

2021-03-02 23:29:50 UTC

(In reply to Alex Orange from comment #30)
> Again, very much need this fix.
May still have to be patient but just to say this isn't forgotten and will get fixed (I know waiting sound unreasonable when the fix is simple, but there's some re-organization going on).

Comment 32 Ionen Wolkens gentoo-dev

2021-03-02 23:44:33 UTC

*** Bug 600182 has been marked as a duplicate of this bug. ***

Comment 33 Larry the Git Cow gentoo-dev

2021-03-21 15:53:31 UTC

The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=26146d1510fd678538b7d02400c1eb8e66e20212

commit 26146d1510fd678538b7d02400c1eb8e66e20212
Author:     Ionen Wolkens <sudinave@gmail.com>
AuthorDate: 2021-03-21 15:52:10 +0000
Commit:     David Seifert <soap@gentoo.org>
CommitDate: 2021-03-21 15:52:10 +0000

    x11-drivers/nvidia-drivers: bump to 460.67 with refactored ebuild
    
    ebuild carries a lot of history and, rather than cleanups, it needed
    something closer to a rewrite.
    
    Bugfixes:
    - Removed all udev rules to solve long standing issues (bug #454740)
    - Install libraries with no X11 dependencies with USE=-X,
      notably for headless OpenCL/CUDA (bug #561706)
    - Install systemd unit for persistenced + nvpd user (bug #591638)
    - Add custom error message for DRM_KMS_HELPER and ensure driver
      doesn't attempt building DRM support without it (bug #603818)
    - Warn about AMD SME if enabled by default (bug #652408)
    - Distribute extra sources to lift RESTRICT="bindist mirror", the
      nvidia-driver.eclass is no longer used (bug #732702)
    - Build modprobe and persistenced from source (bug #747145)
    - Use system locations for vulkan icd/layers (bug #749600)
    
    Others:
    - Dropped IUSE=compat/multilib/kms/uvm/wayland
      > compat: was for non-GLVND variants and currently a no-op
      > multilib: obsolete, abi_x86_32 does all that's needed
      > kms/uvm: modules are loaded by nvidia-modprobe as-needed and
        there's not much sense in skipping installation. Will also save
        OpenCL/CUDA packages from having to depend on [uvm]
      > wayland: library is provided by gui-libs/egl-wayland instead which
        now also provides pkgconfig files and can be a newer version.
        optfeature warning was added for awareness.
    - Dropped REQUIRED_USE, all USE can now be used independently, e.g.
      now possible to get libXNVCtrl.a (static-libs) without the
      deps-heavy USE=tools
    - Dropped locale patch, the offending code it was meant to fix is gone.
    - Dropped linker patch, uses right linker even with -native-symlinks.
    - Added modprobe.d .conf to blacklist nouveau by default.
    - Patched nvidia-modprobe to respect nvidia.conf's permissions when
      creating uvm devices, was previously created as world read-write.
    - No longer installing libOpenCL.so loader (not needed to use OpenCL,
      was used by the no longer available eselect-opencl).
    - nvidia-persistenced init script simplified and updated for nvpd user.
    - nvidia-smi init script removed (all it did was query cards every 300
      seconds), mentioned behavior is no longer observable (fan scales
      normally without X) and it wasn't intended for this purpose.
    - Removed I2C_NVIDIA_GPU check as it caused unnecessary noise for
      gentoo-kernel-bin users (built as module), and being a bad thing
      even if loaded is questionable.
    - Attempt to reduce message noise. The only fatal CONFIG_CHECK is
      fairly rare so there's little reason to check twice with pkg_pretend.
    - ... but added new conditional messages to explain important things
      often seen as common sense but that a new user likely won't know.
    - Replaced the nvidia-driver.eclass legacy test with a compact version
      that reads supported-gpus.json (usable on >450).
    - More strict deps, some may sound strange but nvidia-settings only
      use headers for some of these (dbus/Xrandr/Xv/vdpau).
      > X? libs kept separate as it's the only one needing multilib deps.
      > pax-utils now unconditional for scanelf as libraries are always
        installed. Alternatively could've generated those, but prefer to
        leave it easier to maintain for future generations.
      > virtual/opencl removed, no sense in the drivers depending on this
        and it's instead applications using opencl that should.
      > Added MODULES_OPTIONAL_USE="driver" to handle linux-mod deps
    - Added MIT license for persistenced
    - Added ZLIB license for supported-gpus.json
    - NV_KERNEL_MAX (previously NV_KV_MAX_PLUS) set to be <=5.11 form
      rather than <5.12 given that often confused users thinking it meant
      5.12 support from quick looks.
    - arm64 support "should" work but runtime untested
    - And a long list of cleanups that "hopefully" won't cause new issues.
    
    Closes: https://bugs.gentoo.org/454740
    Closes: https://bugs.gentoo.org/561706
    Closes: https://bugs.gentoo.org/591638
    Closes: https://bugs.gentoo.org/603818
    Closes: https://bugs.gentoo.org/652408
    Closes: https://bugs.gentoo.org/732702
    Closes: https://bugs.gentoo.org/747145
    Closes: https://bugs.gentoo.org/749600
    Signed-off-by: Ionen Wolkens <sudinave@gmail.com>
    Signed-off-by: David Seifert <soap@gentoo.org>

 x11-drivers/nvidia-drivers/Manifest                |   7 +
 .../files/nvidia-blacklist-nouveau.conf            |   3 +
 .../files/nvidia-modprobe-390.141-uvm-perms.patch  |  12 +
 .../nvidia-drivers/files/nvidia-persistenced.confd |   7 +
 .../nvidia-drivers/files/nvidia-persistenced.initd |  12 +
 .../nvidia-drivers/nvidia-drivers-460.67.ebuild    | 391 +++++++++++++++++++++
 6 files changed, 432 insertions(+)