Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 505092

Summary: x11-drivers/nvidia-drivers-334.21 - unprivileged CUDA apps have no access when nvidia_uvm is not loaded
Product: Gentoo Linux Reporter: whgentoo
Component: [OLD] LibraryAssignee: Jeroen Roovers (RETIRED) <jer>
Status: RESOLVED FIXED    
Severity: normal CC: kazbanov, kparent
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://bbs.archlinux.org/viewtopic.php?pid=1389012
Whiteboard:
Package list:
Runtime testing required: ---

Description whgentoo 2014-03-19 15:30:51 UTC
This bug is from NVIDIA. After upgrade driver to 334.21, cuda programs wil fail with the following error message:

CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

Some discussions can be found at https://bbs.archlinux.org/viewtopic.php?pid=1389012 and 
https://devtalk.nvidia.com/default/topic/699610/linux/334-21-driver-returns-999-on-cuinit-cuda-/ 


Reproducible: Always

Steps to Reproduce:
1.Upgrade to nvidia-driver-334.21
2.Run any cuda program, for example the 'deviceQuery' from cuda SDK

Actual Results:  
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 15:36:32 UTC
Works for me. After the upgrade, did you happen to re-run eselect opengl/opencl yet?
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 15:37:26 UTC
Please post your `emerge --info x11-drivers/nvidia-drivers' output in a comment.
Comment 3 whgentoo 2014-03-19 15:46:34 UTC
Portage 2.2.8-r1 (default/linux/amd64/13.0/desktop/kde, gcc-4.7.3,
glibc-2.17, 3.10.32-gentoo x86_64)
=================================================================
                        System Settings
=================================================================
System uname: Linux-3.10.32-gentoo-x86_64-AMD_FX-tm-6100_Six-Core_Processor-with-gentoo-2.2
KiB Mem:     8177336 total,   6617028 free
KiB Swap:     524284 total,    524284 free
Timestamp of tree: Tue, 18 Mar 2014 23:15:01 +0000
ld GNU ld (GNU Binutils) 2.23.2
app-shells/bash:          4.2_p45
dev-lang/python:          2.7.5-r3, 3.3.3
dev-util/cmake:           2.8.11.2
dev-util/pkgconfig:       0.28
sys-apps/baselayout:      2.2
sys-apps/openrc:          0.12.4
sys-apps/sandbox:         2.6-r1
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.11.6, 1.12.6, 1.13.4
sys-devel/binutils:       2.23.2
sys-devel/gcc:            4.7.3-r1
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r4
sys-kernel/linux-headers: 3.9 (virtual/os-headers)
sys-libs/glibc:           2.17
Repositories: gentoo science gentoo-zh
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA NVIDIA-CUDA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d
/etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release
/etc/revdep-rebuild /etc/sandbox.d /etc/terminfo
/etc/texmf/language.dat.d /etc/texmf/language.def.d
/etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified
distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch
preserve-libs protect-owned sandbox sfperms strict
unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv
usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://gentoo.cites.uiuc.edu/pub/gentoo/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j7"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times
--omit-dir-times --compress --force --whole-file --delete --stats
--human-readable --timeout=180 --exclude=/distfiles --exclude=/local
--exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/science /var/lib/layman/gentoo-zh"
SYNC="rsync://rsync25.us.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa amd64 avx berkdb bindist bluetooth
branding bzip2 cairo cdda cdr cli consolekit cracklib crypt cups cxx
dbus declarative dri dts dvd dvdr emboss encode exif fam firefox flac
fma4 fortran gdbm gif gpm gtk iconv ipv6 jpeg kde kipi latex lcms ldap
libnotify mad mmx mng modules mp3 mp4 mpeg multilib ncurses nls nptl
ogg opengl openmp pam pango pcre pdf phonon plasma png policykit ppds
qt3support qt4 readline sdl semantic-desktop session spell sse sse2
sse3 sse4 ssl ssse3 startup-notification svg tcpd tiff truetype udev
udisks unicode upower usb vorbis wxwidgets x264 xcb xcomposite
xinerama xml xscreensaver xv xvid zlib" ABI_X86="64"
ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci
emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0
intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci"
APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions
alias auth_basic authn_alias authn_anon authn_dbm authn_default
authn_file authz_dbm authz_default authz_groupfile authz_host
authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock
deflate dir disk_cache env expires ext_filter file_cache filter
headers include info log_config logio mem_cache mime mime_magic
negotiation rewrite setenvif speling status unique_id userdir
usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets
stage tables krita karbon braindump author" CAMERAS="ptp2"
COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog"
ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18
garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver
oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip
tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse evdev"
KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216
lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console
presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice"
PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7"
PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20"
USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock
lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee
tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL,
PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS,
PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON

=================================================================
                        Package Settings
=================================================================

x11-drivers/nvidia-drivers-334.21 was built with the following:
USE="X acpi (multilib) tools -pax_kernel -uvm"

(In reply to Jeroen Roovers from comment #2)
> Please post your `emerge --info x11-drivers/nvidia-drivers' output in a
> comment.
Comment 4 whgentoo 2014-03-19 15:47:35 UTC
Yes I did re-run them.

(In reply to Jeroen Roovers from comment #1)
> Works for me. After the upgrade, did you happen to re-run eselect
> opengl/opencl yet?
Comment 5 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 15:49:16 UTC
Does it help to rebuild with USE=uvm?
Comment 6 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 15:49:53 UTC
That's what the Arch Linux chat suggests...
Comment 7 whgentoo 2014-03-19 16:01:29 UTC
Hi Jeroen,

Thanks for the advice, building with 'uvm' does partially solve the problem. Though as mentioned in the Arch chat, I still need to run some Cuda app as root first, then be able to run with normal user.
(In reply to Jeroen Roovers from comment #6)
> That's what the Arch Linux chat suggests...
Comment 8 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 16:08:08 UTC
A strace of deviceQuery tells me this:

1) It checks in /proc/modules whether nvidia-uvm is loaded.
2) ...
3) It tries to run </usr/bin/nvidia-modprobe> when it should be trying </opt/bin/nvidia-modprobe> which is where we install it.

Maybe, perhaps, we should install it in /usr/bin instead. Or a symlink perhaps.
Comment 9 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 16:10:33 UTC
(In reply to Jeroen Roovers from comment #8)
> Maybe, perhaps, we should install it in /usr/bin instead. Or a symlink
> perhaps.

No, that doesn't work for unprivileged users either.
Comment 10 whgentoo 2014-03-19 16:21:55 UTC
Looks like nvidia-modprobe need to be setuid. 

Quote from http://us.download.nvidia.com/XFree86/Linux-x86_64/334.21/README/faq.html

"If the user-space NVIDIA driver component cannot load the kernel module or create the device files itself, it will attempt to invoke the setuid root nvidia-modprobe utility, which will perform these operations on behalf of the non-privileged driver."


(In reply to Jeroen Roovers from comment #9)
> (In reply to Jeroen Roovers from comment #8)
> > Maybe, perhaps, we should install it in /usr/bin instead. Or a symlink
> > perhaps.
> 
> No, that doesn't work for unprivileged users either.
Comment 11 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-19 16:28:51 UTC
But that would still fail if it's in /opt/bin instead of /usr/bin, right?
Comment 12 whgentoo 2014-03-19 16:35:51 UTC
Yes, I get it work now by setuid and create a symlink under /usr/bin. 
 
(In reply to Jeroen Roovers from comment #11)
> But that would still fail if it's in /opt/bin instead of /usr/bin, right?
Comment 13 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-20 16:04:03 UTC
Fixed in 331.49-r1 and 334.21-r1.
Comment 14 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-20 16:04:27 UTC
Note that your user still needs to be in the video group for this to work.
Comment 15 Jeroen Roovers (RETIRED) gentoo-dev 2014-03-31 12:23:51 UTC
*** Bug 506360 has been marked as a duplicate of this bug. ***