953074 – sci-ml/ollama: ollama not added to video group automatically

Bug 953074 - sci-ml/ollama: ollama not added to video group automatically

Summary: sci-ml/ollama: ollama not added to video group automatically

Status:	RESOLVED FIXED

Alias:	None

Product:	GURU
Classification:	Unclassified
Component:	Package issues (show other bugs)
Hardware:	All Linux

Importance:	Normal normal
Assignee:	Paul Zander

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2025-04-03 10:16 UTC by knifed
Modified:	2025-04-08 19:33 UTC (History)
CC List:	2 users (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
Debug log (ollama-bug,3.26 KB, text/plain) 2025-04-03 10:18 UTC, knifed	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description knifed 2025-04-03 10:16:57 UTC

I have tried this on both my desktop (has a GTX 970) and laptop (some newer GTX from the past year or so, Ampere?). CUDA use flag is enabled. When ollama starts, it says it cannot find any hardware from the cuda driver, and debug logs indicate that cudart failed to initialize with error code 100. Ollama treats error code 100 from CUDA as unable to find any CUDA devices.

Reproducible: Always

Steps to Reproduce:
1. Build ollama with cuda use flag enabled.
2. Start ollama.
Actual Results:  
Ollama cannot find any nvidia devices via its CUDA implementation.

Expected Results:  
Ollama finds the CUDA devices.

Comment 1 knifed 2025-04-03 10:18:08 UTC

This is happening for me with NVidia drivers 570, with kernel-open USE flag set. Have not tested older versions, and turning off kernel-open causes issues with resume/suspend on my laptop (it has two GPUs).

Comment 2 knifed 2025-04-03 10:18:36 UTC

Created attachment 923542 [details]
Debug log

Comment 3 Paul Zander 2025-04-03 11:14:36 UTC

Please make sure you don't set CUDA_VISIBLE_DEVICES somewhere. ollama should run on cuda 12.8 and nvidia-drivers-570 without issues.

Comment 4 knifed 2025-04-03 11:24:33 UTC

(In reply to Paul Zander from comment #3)
> Please make sure you don't set CUDA_VISIBLE_DEVICES somewhere. ollama should
> run on cuda 12.8 and nvidia-drivers-570 without issues.

Here is the log line for the environment variables when ollama starts. CUDA_VISIBLE_DEVICES is not set.


-----

6062:Apr  3 08:59:43 charon ollama: 2025/04/03 08:59:43 routes.go:1230: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:true OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"

-----

Here's output of lsmod | grep nvidia:

nvidia_uvm           3010560  0
nvidia_modeset       2088960  6
nvidia              12685312  94 nvidia_uvm,nvidia_modeset
video                  73728  4 thinkpad_acpi,xe,i915,nvidia_modeset

Comment 5 Paul Zander 2025-04-03 11:48:22 UTC

What version of ollama is this? systemd or openrc?

Comment 6 knifed 2025-04-03 11:51:23 UTC

OpenRC. My system has USE=-systemd set. Ollama config in /etc/conf.d is:

export OLLAMA_FLASH_ATTENTION=1
export OLLAMA_INTEL_GPU=1

Comment 7 Paul Zander 2025-04-03 13:28:15 UTC

What version of ollama is this?

Comment 8 knifed 2025-04-03 13:29:17 UTC

It should be 0.6.3. But for whatever reason, the compiled version that prints out is 0.0.0. But it is the ~amd64 version of the GURU package.

Comment 9 Paul Zander 2025-04-06 11:15:25 UTC

What does nvidia-smi report? And try sudo -u ollama -g ollama nvidia-smi as well please.

I don't think it's an issue with ollama per se. Might be easier to debug on irc as well.

Comment 10 knifed 2025-04-07 07:26:22 UTC

Excellent point! This is the one thing I did not check. Ollama user was not in the video group, which gave permission denied when running nvidia-smi. So maybe the actual 'bug report' is to print an info message about adding ollama user to the video group. Or making it happen in the ebuild if possible? Dunno if that is possible.

Comment 11 Paul Zander 2025-04-07 18:27:06 UTC

I originally fixed that in f0ba99ae524b3c6fae9696507590e9a5376de095. Seems I missed the non-9999 version. It's in dev now.

Comment 12 Larry the Git Cow gentoo-dev

2025-04-08 19:33:44 UTC

The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/proj/guru.git/commit/?id=2b8d876eb74b90e7f4925304df1e5df6193b1529

commit 2b8d876eb74b90e7f4925304df1e5df6193b1529
Author:     Paul Zander <negril.nx+gentoo@gmail.com>
AuthorDate: 2025-04-07 18:08:46 +0000
Commit:     Paul Zander <negril.nx+gentoo@gmail.com>
CommitDate: 2025-04-07 18:12:53 +0000

    sci-ml/ollama: add missing USE-dep to 0.6.3 to put user in video group
    
    Closes: https://bugs.gentoo.org/953074
    Signed-off-by: Paul Zander <negril.nx+gentoo@gmail.com>

 sci-ml/ollama/ollama-0.6.3.ebuild | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)