Created attachment 851494 [details] gentoo-sources-6.1.12 .config Hello, I recently upgraded my kernel to the new release, sys-kernel/gentoo-sources-6.1.12. After installing x11-drivers/nvidia-drivers-525.85.05 into /lib/modules/6.1.12-gentoo/ I restarted my machine. My video output freezes at "Populating /dev with existing devices through uevents...". I CTRL+ALT+DEL and reboot or shutdown therefore the system is responsive. My last working kernel was 5.15.88-gentoo. https://i.imgur.com/yX56oCX.jpeg I pass through one of my graphics cards using vfio-pci. My other graphics card on slot 2 is my primary video out. Nouveau is blacklisted in my configuration. /etc/modprobe.d/vfio.conf softdep nouveau pre: vfio-pci softdep nvidia pre: vfio-pci softdep nvidia* pre: vfio-pci alias char-major-195 nvidia alias /dev/nvidiactl char-major-195 options vfio-pci ids=10de:2504,10de:228e /etc/modprobe.d/nvidia.conf # NVIDIA drivers options # See /usr/share/doc/nvidia-drivers-*/README.txt* for more information. # nvidia-drivers and nouveau cannot be used at same time. # Comment out the following line if you wish to allow nouveau. blacklist nouveau # Kernel Mode Setting (notably needed for EGLStream/Wayland) # Enabling may possibly cause issues with SLI and Reverse PRIME. options nvidia-drm modeset=1 # Suspend options. Allocations=0 recommended over =1 unless enable nvidia's # systemd sleep services (nvidia-hibernate, nvidia-resume, nvidia-suspend), # but even then may lead to issues on some setups (keep 0 if in doubt). options nvidia \ NVreg_PreserveVideoMemoryAllocations=0 \ NVreg_TemporaryFilePath=/var/tmp # !!! Security Warning !!! # Do not change the DeviceFile options unless you know what you are doing. # Only add trusted users to the 'video' group, these users may be able to # crash, compromise, or irreparably damage the machine. options nvidia \ NVreg_DeviceFileGID=27 \ NVreg_DeviceFileMode=432 \ NVreg_DeviceFileUID=0 \ NVreg_ModifyDeviceFiles=1 # Should be no need to touch anything below. alias char-major-195 nvidia alias /dev/nvidiactl char-major-195 remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset nvidia-uvm nvidia
Created attachment 851496 [details] lspci -k 1660ti is the video out 3060 is the vfio pass though
Created attachment 851498 [details] syslog snippet
Does it work without your vfio setup? May want to check if it works normally at least. Perhaps changes in the kernel w/ vfio is making nvidia think the card is in use (rather than blacklisted nouveau), not that it's something I've kept up with. vfio passthrough works fine for me still, but I only have one nvidia card, can't test two nor am I familiar with using two at once (so can't say what module options or load order may help).
(In reply to Ionen Wolkens from comment #3) > Does it work without your vfio setup? May want to check if it works normally > at least. Perhaps changes in the kernel w/ vfio is making nvidia think the > card is in use (rather than blacklisted nouveau), not that it's something > I've kept up with. > > vfio passthrough works fine for me still, but I only have one nvidia card, > can't test two nor am I familiar with using two at once (so can't say what > module options or load order may help). I disabled the vfio.conf. Now it does not freeze when I start up the new kernel. How did you manage to pass in only one card and still use your linux host as normal? AFAIK if you pass in one card, you have to do some switching to the virtual machine and back to the host once the virtual machine is down. In the end I like the two GPU method because it means I can use my linux and VM at the same time via the a shared video memory buffer of looking-glass-client. But back to why my vfio.conf doesnt work for my configuration in this new kernel; I need to see what the kernel has changed and how it has affected the process.
Tried adding: options nvidia ids=10de:2182,10de:1aeb,10de:1aec To the vfio.conf; still won't work. My video just ends up freezing. Im really not sure now...
Ok got it to work by adding a custom dracut module in /usr/lib/dracut/20vfio-override/ module-setup.sh #!/usr/bin/bash check() { return 0 } depends() { return 0 } install() { inst_hook pre-udev 00 "$moddir/vfio-pci-override.sh" } vfio-pci-override.sh #!/bin/sh DEVICES=( "0000:01:00.0 " # 3060 (VGA) "0000:01:00.1 " # 3060 (audio) ) for dev in ${DEVICES[@]}; do echo "vfio-pci" > /sys/bus/pci/devices/$dev/driver_override done modprobe -i vfio-pci and adding drivers+=" vfio vfio-pci vfio_iommu_type1 " to /etc/dracut.conf.d/* as well as adding rd.driver.pre=vfio-pci to GRUB_CMDLINE_LINUX_DEFAULT. Note lspci -k did not show the vfio-pci module was in use until I started using it... and this is fine. I guess they changed how the vfio-pci module is loaded. 01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1) Subsystem: eVga.com. Corp. GA106 [GeForce RTX 3060 Lite Hash Rate] Kernel driver in use: vfio-pci Kernel modules: nouveau, nvidia_drm, nvidia
Oh yeah and I went ahead and removed the vfio.conf file in /etc/modprobe.d/ from earlier, since now I am using a dracut module to load the vfio to the specified devices.
I was more specifically referring to two nvidia cards, there can be other non-nvidia cards in that statement. Single gpu passthrough is possible, but that does need workarounds to control the host. But anyhow, glad you got it to work :)
drivers+=" vfio-pci " Is all I really need loaded in my initrd. What I meant is lspci will look like 01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1) Subsystem: eVga.com. Corp. GA106 [GeForce RTX 3060 Lite Hash Rate] Kernel modules: nouveau, nvidia_drm, nvidia With no driver in use, which is good. once you start a VM, vfio-pci will start being in use. Maybe I could get vfio on a single GPU, but then I would not have two GPUs in my PC. lel
Also use #!/bin/sh not #!/usr/bin/bash I was following some other dude's guide but for real if you switch to dash you are gonna want to rewrite all your scripts anyway. just saying