Summary: | x11-drivers/nvidia-drivers-334.21 - efi mode boot problem with /opt/bin/nvidia-smi | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Ulenrich <ulenrich> |
Component: | Current packages | Assignee: | David Seifert <soap> |
Status: | RESOLVED DUPLICATE | ||
Severity: | normal | CC: | ionen, qrilka, stijn+gentoo, xarthisius, zerochaos |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Ulenrich
2014-03-12 15:58:55 UTC
Can you attach a screenshot showing the big scattering bar? Also, 1) Please post your `emerge --info' output in a comment. 2) Tell us what graphics cards are installed on that system. I cannot provide a screenshot because it is somehow input related: if I move the mouse or I hack some text into a Konsole window which is as big such that it reaches the place of the scattering bar. I have an older macMini 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation C79 [GeForce 9400] [10de:0861] (rev b1) I found out it is not directly /opt/bin/nvidia-smi related: The first login without having that executable is working perfectly but with any relogin into kde I get the same quirky screen. I already have the experience it can crash my system after some hours ignoring this. I have Gentoo~unstable systemd-208.9999 system (I tried this git because of this video error). Mesa I tried the stable one and the very new mesa-10.1 version. It is not related to my kernel, because I have copied all of a derivative Debian-siduciton kernel which results the same behavior. But starting the Debian installation with that linux-13.6 is without quirks. Debian-sid has gcc-4.8,systemd-204,mesa-9.2.2 installed but else is quiet the same as on my Gentoo~unstable. It may be more general efi nvidia related, only efi mode boot: Because of this video error I installed a very new partition having Gentoo+stable without proprietary nvidia: The first boot stalls with some video error on the console long before X: A crash with no input possible. Any reboot as well. But If I boot successfully any other partition and then WARM reboot my new Gentoo+stable noveau installation works well. This doesn't happen booted in hybrid-mbr mode. (In reply to Ulenrich from comment #3) > I cannot provide a screenshot because it is somehow input related: Use a (camera) phone, maybe? Or describe in (more) words what you see. Right now we have no idea what it is you are seeing. I wonder how I get this my bug a valuable one: I have written too much, at first I thought I knew the cause, but only now I know: It is that proprietary nvidia-drivers are not fully support efi: It is only possible to load the xorg-server once. Thus if there is an udev rule which pulls in the nvidia kernel module early then the second start, the loading of kdm is scrambled, something memory of input related Workaround for me: 1. copy /etc/udev/rules.d/99-nvidia.rules commenting out 1. line which indirectly started /opt/bin/smi 2. /usr/share/config/kdm/kdmrc TerminateServer=false This way I can do relogins with X-kdm Maybe this is only an issue for older efi (I have a MacMini of 2009) and not modern uefi versions. (In reply to Ulenrich from comment #6) > It is that proprietary nvidia-drivers are not fully support efi: > It is only possible to load the xorg-server once. > Thus if there is an udev rule which pulls in the nvidia kernel module early > then the second start, the loading of kdm is scrambled, something memory of > input related Trying to load nvidia.ko a second time is a NOOP. It looks like your system is perhaps trying to start two X servers, which has nothing to do with nvidia.ko. > Workaround for me: > 1. copy /etc/udev/rules.d/99-nvidia.rules > commenting out 1. line which indirectly started /opt/bin/smi. x11-drivers/nvidia-drivers does /not/ install /etc/udev/rules.d/99-nvidia.rules It does install /lib/udev/rules.d/99-nvidia.rules which does not include that line. If you have that file in /etc, then you should remove it anyway or make sure it works properly yourself. > 2. /usr/share/config/kdm/kdmrc > TerminateServer=false > This way I can do relogins with X-kdm I cannot tell what you actually changed there, or how it directly relates to x11-drivers/nvidia-drivers. > Maybe this is only an issue for older efi (I have a MacMini of 2009) > and not modern uefi versions. You keep mentioning this, but EFI and UEFI shouldn't affect what the operating system does while starting up services. @Jeroen I copied this file /usr/lib64/udev/rules.d/99-nvidia.rules to /etc/udev/rules.d/99-nvidia.rules to override and outcomment this first line: ACTION=="add", DEVPATH=="/module/nvidia", SUBSYSTEM=="module", RUN+="nvidia-udev.sh $env{ACTION}" Which runs /usr/lib64/udev/nvidia-udev.sh with this action in pre X boot stage: add|ADD) /opt/bin/nvidia-smi > /dev/null Seeing this problem as well, with nvidia-drivers-346.47. Disabling nvidia-smi in nvidia-udev.sh solves the problem, otherwise I see a stack trace during boot and pretty soon after the system freezes. Please fix this, this causes headaches after every kernel update. --- nvidia-udev.sh.bak 2015-03-24 00:51:43.262017277 +0100 +++ nvidia-udev.sh 2015-03-24 00:44:44.543335853 +0100 @@ -7,7 +7,7 @@ case $1 in add|ADD) - /opt/bin/nvidia-smi > /dev/null + #/opt/bin/nvidia-smi > /dev/null ;; remove|REMOVE) rm -f /dev/nvidia* Once again bitten by this bug, caused by the udev script that runs "nvidia-smi" after modprobe nvidia: jun 28 20:27:54 taz kernel: BUG: unable to handle kernel NULL pointer dereference at (null) jun 28 20:27:54 taz kernel: IP: [<ffffffff815498a0>] __down+0x3b/0x8e jun 28 20:27:54 taz kernel: PGD 6603db067 PUD 306120067 PMD 0 jun 28 20:27:54 taz kernel: Oops: 0002 [#1] PREEMPT SMP jun 28 20:27:54 taz kernel: Modules linked in: nvidia(PO+) dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag bnep bluetooth xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntr jun 28 20:27:54 taz kernel: snd_pcm e1000e snd_timer ptp snd i2c_i801 lpc_ich firewire_ohci pps_core xhci_pci soundcore mei_me shpchp tpm_infineon tpm_tis processor tpm button nls_iso8859_1 nls_cp437 vfat fat openvswitch pptp gre pppox ppp_generic slhc netconsole vhba(O) vhost_net tun vhost jun 28 20:27:54 taz kernel: hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid ohci_pci ohci_hcd uhci_hcd usb_storage hid arcmsr sg ehci_pci xhci_hcd ehci_hcd sr_mod cdrom firewire_core crc_itu_t usbcore usb_common [last unloaded: nvidia] jun 28 20:27:54 taz kernel: CPU: 9 PID: 32388 Comm: nvidia-smi Tainted: P O 4.0.5-gentoo #1 jun 28 20:27:54 taz kernel: Hardware name: System manufacturer System Product Name/P9X79 WS, BIOS 4701 08/26/2014 jun 28 20:27:54 taz kernel: task: ffff880612a26340 ti: ffff88082633c000 task.ti: ffff88082633c000 jun 28 20:27:54 taz kernel: RIP: 0010:[<ffffffff815498a0>] [<ffffffff815498a0>] __down+0x3b/0x8e jun 28 20:27:54 taz kernel: RSP: 0018:ffff88082633fb38 EFLAGS: 00010092 jun 28 20:27:54 taz kernel: RAX: 0000000000000000 RBX: 7fffffffffffffff RCX: ffffffffa24d74a0 jun 28 20:27:54 taz kernel: RDX: ffff88082633fb38 RSI: ffffffff817b579b RDI: ffffffffa24d7480 jun 28 20:27:54 taz kernel: RBP: ffff88082633fb78 R08: 000060f7b0006580 R09: 0000000000000292 jun 28 20:27:54 taz kernel: R10: 00000000000000d0 R11: ffffffffa21f4866 R12: ffffffffa24d7480 jun 28 20:27:54 taz kernel: R13: ffff880612a26340 R14: 00000000000000ff R15: ffff88030422d538 jun 28 20:27:54 taz kernel: FS: 00007f4cbbb2c700(0000) GS:ffff88084fd20000(0000) knlGS:0000000000000000 jun 28 20:27:54 taz kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jun 28 20:27:54 taz kernel: CR2: 0000000000000000 CR3: 0000000580fd7000 CR4: 00000000000407e0 jun 28 20:27:54 taz kernel: Stack: jun 28 20:27:54 taz kernel: ffffffffa24d74a0 0000000000000000 0000000000000003 00000000000000ff jun 28 20:27:54 taz kernel: ffff88082633fb78 ffffffffa24d7480 ffff88030cea0000 0000000000000003 jun 28 20:27:54 taz kernel: ffff88082633fba8 ffffffff8108281c 000060f7b0006580 0000000000000292 jun 28 20:27:54 taz kernel: Call Trace: jun 28 20:27:54 taz kernel: [<ffffffff8108281c>] down+0x3c/0x50 jun 28 20:27:54 taz kernel: [<ffffffffa21f4b8f>] nvidia_open+0x3af/0x8c0 [nvidia] jun 28 20:27:54 taz kernel: [<ffffffffa21f3b28>] nvidia_frontend_open+0x48/0xa0 [nvidia] jun 28 20:27:54 taz kernel: [<ffffffff8114a7da>] chrdev_open+0x9a/0x1d0 jun 28 20:27:54 taz kernel: [<ffffffff8114a740>] ? cdev_put+0x30/0x30 jun 28 20:27:54 taz kernel: [<ffffffff81144047>] do_dentry_open.isra.13+0xf7/0x320 jun 28 20:27:54 taz kernel: [<ffffffff811442e9>] vfs_open+0x49/0x50 jun 28 20:27:54 taz kernel: [<ffffffff81151e42>] do_last+0x132/0xdf0 jun 28 20:27:54 taz kernel: [<ffffffff8115464b>] path_openat+0x7b/0x630 jun 28 20:27:54 taz kernel: [<ffffffff810d8f07>] ? acct_account_cputime+0x17/0x20 jun 28 20:27:54 taz kernel: [<ffffffff81155dd5>] do_filp_open+0x35/0x90 jun 28 20:27:54 taz kernel: [<ffffffff8154b2c9>] ? _raw_spin_unlock+0x9/0x20 jun 28 20:27:54 taz kernel: [<ffffffff8116224f>] ? __alloc_fd+0x9f/0x130 jun 28 20:27:54 taz kernel: [<ffffffff81145344>] do_sys_open+0x124/0x220 jun 28 20:27:54 taz kernel: [<ffffffff8101094d>] ? syscall_trace_enter_phase1+0x10d/0x180 jun 28 20:27:54 taz kernel: [<ffffffff81145459>] SyS_open+0x19/0x20 jun 28 20:27:54 taz kernel: [<ffffffff8154b9b6>] system_call_fastpath+0x16/0x1b jun 28 20:27:54 taz kernel: Code: 49 89 fc 53 48 bb ff ff ff ff ff ff ff 7f 48 83 ec 28 48 89 4d c0 48 8b 47 28 48 89 57 28 65 4c 8b 2c 25 00 aa 00 00 48 89 45 c8 <48> 89 10 4c 89 6d d0 c6 45 d8 00 4c 89 e7 49 c7 45 00 02 00 00 jun 28 20:27:54 taz kernel: RIP [<ffffffff815498a0>] __down+0x3b/0x8e jun 28 20:27:54 taz kernel: RSP <ffff88082633fb38> jun 28 20:27:54 taz kernel: CR2: 0000000000000000 jun 28 20:27:54 taz kernel: ---[ end trace 84cb727e5ed71186 ]--- jun 28 20:27:54 taz kernel: note: nvidia-smi[32388] exited with preempt_count 1 So should we add a USE=efi (In reply to Stijn Tintel from comment #9) > Seeing this problem as well, with nvidia-drivers-346.47. Disabling > nvidia-smi in nvidia-udev.sh solves the problem, otherwise I see a stack > trace during boot and pretty soon after the system freezes. Please fix this, > this causes headaches after every kernel update. > > --- nvidia-udev.sh.bak 2015-03-24 00:51:43.262017277 +0100 > +++ nvidia-udev.sh 2015-03-24 00:44:44.543335853 +0100 > @@ -7,7 +7,7 @@ > > case $1 in > add|ADD) > - /opt/bin/nvidia-smi > /dev/null > + #/opt/bin/nvidia-smi > /dev/null > ;; > remove|REMOVE) > rm -f /dev/nvidia* That's not a bug fix. I can't fix problems in nvidia-smi: please talk to Nvidia directly about that. (In reply to Jeroen Roovers from comment #11) > So should we add a USE=efi (In reply to Stijn Tintel from comment #9) > > Seeing this problem as well, with nvidia-drivers-346.47. Disabling > > nvidia-smi in nvidia-udev.sh solves the problem, otherwise I see a stack > > trace during boot and pretty soon after the system freezes. Please fix this, > > this causes headaches after every kernel update. > > > > --- nvidia-udev.sh.bak 2015-03-24 00:51:43.262017277 +0100 > > +++ nvidia-udev.sh 2015-03-24 00:44:44.543335853 +0100 > > @@ -7,7 +7,7 @@ > > > > case $1 in > > add|ADD) > > - /opt/bin/nvidia-smi > /dev/null > > + #/opt/bin/nvidia-smi > /dev/null > > ;; > > remove|REMOVE) > > rm -f /dev/nvidia* > > That's not a bug fix. I can't fix problems in nvidia-smi: please talk to > Nvidia directly about that. The nvidia-udev.sh script doesn't seem to come with the driver, but was added to fix #376527. I am inclined to say that the bug is caused by the udev script, and that it should be fixed there. The problem doesn't always occur, so I suspect that sometimes nvidia-smi is being run too soon, when the driver isn't fully initialized yet and thus causing problems. Fix could be as simple as adding "sleep 2" or so before running nvidia-smi. Other option would indeed be a USE flag to optionally install the udev script. (In reply to Stijn Tintel from comment #12) > The nvidia-udev.sh script doesn't seem to come with the driver, but was > added to fix #376527. I am inclined to say that the bug is caused by the > udev script, and that it should be fixed there. > > The problem doesn't always occur, so I suspect that sometimes nvidia-smi is > being run too soon, when the driver isn't fully initialized yet and thus > causing problems. Fix could be as simple as adding "sleep 2" or so before > running nvidia-smi. > > Other option would indeed be a USE flag to optionally install the udev > script. the udev script should be run only when the module is inserted, you can try adding some sleep in there to test your theory of nvidia-smi loading before the driver is ready to deal with it. while I don't love the idea of adding random sleep in there I would be open to a small sleep if it fixes your bug. (In reply to Rick Farina (Zero_Chaos) from comment #13) > the udev script should be run only when the module is inserted, you can try > adding some sleep in there to test your theory of nvidia-smi loading before > the driver is ready to deal with it. while I don't love the idea of adding > random sleep in there I would be open to a small sleep if it fixes your bug. I've added "sleep 1" after I wrote my last comment, and have not seen this bug since. this should be fixed if Jer ever stabilizes the fixed udev script. Another update of nvidia-drivers, and I run into this problem again: feb 12 15:30:37 taz kernel: BUG: unable to handle kernel NULL pointer dereference at (null) feb 12 15:30:37 taz kernel: IP: [<ffffffff8156838c>] __down+0x3c/0xa0 feb 12 15:30:37 taz kernel: PGD 8195cd067 PUD 8195cc067 PMD 0 feb 12 15:30:37 taz kernel: Oops: 0002 [#1] PREEMPT SMP feb 12 15:30:37 taz kernel: Modules linked in: iTCO_wdt iTCO_vendor_support intel_rapl iosf_mbi acpi_cpufreq(-) x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul nvidia(PO+) c feb 12 15:30:37 taz kernel: xts gf128mul aes_x86_64 cbc sha512_generic sha256_generic sha1_generic iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi e1000 fuse overlay nfs lockd grace s feb 12 15:30:37 taz kernel: CPU: 4 PID: 2040 Comm: nvidia-smi Tainted: P O 4.4.0-gentoo #1 feb 12 15:30:37 taz kernel: Hardware name: System manufacturer System Product Name/P9X79 WS, BIOS 4701 08/26/2014 feb 12 15:30:37 taz kernel: task: ffff8800ad8d5580 ti: ffff88081877c000 task.ti: ffff88081877c000 feb 12 15:30:37 taz kernel: RIP: 0010:[<ffffffff8156838c>] [<ffffffff8156838c>] __down+0x3c/0xa0 feb 12 15:30:37 taz kernel: RSP: 0018:ffff88081877fbc0 EFLAGS: 00010086 feb 12 15:30:37 taz kernel: RAX: 0000000000000000 RBX: 7fffffffffffffff RCX: 00000000000000ff feb 12 15:30:37 taz kernel: RDX: ffffffffa1a7ab60 RSI: ffffffff817d1fbc RDI: ffffffffa1a7ab40 feb 12 15:30:37 taz kernel: RBP: ffff88081877fc00 R08: 000060f7c0008370 R09: 000000000000003d feb 12 15:30:37 taz kernel: R10: 0000000000000000 R11: ffffffffa11a979e R12: ffffffffa1a7ab40 feb 12 15:30:37 taz kernel: R13: ffff8800ad8d5580 R14: 0000000000000003 R15: ffff880817c8a1c8 feb 12 15:30:37 taz kernel: FS: 00007f6e1b07d700(0000) GS:ffff88083fc80000(0000) knlGS:0000000000000000 feb 12 15:30:37 taz kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 feb 12 15:30:37 taz kernel: CR2: 0000000000000000 CR3: 0000000819552000 CR4: 00000000000406e0 feb 12 15:30:37 taz kernel: Stack: feb 12 15:30:37 taz kernel: ffffffffa1a7ab60 0000000000000000 ffff8800bb6a4600 ffff8808188c8000 feb 12 15:30:37 taz kernel: 0000000000000003 ffffffffa1a7ab40 ffff8800bb6a4600 ffff8808188c8000 feb 12 15:30:37 taz kernel: ffff88081877fc20 ffffffff8108b1cc 0000000000000282 ffff8800bb6a4600 feb 12 15:30:37 taz kernel: Call Trace: feb 12 15:30:37 taz kernel: [<ffffffff8108b1cc>] down+0x3c/0x50 feb 12 15:30:37 taz kernel: [<ffffffffa11a9977>] nvidia_open+0x257/0x300 [nvidia] feb 12 15:30:37 taz kernel: [<ffffffffa11a8328>] nvidia_frontend_open+0x58/0xc0 [nvidia] feb 12 15:30:37 taz kernel: [<ffffffff8115996a>] chrdev_open+0x9a/0x1c0 feb 12 15:30:37 taz kernel: [<ffffffff811598d0>] ? cdev_put+0x20/0x20 feb 12 15:30:37 taz kernel: [<ffffffff81153a49>] do_dentry_open.isra.13+0x149/0x2d0 feb 12 15:30:37 taz kernel: [<ffffffff8115488a>] vfs_open+0x4a/0x50 feb 12 15:30:37 taz kernel: [<ffffffff81162492>] path_openat+0x352/0x1140 feb 12 15:30:37 taz kernel: [<ffffffff81164599>] do_filp_open+0x79/0xd0 feb 12 15:30:37 taz kernel: [<ffffffff81569e99>] ? _raw_spin_unlock+0x9/0x20 feb 12 15:30:37 taz kernel: [<ffffffff81170307>] ? __alloc_fd+0xb7/0x170 feb 12 15:30:37 taz kernel: [<ffffffff81154bd0>] do_sys_open+0x120/0x210 feb 12 15:30:37 taz kernel: [<ffffffff81154cd9>] SyS_open+0x19/0x20 feb 12 15:30:37 taz kernel: [<ffffffff8156a3db>] entry_SYSCALL_64_fastpath+0x16/0x6e feb 12 15:30:37 taz kernel: Code: bb ff ff ff ff ff ff ff 7f 65 4c 8b 2c 25 80 ae 00 00 48 83 e4 f0 48 83 ec 20 48 8b 47 28 48 89 14 24 48 89 67 28 48 89 44 24 08 <48> 89 20 4c 89 6c 24 10 feb 12 15:30:37 taz kernel: RIP [<ffffffff8156838c>] __down+0x3c/0xa0 feb 12 15:30:37 taz kernel: RSP <ffff88081877fbc0> feb 12 15:30:37 taz kernel: CR2: 0000000000000000 feb 12 15:30:37 taz kernel: ---[ end trace 1d67269097ae32d1 ]--- feb 12 15:30:37 taz kernel: note: nvidia-smi[2040] exited with preempt_count 1 As I mentioned before, adding "sleep 1" in /lib/udev/nvidia-udev.sh before running /opt/bin/nvidia-smi fixes the problem. #!/bin/sh if [ $# -ne 1 ]; then echo "Invalid args" >&2 exit 1 fi case $1 in add|ADD) #hopefully this prevents infinite loops like bug #454740 if lsmod | grep -iq nvidia; then sleep 1 /opt/bin/nvidia-smi > /dev/null fi ;; remove|REMOVE) rm -f /dev/nvidia* ;; esac exit 0 Please add the sleep 1 in the script. Thanks. And once again: vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=io+mem BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8156a86c>] __down+0x3c/0xa0 PGD 234cf8067 PUD 23487a067 PMD 0 Oops: 0002 [#1] PREEMPT SMP Modules linked in: nvidia(PO+) rfcomm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag snd_pcm snd_timer snd mei_me soundcore shpchp nls_iso8859_1 nls_cp437 vfat fat processor tpm_infineon tpm_tis tpm button sch_fq_codel openvswitch nf_defrag_ipv6 hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid ohci_pci ohci_hcd uhci_hcd usb_storage hid arcmsr s CPU: 1 PID: 21613 Comm: nvidia-smi Tainted: P O 4.4.10-gentoo #1 Hardware name: System manufacturer System Product Name/P9X79 WS, BIOS 4701 08/26/2014 task: ffff880818195580 ti: ffff8802321a8000 task.ti: ffff8802321a8000 RIP: 0010:[<ffffffff8156a86c>] [<ffffffff8156a86c>] __down+0x3c/0xa0 RSP: 0018:ffff8802321abbc0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 7fffffffffffffff RCX: 00000000000000ff RDX: ffffffffa27e5420 RSI: ffffffff817d28fc RDI: ffffffffa27e5400 RBP: ffff8802321abc00 R08: 000060f7c0008040 R09: 0000000000000013 R10: 0000000000000000 R11: ffffffffa1f0e75e R12: ffffffffa27e5400 R13: ffff880818195580 R14: 0000000000000003 R15: ffff880817db7a08 FS: 00007fa506fea700(0000) GS:ffff88083fc20000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000007b1084000 CR4: 00000000000406e0 Stack: ffffffffa27e5420 0000000000000000 ffff880233e9e600 ffff880232198000 0000000000000003 ffffffffa27e5400 ffff880233e9e600 ffff880232198000 ffff8802321abc20 ffffffff8108b1ec 0000000000000282 ffff880233e9e600 Call Trace: [<ffffffff8108b1ec>] down+0x3c/0x50 [<ffffffffa1f0e937>] nvidia_open+0x257/0x300 [nvidia] [<ffffffffa1f0d323>] nvidia_frontend_open+0x53/0xa0 [nvidia] [<ffffffff81159f6a>] chrdev_open+0x9a/0x1c0 [<ffffffff81159ed0>] ? cdev_put+0x20/0x20 [<ffffffff81154049>] do_dentry_open.isra.13+0x149/0x2d0 [<ffffffff81154e8a>] vfs_open+0x4a/0x50 [<ffffffff81162ca7>] path_openat+0x557/0x10e0 [<ffffffff81164b69>] do_filp_open+0x79/0xd0 [<ffffffff8156c369>] ? _raw_spin_unlock+0x9/0x20 [<ffffffff81170917>] ? __alloc_fd+0xb7/0x170 [<ffffffff811551d0>] do_sys_open+0x120/0x210 [<ffffffff811552d9>] SyS_open+0x19/0x20 [<ffffffff8156c89b>] entry_SYSCALL_64_fastpath+0x16/0x6e Code: bb ff ff ff ff ff ff ff 7f 65 4c 8b 2c 25 80 ae 00 00 48 83 e4 f0 48 83 ec 20 48 8b 47 28 48 89 14 24 48 89 67 28 48 89 44 24 08 <48> 89 20 4c 89 6c 24 10 RIP [<ffffffff8156a86c>] __down+0x3c/0xa0 RSP <ffff8802321abbc0> CR2: 0000000000000000 ---[ end trace 2302e17022023252 ]--- note: nvidia-smi[21613] exited with preempt_count 1 Please fix this. There are 2 solutions offered, sleep which is ugly, or make installation of this script optional via a USE flag. Make a choice and fix it. I am getting really pissed of that I need to hard reset my box every time nvidia-drivers was updated. Old ticket, just shipping in my 2cents as I found it while writing a new report about something related: 01:00.0 3D controller: NVIDIA Corporation TU117GLM [Quadro T2000 Mobile / Max-Q] (rev a1) Subsystem: Dell TU117GLM [Quadro T2000 Mobile / Max-Q] Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia I cannot reproduce this issue or get any stacktrace whatsoever.. nvidia-udev.sh will likely be removed, so if there was still any issues here it'll likely go away with it *** This bug has been marked as a duplicate of bug 454740 *** |