my log is full of фев 02 19:06:15 home64 kernel: BUG: using smp_processor_id() in preemptible [00000000] code: CPU 1/KVM/12349 фев 02 19:06:15 home64 kernel: caller is single_task_running+0x5/0x20 фев 02 19:06:15 home64 kernel: CPU: 2 PID: 12349 Comm: CPU 1/KVM Tainted: P U W O 4.14.14-ck #4 фев 02 19:06:15 home64 kernel: Hardware name: Gigabyte Technology Co., Ltd. Z370 AORUS Ultra Gaming/Z370 AORUS Ultra Gaming-CF, BIOS F6 10/31/2017 фев 02 19:06:15 home64 kernel: Call Trace: фев 02 19:06:15 home64 kernel: dump_stack+0x46/0x65 фев 02 19:06:15 home64 kernel: check_preemption_disabled+0xd3/0xe0 фев 02 19:06:15 home64 kernel: single_task_running+0x5/0x20 фев 02 19:06:15 home64 kernel: kvm_vcpu_block+0x278/0x310 фев 02 19:06:15 home64 kernel: kvm_arch_vcpu_ioctl_run+0x12d/0x1680 фев 02 19:06:15 home64 kernel: ? kvm_arch_vcpu_load+0x64/0x230 фев 02 19:06:15 home64 kernel: ? kvm_arch_vcpu_load+0x7f/0x230 фев 02 19:06:15 home64 kernel: ? kvm_vcpu_ioctl+0x27b/0x5e0 фев 02 19:06:15 home64 kernel: kvm_vcpu_ioctl+0x27b/0x5e0 фев 02 19:06:15 home64 kernel: ? skiplist_insert+0x57/0xf0 фев 02 19:06:15 home64 kernel: ? timerqueue_del+0x1e/0x40 фев 02 19:06:15 home64 kernel: ? timerqueue_add+0x52/0x80 фев 02 19:06:15 home64 kernel: ? enqueue_hrtimer+0x37/0x90 фев 02 19:06:15 home64 kernel: ? _raw_spin_unlock_irqrestore+0xf/0x30 фев 02 19:06:15 home64 kernel: ? hrtimer_start_range_ns+0x1ad/0x330 фев 02 19:06:15 home64 kernel: do_vfs_ioctl+0x88/0x5d0 фев 02 19:06:15 home64 kernel: ? security_file_ioctl+0x39/0x50 фев 02 19:06:15 home64 kernel: SyS_ioctl+0x6f/0x80 фев 02 19:06:15 home64 kernel: ? exit_to_usermode_loop+0x83/0x90 фев 02 19:06:15 home64 kernel: entry_SYSCALL_64_fastpath+0x1d/0x76 фев 02 19:06:15 home64 kernel: RIP: 0033:0x7f0e728643e7 фев 02 19:06:15 home64 kernel: RSP: 002b:00007f0e665ef918 EFLAGS: 00000246
please provide your /usr/src/linux/.config for this kernel. the KVM reference has me wondering: are you getting this bug on bare metal hardware or is it under KVM? either way, will likely need to upstream this bug report after testing.
Created attachment 518360 [details] .config
The bug happens on the host - the machine running the KVM hypervisor. I have another laptop running the same kernel and using KVM for VMs, but it has no iommu. The bug doesnt' happen there.
(In reply to Anton Gubarkov from comment #3) > The bug happens on the host - the machine running the KVM hypervisor. > I have another laptop running the same kernel and using KVM for VMs, but it > has no iommu. The bug doesnt' happen there. Can't say for certain, but it's possible that this might be related to compatibility between KVM and using a non-CFS scheduler. Also, based on the config it looks like you have cgroups support enabled. Because you've mentioned other hardware not having these issues (different KVM host) I'd like to specifically rule out this possibility please: Please build a KVM guest kernel which uses the CFS scheduler instead of MuQSS. If CFS makes this issue go away, please also try a rebuild with the BFS scheduler (the other choice for ck-sources) instead of the MuQSS or CFS scheduler. This is probably not a gentoo-specific issue, but once the specific details are better understood, I'll assist with relaying the information to upstream (and you can follow up there, if desired.) Also please note: There are some unrelated issues preventing a release of ">sys-kernel/ck-sources-4.14.14" (gentoo-specific packaging) so limited time will likely mean this edge case doesn't see a timely resolution. Sorry about that. --- Reference #1 http://madeforcloud.com/post/kvm-vcpu-scheduling/ Within KVM, each vcpu is mapped to a Linux process which in turn utilises hardware assistance to create the necessary ‘smoke and mirrors’ for virtualisation. As such, a vcpu is just another process to the CFS and also importantly to cgroups which, as a resource manager, allows Linux to manage allocation of resources - typically proportionally in order to set constraint allocations. cgroups also apply to Memory, network and I/O. --- Reference #2 /usr/src/linux-4.14.14-ck/Documentation/scheduler/sched-MuQSS.txt What MuQSS does _not_ now feature is support for CGROUPS. The average user should neither need to know what these are, nor should they need to be using them to have good desktop behaviour. However since some applications refuse to work without cgroups, one can enable them with MuQSS as a stub and the filesystem will be created which will allow the applications to work. ---
I've rebuilt the kernel on the affected host w/ MuQSS deselected. The BUG went away. I can't find the option to enable BFS.
Some of the remarks about the host left me confused. I think you're referring to the hardware, not the kernel. (see below) A) MuQSS enabled on host, KVM guest uses kernel with MuQSS disabled B) MuQSS disabled on host, KVM guest uses kernel without MuQSS C) MuQSS enabled on host, as well as KVM guest (both using MuQSS) D) Not using MuQSS at all (neither host, nor on the KVM guest) Please clarify which test case(s) have the bug VS working normally. The output of:$ cat /proc/version (on the host, as well as KVM guest) should specify which compiler (incl. version) was used to build the kernel, and/or other metadata which may be relevant to document the issue. Also, what is the output for:$ cat /proc/cpuinfo This bug is probably not a gentoo-specific failure. I'll relay the info upstream.
I have different cases. My guests are running Windows. so I have A) Host home64 with MuQSS enabled exhibits the BUG when a KVM guest is running B) Host home64 without MuQSS enabled shows no BUG C) Host r9-008cln never exhibited the BUG (but has no iommu hardware) My info from home64: home64 ~ # cat /proc/version Linux version 4.14.14-ck (root@home64) (gcc version 7.3.0 (Gentoo 7.3.0 p1.0)) #5 SMP PREEMPT Wed Feb 7 21:56:38 MSK 2018 home64 ~ # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4196.557 cache size : 9216 KB physical id : 0 siblings : 6 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts c cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx es t tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_d eadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault in vpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsba se tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflu shopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_ act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4193.025 cache size : 9216 KB physical id : 0 siblings : 6 core id : 1 cpu cores : 6 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts c cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx es t tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_d eadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault in vpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsba se tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflu shopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_ act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4183.938 cache size : 9216 KB physical id : 0 siblings : 6 core id : 2 cpu cores : 6 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts c cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx es t tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_d eadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault in vpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsba se tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflu shopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_ act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4169.763 cache size : 9216 KB physical id : 0 siblings : 6 core id : 3 cpu cores : 6 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts c cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx es t tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_d eadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault in vpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsba se tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflu shopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_ act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4166.248 cache size : 9216 KB physical id : 0 siblings : 6 core id : 4 cpu cores : 6 apicid : 8 initial apicid : 8 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts c cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx es t tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_d eadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault in vpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsba se tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflu shopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_ act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz stepping : 10 microcode : 0x80 cpu MHz : 4178.815 cache size : 9216 KB physical id : 0 siblings : 6 core id : 5 cpu cores : 6 apicid : 10 initial apicid : 10 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti retpoline intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : cpu_meltdown spectre_v1 spectre_v2 bogomips : 7200.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management:
(In reply to Anton Gubarkov from comment #7) > I have different cases. My guests are running Windows. > > so I have > > A) Host home64 with MuQSS enabled exhibits the BUG when a KVM guest is > running > B) Host home64 without MuQSS enabled shows no BUG > C) Host r9-008cln never exhibited the BUG (but has no iommu hardware) [...] I believe the issue might be caused by KVM (on the host) not supporting MuQSS Please specify SPECIFICALLY if home64 is using: A) MuQSS enabled on host, KVM guest uses kernel with MuQSS disabled B) MuQSS disabled on host, KVM guest uses kernel without MuQSS C) MuQSS enabled on host, as well as KVM guest (both using MuQSS) I need to know if MuQSS is on the host and/or guest I also see that you've built the kernel using a compiler which isn't yet keyword-stabilized. Please try to reproduce this issue building using gcc version 6.4.0-r1 gcc-config will let you activate a different slot, so rebuilding gcc 7.3.x again won't be required if you switch back to that one, fortunately.
MuQSS on the KVM host seems to have performance regressions: http://ck-hack.blogspot.com/2017/05/linux-411-ck2-muqss-version-0156.html?showComment=1496771913408#c2549435917673112633 ^ thread from upstream / should still be relevant if attempting to use MuQSS on the host, rather than kvm guests. I'd recommend against using MuQSS on the kvm host, as CFS (default linux scheduler) is known to be a stable configuration with lower overhead for KVM. TL;DR - your performance will only go down if you attempt to use MuQSS on the kvm host. Further, MuQSS <--> kvm interactions are upstream bugs which won't (can't) be fixed by making gentoo-specific changes.