Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 644942 - 4.14.x-gentoo - Kernel BUG. invalid opcode. list_del_entry_valid
Summary: 4.14.x-gentoo - Kernel BUG. invalid opcode. list_del_entry_valid
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://bugzilla.kernel.org/show_bug....
Whiteboard:
Keywords:
Depends on:
Blocks: 649198
  Show dependency tree
 
Reported: 2018-01-18 13:50 UTC by Alexander Miroshnichenko
Modified: 2018-04-10 04:27 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Kernel config (config-4.14.14-gentoo,125.03 KB, text/plain)
2018-01-18 13:50 UTC, Alexander Miroshnichenko
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Miroshnichenko 2018-01-18 13:50:24 UTC
Created attachment 515234 [details]
Kernel config

Guest VM hungs if kernel 4.14.x on host and guest vm with attached config.
I tried 4.14.8, 4.14.12, 4.14.13 with same issue.

QEMU-KVM: app-emulation/qemu-2.10.1-r1

How to reproduce:
- Compile and install kernel from gentoo-sources with attached kernel config.
- Install compiled kernel on host and guest systems
- Try to compile something big and wat some time (from couple of minutes to couple of hours)
- Guest VM unresposive. Kernel hungs.

[ 8204.339820] invalid opcode: 0000 [#1] SMP PTI
[ 8204.341308] Modules linked in: netconsole rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc binfmt_misc virtio_balloon virtio_net iTCO_wdt iTCO_vendor_support shpchp i2c_i801 lpc_ich input_leds intel_agp ghash_clmulni_intel btrfs xor zstd_compress raid6_pq zstd_decompress xxhash dm_crypt usbhid xhci_plat_hcd ohci_pci ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd sg ata_generic sata_nv ata_piix sd_mod ahci libahci virtio_scsi xhci_pci xhci_hcd qxl libata drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops intel_gtt led_class scsi_mod ttm usbcore virtio_pci drm usb_common virtio_ring virtio agpgart
[ 8204.347292] CPU: 1 PID: 8022 Comm: cc1plus Not tainted 4.14.14-gentoo #1
[ 8204.348022] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1.fc27 04/01/2014
[ 8204.348676] task: ffff8c39fa703000 task.stack: ffffa0d3c1860000
[ 8204.349536] RIP: 0010:__list_del_entry_valid+0x81/0x90
[ 8204.350551] RSP: 0000:ffffa0d3c1863bf8 EFLAGS: 00010082
[ 8204.351494] RAX: 0000000000000054 RBX: 0000000000000370 RCX: 0000000000000000
[ 8204.352628] RDX: 0000000000000000 RSI: ffff8c3a2ae56538 RDI: ffff8c3a2ae56538
[ 8204.353508] RBP: ffff8c3a2b1ef000 R08: 0000000000000001 R09: 000000000000027c
[ 8204.354488] R10: 0000000000000000 R11: 000000000000027c R12: 0000000000000010
[ 8204.355496] R13: ffffc64600c1ffc0 R14: 000000000000000a R15: ffffc64600c00020
[ 8204.356861] FS:  00007f39fb626400(0000) GS:ffff8c3a2ae40000(0000) knlGS:0000000000000000
[ 8204.357862] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8204.358893] CR2: 00007f39f9120000 CR3: 0000000035f58001 CR4: 00000000001606e0
[ 8204.359869] Call Trace:
[ 8204.360812]  ? __rmqueue+0xbd/0x570
[ 8204.361434]  ? current_time+0x3b/0x70
[ 8204.362068]  ? get_page_from_freelist+0xac9/0xbd0
[ 8204.362694]  ? __alloc_pages_nodemask+0x103/0x260
[ 8204.363302]  ? alloc_pages_vma+0x7c/0x1c0
[ 8204.363908]  ? __handle_mm_fault+0xc46/0x1000
[ 8204.364534]  ? handle_mm_fault+0xe7/0x190
[ 8204.365326]  ? __do_page_fault+0x1c0/0x410
[ 8204.366371]  ? async_page_fault+0x36/0x60
[ 8204.367075]  ? async_page_fault+0x4c/0x60
[ 8204.367773] Code: a8 47 d8 86 e8 0f 0f c5 ff 0f 0b 48 89 fe 48 c7 c7 e0 47 d8 86 e8 fe 0e c5 ff 0f 0b 48 89 fe 48 c7 c7 20 48 d8 86 e8 ed 0e c5 ff <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 48 85 d2 41 55 41
[ 8204.369059] RIP: __list_del_entry_valid+0x81/0x90 RSP: ffffa0d3c1863bf8
[ 8204.369715] ---[ end trace 72be4cd0d441c6bf ]---
Comment 1 Alexander Miroshnichenko 2018-01-25 21:10:41 UTC
Here is the new again

[ 8404.955744] invalid opcode: 0000 [#1] SMP
[ 8404.956584] Modules linked in: netconsole rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc binfmt_misc iTCO_wdt iTCO_vendor_support virtio_net virtio_balloon input_leds i2c_i801 intel_agp lpc_ich ghash_clmulni_intel shpchp btrfs xor zstd_compress raid6_pq zstd_decompress xxhash dm_crypt usbhid xhci_plat_hcd ohci_pci ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd sg ata_generic sata_nv ata_piix sd_mod qxl ahci libahci drm_kms_helper virtio_scsi syscopyarea sysfillrect sysimgblt fb_sys_fops ttm xhci_pci libata xhci_hcd scsi_mod led_class virtio_pci virtio_ring usbcore drm virtio usb_common intel_gtt agpgart
[ 8404.961382] CPU: 0 PID: 6482 Comm: kworker/0:2H Not tainted 4.14.13-gentoo #1
[ 8404.962183] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1.fc27 04/01/2014
[ 8404.963091] task: ffffa3a5a18f0000 task.stack: ffffad9e4165c000
[ 8404.963961] RIP: 0010:__list_del_entry_valid+0x86/0x90
[ 8404.964837] RSP: 0018:ffffad9e4165fea0 EFLAGS: 00010086
[ 8404.965702] RAX: 0000000000000054 RBX: ffffa3a5aae33580 RCX: 0000000000000000
[ 8404.966564] RDX: 0000000000000000 RSI: ffffa3a5aae16538 RDI: ffffa3a5aae16538
[ 8404.967367] RBP: ffffad9e4165fea0 R08: 0000000000000001 R09: 0000000000000313
[ 8404.968258] R10: ffffa3a5aae1d100 R11: 0000000000000313 R12: ffffa3a5aae33580
[ 8404.969229] R13: ffffa3a59e6bc7b0 R14: ffffa3a5aae335a0 R15: ffffa3a59e6bc780
[ 8404.970089] FS:  0000000000000000(0000) GS:ffffa3a5aae00000(0000) knlGS:0000000000000000
[ 8404.970959] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8404.971822] CR2: 00000000000000b8 CR3: 000000000401c004 CR4: 00000000001606f0
[ 8404.972857] Call Trace:
[ 8404.973790]  worker_thread+0x110/0x3c0
[ 8404.974700]  kthread+0x108/0x140
[ 8404.975670]  ? process_one_work+0x380/0x380
[ 8404.976779]  ? kthread_create_on_node+0x70/0x70
[ 8404.977752]  ? SyS_exit_group+0x14/0x20
[ 8404.978599]  ret_from_fork+0x1f/0x30
[ 8404.979553] Code: 78 40 d8 ba e8 0d 3b c4 ff 0f 0b 48 89 fe 48 c7 c7 b0 40 d8 ba e8 fc 3a c4 ff 0f 0b 48 89 fe 48 c7 c7 f0 40 d8 ba e8 eb 3a c4 ff <0f> 0b 90 90 90 90 90 90 90 90 55 48 85 d2 48 89 e5 41 56 41 55
[ 8404.982311] RIP: __list_del_entry_valid+0x86/0x90 RSP: ffffad9e4165fea0
[ 8404.983123] ---[ end trace 29abfd8cb53c692c ]---
[ 8404.983926] Kernel panic - not syncing: Fatal exception
[ 8404.984928] Kernel Offset: 0x39000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 8404.985645] Rebooting in 1 seconds..
[ 8406.002651] ACPI MEMORY or I/O RESET_REG.
Comment 2 Alexander Miroshnichenko 2018-01-25 21:13:47 UTC
Sometimes I can see crashes of applications before kernel crash:

[ 7726.832065] emerge[8621]: segfault at ab ip 00007f6d4ec8b29b sp 00007fffeac7f820 error 4 in libpython2.7.so.1.0[7f6d4ebf9000+1a8000]
[ 7741.857922] ebuild.sh[7895]: segfault at 18 ip 00007fb7de9cea49 sp 00007ffd673ce530 error 4 in libc-2.25.so[7fb7de94b000+1b2000]
[ 8045.178786] traps: seq[22965] trap invalid opcode ip:55b339e3a002 sp:7ffc4aa14618 error:0 in seq[55b339e30000+c000]
[ 8045.209669] autom4te-2.69[22964]: segfault at 7faedcb0ff68 ip 00007faedcb0ff68 sp 00007ffdbb6b4180 error 7 in libperl.so.5.24.3[7faedc9cc000+1ed000]
[ 8176.989699] traps: seq[26454] trap invalid opcode ip:55fbd5588002 sp:7ffdddc481b8 error:0 in seq[55fbd557e000+c000]
[ 8177.184795] autom4te-2.69[26452]: segfault at 7f0137032f68 ip 00007f0137032f68 sp 00007ffef6cdfd10 error 7 in libperl.so.5.24.3[7f0136eef000+1ed000]
[ 8178.060351] traps: seq[27150] trap invalid opcode ip:557dee012002 sp:7ffc4f63acd8 error:0 in seq[557dee008000+c000]
[ 8178.068409] aclocal-1.15[27147]: segfault at 7fc8cc68bf68 ip 00007fc8cc68bf68 sp 00007ffcb032ee50 error 7 in libperl.so.5.24.3[7fc8cc548000+1ed000]
Comment 3 Tomáš Mózes 2018-01-26 04:39:14 UTC
Can you please try with pti=off? Does it happen with one machine or it's reproducible on multiple ones?

Try building the vanilla sources and please report upstream if it happens again.
Comment 4 Alexander Miroshnichenko 2018-01-26 09:48:09 UTC
>Can you please try with pti=off?

I have configured kernel with
CONFIG_PAGE_TABLE_ISOLATION=n
CONFIG_RETPOLINE=n

and still have kernel crash

>Does it happen with one machine or it's reproducible on multiple ones?
The bug appears with QEMU-KVM guests only. All of my 4 PC works fine.
Comment 5 Alexander Miroshnichenko 2018-01-26 09:49:01 UTC
The issue appears when I migrate from Kernel 4.9.
With kernel 4.9.X no problems
Comment 6 Tomáš Mózes 2018-01-26 10:29:32 UTC
Please try the vanilla 14.x sources and report upstream.
Comment 7 Alice Ferrazzi Gentoo Infrastructure gentoo-dev 2018-02-01 05:50:14 UTC
please write here the thread url link after reporting upstream
Comment 8 Alexander Miroshnichenko 2018-02-04 08:43:40 UTC
Upstream bug https://bugzilla.kernel.org/show_bug.cgi?id=198659
Comment 9 Alice Ferrazzi Gentoo Infrastructure gentoo-dev 2018-02-07 07:47:57 UTC
thanks you :)
Comment 10 Alice Ferrazzi Gentoo Infrastructure gentoo-dev 2018-02-20 04:06:37 UTC
you could follow Andrew Morton indication to send the bug to the mailing list ?

please post the thread link if you did, I couldn't find it.
Comment 11 Alexander Miroshnichenko 2018-02-20 07:13:16 UTC
I can find only the https://marc.info/?l=linux-mm&m=151787609332462&w=2
Comment 12 Alice Ferrazzi Gentoo Infrastructure gentoo-dev 2018-03-12 11:25:14 UTC
any news ?
Comment 13 Alexander Miroshnichenko 2018-03-19 10:49:20 UTC
There no information from upstream kernel team.
Comment 14 Alexander Miroshnichenko 2018-04-10 04:22:39 UTC
The 4.14.31 kernel seems to work fine without issue