Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 642384 - sys-kernel/gentoo-sources-4.14.9 CPU hard and soft lockups before initramfs on boot
Summary: sys-kernel/gentoo-sources-4.14.9 CPU hard and soft lockups before initramfs o...
Status: RESOLVED DUPLICATE of bug 642268
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal major (vote)
Assignee: Gentoo Linux bug wranglers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-27 09:57 UTC by Jaak Ristioja
Modified: 2017-12-27 18:35 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
xz-compressed .config for gentoo-sources-4.14.9 (config.xz,21.12 KB, application/x-xz)
2017-12-27 09:57 UTC, Jaak Ristioja
Details
crash1.jpg - first screenshot of first crash (crash1.jpg,108.33 KB, image/jpeg)
2017-12-27 09:59 UTC, Jaak Ristioja
Details
crash1b.jpg - second screenshot of first crash (crash1b.jpg,98.99 KB, image/jpeg)
2017-12-27 10:00 UTC, Jaak Ristioja
Details
crash2.jpg - first screenshot of second crash (crash2.jpg,125.30 KB, image/jpeg)
2017-12-27 10:00 UTC, Jaak Ristioja
Details
crash2b.jpg - second screenshot of second crash (crash2b.jpg,96.99 KB, image/jpeg)
2017-12-27 10:01 UTC, Jaak Ristioja
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jaak Ristioja 2017-12-27 09:57:36 UTC
Created attachment 511720 [details]
xz-compressed .config for gentoo-sources-4.14.9

sys-kernel/gentoo-sources-4.14.9 fails to boot on a Lenovo ThinkPad T440p laptop (20AN006VMS, BIOS GLET90WW 2.44) due to CPU lockups. With sys-kernel/gentoo-sources-4.14.8-r1 it boots well.

It seems to lockup before busybox in the initramfs has the oppurtunity to execute "echo" commands from the init script.

There seem to be two versions of the crash. The first one (see screenshots crash1.jpg and crash1b.jpg) results in CPU hard lockups in do_double_fault+0x0/0x30 ("NMI watchdog: Watchdog detected hard LOCKUP on cpu 1"):

   <#DF>
   do_double_fault+0xb/0xc0
   </#DF>

In this case the system echoes keyboard input to the screen and responds to SysRq which I can use to emergency sync, remount read-only and reboot the system. Given enough entropy using the keyboard, it even outputs "random: crng init done".

The second version on this results in soft lockups (see screenshots crash2.jpg and crash2b.jpg) in multi_cpu_stop+0x4b/0xd0 ("watchdog: BUG: soft lockup - CPU#7 stuck for 23s! [migration/7:52]"):

    ? cpu_stop_queue_work+0x90/0x90
    cpu_stopper_thread+0x8e/0x110
    smpboot_thread_fn+0x100/0x1e0
    kthread+0x101/0x140
    ? sort_range+0x20/0x20
    ? __kthread_create_on_node+0x180/0x180
    ret_from_fork+0x1f/0x30

In this case the system does not echo keyboard input to the screen, but does respond to SysRq as in the first case. Since the error messages were many and long and scrolled by instantly, I only managed to photograph the last screenful. But these happened at the same time as in the case of the hard lockup.

See the screenshots for more details.
Comment 1 Jaak Ristioja 2017-12-27 09:59:25 UTC
Created attachment 511722 [details]
crash1.jpg - first screenshot of first crash
Comment 2 Jaak Ristioja 2017-12-27 10:00:00 UTC
Created attachment 511724 [details]
crash1b.jpg - second screenshot of first crash
Comment 3 Jaak Ristioja 2017-12-27 10:00:54 UTC
Created attachment 511726 [details]
crash2.jpg - first screenshot of second crash
Comment 4 Jaak Ristioja 2017-12-27 10:01:18 UTC
Created attachment 511728 [details]
crash2b.jpg - second screenshot of second crash
Comment 5 Tomáš Mózes 2017-12-27 17:00:00 UTC
4.14.9 seems like a bad release,please use 4.14.8-r1 or wait for 4.14.10.
Comment 6 Jaak Ristioja 2017-12-27 18:17:26 UTC
The kernel was built with GCC 6.4.0 on an hardened/linux/amd64/no-multilib profile. When rebuilt using GCC 7.2.0 I still got a CPU soft lockup on boot. When rebuilt with GCC 5.4.0 the kernel boots and runs without this issue.

I think this is a duplicate of bug #642268.

*** This bug has been marked as a duplicate of bug 642268 ***
Comment 7 Jaak Ristioja 2017-12-27 18:35:03 UTC
(In reply to Michael Cook from https://bugs.gentoo.org/642268#c9)
> I booted fine compiling with 7.2.0, so maybe whatever bug it is, is fixed in
> 7.2?

In response to the above comment on the duplicate, I double checked whether it boots when compiled with GCC 7.2.0. I did a `make mrproper`, replaced the .config, recompiled with GCC 7.2.0, removed old kernel versions from /boot/, ran `make install` and booted. And I still got the issue. This time after the boot delay I saw a do_double_fault error followed by a the soft lockup messages a few seconds later.