Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 615736

Summary: sys-apps/kexec-tools on hardened-musl "Cannot load /boot/vmlinuz"; fails with no explanation
Product: Gentoo Linux Reporter: Chad Joan <chadjoan>
Component: Current packagesAssignee: Gentoo's Team for Core System packages <base-system>
Status: RESOLVED UPSTREAM    
Severity: normal CC: musl, tsmksubc
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 430702    
Attachments: kexec-tools-2.0.13-r1-sscanf.patch

Description Chad Joan 2017-04-16 07:18:33 UTC
Created attachment 470112 [details, diff]
kexec-tools-2.0.13-r1-sscanf.patch

For version 2.0.13-r1, the problem looks like this:
$ /etc/init.d/kexec start
 * Caching service dependencies ...                                       [ ok ]
 * Using kernel image /boot/vmlinuz (with /boot/initrd) for kexec ...
Cannot load /boot/vmlinuz                                                 [ !! ]
 * ERROR: kexec failed to start

For version 2.0.14, the problem looks like this:
$ /etc/init.d/kexec start
 * Caching service dependencies ...                                       [ ok ]
 * Using kernel image /boot/vmlinuz (with /boot/initrd) for kexec ...
Cannot get kernel page_offset_base symbol address
Cannot load /boot/vmlinuz                                                 [ !! ]
 * ERROR: kexec failed to start

Note that the "Cannot get kernel page_offset_base symbol address" error message is unrelated.  Neither version I tried gives any indication of why /boot/vmlinuz could not be loaded, and the 2.0.14 one gives us a red herring.


Requisite conditions:
- Be using musl libc. Or anything strictly adhering to ISO C should do ;)
- Have sys-apps/kexec-tools-2.0.14 or earlier installed (as of this writing, this is the latest version that isn't upstream's git HEAD).
- Use a crash kernel to catch kernel panics: More specifically, pass -p to kexec; I set KEXEC_OPT_ARGS="-p" in /etc/conf.d/kexec to get my bug on.


Solution:
This problem was caused by kexec using %Lx specifiers in sscanf format strings when %llx was intended.  The bug is fixed in this commit:
https://github.com/horms/kexec-tools/commit/47cc70157c6610d4c01a7ab4f8f0d24ab054b43b

The "Cannot get kernel page_offset_base symbol address" error is another bug that upstream is aware of, and it seems to be benign.  Details:
https://bugzilla.redhat.com/show_bug.cgi?id=1428246
https://bugzilla.redhat.com/show_bug.cgi?id=1432322

The attached patch is based off of the above commit that fixes the sscanf format strings.  With this patch my kexec kernel is no longer homeless :)
I tested it on versions 2.0.13-r1 and 2.0.14.  It might work for earlier versions.

(I wish my google searches would have brought up that commit in the beginning of this... instead I ended up spending some hours figuring out that the crash_reserved_mem array always described one region with 0 as its start and end address, then figuring out /why/ the crash_reserved_mem array was so affected.  Once I learned that it was about format specifiers, the upstream commit was easy to find, of course.  o.O  )

Hope that helps.
Comment 1 Brian Evans Gentoo Infrastructure gentoo-dev 2017-10-24 16:50:08 UTC
This was included with kexec-tools-2.0.15