Since upgrading from xen-4.7.3, I am getting Xen kernel panics, usually ending in a stack dump and double fault handler. Sometimes it get's past the xen kernel and begins to boot DOM0, but usually hangs very early in the boot process. If I boot DOM0 directly from grub it works fine. I cannot downgrade Xen back to 4.7 since it doesn't seem to build against glibc-2.25. I will attempt to locate RS232 level converters to get a useful stacktrace log from xen in the next few days. The hardware is an Intel Q9450 running on a Asus P5QL Pro motherboard. I have Xen set to load early microcode for the CPU, and have tried disabling that to no avail. Otherwise stable system, running gentoo-sources-4.12.12 w recommended kernel settings from the Gentoo Xen wiki. Reproducible: Always
What do you mean by "booting dom0 directly from grub"? Which other method are you using to boot?
Normally when I boot Xen, I use the following grub entry, root (hd0,0) kernel /boot/xen.gz ucode=scan module /boot/bzImage root=/dev/vol00/root rootfstype=ext4 rd.lvm.vg=vol00 module /boot/initramfs But what I mean is that if I boot the DOM0 Gentoo OS directly (bypassing Xen), it boots fine with: root (hd0,0) kernel /boot/bzImage root=/dev/vol00/root rootfstype=ext4 rd.lvm.vg=vol00 initrd /boot/initramfs
Created attachment 502926 [details] Serial log dump of Xen boot This log was achieved by adding loglvl=all com1=115200,8n1 console=com1 To my kernel line in grub
are you running hardened profile? could you provide the 'emerge --info'? I'm not sure if this is caused by this, but we have history that hardened toolchain bring issues.. btw, I do not have system running hardened profile. and from the dump stack, sounds it relate to serial/interrupt.. and it doesn't make sense that the code[1] is called several times in this single call stack [1] (XEN) [<ffff82d0802413bf>] common_interrupt+0x5f/0x70
I am running a hardened profile, and have been on this machine since 2014. I have never seen anything like this on this setup. Not saying you are wrong, I just don't know how to prove it without rebuilding world without hardened. I think the serial interrupt might be a red herring; another run and I get a different backtrace that has no serial interrupt, which I will attach next. emerge --info is as follows: Portage 2.3.8 (python 3.4.5-final-0, hardened/linux/amd64, gcc-5.4.0, glibc-2.25-r8, 4.12.12-gentoo x86_64) ================================================================= System uname: Linux-4.12.12-gentoo-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q9450_@_2.66GHz-with-gentoo-2.4.1 KiB Mem: 8179980 total, 8018120 free KiB Swap: 0 total, 0 free Timestamp of repository gentoo: Sat, 04 Nov 2017 20:00:02 +0000 Head commit of repository gentoo: 519dfbcd97ff3b205959c9a85cfcab06d9d0e21b sh bash 4.3_p48-r1 ld GNU ld (Gentoo 2.28.1 p1.0) 2.28.1 app-shells/bash: 4.3_p48-r1::gentoo dev-lang/perl: 5.24.3::gentoo dev-lang/python: 2.7.14::gentoo, 3.4.5::gentoo dev-util/cmake: 3.8.2::gentoo dev-util/pkgconfig: 0.29.2::gentoo sys-apps/baselayout: 2.4.1-r2::gentoo sys-apps/openrc: 0.32.1::gentoo sys-apps/sandbox: 2.10-r4::gentoo sys-devel/autoconf: 2.69::gentoo sys-devel/automake: 1.15-r2::gentoo sys-devel/binutils: 2.28.1::gentoo sys-devel/gcc: 5.4.0-r3::gentoo sys-devel/gcc-config: 1.8-r1::gentoo sys-devel/libtool: 2.4.6-r3::gentoo sys-devel/make: 4.2.1::gentoo sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers) sys-libs/glibc: 2.25-r8::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -march=native -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=native -pipe" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://mirror.its.dal.ca/gentoo" LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j8" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="acl amd64 berkdb bzip2 cli cracklib crypt cxx dri gdbm hardened iconv justify modules multilib ncurses nls nptl openmp pam pcre pie readline seccomp session ssl ssp tcpd threads unicode urandom xattr xtpax zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 sse3 sse4_1 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" L10N="en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_4" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby22" USERLAND="GNU" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 503080 [details] Another serial log dump of Xen kernel panic
I can confirm on two systems: 1. Intel i7-7700 on high end Asus gaming board 2. Supermicro a1sri /w atom c2758 [1] Xen kernel panic with message "Kernel panic on CPU: X. Double fault", [2] passes memory scrubbing than reboots. As to compile xen-tool 4.7.3 on glibc 2.25, you can do it, apply custom patch similiar to this https://patchwork.kernel.org/patch/9624953/ to ebuild. You can do it via /etc/portage/patch/app-emulation/xen-tools-4.7.3/glibc-custom.patch So those systems are critical to me, so I've found workaround: using gcc-config, change your compiler to vannila one and rebuild xen. It should work then. Change compiler to hardened again. It is beyond my skill to find out why it crashes but I hope this helps a bit :)
(In reply to Krzysztof from comment #7) > I can confirm on two systems: > 1. Intel i7-7700 on high end Asus gaming board > 2. Supermicro a1sri /w atom c2758 Thanks for the input Krzysztof, you saved me a lot of time. I had just switched profiles to non-hardened, rebuilt gcc, and was about to rebuild emptytree @ world. This is a much more brief workaround! I forgot you could switch to vanilla compiler via gcc-config.
Just a note for others to beware. Since gcc-6 includes pie/stack protector, Gentoo no longer supplies the subprofiles (as selected by gcc-config). Normally this would be ok, since you can always add -no-pie and -fno-stack-protector to your per-package flags as documented in the hardened wiki. However, since Xen is special it uses its own ld/cflags. There is an unsupported USE for user supplied custom-cflags, but I didn't fare well with it. My solution for the time being is to keep gcc-5.4 around so I can switch profiles with gcc-config as per the above comments.
I fixed xen-tools-4.7.3 to be compatible wdith the current glibc (see https://raw.githubusercontent.com/fnordpipe/fnordpipe-overlay/master/patches/app-emulation/xen-tools-4.7.3/0001-fix-glibc-2.25-compat-issues.patch). now I know till 4.7.3 xen hadn't issues with hardened profile. Additionally I tried xen-4.9.0 today to see if there is a fix. It's not. xen-4.9.0 ends in the same double fault.
Keep in mind <xen-4.8.2-r1 has CVEs, so keeping yourself on 4.7 might not be a great idea depending on what you have deployed. https://bugs.gentoo.org/631366
I can confirm this as well. Getting the "double fault" error during boot with both 4.8.2 and 4.9.1 with hardened profile, but compiling 4.9.1 with vanilla gcc-5.4.0 works.
I can confirm the bug on fresh installation from "stage3-amd64-hardened-20180104T214501Z.tar.xz". I recompiled gcc without hardening features: # USE='-hardened' emerge -1qv =sys-devel/gcc-6.4.0-r1 && emerge -1qv =app-emulation/xen-4.9.1-r1 and dom0 booted without error: # xl info xen_major : 4 xen_minor : 9 xen_extra : .1 xen_version : 4.9.1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 cc_compiler : x86_64-pc-linux-gnu-gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0 cc_compile_by : cc_compile_domain : cc_compile_date : Fri Jan 26 00:33:12 CET 2018
try to compile it with -fstack-check=no in CFLAGS
(In reply to Magnus Granberg from comment #14) > try to compile it with -fstack-check=no in CFLAGS Thanks for the suggestion. Can anyone please test it?
(In reply to Magnus Granberg from comment #14) > try to compile it with -fstack-check=no in CFLAGS I've tried and it does NOT help.
As I found out in comment #9 CFLAGS is not honoured in xen, it has its own idea of cflags/ldflags
This is an issue in Xen 4.9.2 as well. I'm having to switch away from the hardened profile entirely.
(In reply to Eric Gisse from comment #18) > This is an issue in Xen 4.9.2 as well. > > I'm having to switch away from the hardened profile entirely. I was hit by this too, but instead of changing away from the hardened profile I skipped forward to 4.11.0-r2 which works. release : 4.14.65-gentoo version : #1 SMP Mon Aug 20 20:19:09 BST 2018 machine : x86_64 ... xen_major : 4 xen_minor : 11 xen_extra : .0 xen_version : 4.11.0 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : xen_commandline : placeholder dom0_mem=2048M cc_compiler : x86_64-pc-linux-gnu-gcc (Gentoo Hardened 7.3.0-r3 p1.4) 7.3.0 cc_compile_by : cc_compile_domain : cc_compile_date : Fri Sep 14 19:54:01 BST 2018 build_id : 06d505ecda655363134b2fbba025fd3af84ef98e xend_config_format : 4
I'm running 4.9.2 for a few days without issues. Probably it had anything to do with a specific gcc version? # xl info release : 4.14.52-gentoo version : #1 SMP Wed Jul 18 23:10:20 UTC 2018 machine : x86_64 virt_caps : hvm xen_major : 4 xen_minor : 9 xen_extra : .2 xen_version : 4.9.2 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : cc_compiler : x86_64-pc-linux-gnu-gcc (Gentoo Hardened 7.3.0-r3 p1.4) 7.3.0 cc_compile_by : cc_compile_date : Thu Jul 19 17:28:29 UTC 2018 xend_config_format : 4
Thanks for the report, closing for now as 2 people have 4.9.2 and 4.11.0 working on hardened.