Created attachment 770963 [details] xen serial console capture of crash When trying to use gdbsx to debug a guest system with xen-4.15 or xen-4.16 leads to a RIP panic in the xen hypervisor. (rip info attached). Steps to reproduce start a up a linux guest system (you will need have a copy of the linux source tree used to build the kernel on the guest) in the dom0 host: gdbsx -a {domid} 64 9999 on a separate system that has the source tree of the linux kernel for the guest: cd {linux src tree} gdb vmlinux target remote {xen system}:9999 lx-symbols --- at this point the xen system will have crashed. It appears that the problem is that the CONFIG_GDBSX is not being set within the xen/.config. (this could very well be a bug in upstream Xen) Upstream Xen has changed the way the xen hypervisor is configured (it's now via Kconfig). Even though they've documented that CONFIG_GDBSX=y is a default, it's not being configured. I was able to successfully get a xen hypervisor built that doesn't crash when gdbsx is used by performing the following steps: ebuild xen-4.16.0-r5.ebuild configure pushd /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen make menuconfig --- enable Debugging Options ---> Developer Checks ** this should show that Guest debugging with gdbsx is now enabled ** save/exit from menuconfig popd ebuild xen-4-16.0-r5.ebuild install ebuild xen-4.16.0-r5.ebuild qmerge After doing the above build -- I was unable to reproduce the build failure. Additional information: The Xen commit that changed how gdbsx gets configured in is: xen: make gdbsx support configurable (e726a82ca0)
Upstream xen has created a patch for the crash. https://lists.xenproject.org/archives/html/xen-devel/2022-04/msg01175.html While the above patch fixes the crash, it will leave a xen system without gdbsx support unless it's configured back in. I guess a feature request is needed for an ebuild USE setting to enable gdbsx.
If I am not mistaken, then you get gdbsx by setting the 'debug' use flag on app-emulation/xen.
(In reply to Florian Schmaus from comment #2) > If I am not mistaken, then you get gdbsx by setting the 'debug' use flag on > app-emulation/xen. Not anymore.. I tried setting the USE=debug which passes the option to configure There's now a message: With USE=debug >>> Source prepared. >>> Configuring source in /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0 ... >>> Source configured. >>> Compiling source in /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0 ... make -j8 V=1 CC=x86_64-pc-linux-gnu-gcc LDFLAGS= LD=x86_64-pc-linux-gnu-ld -C xen debug=y make: Entering directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' Makefile:65: "You must use e.g. 'make menuconfig' to enable/disable debug now."
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=22b2b448d0abacef7fabcd77a0dde9d6a15f1339 commit 22b2b448d0abacef7fabcd77a0dde9d6a15f1339 Author: Florian Schmaus <flow@gentoo.org> AuthorDate: 2022-04-17 21:34:02 +0000 Commit: Florian Schmaus <flow@gentoo.org> CommitDate: 2022-04-17 21:51:58 +0000 app-emulation/xen: fix debug use flag Upstream changed to a kconfig build system for the Xen hypervisor. Even though still documented, passing 'debug=y' as make argument does not enable a debug build. We now create a Gentoo specific kconfig that is merged into upstream's default configuration. This also allows to drop the flask patch. Bug: https://bugs.gentoo.org/838730 Signed-off-by: Florian Schmaus <flow@gentoo.org> app-emulation/xen/xen-4.16.0-r5.ebuild | 45 +++++++++++++++++++++++++--------- 1 file changed, 33 insertions(+), 12 deletions(-)
Came across another problem with recent xen and gdbsx, turns out there is an additional .config setting that is needed, CONFIG_CRASH_DEBUG=y Without CONFIG_CRASH_DEBUG, it's impossible to set a breakpoint within a guest via gdbsx. The breakpoint itself is "set", but when the breakpoint is hit, the trap is not passed back to gdbsx, instead the trap is handed off to the guest which results in "odd" errors within the guest system. The upstream XEN commit that I believe is the cause of this behavior is: "xen: put more code under CONFIG_CRASH_DEBUG" (137a233186b6d436) Not sure if you want me to open another ticket for this or not.. -- let me know. I haven't reported this yet to the XEN mailing list (yet). I believe the behavior of the settings for gdbsx is inconsistent. Some of the commits seem to be related to debugging the Xen hypervisor, and other times it's related to gdbsx. gdbsx isn't really used to debug the Xen hypervisor, but is used for attaching a debugger to a guest system so it's possible to run a debugging session against the guest's kernel. Also I want to mention that with the patch "app-emulation/xen: fix debug use flag" applied, there are some "error" messages early on in the compile step. They don't seem to be causing any problem, but they are complaining about read error/early end of file when processing the .config file -- which I think causes a retry to rebuild the .config file. The gentoo specific settings are present even with the error messages.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=37851981078ad4f6194cfa07041a2bcfc26d6110 commit 37851981078ad4f6194cfa07041a2bcfc26d6110 Author: Florian Schmaus <flow@gentoo.org> AuthorDate: 2022-04-21 09:30:32 +0000 Commit: Florian Schmaus <flow@gentoo.org> CommitDate: 2022-04-21 09:32:55 +0000 app-emulation/xen: enable CONFIG_CRASH_DEBUG on USE=debug Bug: https://bugs.gentoo.org/838730 Signed-off-by: Florian Schmaus <flow@gentoo.org> app-emulation/xen/{xen-4.16.0-r5.ebuild => xen-4.16.0-r6.ebuild} | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
As requested via IRC, here are the error messages that are generated --- Developer Checks (DEBUG) [Y/n/?] y Crash Debugging Support (CRASH_DEBUG) [N/y/?] (NEW) Error in reading or end of file. (repeated for different options) --- As mentioned above, the build does finish correctly (with the correct configuration) Below are the messages in context: --- $ ebuild xen-4.16.0-r5.ebuild compile .... >>> Source prepared. >>> Configuring source in /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0 ... make -j8 defconfig set -e; { echo 'CONFIG_XSM_FLASK_POLICY=n'; :; } > .allconfig.tmp make -f /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig/Makefile.kconfig ARCH=x86_64 SRCARCH=x86 HOSTCC="gcc" HOSTCXX="g++" KCONFIG_ALLCONFIG=.allconfig.tmp defconfig make[1]: Entering directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' gcc -Wp,-MD,tools/kconfig/.conf.o.d -c -o tools/kconfig/conf.o tools/kconfig/conf.c gcc -Wp,-MD,tools/kconfig/.confdata.o.d -c -o tools/kconfig/confdata.o tools/kconfig/confdata.c gcc -Wp,-MD,tools/kconfig/.expr.o.d -c -o tools/kconfig/expr.o tools/kconfig/expr.c flex -otools/kconfig/lexer.lex.c -L tools/kconfig/lexer.l bison -o tools/kconfig/parser.tab.c --defines=tools/kconfig/parser.tab.h -t -l tools/kconfig/parser.y gcc -Wp,-MD,tools/kconfig/.preprocess.o.d -c -o tools/kconfig/preprocess.o tools/kconfig/preprocess.c gcc -Wp,-MD,tools/kconfig/.symbol.o.d -c -o tools/kconfig/symbol.o tools/kconfig/symbol.c gcc -Wp,-MD,tools/kconfig/.lexer.lex.o.d -I /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig -c -o tools/kconfig/lexer.lex.o tools/kconfig/lexer.lex.c gcc -Wp,-MD,tools/kconfig/.parser.tab.o.d -I /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig -c -o tools/kconfig/parser.tab.o tools/kconfig/parser.tab.c gcc -o tools/kconfig/conf tools/kconfig/conf.o tools/kconfig/confdata.o tools/kconfig/expr.o tools/kconfig/lexer.lex.o tools/kconfig/parser.tab.o tools/kconfig/preprocess.o tools/kconfig/symbol.o tools/kconfig/conf --defconfig=arch/x86/configs/x86_64_defconfig Kconfig # # configuration written to .config # make[1]: Leaving directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' * ./tools/kconfig/merge_config.sh -m -r .config gentoo-config Using .config as base Merging gentoo-config Value of CONFIG_DEBUG is redefined by fragment gentoo-config: Previous value: # CONFIG_DEBUG is not set New value: CONFIG_DEBUG=y # # merged configuration written to .config (needs make) # >>> Source configured. >>> Compiling source in /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0 ... make -j8 V=1 CC=x86_64-pc-linux-gnu-gcc LDFLAGS= LD=x86_64-pc-linux-gnu-ld -C xen make: Entering directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' make -f /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig/Makefile.kconfig ARCH=x86_64 SRCARCH=x86 HOSTCC="x86_64-pc-linux-gnu-gcc" HOSTCXX="g++" syncconfig make[1]: Entering directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.conf.o.d -c -o tools/kconfig/conf.o tools/kconfig/conf.c x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.confdata.o.d -c -o tools/kconfig/confdata.o tools/kconfig/confdata.c x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.expr.o.d -c -o tools/kconfig/expr.o tools/kconfig/expr.c flex -otools/kconfig/lexer.lex.c -L tools/kconfig/lexer.l bison -o tools/kconfig/parser.tab.c --defines=tools/kconfig/parser.tab.h -t -l tools/kconfig/parser.y x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.preprocess.o.d -c -o tools/kconfig/preprocess.o tools/kconfig/preprocess.c x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.symbol.o.d -c -o tools/kconfig/symbol.o tools/kconfig/symbol.c x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.lexer.lex.o.d -I /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig -c -o tools/kconfig/lexer.lex.o tools/kconfig/lexer.lex.c x86_64-pc-linux-gnu-gcc -Wp,-MD,tools/kconfig/.parser.tab.o.d -I /var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen/tools/kconfig -c -o tools/kconfig/parser.tab.o tools/kconfig/parser.tab.c x86_64-pc-linux-gnu-gcc -o tools/kconfig/conf tools/kconfig/conf.o tools/kconfig/confdata.o tools/kconfig/expr.o tools/kconfig/lexer.lex.o tools/kconfig/parser.tab.o tools/kconfig/preprocess.o tools/kconfig/symbol.o tools/kconfig/conf --syncconfig Kconfig * * Restart config... * * * Debugging Options * Developer Checks (DEBUG) [Y/n/?] y Crash Debugging Support (CRASH_DEBUG) [N/y/?] (NEW) Error in reading or end of file. Guest debugging with gdbsx (GDBSX) [Y/n/?] (NEW) Error in reading or end of file. Compile Xen with debug info (DEBUG_INFO) [Y/n/?] (NEW) Error in reading or end of file. Compile Xen with frame pointers (FRAME_POINTER) [Y/n/?] (NEW) Error in reading or end of file. Lock Profiling (DEBUG_LOCK_PROFILE) [N/y/?] (NEW) Error in reading or end of file. Lock debugging (DEBUG_LOCKS) [Y/n/?] (NEW) Error in reading or end of file. Performance Counters (PERF_COUNTERS) [N/y/?] (NEW) Error in reading or end of file. Verbose debug messages (VERBOSE_DEBUG) [Y/n/?] (NEW) Error in reading or end of file. Page scrubbing test (SCRUB_DEBUG) [Y/n/?] (NEW) Error in reading or end of file. Undefined behaviour sanitizer (UBSAN) [N/y/?] (NEW) Error in reading or end of file. Debug trace support (DEBUG_TRACE) [N/y/?] (NEW) Error in reading or end of file. Poison free xenpool blocks (XMEM_POOL_POISON) [Y/n/?] (NEW) Error in reading or end of file. make[1]: Leaving directory '/var/tmp/portage/app-emulation/xen-4.16.0-r5/work/xen-4.16.0/xen' make -f Rules.mk _build
The xen-4.16.0-r7 ebuild fixed the error messages and the .config file contains the correct configuration. Thanks