Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 643208 - When using Debian Xen DOM0, the Gentoo DOMU crashes after a day-two when using new kernels
Summary: When using Debian Xen DOM0, the Gentoo DOMU crashes after a day-two when usin...
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Xen Devs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-03 07:04 UTC by Vladimir Romanov (RETIRED)
Modified: 2018-02-23 06:12 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
dmesg -w results (dmesg.png,162.37 KB, image/png)
2018-01-03 07:05 UTC, Vladimir Romanov (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vladimir Romanov (RETIRED) gentoo-dev 2018-01-03 07:04:36 UTC
I've using Debian Jessie as Xen DOM0, and there is numerous HVM DOMU's with Gentoo. All of them had gentoo-sources with various versions. Most of them have gentoo-sources 4.4.6 and they run OK. I decided to update one of them and updated it to kernel version 4.14.8-r1. Then this DOMU began to crash approx one time a day.
It does not say anything, just freezes. I can't login to it using SSH, or even ping it.
I decided to run dmesg -w and the results are attached (it says something about handle_irq in 8139cp module)

Debian has kernel 3.16.0-4-amd64 and XEN version 4.4.1. VM emerge --info is attached (but with kernel 4.4.6 (it's only difference) - because it is working VM and i can't stop it)

Reproducible: Always

Steps to Reproduce:
1. Run Gentoo DOMU for a day-two with kernel > 4.4.6
2.
3.
Actual Results:  
VM Crashes

Expected Results:  
VM Works
Comment 1 Vladimir Romanov (RETIRED) gentoo-dev 2018-01-03 07:05:01 UTC
Portage 2.3.13 (python 2.7.14-final-0, default/linux/amd64/17.0, gcc-4.9.4, glibc-2.25-r9, 4.4.6-gentoo x86_64)
=================================================================
System uname: Linux-4.4.6-gentoo-x86_64-Intel-R-_Core-TM-_i7-4770_CPU_@_3.40GHz-with-gentoo-2.4.1
KiB Mem:     8132700 total,    233444 free
KiB Swap:    4985852 total,   4985852 free
Timestamp of repository gentoo: Mon, 25 Dec 2017 14:06:45 +0000
Head commit of repository gentoo: c409c95d1d987ea1f15457bce167a49a2566ce1c

sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-lang/perl:            5.24.3::gentoo
dev-lang/python:          2.7.14-r1::gentoo, 3.4.5::gentoo, 3.5.4-r1::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.34.11::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.69::gentoo
sys-devel/automake:       1.14.1::gentoo, 1.15.1-r1::gentoo
sys-devel/binutils:       2.25.1-r1::gentoo, 2.26.1::gentoo, 2.29.1-r1::gentoo
sys-devel/gcc:            4.9.4::gentoo, 6.4.0::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r9::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/gentoo
    priority: -1000

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.6/ext-active/ /etc/php/apache2-php7.0/ext-active/ /etc/php/apache2-php7.1/ext-active/ /etc/php/cgi-php5.6/ext-active/ /etc/php/cgi-php7.0/ext-active/ /etc/php/cgi-php7.1/ext-active/ /etc/php/cli-php5.6/ext-active/ /etc/php/cli-php7.0/ext-active/ /etc/php/cli-php7.1/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="ru_RU.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="acl amd64 apache2 berkdb bindist bzip2 cli cracklib crypt curl cxx dri firmware fortran gd gdbm iconv imagemagick ipv6 mmx modules multilib ncurses nls nptl openmp pam pcre pdo php png postgres readline seccomp session sse sse2 ssl tcpd threads truetype unicode x264 xattr xml zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6 php7-0" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_5" PYTHON_TARGETS="python2_7 python3_5" RUBY_TARGETS="ruby22" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, MAKEOPTS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 2 Vladimir Romanov (RETIRED) gentoo-dev 2018-01-03 07:05:56 UTC
Created attachment 512956 [details]
dmesg -w results
Comment 3 Tomáš Mózes 2018-01-03 15:24:54 UTC
Does it also happen with the latest 4.4.11?
Comment 4 Vladimir Romanov (RETIRED) gentoo-dev 2018-01-05 06:01:07 UTC
I don't see 4.4.11 in sys-kernel/gentoo-sources. Only 4.4.109 and 4.14.11. Do you mean 4.14.11?
Comment 5 Tomáš Mózes 2018-01-05 06:12:42 UTC
Yes, sorry, the latest 4.14 (currently 4.14.11-r2).
Comment 6 Vladimir Romanov (RETIRED) gentoo-dev 2018-01-05 10:04:48 UTC
Testing it. I need several days to test.
Comment 7 Tomáš Mózes 2018-01-08 05:45:08 UTC
I haven't observed any crashes yet with xen 4.9/4.10, but instead sometimes some processes are kept in D state (uninterruptible sleep) and only a domU restart helps.

So you use debian kernel for the dom0 and custom gentoo built kernels for domU?
Comment 8 Tomáš Mózes 2018-01-10 08:31:08 UTC
This may be related to:
https://www.novell.com/support/kb/doc.php?id=7018590
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=880554

Try to increse gnttab_max_frames=256 in your xen dom0.

Also, we could backport the xen-diag tool. I've tested with tools 4.9.1-r1 and it works fine, it could help us trace down this issues:
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=df36d82e3fc91bee2ff1681fd438c815fa324b6a
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=04e0457e1e786a3a33d1854275fd6cd7ba6306f7

In general, since my domUs went from 4.1 to <4.12, ocasionally I observed hangs. Now on 4.14 it seems better, but some other issues arrised :)
Comment 9 Tomáš Mózes 2018-01-23 13:21:58 UTC
Please reopen if the workaround does not work.