On an amd64 server, kernel 3.14.13 and xen 4.3.2, the domU sometimes freezes (for example when doing an emerge with distfiles on glusterfs). When shutting down the domU, it remains in "--ps-d" state. It doesn't happen on all servers running 3.14.13. Already reported on xen-devel: http://www.gossamer-threads.com/lists/xen/devel/334071 http://lists.xen.org/archives/html/xen-devel/2014-06/msg01317.html And fixed in: http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/xen-netback/netback.c?id=59ae9fc67007da8b5aea7b0a31c3607745cfbfee I tried to apply the patch on 3.14.13, it works so far. Jul 23 09:05:49 host kernel: [ 1273.627267] ------------[ cut here ]------------ Jul 23 09:05:49 host kernel: [ 1273.627971] kernel BUG at /usr/src/linux-3.14.13-gentoo/drivers/net/xen-netback/netback.c:540! Jul 23 09:05:49 host kernel: [ 1273.629274] invalid opcode: 0000 [#1] SMP Jul 23 09:05:49 host kernel: [ 1273.629919] Modules linked in: xt_physdev xt_LOG xt_limit xt_multiport xt_conntrack tun x86_pkg_temp_thermal [last unloaded: microcode] Jul 23 09:05:49 host kernel: [ 1273.631892] CPU: 4 PID: 3631 Comm: vif3.0 Not tainted 3.14.13-gentoo #1 Jul 23 09:05:49 host kernel: [ 1273.632888] Hardware name: /DH77EB, BIOS EBH7710H.86A.0071.2012.0426.1942 04/26/2012 Jul 23 09:05:49 host kernel: [ 1273.634280] task: ffff8800d35308a0 ti: ffff8800d1960000 task.ti: ffff8800d1960000 Jul 23 09:05:49 host kernel: [ 1273.635412] RIP: e030:[<ffffffff8170c783>] [<ffffffff8170c783>] xenvif_rx_action+0x853/0x860 Jul 23 09:05:49 host kernel: [ 1273.636718] RSP: e02b:ffff8800d1961d78 EFLAGS: 00010297 Jul 23 09:05:49 host kernel: [ 1273.637517] RAX: 0000000000000001 RBX: ffff8800c4a40780 RCX: 3fffffffffffff00 Jul 23 09:05:49 host kernel: [ 1273.638592] RDX: 0000000000000013 RSI: ffff8800d19456e8 RDI: 0000000000000000 Jul 23 09:05:49 host kernel: [ 1273.639667] RBP: ffff8800d1961e48 R08: ffff8800d1961dc4 R09: 0000000000000001 Jul 23 09:05:49 host kernel: [ 1273.640379] R10: ffff8800c4a40780 R11: 0000000000000000 R12: ffff8800d1961dc4 Jul 23 09:05:49 host kernel: [ 1273.640866] R13: 0000000000000003 R14: ffff8800d19456e8 R15: ffff8800c4a40780 Jul 23 09:05:49 host kernel: [ 1273.641355] FS: 00007fb946c3a700(0000) GS:ffff880121700000(0000) knlGS:0000000000000000 Jul 23 09:05:49 host kernel: [ 1273.641907] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 23 09:05:49 host kernel: [ 1273.642299] CR2: 00007f5db8c34000 CR3: 00000000d067b000 CR4: 0000000000042660 Jul 23 09:05:49 host kernel: [ 1273.642785] Stack: Jul 23 09:05:49 host kernel: [ 1273.642921] ffff8800d1961dc4 ffff8800d3530e08 ffff8800d1961db8 ffff8800c4a472a0 Jul 23 09:05:49 host kernel: [ 1273.643468] 000000030001add3 ffff8800c4a40780 0000001200000000 ffff8800d1961dd8 Jul 23 09:05:49 host kernel: [ 1273.643997] ffff8800d2c18d00 0000000000000001 ffff8800d1961dc8 ffff8800d1961dc8 Jul 23 09:05:49 host kernel: [ 1273.644525] Call Trace: Jul 23 09:05:49 host kernel: [ 1273.644693] [<ffffffff8170e164>] xenvif_kthread+0xa4/0x230 Jul 23 09:05:49 host kernel: [ 1273.645074] [<ffffffff81098da0>] ? finish_wait+0x80/0x80 Jul 23 09:05:49 host kernel: [ 1273.645476] [<ffffffff8170e0c0>] ? xenvif_stop_queue+0x60/0x60 Jul 23 09:05:49 host kernel: [ 1273.645950] [<ffffffff8107c774>] kthread+0xc4/0xe0 Jul 23 09:05:49 host kernel: [ 1273.646296] [<ffffffff81010000>] ? ftrace_raw_event_xen_mc_callback+0x90/0x160 Jul 23 09:05:49 host kernel: [ 1273.646806] [<ffffffff8107c6b0>] ? flush_kthread_worker+0x70/0x70 Jul 23 09:05:49 host kernel: [ 1273.647238] [<ffffffff818eab0c>] ret_from_fork+0x7c/0xb0 Jul 23 09:05:49 host kernel: [ 1273.647615] [<ffffffff8107c6b0>] ? flush_kthread_worker+0x70/0x70 Jul 23 09:05:49 host kernel: [ 1273.648045] Code: 42 0c 00 00 00 00 e9 64 fe ff ff c6 85 60 ff ff ff 00 e9 78 f9 ff ff 83 c8 03 e9 f2 fa ff ff 31 db b8 04 00 00 00 e9 cb fa ff ff <0f> 0b 0f 0b 0f 0b 0f 1f 80 00 00 00 00 48 8b 97 18 6b 00 00 55 Jul 23 09:05:49 host kernel: [ 1273.649791] RIP [<ffffffff8170c783>] xenvif_rx_action+0x853/0x860 Jul 23 09:05:49 host kernel: [ 1273.650230] RSP <ffff8800d1961d78> Jul 23 09:05:49 host kernel: [ 1273.938192] ---[ end trace c4abd69fee6a7f72 ]--- Portage 2.2.8-r1 (default/linux/amd64/13.0, gcc-4.7.3, glibc-2.17, 3.14.13-gentoo x86_64) ================================================================= System uname: Linux-3.14.13-gentoo-x86_64-Intel-R-_Core-TM-_i7-3770_CPU_@_3.40GHz-with-gentoo-2.2 KiB Mem: 3558520 total, 2583072 free KiB Swap: 0 total, 0 free Timestamp of tree: Wed, 23 Jul 2014 01:45:01 +0000 ld GNU ld (GNU Binutils) 2.23.2 app-shells/bash: 4.2_p45 dev-lang/python: 2.7.6, 3.3.3 dev-util/cmake: 2.8.12.2 dev-util/pkgconfig: 0.28-r1 sys-apps/baselayout: 2.2 sys-apps/openrc: 0.12.4 sys-apps/sandbox: 2.6-r1 sys-devel/autoconf: 2.69 sys-devel/automake: 1.13.4 sys-devel/binutils: 2.23.2 sys-devel/gcc: 4.7.3-r1 sys-devel/gcc-config: 1.7.3 sys-devel/libtool: 2.4.2-r1 sys-devel/make: 3.82-r4 sys-kernel/linux-headers: 3.13 (virtual/os-headers) sys-libs/glibc: 2.17 Repositories: gentoo hydra ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="*" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-mtune=native -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /var/bind" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-mtune=native -O2 -pipe" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://tux.rainside.sk/gentoo/ http://gentoo.wheel.sk/" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/var/lib/layman/hydra" SYNC="rsync://rsync.sk.gentoo.org/gentoo-portage" USE="acl amd64 berkdb bzip2 cli cracklib crypt cxx dri fortran gdbm iconv mmx modules multilib ncurses nls nptl openmp pam pcre readline session sse sse2 ssl tcpd unicode xattr zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="pc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON app-emulation/xen-4.3.2-r4 was built with the following: USE="-custom-cflags -debug -efi -flask -xsm" ABI_X86="64" CFLAGS="" app-emulation/xen-tools-4.3.2-r5 was built with the following: USE="pam screen -api -custom-cflags -debug -doc -flask -hvm (-ocaml) -pygrub -python -qemu -static-libs -system-seabios -xend" ABI_X86="64" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7" CFLAGS="" CXXFLAGS="-mtune=native -O2 -pipe -fno-strict-overflow" LDFLAGS=""
thanks for this report, I'm CCing @kernel team
found this already included into sys-kernel/gentoo-sources-3.16.1, but not in sys-kernel/gentoo-sources-3.14.17 and sys-kernel/gentoo-sources-3.15.10 I think it's more proper for upstream to backport or report to longterm kernel maintainer(at least for 3.14.x series), greg-kh?
I've applied the patch by Zoltan Kiss on a system with 3.14.13 and it has been working fine for 60 days so far. I'll try to mail greg-kh whether it would be possible to incorporate the patch into 3.14.x.
It won't make it into 3.14, but it's in later kernels...