Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 564276 - <sys-kernel/gentoo-sources-4.4: KVM live migration cause CPU steal time stuck at 100%
Summary: <sys-kernel/gentoo-sources-4.4: KVM live migration cause CPU steal time stuck...
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://bugs.launchpad.net/qemu/+bug/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-27 17:23 UTC by Zentoo
Modified: 2019-03-06 10:21 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zentoo 2015-10-27 17:23:52 UTC
with recent kernel, KVM live migration cause guest CPU to be stuck at 100% with 100% of stolen time. The problem arise at nearly 100% once the guest have some ram to transfer for migration (as buffer/cache).

You can see it easily with top/htop (top/htop: column st) and with vmstat -S (stolen CPU ticks).
vmstat -S M 1 10 report it clearly

The problem have been observed here with 4.0.x kernel series used on guest.
The problem doesn't happens with 3.14.x kernel series.

The only solution is to reboot the guest.

The problem have been reported upstream:
https://bugs.launchpad.net/qemu/+bug/1494350

And there is a patch to solve this problem:
http://www.spinics.net/lists/kvm/msg122175.html

Since this problem is really important for virtualization environment/clusters, I think it could be great to use this patch in portage before get an upstream patched version.

Reproducible: Always

Steps to Reproduce:
1. use a 3.18.x or 4.0.x guest KVM kernel with qemu-2.[1-4]
2. use memory (cat /dev/vda > /dev/null to fill buffers)
3. live migration of the guest

Actual Results:  
$ vmstat -S M 1 10
=> check CPU/st at 100%


Expected Results:  
$ vmstat -S M 1 10
=> check CPU/st at 0%
Comment 1 SpanKY gentoo-dev 2015-11-06 15:57:58 UTC
this is a bug in the kernel, not qemu
Comment 2 Zentoo 2015-12-17 11:02:22 UTC
 I've finally find the exact problem and the problem is located on kernel host side (not guest kernel side).


The fix have been reviewed and accepted:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7cae2bedcbd4680b155999655e49c27b9cf020fa

Relative thread:
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg04306.html

Since the patch is really simple. It would be a really good thing to backport it in gentoo kernel patchset.

I don't know if gentoo kernel policy authorize such backport but this bug is really a critical one.
It block admins to use gentoo as a host hypervisor for kvm since you need a kernel 4.4 to avoid it and there is some technlogy as DRBD that can't be used with a recent kernel because user-space tools need specific kernel version.
Comment 3 Zentoo 2015-12-17 11:05:47 UTC
(In reply to Zentoo from comment #2)
>  I've finally find the exact problem and the problem is located on kernel
> host side (not guest kernel side).

Forget to tell that it should be included in kernel 4.4.
So backporting the patch to older kernel is the perfect solution.
Comment 4 Mike Pagano gentoo-dev 2015-12-17 17:39:19 UTC
Any version preferences for this?
Comment 5 Zentoo 2015-12-17 17:57:47 UTC
Actual stable versions starting from 3.14 series will be a must actually.
(so 3.14.56)

I didn't expect such question :)
Comment 6 Mike Pagano gentoo-dev 2015-12-17 23:20:15 UTC
(In reply to Zentoo from comment #5)

> 
> I didn't expect such question :)


I don't understand this statement.
Comment 7 Zentoo 2016-01-25 12:05:14 UTC
I've checked last gentoo-sources:
sys-kernel/gentoo-sources-4.4.0-r1 got the patch.

So it will be a gread to backport it for older version since there this patch have no dependencies and can't hurt anything else.
Comment 8 Dan Moulding 2016-02-03 00:00:54 UTC
This bug was never present in Linux 4.4. The patch was accepted and merged well before v4.4 was tagged and released.
Comment 9 Zentoo 2016-02-03 09:48:44 UTC
(In reply to Dan Moulding from comment #8)
> This bug was never present in Linux 4.4. The patch was accepted and merged
> well before v4.4 was tagged and released.

Yes the patch is included since the first 4.4 kernel but the purpose of this bug is to backport the patch to older version of kernel (3.14/3.18/4.0/4.1 series).

Why users need it ? They need it because most of virtualisation environment can't use last 4.4 kernel because they need specific kernel part version as DRBD for example. So users need to be sticked to specific kernel version.
As this patch doesn't have dependencies and modify nothing except fix this specific KVM bug, it's easy to use it on any kernel version (I use it in production on 3.14 series actually).
Comment 10 Zentoo 2018-01-10 17:40:19 UTC
There is no kernel in portage tree with this bug anymore.