Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 434976 - sys-kernel/gentoo-sources-3.4.9 - arch/x86/xen/enlighten.c:860 xen_apic_write+0x15/0x17() : warn_slowpath_common+0x80/0x98
Summary: sys-kernel/gentoo-sources-3.4.9 - arch/x86/xen/enlighten.c:860 xen_apic_write...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://patchwork.kernel.org/patch/16...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-14 07:19 UTC by Konstantin Agouros
Modified: 2013-02-09 12:15 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Kernel-Config (.config,80.86 KB, text/plain)
2012-09-14 07:19 UTC, Konstantin Agouros
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin Agouros 2012-09-14 07:19:27 UTC
Created attachment 323732 [details]
Kernel-Config

During boot of the Dom0 in 3.4.9 I find the following 6 times (i guess once for each CPU core):

[    2.358574] ------------[ cut here ]------------
[    2.358613] WARNING: at arch/x86/xen/enlighten.c:860 xen_apic_write+0x15/0x17()
[    2.358642] Hardware name: To Be Filled By O.E.M.
[    2.358667] Modules linked in:
[    2.358713] Pid: 0, comm: swapper/5 Tainted: G        W    3.4.9-gentoo-64bit #2
[    2.358743] Call Trace:
[    2.358768]  <IRQ>  [<ffffffff8104071e>] warn_slowpath_common+0x80/0x98
[    2.358818]  [<ffffffff8104074b>] warn_slowpath_null+0x15/0x17
[    2.358845]  [<ffffffff81003411>] xen_apic_write+0x15/0x17
[    2.358873]  [<ffffffff8101fd6e>] perf_events_lapic_init+0x2e/0x30
[    2.358900]  [<ffffffff8101ff39>] x86_pmu_enable+0x1c9/0x243
[    2.358927]  [<ffffffff810b3851>] perf_pmu_enable+0x21/0x23
[    2.358953]  [<ffffffff8101e9c9>] x86_pmu_commit_txn+0x84/0x9a
[    2.358980]  [<ffffffff81032725>] ? pvclock_clocksource_read+0x48/0xb8
[    2.359007]  [<ffffffff81032725>] ? pvclock_clocksource_read+0x48/0xb8
[    2.359034]  [<ffffffff81032725>] ? pvclock_clocksource_read+0x48/0xb8
[    2.359061]  [<ffffffff810b47c8>] ? event_sched_in+0x7c/0x10e
[    2.359088]  [<ffffffff810b48e2>] group_sched_in+0x88/0x127
[    2.359115]  [<ffffffff810b4e4c>] __perf_event_enable+0xcf/0x123
[    2.359141]  [<ffffffff810b217d>] remote_function+0x3c/0x43
[    2.359169]  [<ffffffff81366209>] ? _raw_spin_lock_irq+0xb/0x24
[    2.360503]  [<ffffffff81081db7>] generic_smp_call_function_single_interrupt+0xc7/0xea
[    2.360533]  [<ffffffff8100ee64>] xen_call_function_single_interrupt+0xe/0x22
[    2.360560]  [<ffffffff81095b4e>] handle_irq_event_percpu+0x5a/0x196
[    2.360587]  [<ffffffff81098299>] handle_percpu_irq+0x39/0x4d
[    2.360614]  [<ffffffff8124ebd0>] __xen_evtchn_do_upcall+0x147/0x1e3
[    2.360641]  [<ffffffff8125078f>] xen_evtchn_do_upcall+0x2a/0x3c
[    2.360668]  [<ffffffff8136810e>] xen_do_hypervisor_callback+0x1e/0x30
[    2.360694]  <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[    2.360743]  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[    2.360770]  [<ffffffff810081d0>] ? xen_safe_halt+0x10/0x18
[    2.360796]  [<ffffffff81018f58>] ? default_idle+0xb1/0x14c
[    2.360823]  [<ffffffff81019856>] ? cpu_idle+0xb3/0xd2
[    2.360850]  [<ffffffff81008819>] ? xen_irq_enable_direct_reloc+0x4/0x4
[    2.360877]  [<ffffffff81357857>] ? cpu_bringup_and_idle+0xe/0x10
[    2.360904] ---[ end trace 41ef0ee79c2c0f37 ]---
Comment 1 Konstantin Agouros 2012-09-14 07:20:16 UTC
I do not know if this is related however the box crashes at least once a week.
It seems that if the free memory (without substracting buffers/cached) reaches 0 something goes horribly wrong.
Comment 2 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-20 17:47:55 UTC
According to bug #435546 this might have been fixed already.

Can you try more recent kernels like stable gentoo-sources 3.6.11 and development git-sources-3.8_rc3?
Comment 3 Konstantin Agouros 2013-01-20 17:51:02 UTC
3.6.11 just bombed on me again today.

I am currently thinking the problem is xen and not really the kernel.
Comment 4 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-20 18:03:25 UTC
https://patchwork.kernel.org/patch/1636911/ landed in linux 3.7-rc3 and happens at the same place in the code arch/x86/xen/enlighten.c:860 xen_apic_write+0x15/0x17()

Please try a more recent unstable 3.7 or development 3.8 kernel, take your pick:

Unstable
--------

    echo "sys-kernel/gentoo-sources" >> /etc/portage/package.accept_keywords
    emerge -uDN gentoo-sources
    eselect kernel set linux-3.7.2-gentoo

Development
-----------

    echo "sys-kernel/git-sources" >> /etc/portage/package.accept_keywords
    emerge git-sources
    eselect kernel set linux-3.8-rc3

Then follow the kernel upgrade guide like usual.
Comment 5 Konstantin Agouros 2013-01-20 18:07:57 UTC
Ah there is a bit of information I forgot:

The free mem reaches 0 is not correct. It crashes shortly after passing 512MB of swapped out memory.
Comment 6 Konstantin Agouros 2013-01-20 18:10:19 UTC
Also for 3.6.11 the messages look a bit different. Now I get:

(XEN) physdev.c:155: dom0: wrong map_pirq type 3
(XEN) Xen WARN at msi.c:659
(XEN) ----[ Xen-4.1.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48016d95c>] pci_enable_msi+0x6fc/0x910
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 00000000ffffffff   rbx: 0000000000002003   rcx: 00000000000fe3dc
(XEN) rdx: 0000000000000029   rsi: ffff82c4802380d6   rdi: ffff83021ec34c9c
(XEN) rbp: ffff83012846d0e0   rsp: ffff82c480297d28   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 0000000000000008   r11: 0000000000000000
(XEN) r12: ffff82c480297e20   r13: 000000000000000a   r14: ffff83012846ddb0
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000215556000   cr2: 00007f633e50e000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff82c480297d28:
(XEN)    00000000ffffffff 0000000000000001 0000000000000149 00000000fe3dc000
(XEN)    0000000000000000 0000000000000001 0000006000000004 00000000000fe3dc
(XEN)    0000006200000111 00000000fe3dc000 ffff830200000003 0000000000000111
(XEN)    0000000000000111 00000000000fe3dc ffff83012846d178 000982c4801828ae
(XEN)    0000000000000cfc 0000000000000111 000000000000001e ffff830219172000
(XEN)    00000000ffffffed 0000000000000111 000000000000001e ffff82c480170955
Comment 7 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-20 18:16:50 UTC
No commits related to that message found, still makes me wonder whether that other fix fixes this as well...
Comment 8 Konstantin Agouros 2013-01-20 18:23:13 UTC
I have a serial console on the xen console. So when the crash happens what I do see is a general protection fault from Xen. Can this really be caused by a Dom0 kernel?
Comment 9 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-20 19:04:40 UTC
Your error message contains

> dom0: wrong map_pirq type 3

so I suppose this is the hypervisor reporting that there is something going wrong with the Dom0 kernel.
Comment 10 Konstantin Agouros 2013-01-20 19:16:17 UTC
Well any hint on _what_ is going wrong would be great.

Also gentoo-sources 3.5.7 was crashing much faster then 3.4.9 or 3.6.11.
Comment 11 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-20 19:51:19 UTC
(In reply to comment #10)
> Well any hint on _what_ is going wrong would be great.

(Comment #4)
> https://patchwork.kernel.org/patch/1636911/ landed in linux 3.7-rc3 and
> happens at the same place in the code arch/x86/xen/enlighten.c:860
> xen_apic_write+0x15/0x17()

Well, you haven't tried a kernel that incorporates this fix yet; unless the fix has been backported (see whether the changes from that patch were applied). As long as that bug is still around we can't assume that you are experiencing a independent new bug with xen...

Given that the new error you gave hasn't been patched, unless the patch from above does so; you might want to report it upstream at https://bugzilla.kernel.org/ so they can take a look at it. But that assumes you have tried the development kernel, or there exists a chance they will ask you to do that.

Can you please leave a link to the upstream bug here if you do that?

Good luck!
Comment 12 Konstantin Agouros 2013-01-21 22:45:24 UTC
Would 3.7.3-gentoo do as well for the test?
Comment 13 Tom Wijsman (TomWij) (RETIRED) gentoo-dev 2013-01-22 02:16:38 UTC
(In reply to comment #12)
> Would 3.7.3-gentoo do as well for the test?

Yes, that should suffice as well, any 3.7 version would include that patch.
Comment 14 Konstantin Agouros 2013-01-23 22:18:00 UTC
gentoo-sources-3.7.3 standing by for next reboot
Comment 15 Konstantin Agouros 2013-02-05 12:26:14 UTC
OK 

I bootet 3.7.3 with Xen 4.2

Swapped out 1GB - no panic
It seems this is solve the issue. 

xl dmesg does not show any errors from the boot, so I hope all is well, case closed.