| Summary: | Xen DomU hardened kernel 3.1.8 and later crashes on boot | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Roman Avdanin <ravd> |
| Component: | Hardened | Assignee: | The Gentoo Linux Hardened Team <hardened> |
| Status: | RESOLVED TEST-REQUEST | ||
| Severity: | normal | CC: | lorand.kelemen, pageexec, ravd, xen |
| Priority: | Normal | ||
| Version: | unspecified | ||
| Hardware: | AMD64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
|
Description
Roman Avdanin
2012-02-07 04:31:18 UTC
I can confirm the PAX_MEMORY_STACKLEAK crash and also to save some time for those who find this bug: The kernel also panics when CONFIG_TCG_TPM is set, when booting in PV mode. (Our kernels mimic production settings except Xen / physical drivers, that't why TPM was set, it's not needed). Tested on the following kernel versions: hardened-sources-3.2.2-r1 hardened-sources-3.2.14-r1 hardened-sources-3.3.1-r1 On these hypervisors: XenServer 6.0 build 50762p (uses Xen hypervisor 4.1.1) [1] XenServer 5.6.0 build 31188p (uses Xen hypervisor 3.4.2-5.6.0.597.20014) [1] [1] http://support.citrix.com/article/CTX122443 Console output: Linux version 3.2.14-hardened-r1 (root@vshop-humc-1) (gcc version 4.5.3 (Gentoo Hardened 4.5.3-r2 p1.1, pie-0.4.7) ) #13 SMP Fri Apr 13 13:27:41 CEST 2012 Command line: root=/dev/xvda3 Released 0 pages of unused memory Set 0 page(s) to 1-1 mapping BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000060000000 (usable) NX (Execute Disable) protection: active DMI not present or invalid. last_pfn = 0x60000 max_arch_pfn = 0x400000000 init_memory_mapping: 0000000000000000-0000000060000000 Zone PFN ranges: DMA 0x00000010 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal empty Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000010 -> 0x000000a0 0: 0x00000100 -> 0x00060000 SMP: Allowing 1 CPUs, 0 hotplug CPUs No local APIC present APIC: disable apic facility APIC: switched to apic NOOP Allocating PCI resources starting at 60000000 (gap: 60000000:a0000000) Booting paravirtualized kernel on Xen Xen version: 4.1.1 (preserve-AD) setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1 PERCPU: Embedded 21 pages/cpu @ffff88005ffd7000 s62016 r0 d24000 u86016 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 386209 Kernel command line: root=/dev/xvda3 PID hash table entries: 4096 (order: 3, 32768 bytes) Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) Memory: 1530572k/1572864k available (3408k kernel code, 448k absent, 41844k reserved, 1238k data, 436k init) SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Hierarchical RCU implementation. NR_IRQS:4352 nr_irqs:24 16 Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled installing Xen timer for CPU 0 Detected 2327.570 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 4655.14 BogoMIPS (lpj=23275700) pid_max: default: 32768 minimum: 501 Mount-cache hash table entries: 256 CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 SMP alternatives: switching to UP code Freeing SMP alternatives: 24k freed cpu 0 spinlock event irq 17 Performance Events: unsupported p6 CPU model 15 no PMU driver, software events only. Brought up 1 CPUs Grant table initialized NET: Registered protocol family 16 bio: create slab <bio-0> at 0 xen/balloon: Initialising balloon driver. xen-balloon: Initialising balloon driver. Switching to clocksource xen NET: Registered protocol family 2 IP route cache hash table entries: 65536 (order: 7, 524288 bytes) TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 262144 bind 65536) TCP reno registered UDP hash table entries: 1024 (order: 3, 32768 bytes) UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes) NET: Registered protocol family 1 platform rtc_cmos: registered platform RTC device (no PNP device found) microcode: CPU0 sig=0x6fb, pf=0x40, revision=0xbc microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba Intel AES-NI instructions are not detected. HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 2989 Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) io scheduler noop registered io scheduler deadline registered (default) io scheduler cfq registered Event-channel device installed. Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled BUG: unable to handle kernel paging request at ffff88005d827140 IP: [<ffffffff810057d3>] xen_set_pte_at+0x23/0x40 PGD 1365067 PUD ee9067 PMD fd6067 PTE 801000005d827065 Oops: 0003 [#1] SMP CPU 0 Pid: 1, comm: swapper/0 Not tainted 3.2.14-hardened-r1 #13 RIP: e030:[<ffffffff810057d3>] [<ffffffff810057d3>] xen_set_pte_at+0x23/0x40 RSP: e02b:ffff88005d843d20 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88005d827140 RCX: 80000000fed40463 RDX: 0000000000000000 RSI: 80000000fed40463 RDI: ffff88005d827140 RBP: 80000000fed40463 R08: 00000000000000d0 R09: ffff88005da637c0 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000fed40 R13: 8000000000000573 R14: ffffc9000002d000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88005ffd7000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88005d827140 CR3: 0000000001364000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff88005d8443d8, task ffff88005d844000) Stack: 00003ffffffff000 ffffc90000028000 ffff88005d827140 ffffffff811bda71 00000000000fed45 0000000000000001 0000000000008000 ffff88005da637c0 ffffc9000002d000 ffff88005d826000 ffffc9000002cfff 00003700fed18000 Call Trace: [<ffffffff811bda71>] ? ioremap_page_range+0x271/0x350 [<ffffffff8102ca5d>] ? __ioremap_caller+0x2bd/0x3a0 [<ffffffff8134e19f>] ? _raw_spin_lock+0xf/0x20 [<ffffffff814ba9f9>] ? init_tis+0xc6/0x62e [<ffffffff810076ef>] ? xen_restore_fl_direct_reloc+0x4/0x4 [<ffffffff8134e299>] ? _raw_spin_unlock_irqrestore+0x29/0x30 [<ffffffff814ba933>] ? misc_init+0xb7/0xb7 [<ffffffff8149cbab>] ? do_one_initcall+0x75/0x13a [<ffffffff8149cd05>] ? kernel_init+0x95/0x118 [<ffffffff81350184>] ? kernel_thread_helper+0x4/0x10 [<ffffffff8134ef73>] ? int_ret_from_sys_call+0x7/0x18 [<ffffffff8134e6f6>] ? retint_restore_args+0x5/0x6 [<ffffffff81350180>] ? gs_change+0x13/0x13 Code: 0b 0f 0b 0f 0b 0f 1f 00 48 83 ec 18 48 89 ce 48 89 5c 24 08 48 89 6c 24 10 48 89 d3 48 89 cd 48 89 d7 e8 c1 fe ff ff 84 c0 75 03 <48> 89 2b 48 8b 5c 24 08 48 8b 6c 24 10 48 83 c4 18 c3 66 66 2e RIP [<ffffffff810057d3>] xen_set_pte_at+0x23/0x40 RSP <ffff88005d843d20> CR2: ffff88005d827140 ---[ end trace 936528fd05cf662e ]--- Kernel panic - not syncing: Attempted to kill init! Lost connection to the server. (In reply to comment #0) > [<ffffffff819db746>] xen_load_gdt_boot+0x45 what happens here is that this function has a local variable-length-array variable that at runtime causes a call to alloca and the PaX specific check function. however it's apparently so early during boot that the per-cpu variables are not available, hence the page fault. i've just fixed this by allocating a fixed sized array (there's no point in saving a few bytes of stack space over this) so the next patches will fix this crash but i'm sure xen has other surprises ;). (In reply to comment #1) > BUG: unable to handle kernel paging request at ffff88005d827140 > IP: [<ffffffff810057d3>] xen_set_pte_at+0x23/0x40 > PGD 1365067 PUD ee9067 PMD fd6067 PTE 801000005d827065 > Oops: 0003 [#1] SMP > CPU 0 > Pid: 1, comm: swapper/0 Not tainted 3.2.14-hardened-r1 #13 > RIP: e030:[<ffffffff810057d3>] [<ffffffff810057d3>] xen_set_pte_at+0x23/0x40 this looks like an attempt to update a read-only (even in the guest) page table. it would be nice to know if you can still reproduce it with our latest 3.2 or 3.7 patches. also open a new bug for this please ;). I will do these, just give me some time to catch up with work-work. Sorry for the delay in reporting back. I don't have access anymore to the mentioned environments, ergo cannot move this issue further. Hopefully it was already sorted out. Sounds like this might have been resolved by the PaX folks. |