Created attachment 326090 [details] .config for 3.5.4 On x86 system with AthlonXP 3000 and 2GB RAM, when kernel 3.4.5 or 3.5.4-r1 built with CONFIG_TRANSPARENT_HUGEPAGE=y I see these kernel bugs shortly after system boot: kern.alert: BUG: unable to handle kernel NULL pointer dereference at (nil) kern.alert: IP: [<c101b29e>] do_page_fault+0x23e/0x530 kern.warn: *pde = 00000000 kern.warn: Oops: 0000 [#1] kern.warn: Modules linked in: kern.warn: kern.warn: Pid: 1953, comm: display Not tainted 3.4.5-hardened #1 ASUSTeK Computer INC. A7N8X-E/A7N8X-E kern.warn: EIP: 0060:[<c101b29e>] EFLAGS: 00210246 CPU: 0 kern.warn: EAX: 00000000 EBX: f3b8df1c ECX: c16f7920 EDX: 00000000 kern.warn: ESI: 00000007 EDI: 00000002 EBP: f3b8df14 ESP: f3b8dea0 kern.warn: DS: 0068 ES: 0068 FS: 0000 GS: 00e0 SS: 0068 kern.warn: CR0: 8005003b CR2: 00000000 CR3: 33b93000 CR4: 000007d0 kern.warn: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 kern.warn: DR6: ffff0ff0 DR7: 00000400 <0>Process display (pid: 1953, ti=f3afe38c task=f3afe120 task.ti=f3afe38c) <0>Stack: kern.warn: f3b8deb0 c1009aa8 ffffb000 00000400 f3afe120 f4325774 00000029 f4325740 kern.warn: a3800004 f3b8dee0 c1093d64 a3800000 00000001 f404c860 794000e3 f4325740 kern.warn: f3b8df0c c10a1b2d 00000000 00000029 f404c860 f6e71380 f7725000 a3800000 <0>Call Trace: kern.warn: [<c1009aa8>] ? kernel_fpu_begin+0x18/0xf0 kern.warn: [<c1093d64>] ? page_add_new_anon_rmap+0xb4/0xc0 kern.warn: [<c10a1b2d>] ? do_huge_pmd_anonymous_page+0x19d/0x300 kern.warn: [<c101b060>] ? vmalloc_sync_all+0xe0/0xe0 kern.warn: [<c150be7b>] error_code+0x6b/0x70 kern.warn: [<c101b060>] ? vmalloc_sync_all+0xe0/0xe0 kern.warn: [<c150bbc6>] ? restore_all_pax+0x7/0x7 kern.warn: [<c101b060>] ? vmalloc_sync_all+0xe0/0xe0 kern.warn: [<c150bbc6>] ? restore_all_pax+0x7/0x7 <0>Code: fe ff ff 85 d2 0f 84 a8 fe ff ff c1 e8 0c c1 e0 05 03 05 88 cd 77 c1 e8 e1 ee 06 00 8b 55 ac c1 ea 0a 81 e2 fc 0f 00 00 8d 14 10 <8b> 02 a8 01 0f 84 7e fe ff ff a8 04 0f 85 76 fe ff ff 85 ff 0f <0>EIP: [<c101b29e>] do_page_fault+0x23e/0x530 SS:ESP 0068:f3b8dea0 kern.warn: CR2: 0000000000000000 kern.warn: ---[ end trace 7b9325d7a3f5361f ]--- kern.alert: BUG: unable to handle kernel NULL pointer dereference at 00000f50 kern.alert: IP: [<c101b29e>] do_page_fault+0x23e/0x530 kern.warn: *pde = 00000000 kern.warn: Oops: 0000 [#2] kern.warn: Modules linked in: kern.warn: kern.warn: Pid: 1950, comm: opera Tainted: G D 3.4.5-hardened #1 ASUSTeK Computer INC. A7N8X-E/A7N8X-E kern.warn: EIP: 0060:[<c101b29e>] EFLAGS: 00210206 CPU: 0 kern.warn: EAX: 00000000 EBX: f3b8ff6c ECX: c177ba10 EDX: 00000f50 kern.warn: ESI: 00000005 EDI: 00000000 EBP: f3b8ff64 ESP: f3b8fef0 kern.warn: DS: 0068 ES: 0068 FS: 0000 GS: 00e0 SS: 0068 kern.warn: CR0: 80050033 CR2: 00000f50 CR3: 33886000 CR4: 000007d0 kern.warn: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 kern.warn: DR6: ffff0ff0 DR7: 00000400 <0>Process opera (pid: 1950, ti=f55a1d2c task=f55a1ac0 task.ti=f55a1d2c) <0>Stack: kern.warn: f3b8ff1c c1003d73 f3b8ff28 c10482e2 f55a1ac0 f56ee774 00000028 f56ee740 kern.warn: 0cbd45a0 f3b8ff1c c150b3dd f55a1d2c c150bbc6 0c1d3c48 0bb13fb0 00000000 kern.warn: befd1d1c 000003fd befd1cf8 04b8121b 0000007b 0000007b 00000000 00000033 <0>Call Trace: kern.warn: [<c1003d73>] ? do_IRQ+0x43/0xa0 kern.warn: [<c10482e2>] ? ktime_get_ts+0xe2/0x110 kern.warn: [<c150b3dd>] ? schedule+0x1d/0x50 kern.warn: [<c150bbc6>] ? restore_all_pax+0x7/0x7 kern.warn: [<c101b060>] ? vmalloc_sync_all+0xe0/0xe0 kern.warn: [<c150be7b>] error_code+0x6b/0x70 <0>Code: fe ff ff 85 d2 0f 84 a8 fe ff ff c1 e8 0c c1 e0 05 03 05 88 cd 77 c1 e8 e1 ee 06 00 8b 55 ac c1 ea 0a 81 e2 fc 0f 00 00 8d 14 10 <8b> 02 a8 01 0f 84 7e fe ff ff a8 04 0f 85 76 fe ff ff 85 ff 0f <0>EIP: [<c101b29e>] do_page_fault+0x23e/0x530 SS:ESP 0068:f3b8fef0 kern.warn: CR2: 0000000000000f50 kern.warn: ---[ end trace 7b9325d7a3f53620 ]---
BTW, I've also several other systems with very similar setup (nearly same gentoo configuration and packages, and nearly same kernel config), all with: CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set but they didn't have this kernel BUG. I suppose this is because their hardware is very different from system with that kernel BUG: - Intel Core i7-2600K, 8GB RAM - Intel Xeon X5680, 16GB RAM - Intel Xeon E5310, 4GB RAM - Intel Core2 Duo E4500, 2GB RAM
(In reply to comment #1) > BTW, I've also several other systems with very similar setup (nearly same > gentoo configuration and packages, and nearly same kernel config), all with: > CONFIG_TRANSPARENT_HUGEPAGE=y > CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y > # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set > but they didn't have this kernel BUG. I suppose this is because their > hardware is very different from system with that kernel BUG: > - Intel Core i7-2600K, 8GB RAM > - Intel Xeon X5680, 16GB RAM > - Intel Xeon E5310, 4GB RAM > - Intel Core2 Duo E4500, 2GB RAM I don't think this is a harened issue, can you compare hardened-sources to vanilla with the closest possible config and let me know.
(In reply to comment #2) > I don't think this is a harened issue, can you compare hardened-sources to > vanilla with the closest possible config and let me know. I was sure in this too… 10 minutes ago. Now I've boot several times vanilla 3.5.4 with CONFIG_TRANSPARENT_HUGEPAGE=y and it doesn't have this issue. Here is differences in .config: --- config-3.5.4-hardened-r1 2012-10-09 16:54:38.786032008 +0300 +++ linux-3.5.4/.config 2012-10-11 19:44:55.901116082 +0300 @@ -1,6 +1,6 @@ # # Automatically generated file; DO NOT EDIT. -# Linux/i386 3.5.4-hardened-r1 Kernel Configuration +# Linux/i386 3.5.4 Kernel Configuration # # CONFIG_64BIT is not set CONFIG_X86_32=y @@ -129,6 +129,7 @@ CONFIG_RT_GROUP_SCHED=y CONFIG_BLK_CGROUP=y # CONFIG_DEBUG_BLK_CGROUP is not set +# CONFIG_CHECKPOINT_RESTORE is not set CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_IPC_NS is not set @@ -212,7 +213,6 @@ CONFIG_MODULE_FORCE_UNLOAD=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set -CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_LBDAF=y CONFIG_BLK_DEV_BSG=y @@ -329,7 +329,6 @@ CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y -CONFIG_X86_ALIGNMENT_16=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y @@ -393,7 +392,9 @@ CONFIG_DEFAULT_MMAP_MIN_ADDR=65536 CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y # CONFIG_MEMORY_FAILURE is not set -# CONFIG_TRANSPARENT_HUGEPAGE is not set +CONFIG_TRANSPARENT_HUGEPAGE=y +CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y +# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set CONFIG_CROSS_MEMORY_ATTACH=y CONFIG_NEED_PER_CPU_KM=y CONFIG_CLEANCACHE=y @@ -421,6 +422,7 @@ CONFIG_PHYSICAL_START=0x1000000 # CONFIG_RELOCATABLE is not set CONFIG_PHYSICAL_ALIGN=0x100000 +CONFIG_COMPAT_VDSO=y # CONFIG_CMDLINE_BOOL is not set CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y @@ -507,6 +509,7 @@ CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_HAVE_AOUT=y +# CONFIG_BINFMT_AOUT is not set # CONFIG_BINFMT_MISC is not set CONFIG_HAVE_ATOMIC_IOMAP=y CONFIG_HAVE_TEXT_POKE_SMP=y @@ -1290,11 +1293,7 @@ # CONFIG_NOZOMI is not set # CONFIG_N_GSM is not set # CONFIG_TRACE_SINK is not set - -# -# KCopy -# -CONFIG_KCOPY=y +# CONFIG_DEVKMEM is not set # # Serial drivers @@ -1339,6 +1338,7 @@ # CONFIG_HANGCHECK_TIMER is not set # CONFIG_TCG_TPM is not set # CONFIG_TELCLOCK is not set +CONFIG_DEVPORT=y CONFIG_I2C=y CONFIG_I2C_BOARDINFO=y # CONFIG_I2C_COMPAT is not set @@ -2278,7 +2278,9 @@ # Pseudo filesystems # CONFIG_PROC_FS=y +# CONFIG_PROC_KCORE is not set CONFIG_PROC_SYSCTL=y +CONFIG_PROC_PAGE_MONITOR=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y @@ -2398,6 +2400,7 @@ CONFIG_X86_VERBOSE_BOOTUP=y CONFIG_EARLY_PRINTK=y # CONFIG_EARLY_PRINTK_DBGP is not set +# CONFIG_DEBUG_SET_MODULE_RONX is not set CONFIG_DOUBLEFAULT=y # CONFIG_IOMMU_STRESS is not set CONFIG_HAVE_MMIOTRACE_SUPPORT=y ... [large GRSEC-related block removed] ... @@ -2575,6 +2426,7 @@ # CONFIG_SECURITY_PATH is not set # CONFIG_SECURITY_TOMOYO is not set # CONFIG_SECURITY_APPARMOR is not set +# CONFIG_SECURITY_YAMA is not set # CONFIG_IMA is not set CONFIG_DEFAULT_SECURITY_DAC=y CONFIG_DEFAULT_SECURITY=""
You diff doesn't make sense because there should be a set of deleted lines for CONFIG_GRKERNSEC_* and another set for CONFIG_PAX_*. These correspond to the defines that are in the hardened sources but not in vanilla. Nonetheless, I'm cc-ing upstream to see what they think. (In reply to comment #3) > (In reply to comment #2) > > I don't think this is a harened issue, can you compare hardened-sources to > > vanilla with the closest possible config and let me know. > > I was sure in this too… 10 minutes ago. Now I've boot several times vanilla > 3.5.4 with CONFIG_TRANSPARENT_HUGEPAGE=y and it doesn't have this issue. > > Here is differences in .config: > > --- config-3.5.4-hardened-r1 2012-10-09 16:54:38.786032008 +0300 > +++ linux-3.5.4/.config 2012-10-11 19:44:55.901116082 +0300 > @@ -1,6 +1,6 @@ > # > # Automatically generated file; DO NOT EDIT. > -# Linux/i386 3.5.4-hardened-r1 Kernel Configuration > +# Linux/i386 3.5.4 Kernel Configuration > # > # CONFIG_64BIT is not set > CONFIG_X86_32=y > @@ -129,6 +129,7 @@ > CONFIG_RT_GROUP_SCHED=y > CONFIG_BLK_CGROUP=y > # CONFIG_DEBUG_BLK_CGROUP is not set > +# CONFIG_CHECKPOINT_RESTORE is not set > CONFIG_NAMESPACES=y > # CONFIG_UTS_NS is not set > # CONFIG_IPC_NS is not set > @@ -212,7 +213,6 @@ > CONFIG_MODULE_FORCE_UNLOAD=y > # CONFIG_MODVERSIONS is not set > # CONFIG_MODULE_SRCVERSION_ALL is not set > -CONFIG_STOP_MACHINE=y > CONFIG_BLOCK=y > CONFIG_LBDAF=y > CONFIG_BLK_DEV_BSG=y > @@ -329,7 +329,6 @@ > CONFIG_X86_INVLPG=y > CONFIG_X86_BSWAP=y > CONFIG_X86_POPAD_OK=y > -CONFIG_X86_ALIGNMENT_16=y > CONFIG_X86_INTEL_USERCOPY=y > CONFIG_X86_USE_PPRO_CHECKSUM=y > CONFIG_X86_USE_3DNOW=y > @@ -393,7 +392,9 @@ > CONFIG_DEFAULT_MMAP_MIN_ADDR=65536 > CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y > # CONFIG_MEMORY_FAILURE is not set > -# CONFIG_TRANSPARENT_HUGEPAGE is not set > +CONFIG_TRANSPARENT_HUGEPAGE=y > +CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y > +# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set > CONFIG_CROSS_MEMORY_ATTACH=y > CONFIG_NEED_PER_CPU_KM=y > CONFIG_CLEANCACHE=y > @@ -421,6 +422,7 @@ > CONFIG_PHYSICAL_START=0x1000000 > # CONFIG_RELOCATABLE is not set > CONFIG_PHYSICAL_ALIGN=0x100000 > +CONFIG_COMPAT_VDSO=y > # CONFIG_CMDLINE_BOOL is not set > CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y > > @@ -507,6 +509,7 @@ > CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE=y > # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set > CONFIG_HAVE_AOUT=y > +# CONFIG_BINFMT_AOUT is not set > # CONFIG_BINFMT_MISC is not set > CONFIG_HAVE_ATOMIC_IOMAP=y > CONFIG_HAVE_TEXT_POKE_SMP=y > @@ -1290,11 +1293,7 @@ > # CONFIG_NOZOMI is not set > # CONFIG_N_GSM is not set > # CONFIG_TRACE_SINK is not set > - > -# > -# KCopy > -# > -CONFIG_KCOPY=y > +# CONFIG_DEVKMEM is not set > > # > # Serial drivers > @@ -1339,6 +1338,7 @@ > # CONFIG_HANGCHECK_TIMER is not set > # CONFIG_TCG_TPM is not set > # CONFIG_TELCLOCK is not set > +CONFIG_DEVPORT=y > CONFIG_I2C=y > CONFIG_I2C_BOARDINFO=y > # CONFIG_I2C_COMPAT is not set > @@ -2278,7 +2278,9 @@ > # Pseudo filesystems > # > CONFIG_PROC_FS=y > +# CONFIG_PROC_KCORE is not set > CONFIG_PROC_SYSCTL=y > +CONFIG_PROC_PAGE_MONITOR=y > CONFIG_SYSFS=y > CONFIG_TMPFS=y > CONFIG_TMPFS_POSIX_ACL=y > @@ -2398,6 +2400,7 @@ > CONFIG_X86_VERBOSE_BOOTUP=y > CONFIG_EARLY_PRINTK=y > # CONFIG_EARLY_PRINTK_DBGP is not set > +# CONFIG_DEBUG_SET_MODULE_RONX is not set > CONFIG_DOUBLEFAULT=y > # CONFIG_IOMMU_STRESS is not set > CONFIG_HAVE_MMIOTRACE_SUPPORT=y > ... > [large GRSEC-related block removed] > ... > @@ -2575,6 +2426,7 @@ > # CONFIG_SECURITY_PATH is not set > # CONFIG_SECURITY_TOMOYO is not set > # CONFIG_SECURITY_APPARMOR is not set > +# CONFIG_SECURITY_YAMA is not set > # CONFIG_IMA is not set > CONFIG_DEFAULT_SECURITY_DAC=y > CONFIG_DEFAULT_SECURITY=""
(In reply to comment #4) > You diff doesn't make sense because there should be a set of deleted lines > for CONFIG_GRKERNSEC_* and another set for CONFIG_PAX_*. These correspond > to the defines that are in the hardened sources but not in vanilla. I didn't intend to use that diff for patching .config - I've posted it just as ease to read overview of changes between hardened/vanilla, and removed large grsec/pax-related block because we all know these defines are absent in vanilla. If you wanna use that diff to get full .config - I'll just attach these .config instead. One which already attached is for hardened kernel with extra patches BFS and BFQ. I'll add .config for hardened without BFS/BFQ (which have same issue) and vanilla (which doesn't have this issue).
Created attachment 326324 [details] .config for 3.5.4-hardened-r1 .config for 3.5.4-hardened-r1 without BFS/BFQ patches; CONFIG_TRANSPARENT_HUGEPAGE disabled (enabling it will result in this issue)
Created attachment 326326 [details] .config for 3.5.4-vanilla .config for 3.5.4-vanilla; CONFIG_TRANSPARENT_HUGEPAGE enabled, this issue doesn't happens
is this still a problem in 3.7?
(In reply to comment #8) > is this still a problem in 3.7? Yes, this is still a problem in 3.8.4-hardened. Here is example: 2013-03-25_20:37:28.74076 kern.info: devtmpfs: mounted 2013-03-25_20:37:28.74077 kern.info: Freeing unused kernel memory: 412k freed 2013-03-25_20:37:28.74078 daemon.info: systemd-udevd[835]: starting version 197 ... 2013-03-25_20:37:28.74096 kern.info: Adding 4194300k swap on /dev/sda1. Priority:-1 extents:1 across:4194300k 2013-03-25_20:37:31.16148 kern.alert: BUG: unable to handle kernel NULL pointer dereference at (nil) 2013-03-25_20:37:31.16163 kern.alert: IP: [<c1024c8d>] __do_page_fault+0x26d/0x560 2013-03-25_20:37:31.16172 kern.warn: *pde = 00000000 2013-03-25_20:37:31.16180 kern.warn: Oops: 0000 [#1] PREEMPT 2013-03-25_20:37:31.16188 kern.warn: Modules linked in: skge forcedeth 2013-03-25_20:37:31.16197 kern.warn: Pid: 1307, comm: mysqld Not tainted 3.8.4-hardened #1 ASUSTeK Computer INC. A7N8X-E/A7N8X-E 2013-03-25_20:37:31.16206 kern.warn: EIP: 0060:[<c1024c8d>] EFLAGS: 00210202 CPU: 0 2013-03-25_20:37:31.16214 kern.warn: EAX: f30dacf4 EBX: 00000002 ECX: 00000000 EDX: 00000000 2013-03-25_20:37:31.16222 kern.warn: ESI: f31d5f94 EDI: c10250b0 EBP: f31d5f84 ESP: f31d5f10 2013-03-25_20:37:31.16230 kern.warn: DS: 0068 ES: 0068 FS: 0000 GS: 00e0 SS: 0068 2013-03-25_20:37:31.16237 kern.warn: CR0: 8005003b CR2: 00000000 CR3: 33a16000 CR4: 000007d0 2013-03-25_20:37:31.16245 kern.warn: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 2013-03-25_20:37:31.16253 kern.warn: DR6: ffff0ff0 DR7: 00000400 2013-03-25_20:37:31.16261 <0>Process mysqld (pid: 1307, ti=f30dacf4 task=f30daa80 task.ti=f30dacf4) 2013-03-25_20:37:31.16269 <0>Stack: 2013-03-25_20:37:31.16277 kern.warn: 1ff8db10 f30daa80 f2e94c78 f30daa80 f2e94c78 f2e94c40 00000029 a5400000 2013-03-25_20:37:31.16287 kern.warn: 00000007 00000000 1ffa8d04 a53ff000 bf82d748 00000000 1ff8db10 1ffa8d04 2013-03-25_20:37:31.16296 kern.warn: c10250b0 f31d5f5c c10250b8 f30dacf4 c15838a9 1ff8db10 0013b880 00000000 2013-03-25_20:37:31.16305 <0>Call Trace: 2013-03-25_20:37:31.16312 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16323 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:31.16331 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:31.16338 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16347 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:31.16355 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:31.16363 <0>Code: ff 89 d0 c1 e8 0c c1 e0 05 03 05 48 52 83 c1 e8 6a e4 07 00 8b 55 a8 c1 ea 0a 81 e2 fc 0f 00 00 8d 0c 10 a1 d0 bf 76 c1 ff 40 10 <8b> 11 f6 c2 01 0f 84 72 02 00 00 f6 c2 04 0f 85 69 02 00 00 85 2013-03-25_20:37:31.16372 <0>EIP: [<c1024c8d>] __do_page_fault+0x26d/0x560 SS:ESP 0068:f31d5f10 2013-03-25_20:37:31.16380 kern.warn: CR2: 0000000000000000 2013-03-25_20:37:31.16388 kern.warn: ---[ end trace 2f2199060472fa74 ]--- 2013-03-25_20:37:31.16397 kern.info: note: mysqld[1307] exited with preempt_count 1 2013-03-25_20:37:31.16405 kern.err: BUG: scheduling while atomic: mysqld/1307/0x10000002 2013-03-25_20:37:31.16413 kern.warn: Modules linked in: skge forcedeth 2013-03-25_20:37:31.16421 kern.warn: Pid: 1307, comm: mysqld Tainted: G D 3.8.4-hardened #1 2013-03-25_20:37:31.16429 kern.warn: Call Trace: 2013-03-25_20:37:31.16437 kern.warn: [<c157dade>] __schedule_bug+0x47/0x53 2013-03-25_20:37:31.16445 kern.warn: [<c15829d9>] __schedule+0x509/0x5b0 2013-03-25_20:37:31.16452 kern.warn: [<c1058744>] ? ktime_get+0x54/0xf0 2013-03-25_20:37:31.16460 kern.warn: [<c101e256>] ? lapic_next_event+0x16/0x20 2013-03-25_20:37:31.16468 kern.warn: [<c105ec52>] ? clockevents_program_event+0xa2/0x150 2013-03-25_20:37:31.16476 kern.warn: [<c1052266>] __cond_resched+0x16/0x20 2013-03-25_20:37:31.16486 kern.warn: [<c1582ae5>] _cond_resched+0x25/0x30 2013-03-25_20:37:31.16495 kern.warn: [<c10a5228>] unmap_single_vma+0x238/0x4a0 2013-03-25_20:37:31.16503 kern.warn: [<c10950f8>] ? pagevec_lru_move_fn+0xb8/0xf0 2013-03-25_20:37:31.16511 kern.warn: [<c10a5c0b>] unmap_vmas+0x3b/0x50 2013-03-25_20:37:31.16519 kern.warn: [<c10aa8bf>] exit_mmap+0x6f/0x110 2013-03-25_20:37:31.16527 kern.warn: [<c10c0811>] ? __khugepaged_exit+0xd1/0x100 2013-03-25_20:37:31.16535 kern.warn: [<c102b68b>] mmput+0x3b/0xd0 2013-03-25_20:37:31.16543 kern.warn: [<c1032c2e>] do_exit+0x1ae/0x800 2013-03-25_20:37:31.16551 kern.warn: [<c157d832>] ? printk+0x38/0x3a 2013-03-25_20:37:31.16559 kern.warn: [<c1031299>] ? kmsg_dump+0xb9/0xd0 2013-03-25_20:37:31.16567 kern.warn: [<c10333dd>] do_group_exit+0x2d/0x80 2013-03-25_20:37:31.16575 kern.warn: [<c10053f0>] oops_end+0x60/0x80 2013-03-25_20:37:31.16582 kern.warn: [<c157d238>] no_context+0x1df/0x1e7 2013-03-25_20:37:31.16590 kern.warn: [<c157d3bf>] __bad_area_nosemaphore+0x17f/0x19d 2013-03-25_20:37:31.16600 kern.warn: [<c1035206>] ? irq_exit+0x46/0x60 2013-03-25_20:37:31.16608 kern.warn: [<c101e999>] ? smp_apic_timer_interrupt+0x49/0x80 2013-03-25_20:37:31.16615 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16623 kern.warn: [<c157d3ef>] bad_area_nosemaphore+0x12/0x14 2013-03-25_20:37:31.16631 kern.warn: [<c1024abf>] __do_page_fault+0x9f/0x560 2013-03-25_20:37:31.16639 kern.warn: [<c10956ad>] ? lru_cache_add_lru+0x1d/0x40 2013-03-25_20:37:31.16647 kern.warn: [<c10ade60>] ? page_add_new_anon_rmap+0x70/0xf0 2013-03-25_20:37:31.16655 kern.warn: [<c10c0407>] ? do_huge_pmd_anonymous_page+0x1b7/0x3d0 2013-03-25_20:37:31.16663 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16684 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:31.16690 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:31.16696 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16701 kern.warn: [<c1024c8d>] ? __do_page_fault+0x26d/0x560 2013-03-25_20:37:31.16708 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16714 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:31.16719 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:31.16724 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16729 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:31.16734 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:31.16739 kern.err: BUG: scheduling while atomic: mysqld/1307/0x10000002 2013-03-25_20:37:31.16745 kern.warn: Modules linked in: skge forcedeth 2013-03-25_20:37:31.16750 kern.warn: Pid: 1307, comm: mysqld Tainted: G D W 3.8.4-hardened #1 2013-03-25_20:37:31.16756 kern.warn: Call Trace: 2013-03-25_20:37:31.16761 kern.warn: [<c157dade>] __schedule_bug+0x47/0x53 2013-03-25_20:37:31.16766 kern.warn: [<c15829d9>] __schedule+0x509/0x5b0 2013-03-25_20:37:31.16771 kern.warn: [<c10cbdd0>] ? pipe_release+0x90/0xb0 2013-03-25_20:37:31.16776 kern.warn: [<c10dfce0>] ? mntput_no_expire+0x40/0xf0 2013-03-25_20:37:31.16781 kern.warn: [<c10dfda8>] ? mntput+0x18/0x30 2013-03-25_20:37:31.16786 kern.warn: [<c10c464e>] ? __fput+0x13e/0x200 2013-03-25_20:37:31.16792 kern.warn: [<c1052266>] __cond_resched+0x16/0x20 2013-03-25_20:37:31.16799 kern.warn: [<c1582ae5>] _cond_resched+0x25/0x30 2013-03-25_20:37:31.16804 kern.warn: [<c104693e>] task_work_run+0x7e/0xa0 2013-03-25_20:37:31.16809 kern.warn: [<c1032c4f>] do_exit+0x1cf/0x800 2013-03-25_20:37:31.16814 kern.warn: [<c157d832>] ? printk+0x38/0x3a 2013-03-25_20:37:31.16819 kern.warn: [<c1031299>] ? kmsg_dump+0xb9/0xd0 2013-03-25_20:37:31.16825 kern.warn: [<c10333dd>] do_group_exit+0x2d/0x80 2013-03-25_20:37:31.16830 kern.warn: [<c10053f0>] oops_end+0x60/0x80 2013-03-25_20:37:31.16835 kern.warn: [<c157d238>] no_context+0x1df/0x1e7 2013-03-25_20:37:31.16840 kern.warn: [<c157d3bf>] __bad_area_nosemaphore+0x17f/0x19d 2013-03-25_20:37:31.16845 kern.warn: [<c1035206>] ? irq_exit+0x46/0x60 2013-03-25_20:37:31.16850 kern.warn: [<c101e999>] ? smp_apic_timer_interrupt+0x49/0x80 2013-03-25_20:37:31.16855 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16860 kern.warn: [<c157d3ef>] bad_area_nosemaphore+0x12/0x14 2013-03-25_20:37:31.16866 kern.warn: [<c1024abf>] __do_page_fault+0x9f/0x560 2013-03-25_20:37:31.16871 kern.warn: [<c10956ad>] ? lru_cache_add_lru+0x1d/0x40 2013-03-25_20:37:31.16876 kern.warn: [<c10ade60>] ? page_add_new_anon_rmap+0x70/0xf0 2013-03-25_20:37:31.16881 kern.warn: [<c10c0407>] ? do_huge_pmd_anonymous_page+0x1b7/0x3d0 2013-03-25_20:37:31.16886 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16892 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:31.16899 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:31.16904 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16909 kern.warn: [<c1024c8d>] ? __do_page_fault+0x26d/0x560 2013-03-25_20:37:31.16939 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16945 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:31.16950 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:31.16979 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:31.16985 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:31.16990 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:31.48095 kern.alert: grsec: denied resource overstep by requesting 21 for RLIMIT_NICE against limit 0 for /usr/bin/xinit[xinit:1367] uid/euid:1001/1001 gid/egid:100/100, parent /usr/bin/startx[startx:1259] uid/euid:1001/1001 gid/egid:100/100 2013-03-25_20:37:32.27156 kern.alert: BUG: unable to handle kernel NULL pointer dereference at (nil) 2013-03-25_20:37:32.27171 kern.alert: IP: [<c1024c8d>] __do_page_fault+0x26d/0x560 2013-03-25_20:37:32.27179 kern.warn: *pde = 00000000 2013-03-25_20:37:32.27187 kern.warn: Oops: 0000 [#2] PREEMPT 2013-03-25_20:37:32.27199 kern.warn: Modules linked in: skge forcedeth 2013-03-25_20:37:32.27208 kern.warn: Pid: 1503, comm: mysqld Tainted: G D W 3.8.4-hardened #1 ASUSTeK Computer INC. A7N8X-E/A7N8X-E 2013-03-25_20:37:32.27216 kern.warn: EIP: 0060:[<c1024c8d>] EFLAGS: 00210202 CPU: 0 2013-03-25_20:37:32.27224 kern.warn: EAX: f30d9374 EBX: 00000002 ECX: 00000000 EDX: 00000000 2013-03-25_20:37:32.27232 kern.warn: ESI: f3945ec4 EDI: c10250b0 EBP: f3945eb4 ESP: f3945e40 2013-03-25_20:37:32.27240 kern.warn: DS: 0068 ES: 0068 FS: 0000 GS: 00e0 SS: 0068 2013-03-25_20:37:32.27248 kern.warn: CR0: 8005003b CR2: 00000000 CR3: 340cb000 CR4: 000007d0 2013-03-25_20:37:32.27256 kern.warn: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 2013-03-25_20:37:32.27264 kern.warn: DR6: ffff0ff0 DR7: 00000400 2013-03-25_20:37:32.27272 <0>Process mysqld (pid: 1503, ti=f30d9374 task=f30d9100 task.ti=f30d9374) 2013-03-25_20:37:32.27280 <0>Stack: 2013-03-25_20:37:32.27287 kern.warn: f39b4f60 00000080 f3945e50 f30d9100 f395b038 f395b000 00000029 a4400000 2013-03-25_20:37:32.27297 kern.warn: 00000007 f30d9374 f3945e9c c10c0407 00000000 f39bfffc f40cba40 00000029 2013-03-25_20:37:32.27305 kern.warn: f6e70720 f7755000 f395b000 a4400000 f40cba44 f39b4f60 f395b000 f3945ec8 2013-03-25_20:37:32.27313 <0>Call Trace: 2013-03-25_20:37:32.27321 kern.warn: [<c10c0407>] ? do_huge_pmd_anonymous_page+0x1b7/0x3d0 2013-03-25_20:37:32.27331 kern.warn: [<c10a6c10>] ? handle_mm_fault+0x190/0x1d0 2013-03-25_20:37:32.27339 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:32.27347 kern.warn: [<c10250b8>] do_page_fault+0x8/0x10 2013-03-25_20:37:32.27355 kern.warn: [<c1583cab>] error_code+0x6b/0x70 2013-03-25_20:37:32.27363 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:32.27371 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:32.27378 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:32.27386 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:32.27395 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:32.27404 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:32.27412 kern.warn: [<c10250b0>] ? vmalloc_sync_all+0x130/0x130 2013-03-25_20:37:32.27419 kern.warn: [<c10250b8>] ? do_page_fault+0x8/0x10 2013-03-25_20:37:32.27427 kern.warn: [<c15838a9>] ? restore_all_pax+0x7/0x7 2013-03-25_20:37:32.27435 <0>Code: ff 89 d0 c1 e8 0c c1 e0 05 03 05 48 52 83 c1 e8 6a e4 07 00 8b 55 a8 c1 ea 0a 81 e2 fc 0f 00 00 8d 0c 10 a1 d0 bf 76 c1 ff 40 10 <8b> 11 f6 c2 01 0f 84 72 02 00 00 f6 c2 04 0f 85 69 02 00 00 85 2013-03-25_20:37:32.27444 <0>EIP: [<c1024c8d>] __do_page_fault+0x26d/0x560 SS:ESP 0068:f3945e40 2013-03-25_20:37:32.27452 kern.warn: CR2: 0000000000000000 2013-03-25_20:37:32.27460 kern.warn: ---[ end trace 2f2199060472fa75 ]--- 2013-03-25_20:37:32.27467 kern.info: note: mysqld[1503] exited with preempt_count 1 2013-03-25_20:37:32.27476 kern.err: BUG: scheduling while atomic: mysqld/1503/0x10000002 2013-03-25_20:37:32.27486 kern.warn: Modules linked in: skge forcedeth 2013-03-25_20:37:32.27495 kern.warn: Pid: 1503, comm: mysqld Tainted: G D W 3.8.4-hardened #1 2013-03-25_20:37:32.27504 kern.warn: Call Trace: 2013-03-25_20:37:32.27512 kern.warn: [<c157dade>] __schedule_bug+0x47/0x53 2013-03-25_20:37:32.27519 kern.warn: [<c15829d9>] __schedule+0x509/0x5b0 2013-03-25_20:37:32.27527 kern.warn: [<c1058744>] ? ktime_get+0x54/0xf0 2013-03-25_20:37:32.27535 kern.warn: [<c101e256>] ? lapic_next_event+0x16/0x20 2013-03-25_20:37:32.27543 kern.warn: [<c105ec52>] ? clockevents_program_event+0xa2/0x150 2013-03-25_20:37:32.27551 kern.warn: [<c1052266>] __cond_resched+0x16/0x20 2013-03-25_20:37:32.27559 kern.warn: [<c1582ae5>] _cond_resched+0x25/0x30 2013-03-25_20:37:32.27567 kern.warn: [<c10a5228>] unmap_single_vma+0x238/0x4a0 2013-03-25_20:37:32.27575 kern.warn: [<c10950f8>] ? pagevec_lru_move_fn+0xb8/0xf0 2013-03-25_20:37:32.27582 kern.warn: [<c10a5c0b>] unmap_vmas+0x3b/0x50 2013-03-25_20:37:32.27590 kern.warn: [<c10aa8bf>] exit_mmap+0x6f/0x110 2013-03-25_20:37:32.27601 kern.warn: [<c10c0811>] ? __khugepaged_exit+0xd1/0x100 2013-03-25_20:37:32.27609 kern.warn: [<c102b68b>] mmput+0x3b/0xd0 2013-03-25_20:37:32.27617 kern.warn: [<c1032c2e>] do_exit+0x1ae/0x800 2013-03-25_20:37:32.27625 kern.warn: [<c157d832>] ? printk+0x38/0x3a 2013-03-25_20:37:32.27633 kern.warn: [<c1031299>] ? kmsg_dump+0xb9/0xd0 2013-03-25_20:37:32.27435 <0>Code: ff 89 d0 c1 e8 0c c1 e0 05 03 05 48 52 83 c1 e8 6a e4 07 00 8b 55 a8 c1 ea 0a 81 e2 fc 0f 00 00 8d 0c 10 a1 d0 bf 76 c1 ff 40 10 <8b> 11 f6 c2 01 0f 84 72 02 00 00 f6 c2 04 0f 85 69 02 00 00 85 2013-03-25_20:37:32.27444 <0>EIP: [<c1024c8d>] __do_page_fault+0x26d/0x560 SS:ESP 0068:f3945e40 2013-03-25_20:37:32.27452 kern.warn: CR2: 0000000000000000 2013-03-25_20:37:32.27460 kern.warn: ---[ end trace 2f2199060472fa75 ]--- 2013-03-25_20:37:32.27467 kern.info: note: mysqld[1503] exited with preempt_count 1 ...
can you send me the corresponding vmlinux (from the build root) please?
(In reply to comment #10) > can you send me the corresponding vmlinux (from the build root) please? I've already switched off CONFIG_TRANSPARENT_HUGEPAGE and rebuild kernel. Is it ok to just switch on it again, rebuild kernel and send you that vmlinux without actually booting it, or you need vmlinux + kernel logs of exactly that vmlinux (and I need to boot it again and provide it together with new logs of that boot attempt)?
(In reply to comment #11) > (In reply to comment #10) > Is it ok to just switch on it again, rebuild kernel and send you that > vmlinux without actually booting it send it to me for now and if the disasm doesn't match up with the oops report, i'll let you know.
(In reply to comment #12) > send it to me for now and if the disasm doesn't match up with the oops > report, i'll let you know. vmlinux built with: CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y I can't attach it here (because of 1MB limit), so you can download it from http://powerman.name/tmp/vmlinux
so the problem is that you're using the old PAGEEXEC method for non-exec pages and that code has never been updated to work with hugepages (there's one less level of page tables there that the current code doesn't take into account). i can try to fix it but quite frankly i'd rather not deal with something that obsolete (and i don't even know if the TLB hack works with hugepages). is there a reason you can't just use SEGMEXEC?
(In reply to comment #14) > is there a reason you can't just use SEGMEXEC? Not sure, it was configured too many years ago on this system. I'll try to switch to SEGMEXEC. Just to make things clear, is this correct? 1) On modern CPU with NX-bit support (both 32/64-bit) PAGEEXEC should be used. 2) On old CPU without NX-bit support SEGMEXEC should be used. 3) On old CPU without NX-bit support PAGEEXEC implementation outdated and not supported.
(In reply to comment #15) > Just to make things clear, is this correct? > 1) On modern CPU with NX-bit support (both 32/64-bit) PAGEEXEC should be > used. correct. > 2) On old CPU without NX-bit support SEGMEXEC should be used. correct. > 3) On old CPU without NX-bit support PAGEEXEC implementation outdated and > not supported. well, it should still work, i just don't want to keep adding more code to it. so for the next patch i implemented a workaround that, similarly to SEGMEXEC, will disable the use of THP when the old PAGEEXEC method is used, it should at least fix the crash you saw.
(In reply to comment #16) > is there a reason you can't just use SEGMEXEC? Maybe it was one several years ago when I configure this last time. But looks like now SEGMEXEC works ok, and there is no kernel BUG with TRANSPARENT_HUGEPAGE=y. Thanks.
Reolved (In reply to comment #17) > (In reply to comment #16) > > is there a reason you can't just use SEGMEXEC? > > Maybe it was one several years ago when I configure this last time. > But looks like now SEGMEXEC works ok, and there is no kernel BUG with > TRANSPARENT_HUGEPAGE=y. > > Thanks. Okay closing this as resolved.