Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 536040 - =sys-kernel/hardened-sources-3.17.7-r1 kernel oops on high disk usage
Summary: =sys-kernel/hardened-sources-3.17.7-r1 kernel oops on high disk usage
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Hardened (show other bugs)
Hardware: All Linux
: Normal critical (vote)
Assignee: The Gentoo Linux Hardened Kernel Team (OBSOLETE)
URL: http://xena.ww7.be/oops012014.jpg
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-08 16:03 UTC by William Waisse
Modified: 2015-11-21 23:34 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
kernel config (config_3.17.7-hardened-r1ww7_r10b.txt,93.65 KB, text/plain)
2015-02-05 01:14 UTC, William Waisse
Details

Note You need to log in before you can comment on or make changes to this bug.
Description William Waisse 2015-01-08 16:03:14 UTC
I have the oops shown on ths screenshot : 
http://xena.ww7.be/oops012014.jpg

that happens more or less once a day

the banned user 81 shown on the screenshot is apache, which is the process that is most using the disk IO

the server is a DELL R210 II with bios 2.7.0

the "banned user " message on the screenshot appears only on the console, which I can see through IDRAC console, but NOT written in any logs, so it seems the disks/raid are no more accessible at this time.


Reproducible: Always

Steps to Reproduce:
1.Dell R210 II server
2.kernel linux-3.17.7-hardened-r1
3.seems to crash on high disk usage
Actual Results:  
kernel oops

Expected Results:  
no oops ;) or a mce if its a hardware fault ( not sure yet , but it could be )
Comment 1 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2015-01-09 00:45:45 UTC
is there anything special that apache is doing?
Comment 2 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2015-01-09 00:46:19 UTC
also, a disk testing tool like fio should reproduce it if you are willing to try.
Comment 3 William Waisse 2015-01-18 09:44:16 UTC
(In reply to Matthew Thode ( prometheanfire ) from comment #1)
> is there anything special that apache is doing?

nothing special that I know, but I suspect some kind of 0day . . . 

I had another oops on the sane server / hardware, for this one I have it in the logs

Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#1] SMP
Jan 15 22:16:17 gemelos kernel: CPU: 2 PID: 18340 Comm: mysqld Not tainted 3.17.7-hardened-r1ww7_r10b #1
Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210 II/03X6X0, BIOS 2.7.0 11/15/2013
Jan 15 22:16:17 gemelos kernel: task: ee0c0930 ti: ee0c0c94 task.ti: ee0c0c94
Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU: 2
Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff66bd ECX: 00000000 EDX: 00000000
Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: c230fc6c ESP: c230fc48
Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: 204bc454 CR3: 01a04080 CR4: 001407f0
Jan 15 22:16:17 gemelos kernel: Stack:
Jan 15 22:16:17 gemelos kernel: 00000542 00000000 c230fc74 00000000 00000000 00000000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 c230fca8 000c1e69 00000000 00000000 02a70000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001 00000065 00000000 ee3d49c4
Jan 15 22:16:17 gemelos kernel: Call Trace:
Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
Jan 15 22:16:17 gemelos kernel: [<000c2fc5>] balance_dirty_pages_ratelimited+0x43f/0x739
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
Jan 15 22:16:17 gemelos kernel: [<000bb9ff>] generic_perform_write+0x172/0x1af
Jan 15 22:16:17 gemelos kernel: [<003c0000>] ? bnx2x_queue_comp_cmd+0xcf/0x12d
Jan 15 22:16:17 gemelos kernel: [<000bcbc0>] __generic_file_write_iter+0x444/0x4c5
Jan 15 22:16:17 gemelos kernel: [<003c1000>] ? bnx2x_func_send_cmd+0xc7/0x459
Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<003c0000>] ? bnx2x_queue_comp_cmd+0xcf/0x12d
Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<0000f000>] ? init_intel_cacheinfo+0x291/0x3bd
Jan 15 22:16:17 gemelos kernel: [<003c0000>] ? bnx2x_queue_comp_cmd+0xcf/0x12d
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
Jan 15 22:16:17 gemelos kernel: [<000ef286>] SyS_write+0x3f/0x7f
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<003c0000>] ? bnx2x_queue_comp_cmd+0xcf/0x12d
Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010033>] ? print_cpu_info+0x19/0xb0
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00020033>] ? smp_trace_threshold_interrupt+0x13/0x85
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85 d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1 89
Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106 SS:ESP 0068:c230fc48
Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#2]
Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763227 ]---
Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 60 until system restart for suspicious kernel crash
Jan 15 22:16:17 gemelos kernel: SMP
Jan 15 22:16:17 gemelos kernel: CPU: 4 PID: 18516 Comm: apache2 Tainted: G      D        3.17.7-hardened-r1ww7_r10b #1
Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210 II/03X6X0, BIOS 2.7.0 11/15/2013
Jan 15 22:16:17 gemelos kernel: task: ee0eced0 ti: ee0ed234 task.ti: ee0ed234
Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU: 4
Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff6647 ECX: 00000000 EDX: 00000000
Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: c23ebc58 ESP: c23ebc34
Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: a18c2000 CR3: 01a04100 CR4: 001407f0
Jan 15 22:16:17 gemelos kernel: Stack:
Jan 15 22:16:17 gemelos kernel: 00000542 00000000 c23ebc60 00000000 00000000 00000000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 c23ebc94 000c1e69 00000000 00000000 02a70000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001 00000065 00000000 ee3d49c4
Jan 15 22:16:17 gemelos kernel: Call Trace:
Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
Jan 15 22:16:17 gemelos kernel: [<000c2fc5>] balance_dirty_pages_ratelimited+0x43f/0x739
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
Jan 15 22:16:17 gemelos kernel: [<000bb9ff>] generic_perform_write+0x172/0x1af
Jan 15 22:16:17 gemelos kernel: [<000bcbc0>] __generic_file_write_iter+0x444/0x4c5
Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
Jan 15 22:16:17 gemelos kernel: [<000ef391>] SyS_pwrite64+0x52/0x79
Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00510b29>] ? restore_all_pax+0xc/0xc
Jan 15 22:16:17 gemelos kernel: [<0051007b>] ? ldsem_down_read+0x3b/0x163
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00200202>] ? sha256_transform+0x199c/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200286>] ? sha256_transform+0x1a20/0x24a2
Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85 d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1 89
Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106 SS:ESP 0068:c23ebc34
Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#3]
Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763228 ]---
Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 81 until system restart for suspicious kernel crash
Jan 15 22:16:17 gemelos kernel: SMP
Jan 15 22:16:17 gemelos kernel: CPU: 6 PID: 18483 Comm: mysqld Tainted: G      D        3.17.7-hardened-r1ww7_r10b #1
Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210 II/03X6X0, BIOS 2.7.0 11/15/2013
Jan 15 22:16:17 gemelos kernel: task: ee0a4e10 ti: ee0a5174 task.ti: ee0a5174
Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU: 6
Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff64e5 ECX: 00000000 EDX: 00000000
Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: ebe7bbfc ESP: ebe7bbd8
Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: a3400000 CR3: 01a04180 CR4: 001407f0
Jan 15 22:16:17 gemelos kernel: Stack:
Jan 15 22:16:17 gemelos kernel: 00000542 00000000 ebe7bc04 00000000 00000000 00000000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 ebe7bc38 000c1e69 00000000 00000000 02a70000 00000000 00000000
Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001 00000065 00000000 ee3d49c4
Jan 15 22:16:17 gemelos kernel: Call Trace:
Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
Jan 15 22:16:17 gemelos kernel: [<000c2fc5>] balance_dirty_pages_ratelimited+0x43f/0x739
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<000bb9ff>] generic_perform_write+0x172/0x1af
Jan 15 22:16:17 gemelos kernel: [<000bcbc0>] __generic_file_write_iter+0x444/0x4c5
Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
Jan 15 22:16:17 gemelos kernel: [<000ef286>] SyS_write+0x3f/0x7f
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00510b29>] ? restore_all_pax+0xc/0xc
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85 d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1 89
Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106 SS:ESP 0068:ebe7bbd8
Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763229 ]---
Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 60 until system restart for suspicious kernel crash
Comment 4 William Waisse 2015-01-18 09:45:58 UTC
(In reply to Matthew Thode ( prometheanfire ) from comment #2)
> also, a disk testing tool like fio should reproduce it if you are willing to
> try.

I installed this fio, and I can try that, any recommended options to run it ?
Comment 5 William Waisse 2015-01-18 13:53:17 UTC
(In reply to William Waisse from comment #3)
> (In reply to Matthew Thode ( prometheanfire ) from comment #1)
> > is there anything special that apache is doing?
> 
> nothing special that I know, but I suspect some kind of 0day . . . 
> 
> I had another oops on the sane server / hardware, for this one I have it in
> the logs
> 
> Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#1] SMP
> Jan 15 22:16:17 gemelos kernel: CPU: 2 PID: 18340 Comm: mysqld Not tainted
> 3.17.7-hardened-r1ww7_r10b #1
> Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210
> II/03X6X0, BIOS 2.7.0 11/15/2013
> Jan 15 22:16:17 gemelos kernel: task: ee0c0930 ti: ee0c0c94 task.ti: ee0c0c94
> Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU:
> 2
> Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff66bd ECX: 00000000
> EDX: 00000000
> Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: c230fc6c
> ESP: c230fc48
> Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
> Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: 204bc454 CR3: 01a04080
> CR4: 001407f0
> Jan 15 22:16:17 gemelos kernel: Stack:
> Jan 15 22:16:17 gemelos kernel: 00000542 00000000 c230fc74 00000000 00000000
> 00000000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 c230fca8 000c1e69 00000000 00000000
> 02a70000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001
> 00000065 00000000 ee3d49c4
> Jan 15 22:16:17 gemelos kernel: Call Trace:
> Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
> Jan 15 22:16:17 gemelos kernel: [<000c2fc5>]
> balance_dirty_pages_ratelimited+0x43f/0x739
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
> Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
> Jan 15 22:16:17 gemelos kernel: [<000bb9ff>]
> generic_perform_write+0x172/0x1af
> Jan 15 22:16:17 gemelos kernel: [<003c0000>] ?
> bnx2x_queue_comp_cmd+0xcf/0x12d
> Jan 15 22:16:17 gemelos kernel: [<000bcbc0>]
> __generic_file_write_iter+0x444/0x4c5
> Jan 15 22:16:17 gemelos kernel: [<003c1000>] ? bnx2x_func_send_cmd+0xc7/0x459
> Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<003c0000>] ?
> bnx2x_queue_comp_cmd+0xcf/0x12d
> Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<0000f000>] ?
> init_intel_cacheinfo+0x291/0x3bd
> Jan 15 22:16:17 gemelos kernel: [<003c0000>] ?
> bnx2x_queue_comp_cmd+0xcf/0x12d
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
> Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
> Jan 15 22:16:17 gemelos kernel: [<000ef286>] SyS_write+0x3f/0x7f
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<003c0000>] ?
> bnx2x_queue_comp_cmd+0xcf/0x12d
> Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010033>] ? print_cpu_info+0x19/0xb0
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00020033>] ?
> smp_trace_threshold_interrupt+0x13/0x85
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85
> d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb
> 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1
> 89
> Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106
> SS:ESP 0068:c230fc48
> Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#2]
> Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763227 ]---
> Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 60 until system
> restart for suspicious kernel crash
> Jan 15 22:16:17 gemelos kernel: SMP
> Jan 15 22:16:17 gemelos kernel: CPU: 4 PID: 18516 Comm: apache2 Tainted: G  
> D        3.17.7-hardened-r1ww7_r10b #1
> Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210
> II/03X6X0, BIOS 2.7.0 11/15/2013
> Jan 15 22:16:17 gemelos kernel: task: ee0eced0 ti: ee0ed234 task.ti: ee0ed234
> Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU:
> 4
> Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff6647 ECX: 00000000
> EDX: 00000000
> Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: c23ebc58
> ESP: c23ebc34
> Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
> Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: a18c2000 CR3: 01a04100
> CR4: 001407f0
> Jan 15 22:16:17 gemelos kernel: Stack:
> Jan 15 22:16:17 gemelos kernel: 00000542 00000000 c23ebc60 00000000 00000000
> 00000000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 c23ebc94 000c1e69 00000000 00000000
> 02a70000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001
> 00000065 00000000 ee3d49c4
> Jan 15 22:16:17 gemelos kernel: Call Trace:
> Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
> Jan 15 22:16:17 gemelos kernel: [<000c2fc5>]
> balance_dirty_pages_ratelimited+0x43f/0x739
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
> Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
> Jan 15 22:16:17 gemelos kernel: [<000bb9ff>]
> generic_perform_write+0x172/0x1af
> Jan 15 22:16:17 gemelos kernel: [<000bcbc0>]
> __generic_file_write_iter+0x444/0x4c5
> Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
> Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
> Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
> Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
> Jan 15 22:16:17 gemelos kernel: [<000ef391>] SyS_pwrite64+0x52/0x79
> Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
> Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00510b29>] ? restore_all_pax+0xc/0xc
> Jan 15 22:16:17 gemelos kernel: [<0051007b>] ? ldsem_down_read+0x3b/0x163
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00200202>] ? sha256_transform+0x199c/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200286>] ? sha256_transform+0x1a20/0x24a2
> Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85
> d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb
> 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1
> 89
> Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106
> SS:ESP 0068:c23ebc34
> Jan 15 22:16:17 gemelos kernel: divide error: 0000 [#3]
> Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763228 ]---
> Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 81 until system
> restart for suspicious kernel crash
> Jan 15 22:16:17 gemelos kernel: SMP
> Jan 15 22:16:17 gemelos kernel: CPU: 6 PID: 18483 Comm: mysqld Tainted: G   
> D        3.17.7-hardened-r1ww7_r10b #1
> Jan 15 22:16:17 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210
> II/03X6X0, BIOS 2.7.0 11/15/2013
> Jan 15 22:16:17 gemelos kernel: task: ee0a4e10 ti: ee0a5174 task.ti: ee0a5174
> Jan 15 22:16:17 gemelos kernel: EIP: 0060:[<00249241>] EFLAGS: 00210246 CPU:
> 6
> Jan 15 22:16:17 gemelos kernel: EAX: 0000003a EBX: ffff64e5 ECX: 00000000
> EDX: 00000000
> Jan 15 22:16:17 gemelos kernel: ESI: 0000003a EDI: 0000003a EBP: ebe7bbfc
> ESP: ebe7bbd8
> Jan 15 22:16:17 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
> Jan 15 22:16:17 gemelos kernel: CR0: 80050033 CR2: a3400000 CR3: 01a04180
> CR4: 001407f0
> Jan 15 22:16:17 gemelos kernel: Stack:
> Jan 15 22:16:17 gemelos kernel: 00000542 00000000 ebe7bc04 00000000 00000000
> 00000000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 ebe7bc38 000c1e69 00000000 00000000
> 02a70000 00000000 00000000
> Jan 15 22:16:17 gemelos kernel: 00000000 000002a7 0000003b 00000000 00000001
> 00000065 00000000 ee3d49c4
> Jan 15 22:16:17 gemelos kernel: Call Trace:
> Jan 15 22:16:17 gemelos kernel: [<000c1e69>] bdi_position_ratio+0x181/0x1dd
> Jan 15 22:16:17 gemelos kernel: [<000c2fc5>]
> balance_dirty_pages_ratelimited+0x43f/0x739
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<00498fe8>] ? nft_target_init+0x6b/0x17b
> Jan 15 22:16:17 gemelos kernel: [<001a784d>] ? __ext4_journal_stop+0x53/0x6c
> Jan 15 22:16:17 gemelos kernel: [<00017ffe>] ? intel_pmu_hw_config+0xa7/0xca
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<000bb9ff>]
> generic_perform_write+0x172/0x1af
> Jan 15 22:16:17 gemelos kernel: [<000bcbc0>]
> __generic_file_write_iter+0x444/0x4c5
> Jan 15 22:16:17 gemelos kernel: [<00200246>] ? sha256_transform+0x19e0/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00180245>] ext4_file_write_iter+0x3b2/0x473
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<000ee35d>] new_sync_write+0x5c/0x83
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<000ee301>] ? do_sync_readv_writev+0x70/0x70
> Jan 15 22:16:17 gemelos kernel: [<000eee57>] vfs_write+0xe8/0x1c8
> Jan 15 22:16:17 gemelos kernel: [<000ef286>] SyS_write+0x3f/0x7f
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00510b09>] syscall_call+0x7/0x7
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00510b29>] ? restore_all_pax+0xc/0xc
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00010000>] ? print_cpu_msr+0x3b/0x55
> Jan 15 22:16:17 gemelos kernel: [<00200033>] ? sha256_transform+0x17cd/0x24a2
> Jan 15 22:16:17 gemelos kernel: [<00200293>] ? sha256_transform+0x1a2d/0x24a2
> Jan 15 22:16:17 gemelos kernel: Code: 89 f9 83 ec 18 89 d7 8b 51 04 8b 01 85
> d2 89 45 e8 89 d0 89 55 ec 75 2e 8b 4d e8 89 f3 89 fe 39 ce 73 04 31 f6 eb
> 10 89 f0 31 d2 <f7> f1 31 d2 89 c6 89 f8 f7 f1 89 d7 89 d8 89 fa 89 f3 f7 f1
> 89
> Jan 15 22:16:17 gemelos kernel: EIP: [<00249241>] div64_u64+0x36/0x106
> SS:ESP 0068:ebe7bbd8
> Jan 15 22:16:17 gemelos kernel: ---[ end trace 16e28ee794763229 ]---
> Jan 15 22:16:17 gemelos kernel: grsec: banning user with uid 60 until system
> restart for suspicious kernel crash

I cross posted this second oops on the lkml : https://lkml.org/lkml/2015/1/18/60
Comment 6 Anthony Basile gentoo-dev 2015-01-18 15:58:47 UTC
(In reply to William Waisse from comment #5)
> 
> I cross posted this second oops on the lkml :
> https://lkml.org/lkml/2015/1/18/60

Sorry this was assigned to the wrong alias and I'm just seeing it.  Upstream vanilla kernel will not be interested in a grsec (ie heavily) patch kernel, although from the oops it doesn't look like grsec/pax is causing it.  The problem has something to do with your broadcom card (bnx2x driver) and I have had a report in irc about a similar oops.

Maybe pipacs upstream might see something else there.  You can try two things to narrow it down:

1) try hardened-sources 3.18.2-r1 which is the very latest grsec/pax patch

2) try the vanilla equivalent and see if you hit the same oops.
Comment 7 William Waisse 2015-01-18 16:37:29 UTC
adding a few details

3.17.7-hardened

i686-pc-linux-gnu-4.8.3

02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
Comment 8 PaX Team 2015-01-18 19:26:33 UTC
it looks like a divide-by-0 bug somewhere in ext4/vfs, it's not clear what code path is triggering it. if you still have vmlinux around you could try to resolve the address reported for generic_perform_write via addr2line. also check 3.18 as we've moved off of 3.17 already.
Comment 9 William Waisse 2015-01-18 23:38:34 UTC
is it : 

addr2line -e vmlinux -fip 001721af
ext3_fill_super at /usr/src/linux/fs/ext3/super.c:1961 (discriminator 1)

?
Comment 10 PaX Team 2015-01-19 00:35:36 UTC
(In reply to William Waisse from comment #9)
> addr2line -e vmlinux -fip 001721af
> ext3_fill_super at /usr/src/linux/fs/ext3/super.c:1961 (discriminator 1)

the theory is fine but i don't know where that address comes from, it's certainly not for generic_perform_write as that would then show up in the addr2line output as well ;). did you make sure that the address you took from the oops is for the same kernel whose vmlinux image you passed to addr2line? also enabling CONFIG_DEBUG_INFO/CONFIG_DEBUG_INFO_REDUCED will produce better output.
Comment 11 William Waisse 2015-01-19 09:50:14 UTC
(In reply to PaX Team from comment #10)
> (In reply to William Waisse from comment #9)
> > addr2line -e vmlinux -fip 001721af
> > ext3_fill_super at /usr/src/linux/fs/ext3/super.c:1961 (discriminator 1)
> 
> the theory is fine but i don't know where that address comes from, it's
> certainly not for generic_perform_write as that would then show up in the
> addr2line output as well ;). did you make sure that the address you took
> from the oops is for the same kernel whose vmlinux image you passed to
> addr2line? also enabling CONFIG_DEBUG_INFO/CONFIG_DEBUG_INFO_REDUCED will
> produce better output.

the addr2line is ran on the kernel I rebuilt with the same options and source code, but adding CONFIG_DEBUG_INFO/CONFIG_DEBUG_INFO_REDUCED

the vmlinuz in usr/src/linux have been overwritten when I rebuilt the kernel with the debug options

the addr2line wont work on the vmlinuz files I have in /boot after make install, seems to work only on the vmlinuz in the source tree.

gemelos boot # addr2line -e vmlinuz-3.17.7-hardened-r1ww7_r10b -fip 001721af
addr2line: vmlinuz-3.17.7-hardened-r1ww7_r10b: File format not recognized
gemelos boot # addr2line -e vmlinuz-3.17.7-hardened-r1ww7_r10b_debug -fip 001721af
addr2line: vmlinuz-3.17.7-hardened-r1ww7_r10b_debug: File format not recognized


I m now waiting for another oops with the debug-enabled kernel
Comment 12 William Waisse 2015-02-04 17:17:05 UTC
I finally had another crash, the oops is a little bit different : 

Feb  3 09:02:28 gemelos kernel: divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
Feb  3 09:02:28 gemelos kernel: CPU: 4 PID: 15864 Comm: mysqld Not tainted 3.17.7-hardened-r1ww7_r10b_debug #3
Feb  3 09:02:28 gemelos kernel: Hardware name: Dell Inc. PowerEdge R210 II/03X6X0, BIOS 2.7.0 11/15/2013
Feb  3 09:02:28 gemelos kernel: task: d3c8e190 ti: d3c8e4f4 task.ti: d3c8e4f4
Feb  3 09:02:28 gemelos kernel: EIP: 0060:[<002477e8>] EFLAGS: 00210246 CPU: 4
Feb  3 09:02:28 gemelos kernel: EAX: 00000356 EBX: 00000000 ECX: 00000356 EDX: 00000000
Feb  3 09:02:28 gemelos kernel: ESI: 00000000 EDI: 00000000 EBP: d2a7dd18 ESP: d2a7dcec
Feb  3 09:02:28 gemelos kernel: DS: 0068 ES: 0068 FS: 00d8 GS: 007b SS: 0068
Feb  3 09:02:28 gemelos kernel: CR0: 80050033 CR2: 1e0db000 CR3: 01a04100 CR4: 001407f0
Feb  3 09:02:28 gemelos kernel: Stack:
Feb  3 09:02:28 gemelos kernel: 00000000 ffffffff 00000000 00000000 ffffc08b 00000356 00000356 000c46fa
Feb  3 09:02:28 gemelos kernel: 00000000 c000001d 00000001 d2a7dd4c 000c48cd 00000000 00000000 00000000
Feb  3 09:02:28 gemelos kernel: 00010000 ffffffff 0000001d 00000357 00000000 00000001 0000002f ecf429c4
Feb  3 09:02:28 gemelos kernel: Call Trace:
Feb  3 09:02:28 gemelos kernel: [<000c46fa>] ? pos_ratio_polynom+0x25/0x77
Feb  3 09:02:28 gemelos kernel: [<000c48cd>] bdi_position_ratio+0x181/0x1dc
Feb  3 09:02:28 gemelos kernel: [<00010000>] ? identify_cpu+0x73/0x32d
Feb  3 09:02:28 gemelos kernel: [<000c5a90>] balance_dirty_pages_ratelimited+0x366/0x656
Feb  3 09:02:28 gemelos kernel: [<000be51c>] generic_perform_write+0x151/0x192
Feb  3 09:02:28 gemelos kernel: [<0011036c>] ? ftrace_raw_output_writeback_single_inode_template+0x79/0x98
Feb  3 09:02:28 gemelos kernel: [<000bfc1a>] __generic_file_write_iter+0x42f/0x478
Feb  3 09:02:28 gemelos kernel: [<0011047c>] ? perf_trace_bdi_dirty_ratelimit+0x2a/0xf2
Feb  3 09:02:28 gemelos kernel: [<0011036c>] ? ftrace_raw_output_writeback_single_inode_template+0x79/0x98
Feb  3 09:02:28 gemelos kernel: [<00180e42>] ext4_file_write_iter+0x2dd/0x46a
Feb  3 09:02:28 gemelos kernel: [<000f0c89>] new_sync_write+0x5c/0x83
Feb  3 09:02:28 gemelos kernel: [<0011036c>] ? ftrace_raw_output_writeback_single_inode_template+0x79/0x98
Feb  3 09:02:28 gemelos kernel: [<000f0c2d>] ? new_sync_read+0x80/0x80
Feb  3 09:02:28 gemelos kernel: [<000f1509>] vfs_write+0xe8/0x1c5
Feb  3 09:02:28 gemelos kernel: [<000f1bb0>] SyS_pwrite64+0x4e/0x75
Feb  3 09:02:28 gemelos kernel: [<0011036c>] ? ftrace_raw_output_writeback_single_inode_template+0x79/0x98
Feb  3 09:02:28 gemelos kernel: [<00507d89>] syscall_call+0x7/0x7
Feb  3 09:02:28 gemelos kernel: [<0011036c>] ? ftrace_raw_output_writeback_single_inode_template+0x79/0x98
Feb  3 09:02:28 gemelos kernel: [<00110033>] ? ftrace_raw_output_writeback_pages_written+0x37/0x3e
Feb  3 09:02:28 gemelos kernel: [<00200293>] ? sha512_transform+0x528/0x1476
Feb  3 09:02:28 gemelos kernel: [<00200033>] ? sha512_transform+0x2c8/0x1476
Feb  3 09:02:28 gemelos kernel: [<00200296>] ? sha512_transform+0x52b/0x1476
Feb  3 09:02:28 gemelos kernel: [<00200296>] ? sha512_transform+0x52b/0x1476
Feb  3 09:02:28 gemelos kernel: Code: e5 57 56 53 8d 7d 08 83 ec 20 8b 37 8b 7f 04 89 45 e4 89 55 e8 85 ff 75 35 8b 4d e8 39 f1 89 4d ec 73 04 31 c9 eb 12 89 c8 31 d2 <f7> f6 31 d2 89 c1 8b 45 ec f7 f6 89 55 ec 8b 5d e4 8b 55 ec 89
Feb  3 09:02:28 gemelos kernel: EIP: [<002477e8>] div64_u64+0x2d/0x126 SS:ESP 0068:d2a7dcec
Feb  3 09:02:28 gemelos kernel: ---[ end trace 449da52219d682ad ]---

I will now try with hardened-sources 3.18.5 . . .
Comment 13 William Waisse 2015-02-04 17:54:12 UTC
so with the latest crash the adress reported for generic_perform_write is 

Feb  3 09:02:28 gemelos kernel: [<000be51c>] generic_perform_write+0x151/0x192

and i have trhe vmlinux for addr2line :

addr2line -e vmlinux -fip 00151192
reiserfs_write_dquot at /usr/src/linux/fs/reiserfs/super.c:2213

addr2line -e vmlinux -fip 000be51c
constant_test_bit at /usr/src/linux/./arch/x86/include/asm/bitops.h:311
 (inlined by) test_ti_thread_flag at /usr/src/linux/include/linux/thread_info.h:91
 (inlined by) test_tsk_thread_flag at /usr/src/linux/include/linux/sched.h:2828
 (inlined by) signal_pending at /usr/src/linux/include/linux/sched.h:2854
 (inlined by) fatal_signal_pending at /usr/src/linux/include/linux/sched.h:2864
 (inlined by) generic_perform_write at /usr/src/linux/mm/filemap.c:2526

is that what you need ? 
( theres no reiserfs at all on this server )
Comment 14 PaX Team 2015-02-04 19:56:21 UTC
can you resolve 2477e8?
Comment 15 William Waisse 2015-02-04 22:37:55 UTC
addr2line -e vmlinux -fip 002477e8

div_u64_rem at /usr/src/linux/./arch/x86/include/asm/div64.h:53
 (inlined by) div_u64 at /usr/src/linux/include/linux/math64.h:100
 (inlined by) div64_u64 at /usr/src/linux/lib/div64.c:139
Comment 16 PaX Team 2015-02-04 23:50:02 UTC
and this one: c48cd? what happens here is a division by 0 error, i have no idea how we would cause this though, we don't change this writeback code...
Comment 17 William Waisse 2015-02-05 00:29:01 UTC
addr2line -e vmlinux -fip 00c48cd
bdi_position_ratio at /usr/src/linux/mm/page-writeback.c:824
Comment 18 PaX Team 2015-02-05 00:59:28 UTC
so, the problem occured in this code in mm/page-writeback.c:

820 »·······span = (thresh - bdi_thresh + 8 * write_bw) * (u64)x >> 16;
821 »·······x_intercept = bdi_setpoint + span;
822
823 »·······if (bdi_dirty < x_intercept - span / 4) {
824 »·······»·······pos_ratio = div64_u64(pos_ratio * (x_intercept - bdi_dirty),
825 »·······»·······»·······»·······    x_intercept - bdi_setpoint + 1);

the divisor x_intercept-bdi_setpoint+1 was 0, which means that span+1 was 0 which is kinda impossible given the above code, so i wonder what could go so wrong. can you post your kernel config (in particular, i'm wondering if you have enabled the SIZE_OVERFLOW plugin)?
Comment 19 William Waisse 2015-02-05 01:14:15 UTC
Created attachment 395562 [details]
kernel config
Comment 20 PaX Team 2015-02-05 01:33:45 UTC
so we think it's an upstream bug https://lkml.org/lkml/2014/4/29/497 that was fixed only on 64 bit archs. on 32 bit archs the function in question uses a 32 bit type (unsigned long) instead of u64 and therefore the trunction issue mentioned in the thread can very well happen.

i'm wondering, could you run a vanilla kernel just for testing and reproduce this issue there as well?
Comment 21 William Waisse 2015-02-05 02:26:49 UTC
 well I try to never use a vanilla kernel but I will do that if asked by the pax team ! 

 those last 10 years i ve only built hardened sources with grsec/pax, should I use gentoo-sources or vanilla-sources to build this unsecure kernel ?
Comment 22 PaX Team 2015-02-05 02:50:35 UTC
vanilla sources would be better as kernel developers don't like to deal with bugreports for patched trees.
Comment 23 William Waisse 2015-02-05 04:21:45 UTC
ok, now running vmlinuz-3.18.5ww7_vanilla1_debug , hoping i wont be rooted by one more vanilla 0day for not running a secure grsec/pax kernel ;(
Comment 24 William Waisse 2015-02-14 18:53:36 UTC
ok , it crashed again today with 3.18.5 vanilla , full oops on :

http://pastebin.com/raw.php?i=sfvXTAEZ
Comment 25 PaX Team 2015-02-14 19:18:39 UTC
thanks, so it's as i expected it, a vanilla bug not fixed on 32 bit archs. as the (un)lucky finder, you'll have the honours of reporting it to lkml and as soon as they have a fix, we'll take it into grsec (i could go ahead and blindly change all affected variables to u64 but i'd rather have the people familiar with this code produce a proper fix). also please CC me on the lkml submission.
Comment 26 William Waisse 2015-02-14 19:24:13 UTC
(In reply to William Waisse from comment #24)
> ok , it crashed again today with 3.18.5 vanilla , full oops on :
> 
> http://pastebin.com/raw.php?i=sfvXTAEZ

posted on https://lkml.org/lkml/2015/2/14/64
(In reply to PaX Team from comment #25)
> thanks, so it's as i expected it, a vanilla bug not fixed on 32 bit archs.
> as the (un)lucky finder, you'll have the honours of reporting it to lkml and
> as soon as they have a fix, we'll take it into grsec (i could go ahead and
> blindly change all affected variables to u64 but i'd rather have the people
> familiar with this code produce a proper fix). also please CC me on the lkml
> submission.

posted on https://lkml.org/lkml/2015/2/14/64
Comment 27 William Waisse 2015-02-17 22:00:39 UTC
 ok, waiting for lthe lkml to answer, I just tried to patch mm/page-writeback.c myself to avoid this division by zero, kernel booting . . . lets pray
Comment 28 William Waisse 2015-03-01 14:59:30 UTC
now getting the same problem on another server , also 32 bits : 
http://pastebin.com/Rvid0BF8

that happened after i did full pdates on this server, so the kernel bug seems to be triggered by new versions of userspace programs, probably related to writing on ext4 filesystem while using a 32 bits kernel.

 no more crashes on the server i manually "dirty patched" to avoid divide by zero, so I ll just do the same thing on this other crashing server

 just in case someone needs to avoid those crashes, here is my diff of mm/ page-writeback.c

( yes I agree this is probably  a shameful dirty workaround, but i m not exactly a kernel developper and at least I have no more crashes since I patched that )


diff page-writeback.c page-writeback.c.save
581,584d580
<       unsigned int divisor;
<
<       divisor = limit - setpoint;
<       if (divisor < 1 ) divisor=1;
587,588c583
<               divisor);
< //                limit - setpoint + 1);
---
>                   limit - setpoint + 1);
686d680
<       unsigned long divisor;
827a822
>
829,830d823
<               divisor=x_intercept - bdi_setpoint +1;
<               if ( divisor < 1 ) divisor=1;
832c825
<                                   divisor);
---
>                                   x_intercept - bdi_setpoint + 1);
Comment 29 William Waisse 2015-03-20 22:54:58 UTC
@pax_team since the linux kernel developpers dont seem to be very interested in this divide by zero oops . . . would you mind patching that ? for now its ok for me with my basic dirty custom patch workaround checking for divisor=0 , no more crashes. but manually patching the kernel sources everytime i ll have to upgrade is  . . .meh ;(

any ideas why I just had no answers on https://lkml.org/lkml/2015/2/14/64
even after reproducing on a vamilla kernel and providing full debug stack ? 

any ideas why this was fixed on 64 bit kernel code but not for 32 bit ?

I have to say i m a little bit stunned by the complete silence of the lkmk about this kernel oops . . .
Comment 30 PaX Team 2015-03-24 00:33:54 UTC
(In reply to William Waisse from comment #29)
> @pax_team since the linux kernel developpers dont seem to be very interested
> in this divide by zero oops . . . would you mind patching that ?

the problem is that i don't know this code and what the correct patch would look like...

> any ideas why I just had no answers on https://lkml.org/lkml/2015/2/14/64
> even after reproducing on a vamilla kernel and providing full debug stack ? 

good question, maybe try to resend it every now and then and perhaps CC more people.

> any ideas why this was fixed on 64 bit kernel code but not for 32 bit ?

i guess nobody working on the fix at the time thought of 32 bit archs.
Comment 31 Anthony Basile gentoo-dev 2015-10-24 16:27:34 UTC
we've dropped support for the 3.x series of hardened-sources.  is this divide by zero problem in the 4.x series?  otherwise, i'll close this bug obsolete.
Comment 32 PaX Team 2015-10-31 23:07:43 UTC
(In reply to Anthony Basile from comment #31)
> we've dropped support for the 3.x series of hardened-sources.  is this
> divide by zero problem in the 4.x series?  otherwise, i'll close this bug
> obsolete.

yes, afaik this code survives until today and noone seems to have fixed it properly...
Comment 33 William Waisse 2015-11-01 00:41:04 UTC
(In reply to Anthony Basile from comment #31)
> we've dropped support for the 3.x series of hardened-sources.  is this
> divide by zero problem in the 4.x series?  otherwise, i'll close this bug
> obsolete.

 most probably yes since the code didnt change and no one answered to me on the LKML . . . seems 32 bit kernel is no more supported.

 since I reported this bug i just used a patched kernel with my small dirty ack i posted in the previous comment ( if (divisor <1) divisor=1 ) the server was no more crashing, just pretty slow on disk accesses. I ll try a non patched 4.x version soon to tell you if the crash/oops comes back.
Comment 34 William Waisse 2015-11-21 23:34:35 UTC
 ok I ve been running the 4.1.7-hardened kernel for a few weeks now, without my dirty patch , and i had no division by zero oops.

 most probably the divide by zero was triggered somewhere els in a hardware or filesystem or hardware raid driver, and the bug is ( or seem until now ) no more here, so you can probably close the bug.