Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 98516 - random mm oopses on amd64/VIA K8T800
Summary: random mm oopses on amd64/VIA K8T800
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-07-09 16:23 UTC by Andrey Kislyuk (RETIRED)
Modified: 2005-07-21 04:31 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
kernel config (reactor.config,31.25 KB, text/plain)
2005-07-09 16:25 UTC, Andrey Kislyuk (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-09 16:23:26 UTC
I've been getting oopses (and sometimes panics) in what looks to be the memory
manager page freeing functions. This has been happening in 2.6.9, 11, and 12. I
cannot completely rule out faulty hardware, but x86 kernels did not have these
issues. The machine is not overclocked and does not overheat. The oopses appear
every few days. The oopses appear with or without adm8211 and nvidia, so I don't
think they are at fault.

Jul  9 03:12:46 reactor Bad page state at free_hot_cold_page (in process
'kswapd0', page ffff8100018e24d8)
Jul  9 03:12:46 reactor flags:0x20000000 mapping:0000000000000000
mapcount:-16777216 count:0
Jul  9 03:12:46 reactor Backtrace:
Jul  9 03:12:46 reactor
Jul  9 03:12:46 reactor Call Trace:<ffffffff80157432>{bad_page+114}
<ffffffff80157b87>{free_hot_cold_page+135}
Jul  9 03:12:46 reactor <ffffffff80157c25>{__pagevec_free+37}
<ffffffff8015cb95>{release_pages+277}
Jul  9 03:12:46 reactor <ffffffff8015ce59>{__pagevec_release+25}
<ffffffff8015d2ee>{invalidate_mapping_pages+174}
Jul  9 03:12:46 reactor <ffffffff80189249>{shrink_icache_memory+281}
<ffffffff8015d721>{shrink_slab+193}
Jul  9 03:12:46 reactor <ffffffff8015eb04>{balance_pgdat+628}
<ffffffff8015ed87>{kswapd+295}
Jul  9 03:12:46 reactor <ffffffff80145660>{autoremove_wake_function+0}
<ffffffff80145660>{autoremove_wake_function+0}
Jul  9 03:12:46 reactor <ffffffff8010f163>{child_rip+8} <ffffffff8015ec60>{kswapd+0}
Jul  9 03:12:46 reactor <ffffffff8010f15b>{child_rip+0}
Jul  9 03:12:46 reactor Trying to fix it up, but a reboot is needed

_____________________________________________________________


Jul  9 14:50:19 reactor Unable to handle kernel NULL pointer dereference at
0000000000000058 RIP:
Jul  9 14:50:19 reactor <ffffffff801740e6>{try_to_release_page+38}
Jul  9 14:50:19 reactor PGD 2ad53067 PUD 1433e067 PMD 0
Jul  9 14:50:19 reactor Oops: 0000 [1]
Jul  9 14:50:19 reactor CPU 0
Jul  9 14:50:19 reactor Modules linked in: arc4 adm8211 nvidia
Jul  9 14:50:19 reactor Pid: 212, comm: kswapd0 Tainted: P    B 2.6.12-gentoo
Jul  9 14:50:19 reactor RIP: 0010:[<ffffffff801740e6>]
<ffffffff801740e6>{try_to_release_page+38}
Jul  9 14:50:19 reactor RSP: 0018:ffff81003fdb9b70  EFLAGS: 00010286
Jul  9 14:50:19 reactor RAX: 0000000000000008 RBX: ffff81003e85cb48 RCX:
ffff810001d25ee0
Jul  9 14:50:19 reactor RDX: ffff81003e85cb48 RSI: 00000000000000d0 RDI:
ffff810001d25eb8
Jul  9 14:50:19 reactor RBP: ffff810001d25eb8 R08: 0000000000000002 R09:
ffff81003fdb9a16
Jul  9 14:50:19 reactor R10: ffff81003fdb9a90 R11: 0000000000000020 R12:
ffffffff8058d6c0
Jul  9 14:50:19 reactor R13: ffff81003fdb9e38 R14: 0000000000000000 R15:
ffff81003fdb9d18
Jul  9 14:50:19 reactor FS:  0000000001655ae0(0000) GS:ffffffff806e6900(0000)
knlGS:0000000000000000
Jul  9 14:50:19 reactor CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jul  9 14:50:19 reactor CR2: 0000000000000058 CR3: 0000000003a69000 CR4:
00000000000006e0
Jul  9 14:50:19 reactor Process kswapd0 (pid: 212, threadinfo ffff81003fdb8000,
task ffff81003fcf4750)
Jul  9 14:50:19 reactor Stack: ffffffff8015e32c ffffffff8058d840
ffffffff8058d840 ffffffff8058d850
Jul  9 14:50:19 reactor 0000000100000292 ffff81003e85cb48 0000000b00000256
ffffffff00000001
Jul  9 14:50:19 reactor 0000000000000017 0000000000000099
Jul  9 14:50:19 reactor Call Trace:<ffffffff8015e32c>{shrink_zone+2780}
<ffffffff80157c25>{__pagevec_free+37}
Jul  9 14:50:19 reactor <ffffffff8015a9b6>{kmem_freepages+198}
<ffffffff8015b304>{free_block+228}
Jul  9 14:50:19 reactor <ffffffff8015b491>{cache_flusharray+113}
<ffffffff8015eae7>{balance_pgdat+599}
Jul  9 14:50:19 reactor <ffffffff8015ed87>{kswapd+295}
<ffffffff80145660>{autoremove_wake_function+0}
Jul  9 14:50:19 reactor <ffffffff80145660>{autoremove_wake_function+0}
<ffffffff8010f163>{child_rip+8}
Jul  9 14:50:19 reactor <ffffffff8015ec60>{kswapd+0} <ffffffff8010f15b>{child_rip+0}
Jul  9 14:50:19 reactor
Jul  9 14:50:19 reactor
Jul  9 14:50:19 reactor Code: 48 8b 40 50 48 85 c0 74 06 49 89 c3 41 ff e3 e9 36
ff ff ff
Jul  9 14:50:19 reactor RIP <ffffffff801740e6>{try_to_release_page+38} RSP
<ffff81003fdb9b70>
Jul  9 14:50:19 reactor CR2: 0000000000000058


Reproducible: Sometimes
Steps to Reproduce:
Comment 1 Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-09 16:25:01 UTC
Created attachment 63041 [details]
kernel config
Comment 2 Daniel Drake (RETIRED) gentoo-dev 2005-07-10 02:31:05 UTC
Have you tested your memory with memtest recently?
Comment 3 Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-10 11:55:31 UTC
Just ran memtest for a few hours on that machine. No errors in the standard test
configuration.
Comment 4 Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-10 12:07:14 UTC
Some more data:

-The machine is one-cpu, no smp in kernel
-Reproducible on different video cards
-Reproducible with preemptible kernel on or off
-Reproducible with ondemand governor/cool'n'quiet on or off
-Happened on Fedora Core 3 too (2.6.9)
-1 GB of RAM (2x512), socket 939 (two memory controllers)
-ASUS A8V Deluxe with default BIOS configuration
Comment 5 Daniel Drake (RETIRED) gentoo-dev 2005-07-16 15:52:41 UTC
Is this reproducable on vanilla-sources-2.6.13_rc3?
Comment 6 Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-18 10:23:12 UTC
(In reply to comment #5)
> Is this reproducable on vanilla-sources-2.6.13_rc3?

I haven't been able to reproduce it yet after a day of running 2.6.13. I'll
report if I get it again but that might take a while as this machine is in the
process of moving.
Comment 7 Andrey Kislyuk (RETIRED) gentoo-dev 2005-07-19 16:24:02 UTC
The bug is reproduceable on 2.6.13-rc3:

Jul 19 16:22:36 reactor Unable to handle kernel NULL pointer dereference at
0000000000000050 RIP:
Jul 19 16:22:36 reactor <ffffffff80176156>{try_to_release_page+38}
Jul 19 16:22:36 reactor PGD dc4067 PUD ce9067 PMD 0
Jul 19 16:22:36 reactor Oops: 0000 [1]
Jul 19 16:22:36 reactor CPU 0
Jul 19 16:22:36 reactor Modules linked in: nvidia
Jul 19 16:22:36 reactor Pid: 215, comm: kswapd0 Tainted: P      2.6.13-rc3
Jul 19 16:22:36 reactor RIP: 0010:[<ffffffff80176156>]
<ffffffff80176156>{try_to_release_page+38}
Jul 19 16:22:36 reactor RSP: 0000:ffff81003fdb9b70  EFLAGS: 00010286
Jul 19 16:22:36 reactor RAX: 0000000000000000 RBX: ffff81003e875248 RCX:
ffff810001d03ea0
Jul 19 16:22:36 reactor RDX: ffff81003e875248 RSI: 00000000000000d0 RDI:
ffff810001d03e78
Jul 19 16:22:36 reactor RBP: ffff810001d03e78 R08: 0000000000000000 R09:
ffff81003fdb9a16
Jul 19 16:22:36 reactor R10: ffff81003fdb9a90 R11: ffffffff801d6380 R12:
ffffffff80596868
Jul 19 16:22:36 reactor R13: ffff81003fdb9e38 R14: 0000000000000000 R15:
0000000000000001
Jul 19 16:22:36 reactor FS:  0000000000524ae0(0000) GS:ffffffff806fa800(0000)
knlGS:0000000000000000
Jul 19 16:22:36 reactor CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jul 19 16:22:36 reactor CR2: 0000000000000050 CR3: 0000000000d29000 CR4:
00000000000006e0
Jul 19 16:22:36 reactor Process kswapd0 (pid: 215, threadinfo ffff81003fdb8000,
task ffff81003fcf4760)
Jul 19 16:22:36 reactor Stack: ffffffff8015f9ac ffff8100012c9d28
ffffffff805969e8 ffffffff805969e8
Jul 19 16:22:36 reactor ffffffff805969f8 ffff81003e875248 0000000e00000256
ffffffff00000000
Jul 19 16:22:36 reactor 0000000000000012 000000000000002a
Jul 19 16:22:36 reactor Call Trace:<ffffffff8015f9ac>{shrink_zone+2796}
<ffffffff80158f15>{__pagevec_free+37}
Jul 19 16:22:36 reactor <ffffffff8015e20f>{release_pages+319}
<ffffffff8015c8e4>{free_block+228}
Jul 19 16:22:36 reactor <ffffffff8015ca71>{cache_flusharray+113}
<ffffffff8016017c>{balance_pgdat+588}
Jul 19 16:22:36 reactor <ffffffff80145f12>{prepare_to_wait+66}
<ffffffff80160417>{kswapd+295}
Jul 19 16:22:36 reactor <ffffffff80145fc0>{autoremove_wake_function+0}
<ffffffff80145fc0>{autoremove_wake_function+0}
Jul 19 16:22:36 reactor <ffffffff8010f323>{child_rip+8} <ffffffff801602f0>{kswapd+0}
Jul 19 16:22:36 reactor <ffffffff8010f31b>{child_rip+0}
Jul 19 16:22:36 reactor
Jul 19 16:22:36 reactor Code: 48 8b 40 50 48 85 c0 74 06 49 89 c3 41 ff e3 e9 36
ff ff ff
Jul 19 16:22:36 reactor RIP <ffffffff80176156>{try_to_release_page+38} RSP
<ffff81003fdb9b70>
Jul 19 16:22:36 reactor CR2: 0000000000000050
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2005-07-21 04:31:04 UTC
Looks like an upstream bug then. Please report this at
http://bugzilla.kernel.org and post the new bug URL here.