Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 197483 - [2.6.23 regression] Accessing invalid mmapped memory from gdb causes lockup
Summary: [2.6.23 regression] Accessing invalid mmapped memory from gdb causes lockup
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High major (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard: linux-2.6.23-regression
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2007-10-30 02:17 UTC by Duane Griffin
Modified: 2007-11-27 15:04 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
dmesg (dmesg,239.88 KB, text/plain)
2007-10-30 02:20 UTC, Duane Griffin
Details
.config (config-2.6.23.1,41.00 KB, text/plain)
2007-10-30 02:21 UTC, Duane Griffin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Duane Griffin 2007-10-30 02:17:18 UTC
While investigating this bug:
http://bugs.gentoo.org/show_bug.cgi?id=197191

I happened to test the example given there under gdb. After the SIGBUS I went up a couple of frames into the main function, then did:

print ((char *) file)[8192]

Which proceeded to lock up my machine. Actually, it looks like gdb just sits spinning in a *really* tight loop. Kill -9 on the test program being debugged, gdb and/or both together doesn't kill it. SysRq-T doesn't show any stack trace for gdb although all other tasks are shown fine, including the traced test program. Both 2.6.23.1 and 2.6.24-rc1-g82798a17 show the same behaviour.
Comment 1 Duane Griffin 2007-10-30 02:20:37 UTC
Created attachment 134669 [details]
dmesg

dmesg from latest nightly git snapshot showing the bug.

Note the three SysRq-T traces. The first was before the invalid memory access. The second after the access but before the tasks were killed. The last after sending kill -9 to both gdb and the test program.
Comment 2 Duane Griffin 2007-10-30 02:21:43 UTC
Created attachment 134671 [details]
.config

.config (for 2.6.23.1)
Comment 3 Duane Griffin 2007-10-30 02:34:47 UTC
Just tested on 2.6.22.1 where this does NOT occur (gdb happily prints '\0'). So it looks like a regression.

I'll start bisecting tomorrow.
Comment 4 Duane Griffin 2007-10-31 00:46:45 UTC
I've bisected it down to this commit:

54cb8821de07f2ffcd28c380ce9b93d5784b40d7
"mm: merge populate and nopage into fault (fixes nonlinear)"

I've reported it upstream to LKML and Nick Piggin, the author of the commit.
Comment 5 Duane Griffin 2007-11-03 00:52:00 UTC
A fix has been committed to Linus' tree with commit:
5307cc1aa53850f017c8053db034cf950b670ac9

The same patch has been queued for the 2.6.23-stable tree.
Comment 6 Daniel Drake (RETIRED) gentoo-dev 2007-11-03 11:33:38 UTC
Thanks a lot for digging into this and the other bug report, will include that patch with the next revision. and yes, that bug report was very well worded, congratulations on the best-bug-report-of-all-time award :)
Comment 7 Duane Griffin 2007-11-03 12:26:59 UTC
Thanks! I shall treasure it always ;)
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2007-11-27 15:04:23 UTC
this was fixed in gentoo-sources-2.6.23-r2