Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 416685 - Linux Page Allocator: __pte_alloc_kernel() does not honor gfp flags passed to vmalloc()
Summary: Linux Page Allocator: __pte_alloc_kernel() does not honor gfp flags passed to...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal critical (vote)
Assignee: Richard Yao (RETIRED)
URL: https://bugzilla.kernel.org/show_bug....
Whiteboard:
Keywords: Bug, PATCH
Depends on:
Blocks: 410461
  Show dependency tree
 
Reported: 2012-05-20 05:40 UTC by Richard Yao (RETIRED)
Modified: 2012-09-18 23:34 UTC (History)
7 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Prasad Gajanan Joshi's 2011-03-14 patch from the RedHat bug tracker (propage-gfp-setting.patch,54.16 KB, patch)
2012-05-20 05:44 UTC, Richard Yao (RETIRED)
Details | Diff
Updated patch for Linux 3.4 (linux-3.4-vm-gfp-pass.patch,49.58 KB, patch)
2012-06-07 01:36 UTC, Richard Yao (RETIRED)
Details | Diff
Patch against Linux 3.5-rc1 (linux-3.5-vm-gfp-pass.patch,51.49 KB, patch)
2012-06-07 03:09 UTC, Richard Yao (RETIRED)
Details | Diff
Updated patch against Linux 3.5.0 (linux-3.5.0-honor-gfp-flags.patch,51.60 KB, patch)
2012-07-30 00:32 UTC, Richard Yao (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Yao (RETIRED) gentoo-dev 2012-05-20 05:40:24 UTC
The ZFS/Lustre developers discovered that __pte_alloc_kernel() does not honor gfp flags passed to vmalloc()[0], but they opted to set PF_MEMALLOC in their code to workaround it. That is a caused problems[1], so we need a proper fix.

There has been a patch for it sitting in RedHat's bug tracker[2] for more than a year, but no one is doing anything to upstream it. Is there anything we can do?

0: http://marc.info/?l=linux-mm&m=128942194520631&w=4
1: https://github.com/zfsonlinux/spl/issues/116
2: https://bugzilla.kernel.org/show_bug.cgi?id=30702
Comment 1 Richard Yao (RETIRED) gentoo-dev 2012-05-20 05:44:15 UTC
Created attachment 312319 [details, diff]
Prasad Gajanan Joshi's 2011-03-14 patch from the RedHat bug tracker
Comment 2 Mike Pagano gentoo-dev 2012-05-25 14:46:31 UTC
I will follow this upstream bug and backport any approved patch that makes it into Linus's tree.
Comment 3 Richard Yao (RETIRED) gentoo-dev 2012-05-26 10:14:45 UTC
In that case, I am going to reassign this bug to myself. If I am not mistaken, the Linux 3.5 merge window is open. I want to get this fixed before it closes.
Comment 4 Richard Yao (RETIRED) gentoo-dev 2012-06-07 01:36:53 UTC
Created attachment 314493 [details, diff]
Updated patch for Linux 3.4

I have updated Prasad Gajanan Joshi patch from the RedHat bug tracker for Linux 3.4. I plan to send it upstream for review from more experienced kernel developers.
Comment 5 Richard Yao (RETIRED) gentoo-dev 2012-06-07 03:09:42 UTC
Created attachment 314501 [details, diff]
Patch against Linux 3.5-rc1

I am attaching a patch against Linux 3.5-rc1. This patch has a proper GIT commit message.
Comment 6 Giuseppe Vitillaro 2012-06-14 09:58:31 UTC
We meet on the FreeNode #zfsonlinux channel (Orfheo) and you suggested me to test this patch, under gentoo, with the vanilla kernel 3.4.0.

I did and, beside a minor problem applying the patch, I was able to compile the vanilla 3.4.0 kernel with your patch 314493 on an old machine (P4 with around 2Gb), where you think the problem was present.

It looks the patch fix the problem with both the vanilla end the gentoo-sources 3.4.0 kernels (for both it applies with the same problem): the machine was able to keep against my test for hours, while before was hanging in a while.

This with updated (from your overlay) zfs, spl and a PREEMPT kernel.

Hope it helps.
Comment 7 Giuseppe Vitillaro 2012-07-05 10:56:42 UTC
Tested the same setup of my previous "Comment 6" with a NO PREEMPT 3.4.0 kernel, with the same patch, on the same machine. No hangs, no error in dmesg.
Comment 8 Jordi Marqués 2012-07-13 16:05:28 UTC
Patching sys-kernel/openvz-sources-2.6.32.53.5 with the patch on comment 1 on this bug results with http://pastebin.com/VN0gWWVN .
Comment 9 Richard Yao (RETIRED) gentoo-dev 2012-07-30 00:32:44 UTC
Created attachment 319656 [details, diff]
Updated patch against Linux 3.5.0

I still need to find time to implement support for 6 architectures implemented in mainline, but I am posting an updated patch against Linux 3.5.0. This eliminates fuzz when applying against sys-kernel/vanilla-sources-3.5.0
Comment 10 Richard Yao (RETIRED) gentoo-dev 2012-08-01 15:34:49 UTC
I have rewritten the parts of sys-fs/spl that caused ZFS to require this patch and opened an upstream pull request:

https://github.com/zfsonlinux/spl/pull/147

I will merge those patches into a revision bump of sys-fs/spl unless a serious regression is found in the next week.
Comment 11 Richard Yao (RETIRED) gentoo-dev 2012-08-06 18:36:51 UTC
(In reply to comment #10)
> I have rewritten the parts of sys-fs/spl that caused ZFS to require this
> patch and opened an upstream pull request:
> 
> https://github.com/zfsonlinux/spl/pull/147
> 
> I will merge those patches into a revision bump of sys-fs/spl unless a
> serious regression is found in the next week.

I am afraid that patch was found to introduce external memory fragmentation issues in practice, so I am forced to withdraw it.
Comment 12 Richard Yao (RETIRED) gentoo-dev 2012-08-31 07:04:10 UTC
A workaround that eliminates the need for a kernel patch has been committed to ZFSOnLinux GIT. It will be in the 0.6.0-rc11 release within the next few weeks.

Those who wish to use this sooner are welcome to use the sys-kernel/spl-9999, sys-fs/zfs-kmod-9999 and sys-fs/zfs-9999 ebuilds.
Comment 13 Richard Yao (RETIRED) gentoo-dev 2012-09-18 23:34:03 UTC
0.6.0-rc11 has been committed to portage. It eliminates the need to fix this kernel issue.