summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndrew Barry <abarry@cray.com>2011-05-24 17:12:52 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2011-05-25 08:39:36 -0700
commitcfa54a0fcfc1017c6f122b6f21aaba36daa07f71 (patch)
treec6bcc41b79475854254384b7b4912a2101364183
parenta539f3533b78e39a22723d6d3e1e11b6c14454d9 (diff)
downloadop-kernel-dev-cfa54a0fcfc1017c6f122b6f21aaba36daa07f71.zip
op-kernel-dev-cfa54a0fcfc1017c6f122b6f21aaba36daa07f71.tar.gz
mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()
I believe I found a problem in __alloc_pages_slowpath, which allows a process to get stuck endlessly looping, even when lots of memory is available. Running an I/O and memory intensive stress-test I see a 0-order page allocation with __GFP_IO and __GFP_WAIT, running on a system with very little free memory. Right about the same time that the stress-test gets killed by the OOM-killer, the utility trying to allocate memory gets stuck in __alloc_pages_slowpath even though most of the systems memory was freed by the oom-kill of the stress-test. The utility ends up looping from the rebalance label down through the wait_iff_congested continiously. Because order=0, __alloc_pages_direct_compact skips the call to get_page_from_freelist. Because all of the reclaimable memory on the system has already been reclaimed, __alloc_pages_direct_reclaim skips the call to get_page_from_freelist. Since there is no __GFP_FS flag, the block with __alloc_pages_may_oom is skipped. The loop hits the wait_iff_congested, then jumps back to rebalance without ever trying to get_page_from_freelist. This loop repeats infinitely. The test case is pretty pathological. Running a mix of I/O stress-tests that do a lot of fork() and consume all of the system memory, I can pretty reliably hit this on 600 nodes, in about 12 hours. 32GB/node. Signed-off-by: Andrew Barry <abarry@cray.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: Rik van Riel<riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--mm/page_alloc.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 10a8c6d..2a00f17 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2106,6 +2106,7 @@ restart:
first_zones_zonelist(zonelist, high_zoneidx, NULL,
&preferred_zone);
+rebalance:
/* This is the last chance, in general, before the goto nopage. */
page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
@@ -2113,7 +2114,6 @@ restart:
if (page)
goto got_pg;
-rebalance:
/* Allocate without watermarks if the context allows */
if (alloc_flags & ALLOC_NO_WATERMARKS) {
page = __alloc_pages_high_priority(gfp_mask, order,
OpenPOWER on IntegriCloud