| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone. This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted. Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.
[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the LRU lists from the zone to the node and related data such
as counters, tracing, congestion tracking and writeback tracking.
Unfortunately, due to reclaim and compaction retry logic, it is
necessary to account for the number of LRU pages on both zone and node
logic. Most reclaim logic is based on the node counters but the retry
logic uses the zone counters which do not distinguish inactive and
active sizes. It would be possible to leave the LRU counters on a
per-zone basis but it's a heavier calculation across multiple cache
lines that is much more frequent than the retry checks.
Other than the LRU counters, this is mostly a mechanical patch but note
that it introduces a number of anomalies. For example, the scans are
per-zone but using per-node counters. We also mark a node as congested
when a zone is congested. This causes weird problems that are fixed
later but is easier to review.
In the event that there is excessive overhead on 32-bit systems due to
the nodes being on LRU then there are two potential solutions
1. Long-term isolation of highmem pages when reclaim is lowmem
When pages are skipped, they are immediately added back onto the LRU
list. If lowmem reclaim persisted for long periods of time, the same
highmem pages get continually scanned. The idea would be that lowmem
keeps those pages on a separate list until a reclaim for highmem pages
arrives that splices the highmem pages back onto the LRU. It potentially
could be implemented similar to the UNEVICTABLE list.
That would reduce the skip rate with the potential corner case is that
highmem pages have to be scanned and reclaimed to free lowmem slab pages.
2. Linear scan lowmem pages if the initial LRU shrink fails
This will break LRU ordering but may be preferable and faster during
memory pressure than skipping LRU pages.
Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
| |
The NULL checking here doesn't make sense, so it causes a static checker
warning. It turns out that p->mm can't be NULL so the inconsistency is
harmless and we should just remove the check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, lowmemorykiller (LMK) is using TIF_MEMDIE for two purposes.
One is to remember processes killed by LMK, and the other is to
accelerate termination of processes killed by LMK.
But since LMK is invoked as a memory shrinker function, there still
should be some memory available. It is very likely that memory
allocations by processes killed by LMK will succeed without using
ALLOC_NO_WATERMARKS via TIF_MEMDIE. Even if their allocations cannot
escape from memory allocation loop unless they use ALLOC_NO_WATERMARKS,
lowmem_deathpending_timeout can guarantee forward progress by choosing
next victim process.
On the other hand, mark_oom_victim() assumes that it must be called with
oom_lock held and it must not be called after oom_killer_disable() was
called. But LMK is calling it without holding oom_lock and checking
oom_killer_disabled. It is possible that LMK calls mark_oom_victim()
due to allocation requests by kernel threads after current thread
returned from oom_killer_disabled(). This will break synchronization
for PM/suspend.
This patch introduces per a task_struct flag for remembering processes
killed by LMK, and replaces TIF_MEMDIE with that flag. By applying this
patch, assumption by mark_oom_victim() becomes true.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Arve Hjonnevag <arve@android.com>
Cc: Riley Andrews <riandrews@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
| |
Specifically:
lowmemorykiller.c:53: CHECK: use a blank line after enum declarations
lowmemorykiller.c:60: CHECK: use a blank line after enum declarations
Signed-off-by: Sandeep Jain <sandeepjain.linux@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lowmemorykiller debug messages are inscrutable and mostly useful
for debugging the lowmemorykiller, not explaining why a process
was killed. Make the messages more useful by prefixing them
with "lowmemorykiller: " and explaining in more readable terms
what was killed, who it was killed for, and why it was killed.
The messages now look like:
[ 76.997631] lowmemorykiller: Killing 'droid.gallery3d' (2172), adj 1000,
[ 76.997635] to free 27436kB on behalf of 'kswapd0' (29) because
[ 76.997638] cache 122624kB is below limit 122880kB for oom_score_adj 1000
[ 76.997641] Free memory is -53356kB above reserved
A negative number for free memory above reserved means some of the
reserved memory has been used and is being regenerated by kswapd,
which is likely what called the shrinkers.
Cc: Android Kernel Team <kernel-team@android.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Colin Cross <ccross@android.com>
[jstultz: Minor checkpatch tweaks]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
Fix alingment issues by properly indenting function arguments
in accordance with the kernel coding style.
Signed-off-by: Ioana Ciornei <ciorneiioana@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
This patch makes use of the preferred kernel types such
as u16, u32.
Signed-off-by: Ioana Ciornei <ciorneiioana@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Kconfig currently controlling compilation of this code is:
drivers/staging/android/Kconfig:config ANDROID_LOW_MEMORY_KILLER
drivers/staging/android/Kconfig: bool "Android Low Memory Killer"
...meaning that it currently is not being built as a module by anyone.
Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.
We replace module.h with init.h and moduleparam.h ; the latter since
this file was previously implicitly relying on getting that header.
We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.
Cc: "Arve Hjønnevåg" <arve@android.com>
Cc: Riley Andrews <riandrews@android.com>
Cc: devel@driverdev.osuosl.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was observed that setting TIF_MEMDIE before sending SIGKILL at
oom_kill_process() allows memory reserves to be depleted by allocations
which are not needed for terminating the OOM victim.
This patch reverts commit 6bc2b856bb7c ("staging: android: lowmemorykiller:
set TIF_MEMDIE before send kill sig"), for oom_kill_process() was updated
to send SIGKILL before setting TIF_MEMDIE.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here's the big, really big, staging tree patches for 4.2-rc1.
Loads of stuff in here, almost all just coding style fixes / churn,
and a few new drivers as well, one of which I just disabled from the
build a few minutes ago due to way too many build warnings.
Other than the one "disable this driver" patch, all of these have been
in linux-next for quite a while with no reported issues"
* tag 'staging-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1163 commits)
staging: wilc1000: disable driver due to build warnings
Staging: rts5208: fix CHANGE_LINK_STATE value
Staging: sm750fb: ddk750_swi2c.c: Insert spaces before parenthesis
Staging: sm750fb: ddk750_swi2c.c: Place braces on correct lines
Staging: sm750fb: ddk750_swi2c.c: Insert spaces around operators
Staging: sm750fb: ddk750_swi2c.c: Replace spaces with tabs
Staging: sm750fb: ddk750_swi2c.h: Shorten lines to under 80 characters
Staging: sm750fb: ddk750_swi2c.h: Replace spaces with tabs
Staging: sm750fb: modedb.h: Shorten lines to under 80 characters
Staging: sm750fb: modedb.h: Replace spaces with tabs
staging: comedi: addi_apci_3120: rename 'this_board' variables
staging: comedi: addi_apci_1516: rename 'this_board' variables
staging: comedi: ni_atmio: cleanup ni_getboardtype()
staging: comedi: vmk80xx: sanity check context used to get the boardinfo
staging: comedi: vmk80xx: rename 'boardinfo' variables
staging: comedi: dt3000: rename 'this_board' variables
staging: comedi: adv_pci_dio: rename 'this_board' variables
staging: comedi: cb_pcidas64: rename 'thisboard' variables
staging: comedi: cb_pcidas: rename 'thisboard' variables
staging: comedi: me4000: rename 'thisboard' variables
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
TIF_MEMDIE should not be set on a process if it does not have a valid
->mm, and this is protected by task_lock().
If TIF_MEMDIE gets set after the mm has detached, and the process fails to
exit, then the oom killer will defer forever waiting for it to exit.
Make sure that the mm is still valid before setting TIF_MEMDIE by way of
mark_tsk_oom_victim().
Cc: "Arve Hjønnevåg" <arve@android.com>
Cc: Riley Andrews <riandrews@android.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename unmark_oom_victim() to exit_oom_victim(). Marking and unmarking
are related in functionality, but the interface is not symmetrical at
all: one is an internal OOM killer function used during the killing, the
other is for an OOM victim to signal its own death on exit later on.
This has locking implications, see follow-up changes.
While at it, rename mark_tsk_oom_victim() to mark_oom_victim(), which
is easier on the eye.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patchset addresses a race which was described in the changelog for
5695be142e20 ("OOM, PM: OOM killed task shouldn't escape PM suspend"):
: PM freezer relies on having all tasks frozen by the time devices are
: getting frozen so that no task will touch them while they are getting
: frozen. But OOM killer is allowed to kill an already frozen task in order
: to handle OOM situtation. In order to protect from late wake ups OOM
: killer is disabled after all tasks are frozen. This, however, still keeps
: a window open when a killed task didn't manage to die by the time
: freeze_processes finishes.
The original patch hasn't closed the race window completely because that
would require a more complex solution as it can be seen by this patchset.
The primary motivation was to close the race condition between OOM killer
and PM freezer _completely_. As Tejun pointed out, even though the race
condition is unlikely the harder it would be to debug weird bugs deep in
the PM freezer when the debugging options are reduced considerably. I can
only speculate what might happen when a task is still runnable
unexpectedly.
On a plus side and as a side effect the oom enable/disable has a better
(full barrier) semantic without polluting hot paths.
I have tested the series in KVM with 100M RAM:
- many small tasks (20M anon mmap) which are triggering OOM continually
- s2ram which resumes automatically is triggered in a loop
echo processors > /sys/power/pm_test
while true
do
echo mem > /sys/power/state
sleep 1s
done
- simple module which allocates and frees 20M in 8K chunks. If it sees
freezing(current) then it tries another round of allocation before calling
try_to_freeze
- debugging messages of PM stages and OOM killer enable/disable/fail added
and unmark_oom_victim is delayed by 1s after it clears TIF_MEMDIE and before
it wakes up waiters.
- rebased on top of the current mmotm which means some necessary updates
in mm/oom_kill.c. mark_tsk_oom_victim is now called under task_lock but
I think this should be OK because __thaw_task shouldn't interfere with any
locking down wake_up_process. Oleg?
As expected there are no OOM killed tasks after oom is disabled and
allocations requested by the kernel thread are failing after all the tasks
are frozen and OOM disabled. I wasn't able to catch a race where
oom_killer_disable would really have to wait but I kinda expected the race
is really unlikely.
[ 242.609330] Killed process 2992 (mem_eater) total-vm:24412kB, anon-rss:2164kB, file-rss:4kB
[ 243.628071] Unmarking 2992 OOM victim. oom_victims: 1
[ 243.636072] (elapsed 2.837 seconds) done.
[ 243.641985] Trying to disable OOM killer
[ 243.643032] Waiting for concurent OOM victims
[ 243.644342] OOM killer disabled
[ 243.645447] Freezing remaining freezable tasks ... (elapsed 0.005 seconds) done.
[ 243.652983] Suspending console(s) (use no_console_suspend to debug)
[ 243.903299] kmem_eater: page allocation failure: order:1, mode:0x204010
[...]
[ 243.992600] PM: suspend of devices complete after 336.667 msecs
[ 243.993264] PM: late suspend of devices complete after 0.660 msecs
[ 243.994713] PM: noirq suspend of devices complete after 1.446 msecs
[ 243.994717] ACPI: Preparing to enter system sleep state S3
[ 243.994795] PM: Saving platform NVS memory
[ 243.994796] Disabling non-boot CPUs ...
The first 2 patches are simple cleanups for OOM. They should go in
regardless the rest IMO.
Patches 3 and 4 are trivial printk -> pr_info conversion and they should
go in ditto.
The main patch is the last one and I would appreciate acks from Tejun and
Rafael. I think the OOM part should be OK (except for __thaw_task vs.
task_lock where a look from Oleg would appreciated) but I am not so sure I
haven't screwed anything in the freezer code. I have found several
surprises there.
This patch (of 5):
This patch is just a preparatory and it doesn't introduce any functional
change.
Note:
I am utterly unhappy about lowmemory killer abusing TIF_MEMDIE just to
wait for the oom victim and to prevent from new killing. This is
just a side effect of the flag. The primary meaning is to give the oom
victim access to the memory reserves and that shouldn't be necessary
here.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With ZRAM enabled it is observed that lowmemory killer
doesn't trigger properly. swap cached pages are
accounted in NR_FILE, and lowmemorykiller considers
this as reclaimable and adds to other_file. But these
pages can't be reclaimed unless lowmemorykiller triggers.
So subtract swap pages from other_file.
Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
| |
Set TIF_MEMDIE tsk_thread flag before send kill signal to the
selected thread. This is to fit a usual code sequence and avoid
potential race issue.
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the driver shrinkers to the new API. Most changes are compile
tested only because I either don't have the hardware or it's staging
stuff.
FWIW, the md and android code is pretty good, but the rest of it makes me
want to claw my eyes out. The amount of broken code I just encountered is
mind boggling. I've added comments explaining what is broken, but I fear
that some of the code would be best dealt with by being dragged behind the
bike shed, burying in mud up to it's neck and then run over repeatedly
with a blunt lawn mower.
Special mention goes to the zcache/zcache2 drivers. They can't co-exist
in the build at the same time, they are under different menu options in
menuconfig, they only show up when you've got the right set of mm
subsystem options configured and so even compile testing is an exercise in
pulling teeth. And that doesn't even take into account the horrible,
broken code...
[glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
Add "lowmemorykiller:" prefix to the debug print so it's easier to
analyse LMK's debug output:
dmesg | grep lowmemorykiller
Signed-off-by: Dmitry Voytik <dvv.kernel@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The select...to kill messages are not very useful when not debugging
the lowmemorykiller itself. After the change to check TIF_MEMDIE
instead of using a task notifer this message can also get very
noisy.
Cc: Android Kernel Team <kernel-team@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The amount of reserved memory varies between devices. Subtract it
here to reduce the amount of devices specific tuning needed for the
minfree values.
Cc: Android Kernel Team <kernel-team@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change. The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The task handoff notifier leaks task_struct since it never gets freed
after the callback returns NOTIFY_OK, which means it is responsible for
doing so.
It turns out the lowmemorykiller actually doesn't need this notifier at
all. It's used to prevent unnecessary killing by waiting for a thread
to exit as a result of lowmem_shrink(), however, it's possible to do
this in the same way the kernel oom killer works by setting TIF_MEMDIE
and avoid killing if we're still waiting for it to exit.
The kernel oom killer will already automatically set TIF_MEMDIE for
threads that are attempting to allocate memory that have a fatal signal.
The thread selected by lowmem_shrink() will have such a signal after the
lowmemorykiller sends it a SIGKILL, so this won't result in an
unnecessary use of memory reserves for the thread to exit.
This has the added benefit that we don't have to rely on
CONFIG_PROFILING to prevent needlessly killing tasks.
Reported-by: Werner Landgraf <w.landgraf@ru.ru>
Cc: stable@vger.kernel.org
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
| |
Fix compiler warning about the type of the module parameter.
Cc: San Mehat <san@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lowmemorykiller registers an atomic notifier for notfication of when
the task is freed. From this atomic notifier callback, it removes the
atomic notifier via task_free_unregister(). This is incorrect because
atomic_notifier_chain_unregister() calls syncronize_rcu(), which can
sleep, which shouldn't be done from an atomic notifier.
Fix this by registering the notifier during init, and only unregister it
if the lowmemorykiller is unloaded.
Rebased to -next by Paul E. McKenney.
Rebased to -next again by Anton Vorontsov.
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
| |
/proc/pid/oom_adj is deprecated and will be removed in August 2012
according to Documentation/feature-removal-schedule.txt. Convert its
usage in the lowmemorykiller to use the new interface, oom_score_adj,
instead.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This was done to resolve some merge issues with the following files that
had changed in both branches:
drivers/staging/rtl8712/rtl871x_sta_mgt.c
drivers/staging/tidspbridge/rmgr/drv_interface.c
drivers/staging/zcache/zcache-main.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LMK should not try killing kernel threads.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
task->signal == NULL is not possible, so no need for these checks.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LMK should not directly check for task->mm. The reason is that the
process' threads may exit or detach its mm via use_mm(), but other
threads may still have a valid mm. To catch this we use
find_lock_task_mm(), which walks up all threads and returns an
appropriate task (with lock held).
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Grabbing tasklist_lock has its disadvantages, i.e. it blocks
process creation and destruction. If there are lots of processes,
blocking doesn't sound as a great idea.
For LMK, it is sufficient to surround tasks list traverse with
rcu_read_{,un}lock().
>From now on using force_sig() is not safe, as it can race with an
already exiting task, so we use send_sig() now. As a downside, it
won't kill PID namespace init processes, but that's not what we
want anyway.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|/
|
|
|
|
|
|
|
|
|
|
| |
process to die
If a process forked and the child process was killed by the
lowmemorykiller, the lowmemory killer would be disabled until
the parent process reaped the child or it died itself.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
| |
This patch fixes some 80 chatacters limit warnings in the lowmemorykiller.c file
Signed-off-by: Marco Navarra <fromenglish@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
| |
The arguments to shrink functions have changed, update
lowmem_shrink to match.
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
| |
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
| |
Signed-off-by: Colin Cross <ccross@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
Now that we're murder-synchronous, this code path will never be
called (and if it does, it doesn't tell us anything useful other
than we killed a task that was already being killed by somebody
else but hadn't gotten its' signal yet)
Signed-off-by: San Mehat <san@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch optimizes lowmemkiller to not do any work when it has an outstanding
kill-request. This greatly reduces the pressure on the task_list lock
(improving interactivity), as well as improving the vmscan performance
when under heavy memory pressure (by up to 20x in tests).
Note: For this enhancement to work, you need CONFIG_PROFILING
Signed-off-by: San Mehat <san@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under certain circumstances, a process can take awhile to
handle a sig-kill (especially if it's in a scheduler group with
a very low share ratio). When this occurs, lowmemkiller returns
to vmscan indicating the process memory has been freed - even
though the process is still waiting to die. Since the memory
hasn't actually freed, lowmemkiller is called again shortly after,
and picks the same process to die; regardless of the fact that
it has already been 'scheduled' to die and the memory has already
been reported to vmscan as having been freed.
Solution is to check fatal_signal_pending() on the selected
task, and if it's already pending destruction return; indicating
to vmscan that no resources were freed on this pass.
Signed-off-by: San Mehat <san@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
| |
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b0a0ccfad85b3657fe999805df65f5cfe634ab8a.
Turns out I was wrong, we want these in the tree.
Note, I've disabled the drivers from the build at the moment, so other
patches can be applied to fix some build issues due to internal api
changes since the code was removed from the tree.
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Brian Swetland <swetland@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These drivers are no longer being developed and the original authors
seem to have abandonded them and hence, do not want them in the mainline
kernel tree.
So sad :(
Cc: Brian Swetland <swetland@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move module_params to near the end of the source file so that
their references are already known/defined. Fixes build errors:
drivers/staging/android/lowmemorykiller.c: In function '__check_cost':
drivers/staging/android/lowmemorykiller.c:60: error: 'lowmem_shrinker' undeclared (first use in this function)
drivers/staging/android/lowmemorykiller.c: At top level:
drivers/staging/android/lowmemorykiller.c:60: error: 'lowmem_shrinker' undeclared here (not in a function)
drivers/staging/android/lowmemorykiller.c:60: warning: type defaults to 'int' in declaration of 'type name'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Move the lowmemorykiller.txt into the actual source file which is really the
correct place for it, and delete lowmemorykiller.txt.
Signed-off-by: Daniel Walker <dwalker@fifo99.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Brian Swetland <swetland@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
I moved the struct shrinker down so that the predefine isn't needed.
Signed-off-by: Daniel Walker <dwalker@fifo99.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Brian Swetland <swetland@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
from task_struct to mm_struct"
I'm about to merge "oom: move oom_adj value from task_struct to
mm_struct", and this fixup is needed to repair linux-next's
drivers/staging/android/lowmemorykiller.c.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This cleans up the last of the checkpatch warnings in the android
lowmemorykiller driver.
Cc: San Mehat <san@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NULL pointer
get_mm_rss() atomically dereferences the actual without checking for a
NULL pointer, which is possible since task_lock() is not held.
Cc: San Mehat <san@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the code in lowmem_shrink() for the Android low memory killer so
that it follows the kernel coding style.
It's unnecessary to check for p->oomkilladj >= min_adj if the selected
task's oomkilladj score is stored since get_mm_rss() will always be
greater than zero.
Cc: San Mehat <san@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the specified limit by itself
From: Arve Hjønnevåg <arve@android.com>
This allows processes to be killed when the kernel evict cache pages in
an attempt to get more contiguous free memory.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Cc: David Rientjes <rientjes@google.com>
Cc: San Mehat <san@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use NR_ACTIVE plus NR_INACTIVE as a size estimate for our fake cache
instead the sum of rss. Neither method is accurate.
Also skip the process scan, if the amount of memory available is above
the largest threshold set.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Cc: David Rientjes <rientjes@google.com>
Cc: San Mehat <san@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|