diff options
author | Glauber Costa <glommer@parallels.com> | 2012-05-29 15:07:11 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-05-29 16:22:28 -0700 |
commit | 3f134619393cb6c6dfab7890a617d0ceca6d05d7 (patch) | |
tree | 39e05b42c99189cd4496e61a3e16107e065b0f04 /include/net/sock.h | |
parent | 3afe36b1fe7d1e3f66752bb9548a763942f3a104 (diff) | |
download | op-kernel-dev-3f134619393cb6c6dfab7890a617d0ceca6d05d7.zip op-kernel-dev-3f134619393cb6c6dfab7890a617d0ceca6d05d7.tar.gz |
memcg: decrement static keys at real destroy time
We call the destroy function when a cgroup starts to be removed, such as
by a rmdir event.
However, because of our reference counters, some objects are still
inflight. Right now, we are decrementing the static_keys at destroy()
time, meaning that if we get rid of the last static_key reference, some
objects will still have charges, but the code to properly uncharge them
won't be run.
This becomes a problem specially if it is ever enabled again, because now
new charges will be added to the staled charges making keeping it pretty
much impossible.
We just need to be careful with the static branch activation: since there
is no particular preferred order of their activation, we need to make sure
that we only start using it after all call sites are active. This is
achieved by having a per-memcg flag that is only updated after
static_key_slow_inc() returns. At this time, we are sure all sites are
active.
This is made per-memcg, not global, for a reason: it also has the effect
of making socket accounting more consistent. The first memcg to be
limited will trigger static_key() activation, therefore, accounting. But
all the others will then be accounted no matter what. After this patch,
only limited memcgs will have its sockets accounted.
[akpm@linux-foundation.org: move enum sock_flag_bits into sock.h,
document enum sock_flag_bits,
convert memcg_proto_active() and memcg_proto_activated() to test_bit(),
redo tcp_update_limit() comment to 80 cols]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/net/sock.h')
-rw-r--r-- | include/net/sock.h | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/include/net/sock.h b/include/net/sock.h index d89f058..4a45216 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -46,6 +46,7 @@ #include <linux/list_nulls.h> #include <linux/timer.h> #include <linux/cache.h> +#include <linux/bitops.h> #include <linux/lockdep.h> #include <linux/netdevice.h> #include <linux/skbuff.h> /* struct sk_buff */ @@ -921,12 +922,23 @@ struct proto { #endif }; +/* + * Bits in struct cg_proto.flags + */ +enum cg_proto_flags { + /* Currently active and new sockets should be assigned to cgroups */ + MEMCG_SOCK_ACTIVE, + /* It was ever activated; we must disarm static keys on destruction */ + MEMCG_SOCK_ACTIVATED, +}; + struct cg_proto { void (*enter_memory_pressure)(struct sock *sk); struct res_counter *memory_allocated; /* Current allocated memory. */ struct percpu_counter *sockets_allocated; /* Current number of sockets. */ int *memory_pressure; long *sysctl_mem; + unsigned long flags; /* * memcg field is used to find which memcg we belong directly * Each memcg struct can hold more than one cg_proto, so container_of @@ -942,6 +954,16 @@ struct cg_proto { extern int proto_register(struct proto *prot, int alloc_slab); extern void proto_unregister(struct proto *prot); +static inline bool memcg_proto_active(struct cg_proto *cg_proto) +{ + return test_bit(MEMCG_SOCK_ACTIVE, &cg_proto->flags); +} + +static inline bool memcg_proto_activated(struct cg_proto *cg_proto) +{ + return test_bit(MEMCG_SOCK_ACTIVATED, &cg_proto->flags); +} + #ifdef SOCK_REFCNT_DEBUG static inline void sk_refcnt_debug_inc(struct sock *sk) { |