Revert "defer call to mem_cgroup_sk_alloc()"

This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol: defer call to mem_cgroup_sk_alloc()"). Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks memcg socket memory accounting, as packets received before memcg pointer initialization are not accounted and are causing refcounting underflow on socket release. Actually the free-after-use problem was fixed by commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in sk_clone_lock()") for the cgroup pointer. So, let's revert it and call mem_cgroup_sk_alloc() just before cgroup_sk_alloc(). This is safe, as we hold a reference to the socket we're cloning, and it holds a reference to the memcg. Also, let's drop BUG_ON(mem_cgroup_is_root()) check from mem_cgroup_sk_alloc(). I see no reasons why bumping the root memcg counter is a good reason to panic, and there are no realistic ways to hit it. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Roman Gushchin <guro@fb.com> 2018-02-02 15:26:57 +0000
committer: David S. Miller <davem@davemloft.net> 2018-02-02 19:49:31 -0500
commit: edbe69ef2c90fc86998a74b08319a01c508bd497 (patch)
tree: ae886133adf7f3518a47f81db0644fe74732b80b /mm
parent: 4db428a7c9ab07e08783e0fcdc4ca0f555da0567 (diff)
download: op-kernel-dev-edbe69ef2c90fc86998a74b08319a01c508bd497.zip
op-kernel-dev-edbe69ef2c90fc86998a74b08319a01c508bd497.tar.gz
1 files changed, 14 insertions, 0 deletions
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0ae2dc3..0937f2c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5747,6 +5747,20 @@ void mem_cgroup_sk_alloc(struct sock *sk)
 	if (!mem_cgroup_sockets_enabled)
 		return;
 
+	/*
+	 * Socket cloning can throw us here with sk_memcg already
+	 * filled. It won't however, necessarily happen from
+	 * process context. So the test for root memcg given
+	 * the current task's memcg won't help us in this case.
+	 *
+	 * Respecting the original socket's memcg is a better
+	 * decision in this case.
+	 */
+	if (sk->sk_memcg) {
+		css_get(&sk->sk_memcg->css);
+		return;
+	}
+
 	rcu_read_lock();
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)
author	Roman Gushchin <guro@fb.com>	2018-02-02 15:26:57 +0000
committer	David S. Miller <davem@davemloft.net>	2018-02-02 19:49:31 -0500
commit	edbe69ef2c90fc86998a74b08319a01c508bd497 (patch)
tree	ae886133adf7f3518a47f81db0644fe74732b80b /mm
parent	4db428a7c9ab07e08783e0fcdc4ca0f555da0567 (diff)
download	op-kernel-dev-edbe69ef2c90fc86998a74b08319a01c508bd497.zip op-kernel-dev-edbe69ef2c90fc86998a74b08319a01c508bd497.tar.gz