net: allocate skbs on local node

commit b30973f877 (node-aware skb allocation) spread a wrong habit of allocating net drivers skbs on a given memory node : The one closest to the NIC hardware. This is wrong because as soon as we try to scale network stack, we need to use many cpus to handle traffic and hit slub/slab management on cross-node allocations/frees when these cpus have to alloc/free skbs bound to a central node. skb allocated in RX path are ephemeral, they have a very short lifetime : Extra cost to maintain NUMA affinity is too expensive. What appeared as a nice idea four years ago is in fact a bad one. In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load, and two 10Gb NIC might deliver more than 28 million packets per second, needing all the available cpus. Cost of cross-node handling in network and vm stacks outperforms the small benefit hardware had when doing its DMA transfert in its 'local' memory node at RX time. Even trying to differentiate the two allocations done for one skb (the sk_buff on local node, the data part on NIC hardware node) is not enough to bring good performance. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Eric Dumazet <eric.dumazet@gmail.com> 2010-10-11 19:05:25 +0000
committer: David S. Miller <davem@davemloft.net> 2010-10-16 11:13:19 -0700
commit: 564824b0c52c34692d804bb6ea214451615b0b50 (patch)
tree: d836fa51848026df74e2bec2b634f1fcf3c6d02f /net/core
parent: 6f0333b8fde44b8c04a53b2461504f0e8f1cebe6 (diff)
download: op-kernel-dev-564824b0c52c34692d804bb6ea214451615b0b50.zip
op-kernel-dev-564824b0c52c34692d804bb6ea214451615b0b50.tar.gz
1 files changed, 1 insertions, 12 deletions
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 752c197..4e8b82e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -247,10 +247,9 @@ EXPORT_SYMBOL(__alloc_skb);
 struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 		unsigned int length, gfp_t gfp_mask)
 {
-	int node = dev->dev.parent ? dev_to_node(dev->dev.parent) : -1;
 	struct sk_buff *skb;
 
-	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, node);
+	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, NUMA_NO_NODE);
 	if (likely(skb)) {
 		skb_reserve(skb, NET_SKB_PAD);
 		skb->dev = dev;
@@ -259,16 +258,6 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 }
 EXPORT_SYMBOL(__netdev_alloc_skb);
 
-struct page *__netdev_alloc_page(struct net_device *dev, gfp_t gfp_mask)
-{
-	int node = dev->dev.parent ? dev_to_node(dev->dev.parent) : -1;
-	struct page *page;
-
-	page = alloc_pages_node(node, gfp_mask, 0);
-	return page;
-}
-EXPORT_SYMBOL(__netdev_alloc_page);
-
 void skb_add_rx_frag(struct sk_buff *skb, int i, struct page *page, int off,
 		int size)
 {
author	Eric Dumazet <eric.dumazet@gmail.com>	2010-10-11 19:05:25 +0000
committer	David S. Miller <davem@davemloft.net>	2010-10-16 11:13:19 -0700
commit	564824b0c52c34692d804bb6ea214451615b0b50 (patch)
tree	d836fa51848026df74e2bec2b634f1fcf3c6d02f /net/core
parent	6f0333b8fde44b8c04a53b2461504f0e8f1cebe6 (diff)
download	op-kernel-dev-564824b0c52c34692d804bb6ea214451615b0b50.zip op-kernel-dev-564824b0c52c34692d804bb6ea214451615b0b50.tar.gz