op-kernel-dev - Development kernel branch for OpenPOWER systems

diff options

author	Alexander Duyck <alexander.h.duyck@redhat.com>	2015-05-06 21:11:40 -0700
committer	David S. Miller <davem@davemloft.net>	2015-05-12 10:39:26 -0400
commit	9451980a6646ed487efce04a9df28f450935683e (patch)
tree	9a02485449df01c521dc7eb11879e212517b549f /mm
parent	b396cca6fafccf16206a5d041d59c9e6b65b6f5a (diff)
download	op-kernel-dev-9451980a6646ed487efce04a9df28f450935683e.zip op-kernel-dev-9451980a6646ed487efce04a9df28f450935683e.tar.gz

net: Use cached copy of pfmemalloc to avoid accessing page

While testing I found that the testing for pfmemalloc in build_skb was rather expensive. I found the issue to be two-fold. First we have to get from the virtual address to the head page and that comes at the cost of something like 11 cycles. Then there is the cost for reading pfmemalloc out of the head page which can be cache cold due to the fact that put_page_testzero is likely invalidating the cache-line on one or more CPUs as the fragments can be shared. To avoid this extra expense I have added a pfmemalloc member to the netdev_alloc_cache. I then pushed pieces of __alloc_rx_skb into __napi_alloc_skb and __netdev_alloc_skb so that I could rewrite them to make use of the cached pfmemalloc value. The result is that my perf traces show a reduction from 9.28% overhead to 3.7% for the code covered by build_skb, __alloc_rx_skb, and __napi_alloc_skb when performing a test with the packet being dropped instead of being handed to napi_gro_receive. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Diffstat (limited to 'mm')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: