summaryrefslogtreecommitdiffstats
path: root/include/uapi
diff options
context:
space:
mode:
authorWillem de Bruijn <willemb@google.com>2013-03-19 10:18:11 +0000
committerDavid S. Miller <davem@davemloft.net>2013-03-19 17:15:04 -0400
commit77f65ebdca506870d99bfabe52bde222511022ec (patch)
tree8f5ba6c76d1b49b44128d08281cc0b6f3e62285c /include/uapi
parentb0aa73bf081da6810dacd750b9f8186640e172db (diff)
downloadop-kernel-dev-77f65ebdca506870d99bfabe52bde222511022ec.zip
op-kernel-dev-77f65ebdca506870d99bfabe52bde222511022ec.tar.gz
packet: packet fanout rollover during socket overload
Changes: v3->v2: rebase (no other changes) passes selftest v2->v1: read f->num_members only once fix bug: test rollover mode + flag Minimize packet drop in a fanout group. If one socket is full, roll over packets to another from the group. Maintain flow affinity during normal load using an rxhash fanout policy, while dispersing unexpected traffic storms that hit a single cpu, such as spoofed-source DoS flows. Rollover breaks affinity for flows arriving at saturated sockets during those conditions. The patch adds a fanout policy ROLLOVER that rotates between sockets, filling each socket before moving to the next. It also adds a fanout flag ROLLOVER. If passed along with any other fanout policy, the primary policy is applied until the chosen socket is full. Then, rollover selects another socket, to delay packet drop until the entire system is saturated. Probing sockets is not free. Selecting the last used socket, as rollover does, is a greedy approach that maximizes chance of success, at the cost of extreme load imbalance. In practice, with sufficiently long queues to absorb bursts, sockets are drained in parallel and load balance looks uniform in `top`. To avoid contention, scales counters with number of sockets and accesses them lockfree. Values are bounds checked to ensure correctness. Tested using an application with 9 threads pinned to CPUs, one socket per thread and sufficient busywork per packet operation to limits each thread to handling 32 Kpps. When sent 500 Kpps single UDP stream packets, a FANOUT_CPU setup processes 32 Kpps in total without this patch, 270 Kpps with the patch. Tested with read() and with a packet ring (V1). Also, passes psock_fanout.c unit test added to selftests. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/uapi')
-rw-r--r--include/uapi/linux/if_packet.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index f9a6037..8136658 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -55,6 +55,8 @@ struct sockaddr_ll {
#define PACKET_FANOUT_HASH 0
#define PACKET_FANOUT_LB 1
#define PACKET_FANOUT_CPU 2
+#define PACKET_FANOUT_ROLLOVER 3
+#define PACKET_FANOUT_FLAG_ROLLOVER 0x1000
#define PACKET_FANOUT_FLAG_DEFRAG 0x8000
struct tpacket_stats {
OpenPOWER on IntegriCloud