summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorNeal Cardwell <ncardwell@google.com>2015-01-28 20:01:39 -0500
committerDavid S. Miller <davem@davemloft.net>2015-01-28 22:18:38 -0800
commitd6b1a8a92a1417f8859a6937d2e6ffe2dfab4e6d (patch)
tree59f39d492be4f8ee202ca1bc1cb50612ed033edc
parent9cd981dcf174d26805a032aefa791436da709bee (diff)
downloadop-kernel-dev-d6b1a8a92a1417f8859a6937d2e6ffe2dfab4e6d.zip
op-kernel-dev-d6b1a8a92a1417f8859a6937d2e6ffe2dfab4e6d.tar.gz
tcp: fix timing issue in CUBIC slope calculation
This patch fixes a bug in CUBIC that causes cwnd to increase slightly too slowly when multiple ACKs arrive in the same jiffy. If cwnd is supposed to increase at a rate of more than once per jiffy, then CUBIC was sometimes too slow. Because the bic_target is calculated for a future point in time, calculated with time in jiffies, the cwnd can increase over the course of the jiffy while the bic_target calculated as the proper CUBIC cwnd at time t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only increases on jiffy tick boundaries. So since the cnt is set to: ca->cnt = cwnd / (bic_target - cwnd); as cwnd increases but bic_target does not increase due to jiffy granularity, the cnt becomes too large, causing cwnd to increase too slowly. For example: - suppose at the beginning of a jiffy, cwnd=40, bic_target=44 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10 - suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai() increases cwnd to 41 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13 So now CUBIC will wait for 13 packets to be ACKed before increasing cwnd to 42, insted of 10 as it should. The fix is to avoid adjusting the slope (determined by ca->cnt) multiple times within a jiffy, and instead skip to compute the Reno cwnd, the "TCP friendliness" code path. Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-rw-r--r--net/ipv4/tcp_cubic.c8
1 files changed, 8 insertions, 0 deletions
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index ffc045d..4b276d1 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -213,6 +213,13 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
(s32)(tcp_time_stamp - ca->last_time) <= HZ / 32)
return;
+ /* The CUBIC function can update ca->cnt at most once per jiffy.
+ * On all cwnd reduction events, ca->epoch_start is set to 0,
+ * which will force a recalculation of ca->cnt.
+ */
+ if (ca->epoch_start && tcp_time_stamp == ca->last_time)
+ goto tcp_friendliness;
+
ca->last_cwnd = cwnd;
ca->last_time = tcp_time_stamp;
@@ -280,6 +287,7 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked)
if (ca->last_max_cwnd == 0 && ca->cnt > 20)
ca->cnt = 20; /* increase cwnd 5% per RTT */
+tcp_friendliness:
/* TCP Friendly */
if (tcp_friendliness) {
u32 scale = beta_scale;
OpenPOWER on IntegriCloud