diff options
author | hiren <hiren@FreeBSD.org> | 2015-12-08 21:21:48 +0000 |
---|---|---|
committer | hiren <hiren@FreeBSD.org> | 2015-12-08 21:21:48 +0000 |
commit | ce99ff457034b7469b60219795e7060d07487217 (patch) | |
tree | 9eb6bcbf3f79eac6e639762885f0b2b16cd27f4a /sys/netinet/tcp_sack.c | |
parent | 46f91637f062edde183dc8bdb88b69cc66024259 (diff) | |
download | FreeBSD-src-ce99ff457034b7469b60219795e7060d07487217.zip FreeBSD-src-ce99ff457034b7469b60219795e7060d07487217.tar.gz |
One of the ways to detect loss is to count duplicate acks coming back from the
other end till it reaches predetermined threshold which is 3 for us right now.
Once that happens, we trigger fast-retransmit to do loss recovery.
Main problem with the current implementation is that we don't honor SACK
information well to detect whether an incoming ack is a dupack or not. RFC6675
has latest recommendations for that. According to it, dupack is a segment that
arrives carrying a SACK block that identifies previously unknown information
between snd_una and snd_max even if it carries new data, changes the advertised
window, or moves the cumulative acknowledgment point.
With the prevalence of Selective ACK (SACK) these days, improper handling can
lead to delayed loss recovery.
With the fix, new behavior looks like following:
0) th_ack < snd_una --> ignore
Old acks are ignored.
1) th_ack == snd_una, !sack_changed --> ignore
Acks with SACK enabled but without any new SACK info in them are ignored.
2) th_ack == snd_una, window == old_window --> increment
Increment on a good dupack.
3) th_ack == snd_una, window != old_window, sack_changed --> increment
When SACK enabled, it's okay to have advertized window changed if the ack has
new SACK info.
4) th_ack > snd_una --> reset to 0
Reset to 0 when left edge moves.
5) th_ack > snd_una, sack_changed --> increment
Increment if left edge moves but there is new SACK info.
Here, sack_changed is the indicator that incoming ack has previously unknown
SACK info in it.
Note: This fix is not fully compliant to RFC6675. That may require a few
changes to current implementation in order to keep per-sackhole dupack counter
and change to the way we mark/handle sack holes.
PR: 203663
Reviewed by: jtl
MFC after: 3 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D4225
Diffstat (limited to 'sys/netinet/tcp_sack.c')
-rw-r--r-- | sys/netinet/tcp_sack.c | 18 |
1 files changed, 14 insertions, 4 deletions
diff --git a/sys/netinet/tcp_sack.c b/sys/netinet/tcp_sack.c index f9c71c8..82e2251 100644 --- a/sys/netinet/tcp_sack.c +++ b/sys/netinet/tcp_sack.c @@ -345,17 +345,22 @@ tcp_sackhole_remove(struct tcpcb *tp, struct sackhole *hole) * Process cumulative ACK and the TCP SACK option to update the scoreboard. * tp->snd_holes is an ordered list of holes (oldest to newest, in terms of * the sequence space). + * Returns 1 if incoming ACK has previously unknown SACK information, + * 0 otherwise. Note: We treat (snd_una, th_ack) as a sack block so any changes + * to that (i.e. left edge moving) would also be considered a change in SACK + * information which is slightly different than rfc6675. */ -void +int tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) { struct sackhole *cur, *temp; struct sackblk sack, sack_blocks[TCP_MAX_SACK + 1], *sblkp; - int i, j, num_sack_blks; + int i, j, num_sack_blks, sack_changed; INP_WLOCK_ASSERT(tp->t_inpcb); num_sack_blks = 0; + sack_changed = 0; /* * If SND.UNA will be advanced by SEG.ACK, and if SACK holes exist, * treat [SND.UNA, SEG.ACK) as if it is a SACK block. @@ -392,7 +397,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) * received. */ if (num_sack_blks == 0) - return; + return (sack_changed); /* * Sort the SACK blocks so we can update the scoreboard with just one @@ -443,6 +448,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) tp->snd_fack = sblkp->end; /* Go to the previous sack block. */ sblkp--; + sack_changed = 1; } else { /* * We failed to add a new hole based on the current @@ -459,9 +465,11 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) SEQ_LT(tp->snd_fack, sblkp->end)) tp->snd_fack = sblkp->end; } - } else if (SEQ_LT(tp->snd_fack, sblkp->end)) + } else if (SEQ_LT(tp->snd_fack, sblkp->end)) { /* fack is advanced. */ tp->snd_fack = sblkp->end; + sack_changed = 1; + } /* We must have at least one SACK hole in scoreboard. */ KASSERT(!TAILQ_EMPTY(&tp->snd_holes), ("SACK scoreboard must not be empty")); @@ -490,6 +498,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) tp->sackhint.sack_bytes_rexmit -= (cur->rxmit - cur->start); KASSERT(tp->sackhint.sack_bytes_rexmit >= 0, ("sackhint bytes rtx >= 0")); + sack_changed = 1; if (SEQ_LEQ(sblkp->start, cur->start)) { /* Data acks at least the beginning of hole. */ if (SEQ_GEQ(sblkp->end, cur->end)) { @@ -545,6 +554,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) else sblkp--; } + return (sack_changed); } /* |