summaryrefslogtreecommitdiffstats
path: root/sys/netinet/cc
Commit message (Collapse)AuthorAgeFilesLines
* Merge r261590: Fixup for r261590 (vnet sysctl handlers cleanup)glebius2014-03-041-7/+2
|
* Import an implementation of the CAIA Delay-Gradient (CDG) congestion controllstewart2013-07-021-0/+695
| | | | | | | | | | | | | | | | | | algorithm, which is based on the 2011 v0.1 patch release and described in the paper "Revisiting TCP Congestion Control using Delay Gradients" by David Hayes and Grenville Armitage. It is implemented as a kernel module compatible with the modular congestion control framework. CDG is a hybrid congestion control algorithm which reacts to both packet loss and inferred queuing delay. It attempts to operate as a delay-based algorithm where possible, but utilises heuristics to detect loss-based TCP cross traffic and will compete effectively as required. CDG is therefore incrementally deployable and suitable for use on shared networks. In collaboration with: David Hayes <david.hayes at ieee.org> and Grenville Armitage <garmitage at swin edu au> MFC after: 4 days Sponsored by: Cisco University Research Program and FreeBSD Foundation
* Staticize malloc types.pluknet2011-04-134-8/+4
| | | | | Approved by: lstewart MFC after: 1 week
* Use the full and proper company name for Swinburne University of Technologylstewart2011-04-129-59/+61
| | | | | | | | throughout the source tree. Requested by: Grenville Armitage, Director of CAIA at Swinburne University of Technology MFC after: 3 days
* Algorithm modules can define their own private congestion signal types in thelstewart2011-02-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | | top 8 bits of the 32 bit signal bit field space for internal use. These private signals should not be leaked outside of a module. Given that many algorithm modules use the NewReno hook functions to simplify their implementation, the obvious place such a leak would show up is in the NewReno cong_signal hook function. - Show the full number of significant bits in the signal type definitions in <netinet/cc.h>. - Add a bitmask to simplify figuring out if a given signal is in the private or public bit range. - Add a sanity check in newreno_cong_signal() to ensure private signals are not being leaked into the hook function. Sponsored by: FreeBSD Foundation Discussed with: David Hayes <dahayes at swin edu au> MFC after: 1 week X-MFC with: r215166
* Fix typo in comment: "course" -> "coarse"lstewart2011-02-011-1/+1
| | | | | | | Sponsored by: FreeBSD Foundation Submitted by: jmallett MFC after: 3 months X-MFC with: r218152
* Import an implementation of the CAIA-Hamilton-Delay (CHD) congestion controllstewart2011-02-011-0/+497
| | | | | | | | | | | | | | | | | | | | | | algorithm described in the paper "Improved coexistence and loss tolerance for delay based TCP congestion control" by Hayes and Armitage. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. CHD enhances the approach taken by the Hamilton-Delay (HD) algorithm to provide tolerance to non-congestion related packet loss and improvements to coexistence with loss-based congestion control algorithms. A key idea in improving coexistence with loss-based congestion control algorithms is the use of a shadow window, which attempts to track how NewReno's congestion window (cwnd) would evolve. At the next packet loss congestion event, CHD uses the shadow window to correct cwnd in a way that reduces the amount of unfairness CHD experiences when competing with loss-based algorithms. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months
* Import a clean-room implementation of the Hamilton-Delay (HD) congestion controllstewart2011-02-011-0/+254
| | | | | | | | | | | | | | | | | | | | | | algorithm based on the paper "A strategy for fair coexistence of loss and delay-based congestion control algorithms" by Budzisz, Stanojevic, Shorten and Baker. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. HD uses a probabilistic approach to reacting to delay-based congestion. The probability of reducing cwnd is zero when the queuing delay is very small, increasing to a maximum at a set threshold, then back down to zero again when the queuing delay is high. Normal operation keeps the queuing delay below the set threshold. However, since loss-based congestion control algorithms push the queuing delay high when probing for bandwidth, having the probability of reducing cwnd drop back to zero for high delays allows HD to compete with loss-based algorithms. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months
* Import a clean-room implementation of the VEGAS congestion control algorithmlstewart2011-02-011-0/+308
| | | | | | | | | | | | | | | | | based on the paper "TCP Vegas: end to end congestion avoidance on a global internet" by Brakmo and Peterson. It is implemented as a kernel module compatible with the recently committed modular congestion control framework. VEGAS uses network delay as a congestion indicator and unlike regular loss-based algorithms, attempts to keep the network operating with stable queuing delays and no congestion losses. By keeping network buffers used along the path within a set range, queuing delays are kept low while maintaining high throughput. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: bz and others along the way MFC after: 3 months
* An sbuf configured with SBUF_AUTOEXTEND will call malloc with M_WAITOK when alstewart2011-01-231-4/+21
| | | | | | | | | | | | | | | | | | | | write to the buffer causes it to overflow. We therefore can't hold the CC list rwlock over a call to sbuf_printf() for an sbuf configured with SBUF_AUTOEXTEND. Switch to a fixed length sbuf which should be of sufficient size except in the very unlikely event that the sysctl is being processed as one or more new algorithms are loaded. If that happens, we accept the race and may fail the sysctl gracefully if there is insufficient room to print the names of all the algorithms. This should address a WITNESS warning and the potential panic that would occur if the sbuf call to malloc did sleep whilst holding the CC list rwlock. Sponsored by: FreeBSD Foundation Reported by: Nick Hibma Reviewed by: bz MFC after: 3 weeks X-MFC with: r215166
* Some correctness and robustness fixes related to CUBIC's mean RTT estimate:lstewart2011-01-211-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - The mean RTT is updated at the end of each congestion epoch, but if we switch to congestion avoidance within the first epoch (e.g. if ssthresh was primed from the hostcache), we'll trigger a divide by zero panic in cubic_ack_received(). Set the mean to the min in cubic_record_rtt() if the mean is less than the min to ensure we have a sane mean for use in this situation. This fixes the panic reported by Nick Hibma. - Adjust conditions under which we update the mean RTT in cubic_post_recovery() to ensure a low latency path won't yield an RTT of less than 1. This avoids another potential divide by zero panic when running CUBIC in networks with sub-millisecond latencies. - Remove the "safety" assignment of min into mean when we don't update the mean because of failed conditions. The above change to the conditions for updating the mean ensures the safety issue is addressed and I feel it is better to keep our previous mean estimate around if we can't update than to revert to the min. - Initialise the mean RTT to 1 on connection startup to act as a safety belt if a situation we haven't considered and addressed with the above changes were to crop up in the wild. Sponsored by: FreeBSD Foundation Reported and tested by: Nick Hibma Discussed with: David Hayes <dahayes at swin edu au> MFC after: 5 weeks X-MFC with: r216114
* sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.mdf2011-01-121-2/+2
| | | | Commit the net* piece.
* Import a clean-room implementation of the experimental H-TCP congestion controllstewart2010-12-021-0/+521
| | | | | | | | | | | | | | | | | | algorithm based on the Internet-Draft "draft-leith-tcp-htcp-06.txt". It is implemented as a kernel module compatible with the recently committed modular congestion control framework. H-TCP was designed to provide increased throughput in fast and long-distance networks. It attempts to maintain fairness when competing with legacy NewReno TCP in lower speed scenarios where NewReno is able to operate adequately. The paper "H-TCP: A framework for congestion control in high-speed and long-distance networks" provides additional detail. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: rpaulo (older patch from a few weeks ago) MFC after: 3 months
* Import a clean-room implementation of the experimental CUBIC congestion controllstewart2010-12-022-0/+625
| | | | | | | | | | | | | | | | | | algorithm based on the Internet-Draft "draft-rhee-tcpm-cubic-02.txt". It is implemented as a kernel module compatible with the recently committed modular congestion control framework. CUBIC was designed for provide increased throughput in fast and long-distance networks. It attempts to maintain fairness when competing with legacy NewReno TCP in lower speed scenarios where NewReno is able to operate adequately. The paper "CUBIC: A New TCP-Friendly High-Speed TCP Variant" provides additional detail. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation Reviewed by: rpaulo (older patch from a few weeks ago) MFC after: 3 months
* General cleanup of the NewReno CC module (no functional changes):lstewart2010-12-021-52/+40
| | | | | | | | | | | | - Remove superfluous includes and unhelpful comments. - Alphabetically order functions. - Make functions static. Sponsored by: FreeBSD Foundation MFC after: 9 weeks X-MFC with: r215166
* - Reinstantiate the after_idle hook call in tcp_output(), which got lostlstewart2010-12-021-4/+17
| | | | | | | | | | | | somewhere along the way due to mismerging r211464 in our development tree. - Capture the essence of r211464 in NewReno's after_idle() hook. We don't use V_ss_fltsz/V_ss_fltsz_local yet which needs to be revisited. Sponsored by: FreeBSD Foundation Submitted by: David Hayes <dahayes at swin edu au> MFC after: 9 weeks X-MFC with: r215166
* Make the CC framework more VIMAGE friendly by adding the machinery to allowlstewart2010-11-161-30/+28
| | | | | | | | | | | vnets to select their own default CC algorithm independent of each other and the base system. If the base system or a vnet has set a default which gets unloaded, we reset that netstack's default to NewReno. Sponsored by: FreeBSD Foundation Tested by: Mikolaj Golub <to.my.trociny at gmail com> Reviewed by: bz (briefly) MFC after: 3 months
* - Querying the default CC algo is more common than setting it and the functionlstewart2010-11-161-3/+2
| | | | | | | | | | is small, so there is no good reason not to declare the buffer at the top. - Fix a whitespace nit. Sponsored by: FreeBSD Foundation MFC after: 11 weeks X-MFC with: r215166
* Move protocol specific implementation detail out of the core CC framework.lstewart2010-11-161-48/+6
| | | | | | | Sponsored by: FreeBSD Foundation Tested by: Mikolaj Golub <to.my.trociny at gmail com> MFC after: 11 weeks X-MFC with: r215166
* On CC algorithm module unload, we walk the list of active TCP control blocks.lstewart2010-11-161-24/+35
| | | | | | | | | | | | | | | Any found to be using the algorithm that is about to go away are switched back to NewReno to avoid leaving dangling pointers which would trigger a panic. For VIMAGE kernels, there is a list per vnet to walk, yet the implementation was only examining one of the vnet lists. Fix the implementation of the above feature for VIMAGE kernels by looping through all active TCP control blocks across all vnets. Sponsored by: FreeBSD Foundation Tested by: Mikolaj Golub <to.my.trociny at gmail com> Reviewed by: bz (briefly) MFC after: 11 weeks
* cc_init() should only be run once on system boot, but with VIMAGE kernels itlstewart2010-11-161-2/+4
| | | | | | | | | | | | | | | | runs on boot and each time a vnet jail is created. Running cc_init() multiple times results in a panic when attempting to initialise the cc_list lock again, and so r215166 effectively broke the use of vnet jails. Switch to using a SYSINIT to run cc_init() on boot. CC algorithm modules loaded on boot register in the same SI_SUB_PROTO_IFATTACHDOMAIN category as is used in this patch, so cc_init() is run at SI_ORDER_FIRST to ensure the framework is initialised before module registration is attempted. Sponsored by: FreeBSD Foundation Reported and tested by: Mikolaj Golub <to.my.trociny at gmail com> MFC after: 11 weeks X-MFC with: r215166
* This commit marks the first formal contribution of the "Five New TCP Congestionlstewart2010-11-123-0/+641
Control Algorithms for FreeBSD" FreeBSD Foundation funded project. More details about the project are available at: http://caia.swin.edu.au/freebsd/5cc/ - Add a KPI and supporting infrastructure to allow modular congestion control algorithms to be used in the net stack. Algorithms can maintain per-connection state if required, and connections maintain their own algorithm pointer, which allows different connections to concurrently use different algorithms. The TCP_CONGESTION socket option can be used with getsockopt()/setsockopt() to programmatically query or change the congestion control algorithm respectively from within an application at runtime. - Integrate the framework with the TCP stack in as least intrusive a manner as possible. Care was also taken to develop the framework in a way that should allow integration with other congestion aware transport protocols (e.g. SCTP) in the future. The hope is that we will one day be able to share a single set of congestion control algorithm modules between all congestion aware transport protocols. - Introduce a new congestion recovery (TF_CONGRECOVERY) state into the TCP stack and use it to decouple the meaning of recovery from a congestion event and recovery from packet loss (TF_FASTRECOVERY) a la RFC2581. ECN and delay based congestion control protocols don't generally need to recover from packet loss and need a different way to note a congestion recovery episode within the stack. - Remove the net.inet.tcp.newreno sysctl, which simplifies some portions of code and ensures the stack always uses the appropriate mechanisms for recovering from packet loss during a congestion recovery episode. - Extract the NewReno congestion control algorithm from the TCP stack and massage it into module form. NewReno is always built into the kernel and will remain the default algorithm for the forseeable future. Implementations of additional different algorithms will become available in the near future. - Bump __FreeBSD_version to 900025 and note in UPDATING that rebuilding code that relies on the size of "struct tcpcb" is required. Many thanks go to the Cisco University Research Program Fund at Community Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work at the Centre for Advanced Internet Architectures, Swinburne University of Technology is greatly appreciated. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: Cisco URP, FreeBSD Foundation Reviewed by: rpaulo Tested by: David Hayes (and many others over the years) MFC after: 3 months
OpenPOWER on IntegriCloud