summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_reass.c
Commit message (Collapse)AuthorAgeFilesLines
* Changes to support the addition of a new sysctl variable:dg1998-02-261-18/+16
| | | | | | net.inet.tcp.delack_enabled Which defaults to 1 and can be set to 0 to disable TCP delayed-ack processing (i.e. all acks are immediate).
* Improved connection establishment performance by doing local port lookups viadg1998-01-271-27/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a hashed port list. In the new scheme, in_pcblookup() goes away and is replaced by a new routine, in_pcblookup_local() for doing the local port check. Note that this implementation is space inefficient in that the PCB struct is now too large to fit into 128 bytes. I might deal with this in the future by using the new zone allocator, but I wanted these changes to be extensively tested in their current form first. Also: 1) Fixed off-by-one errors in the port lookup loops in in_pcbbind(). 2) Got rid of some unneeded rehashing. Adding a new routine, in_pcbinshash() to do the initialial hash insertion. 3) Renamed in_pcblookuphash() to in_pcblookup_hash() for easier readability. 4) Added a new routine, in_pcbremlists() to remove the PCB from the various hash lists. 5) Added/deleted comments where appropriate. 6) Removed unnecessary splnet() locking. In general, the PCB functions should be called at splnet()...there are unfortunately a few exceptions, however. 7) Reorganized a few structs for better cache line behavior. 8) Killed my TCP_ACK_HACK kludge. It may come back in a different form in the future, however. These changes have been tested on wcarchive for more than a month. In tests done here, connection establishment overhead is reduced by more than 50 times, thus getting rid of one of the major networking scalability problems. Still to do: make tcp_fastimo/tcp_slowtimo scale well for systems with a large number of connections. tcp_fastimo is easy; tcp_slowtimo is difficult. WARNING: Anything that knows about inpcb and tcpcb structs will have to be recompiled; at the very least, this includes netstat(1).
* A more complete fix for the "land" attack, removing the "quick fix" fromfenner1998-01-211-20/+25
| | | | | | | | | | | | | | | | | | | | | | | rev 1.66. This fix contains both belt and suspenders. Belt: ignore packets where src == dst and srcport == dstport in TCPS_LISTEN. These packets can only legitimately occur when connecting a socket to itself, which doesn't go through TCPS_LISTEN (it goes CLOSED->SYN_SENT->SYN_RCVD-> ESTABLISHED). This prevents the "standard" "land" attack, although doesn't prevent the multi-homed variation. Suspenders: send a RST in response to a SYN/ACK in SYN_RECEIVED state. The only packets we should get in SYN_RECEIVED are 1. A retransmitted SYN, or 2. An ack of our SYN/ACK. The "land" attack depends on us accepting our own SYN/ACK as an ACK; in SYN_RECEIVED state; this should prevent all "land" attacks. We also move up the sequence number check for the ACK in SYN_RECEIVED. This neither helps nor hurts with respect to the "land" attack, but puts more of the validation checking in one spot. PR: kern/5103
* Don't use ANSI string concatenation to misformat a string.bde1997-12-191-5/+5
|
* Add Matt Dillon's quick fix hack for the self-connect DoS.wollman1997-11-201-1/+14
| | | | PR: 5103
* Remove a bunch of variables which were unused both in GENERIC and LINT.phk1997-11-071-2/+1
| | | | Found by: -Wunused
* Removed unused #includes.bde1997-10-281-3/+1
|
* Killed the SYN_RECEIVED addition from rev 1.52. It results in legitimatedg1997-10-021-6/+1
| | | | | | RST's being ignored, keeping a connection around until it times out, and thus has the opposite effect of what was intended (which is to make the system more robust to DoS attacks).
* Don't consider a SYN/ACK with CC but no CCECHO a proper T/TCPfenner1997-09-301-9/+11
| | | | | | handshake. Reviewed by: Rich Stevens <rstevens@kohala.com>
* Make TCPDEBUG a new-style option.joerg1997-09-161-1/+3
|
* Fix all areas of the system (or at least all those in LINT) to avoid storingwollman1997-08-161-8/+7
| | | | | | | | socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.
* Fix a bug (apparently very old) that can cause a TCP connection tojdp1997-07-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | be dropped when it has an unusual traffic pattern. For full details as well as a test case that demonstrates the failure, see the referenced PR. Under certain circumstances involving the persist state, it is possible for the receive side's tp->rcv_nxt to advance beyond its tp->rcv_adv. This causes (tp->rcv_adv - tp->rcv_nxt) to become negative. However, in the code affected by this fix, that difference was interpreted as an unsigned number by max(). Since it was negative, it was taken as a huge unsigned number. The effect was to cause the receiver to believe that its receive window had negative size, thereby rejecting all received segments including ACKs. As the test case shows, this led to fruitless retransmissions and eventually to a dropped connection. Even connections using the loopback interface could be dropped. The fix substitutes the signed imax() for the unsigned max() function. PR: closes kern/3998 Reviewed by: davidg, fenner, wollman
* The long-awaited mega-massive-network-code- cleanup. Part I.wollman1997-04-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Re-enable the TCP SYN-attack protection code. I was the one who didn'tfenner1996-11-101-3/+1
| | | | | | understand the socket state flag. 2.2 candidate.
* Fix two bugs I accidently put into the syn code at the last minutepst1996-10-111-5/+9
| | | | | | | | | | (yes I had tested the hell out of this). I've also temporarily disabled the code so that it behaves as it previously did (tail drop's the syns) pending discussion with fenner about some socket state flags that I don't fully understand. Submitted by: fenner
* Improved in_pcblookuphash() to support wildcarding, and changed relaventdg1996-10-071-12/+2
| | | | | | | | | | callers of it to take advantage of this. This reduces new connection request overhead in the face of a large number of PCBs in the system. Thanks to David Filo <filo@yahoo.com> for suggesting this and providing a sample implementation (which wasn't used, but showed that it could be done). Reviewed by: wollman
* Increase robustness of FreeBSD against high-rate connection attemptpst1996-10-071-13/+23
| | | | | | | denial of service attacks. Reviewed by: bde,wollman,olah Inspired by: vjs@sgi.com
* I don't understand, I committed this fix (move a counter and fixed a typo)pst1996-09-211-4/+3
| | | | | | this evening. I think I'm going insane.
* Syntax error: so_incom -> so_incompache1996-09-211-2/+2
|
* If the incomplete listen queue for a given socket is full,pst1996-09-201-5/+18
| | | | | | | | | | | | | | | | | drop the oldest entry in the queue. There was a fair bit of discussion as to whether or not the proper action is to drop a random entry in the queue. It's my conclusion that a random drop is better than a head drop, however profiling this section of code (done by John Capo) shows that a head-drop results in a significant performance increase. There are scenarios where a random drop is more appropriate. If I find one in reality, I'll add the random drop code under a conditional. Obtained from: discussions and code done by Vernon Schryver (vjs@sgi.com).
* Make the misnamed tcp initial keepalive timer value (which is really thepst1996-09-131-3/+3
| | | | | | | | time, in seconds, that state for non-established TCP sessions stays about) a sysctl modifyable variable. [part 1 of two commits, I just realized I can't play with the indices as I was typing this commit message.]
* Receipt of two SYN's are sufficient to set the t_timer[TCPT_KEEP]pst1996-09-131-6/+12
| | | | | | | | to "keepidle". this should not occur unless the connection has been established via the 3-way handshake which requires an ACK Submitted by: jmb Obtained from: problem discussed in Stevens vol. 3
* Back out my stupid braino; I was thinking strlen and not sizeof.fenner1996-05-021-2/+2
|
* Size temp var correctly; buf[4*sizeof "123"] is not long enoughfenner1996-05-021-2/+2
| | | | to store "192.252.119.189\0".
* inet_ntoa buffer was evaluated twice in log_in_vain, fix it.ache1996-04-271-3/+7
| | | | Thanx to: jdp
* Delete #ifdef notdef blocks containing old method of srtt calculation.wollman1996-04-261-48/+1
| | | | Requested by: davidg
* Logging UDP and TCP connection attempts should not be enabled by default.pst1996-04-091-2/+2
| | | | | | It's trivial to create a denial of service attack on a box so enabled. These messages, if enabled at all, must be rate-limited. (!)
* Log TCP syn packets for ports we don't listen on.phk1996-04-041-2/+13
| | | | | | | | | Controlled by: sysctl net.inet.tcp.log_in_vain: 1 Log UDP syn packets for ports we don't listen on. Controlled by: sysctl net.inet.udp.log_in_vain: 1 Suggested by: Warren Toomey <wkt@cs.adfa.oz.au>
* Slight modification of RTO floor calculation.wollman1996-03-251-2/+2
|
* A number of performance-reducing flaws fixed based on commentswollman1996-03-221-3/+54
| | | | | | | | | | | | | | | | from Larry Peterson &co. at Arizona: - Header prediction for ACKs did not exclude Fast Retransmit/Recovery. - srtt calculation tended to get ``stuck'' and could never decrease when below 8. It still can't, but the scaling factors are adjusted so that this artifact does not cause as bad an effect on the RTO value as it used to. The paper also points out the incr/8 error that has been long since fixed, and the problems with ACKing frequency resulting from the use of options which I suspect to be fixed already as well (as part of the T/TCP work). Obtained from: Brakmo & Peterson, ``Performance Problems in BSD4.4 TCP''
* Move or add #include <queue.h> in preparation for upcoming struct socketdg1996-03-111-2/+2
| | | | changes.
* Add a counter for the number of times the listen queue was overflowed toguido1996-02-261-2/+4
| | | | | | the tcpstat structure. (netstat -s) Reviewed by: wollman Obtained from: Steves, TCP/IP Ill. vol.3, page 189
* Fixed bug in Path MTU Discovery that caused the system to have to re-dg1996-02-221-24/+4
| | | | | | | discover the Path MTU for each connection if the connecting host didn't offer an initial MSS. Submitted by: davidg & olah
* Fix a bug related to the interworking of T/TCP and window scaling:olah1996-01-311-6/+19
| | | | | | | | | | | | | | | when a connection enters the ESTBLS state using T/TCP, then window scaling wasn't properly handled. The fix is twofold. 1) When the 3WHS completes, make sure that we update our window scaling state variables. 2) When setting the `virtual advertized window', then make sure that we do not try to offer a window that is larger than the maximum window without scaling (TCP_MAXWIN). Reviewed by: davidg Reported by: Jerry Chen <chen@Ipsilon.COM>
* Another mega commit to staticize things.phk1995-12-141-2/+2
|
* New style sysctl & staticize alot of stuff.phk1995-11-141-6/+15
|
* Start adding new style sysctl here too.phk1995-11-091-1/+5
|
* Cosmetic changes to processing of segments in the SYN_SENT state:olah1995-11-031-10/+10
| | | | | | | | - remove a redundant condition; - complete all validity checks on segment before calling soisconnected(so). Reviewed by: Richard Stevens, davidg, wollman
* Routes can be asymmetric. Always offer to /accept/ an MSS of up to thewollman1995-10-131-7/+1
| | | | | | capacity of the link, even if the route's MTU indicates that we cannot send that much in their direction. (This might actually make it possible to test Path MTU discovery in a useful variety of cases.)
* Finish 4.4-Lite-2 merge: randomize TCP initial sequence numberswollman1995-10-031-6/+8
| | | | to make ISS-guessing spoofing attacks harder.
* Remove a redundant `if' from tcp_reass().olah1995-07-311-4/+2
| | | | | | Correct a typo in a comment (SEND_SYN -> NEEDSYN). Reviewed by: David Greenman
* tcp_input.c - keep track of how many times a route contained a cached rttwollman1995-07-101-7/+7
| | | | | | | | | or ssthresh that we were able to use tcp_var.h - declare tcpstat entries for above; declare tcp_{send,recv}space in_rmx.c - fill in the MTU and pipe sizes with the defaults TCP would have used anyway in the absence of values here
* Keep track of the number of samples through the srtt filter so that wewollman1995-06-291-1/+2
| | | | | | know better when to cache values in the route, rather than relying on a heuristic involving sequence numbers that broke when tcp_sendspace was increased to 16k.
* Remove trailing whitespace.rgrimes1995-05-301-35/+35
|
* #ifdef'd my Nagel/ACK hack with "TCP_ACK_HACK", disabled by default. I'mdg1995-05-111-1/+24
| | | | | | | | currently considering reducing the TCP fasttimo to 100ms to help improve things, but this would be done as a seperate step at some point in the future. This was done because it was causing some sometimes serious performance problems with T/TCP.
* Fix a misspelled constant in tcp_input.c.olah1995-05-091-2/+2
| | | | | | | | | | | | | | | | | On Tue, 09 May 1995 04:35:27 PDT, Richard Stevens wrote: > In tcp_dooptions() under the case TCPOPT_CC there is an assignment > > to->to_flag |= TCPOPT_CC; > > that should be > > to->to_flag |= TOF_CC; > > I haven't thought through the ramifications of what's been happening ... > > Rich Stevens Submitted by: rstevens@noao.edu (Richard Stevens)
* Changed in_pcblookuphash() to not automatically call in_pcblookup() ifdg1995-05-031-3/+14
| | | | | | the lookup fails. Updated callers to deal with this. Call in_pcblookuphash instead of in_pcblookup() in in_pcbconnect; this improves performance of UDP output by about 17% in the standard case.
* Further satisfy my paranoia by making sure that the ACKNOW is onlydg1995-04-101-10/+5
| | | | set when ti_len is non-zero.
OpenPOWER on IntegriCloud