summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_subr.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix a second warning, introduced by my last "fix". I committed the wrongpeter2007-07-051-2/+2
| | | | | | | diff from the wrong machine. Pointy hat to: peter Approved by: re (rwatson - blanket, several days ago)
* Fix cast-qualifiers warning when INET6 is not presentpeter2007-07-051-1/+1
| | | | Approved by: re (rwatson)
* Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSECgnn2007-07-031-4/+4
| | | | | | | | option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC. Approved by: re Sponsored by: Secure Computing
* Commit IPv6 support for FAST_IPSEC to the tree.gnn2007-07-011-11/+2
| | | | | | | | | This commit includes only the kernel files, the rest of the files will follow in a second commit. Reviewed by: bz Approved by: re Supported by: Secure Computing
* Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); inrwatson2007-06-121-4/+2
| | | | | | | | | | | | | | | some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
* Don't assign sp to the value of s when we're about to assign it instead torwatson2007-05-271-1/+1
| | | | | | | s + strlen(s). Found with: Coverity Prevent(tm) CID: 2243
* In tcp_log_addrs():andre2007-05-231-5/+7
| | | | | | | | | o add the hex output of the th_flags field to the example log line in comments o simplify the log line length calculation and make it less evil o correct the test for the length panic; the line isn't on the stack but malloc'ed
* Add tcp_log_addrs() function to generate and standardized TCP log lineandre2007-05-181-0/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for use thoughout the tcp subsystem. It is IPv4 and IPv6 aware creates a line in the following format: "TCP: [1.2.3.4]:50332 to [1.2.3.4]:80 tcpflags <RST>" A "\n" is not included at the end. The caller is supposed to add further information after the standard tcp log header. The function returns a NUL terminated string which the caller has to free(s, M_TCPLOG) after use. All memory allocation is done with M_NOWAIT and the return value may be NULL in memory shortage situations. Either struct in_conninfo || (struct tcphdr && (struct ip || struct ip6_hdr) have to be supplied. Due to ip[6].h header inclusion limitations and ordering issues the struct ip and struct ip6_hdr parameters have to be casted and passed as void * pointers. tcp_log_addrs(struct in_conninfo *inc, struct tcphdr *th, void *ip4hdr, void *ip6hdr) Usage example: struct ip *ip; char *tcplog; if (tcplog = tcp_log_addrs(NULL, th, (void *)ip, NULL)) { log(LOG_DEBUG, "%s; %s: Connection attempt to closed port\n", tcplog, __func__); free(s, M_TCPLOG); }
* Move TIME_WAIT related functions and timer handling from filesandre2007-05-161-1/+0
| | | | | | | | | | | | | | | | other than repo copied tcp_subr.c into tcp_timewait.c#1.284: tcp_input.c#1.350 tcp_timewait() -> tcp_twcheck() tcp_timer.c#1.92 tcp_timer_2msl_reset() -> tcp_tw_2msl_reset() tcp_timer.c#1.92 tcp_timer_2msl_stop() -> tcp_tw_2msl_stop() tcp_timer.c#1.92 tcp_timer_2msl_tw() -> tcp_tw_2msl_scan() This is a mechanical move with appropriate renames and making them static if used only locally. The tcp_tw_2msl_scan() cleanup function is still run from the tcp_slowtimo() in tcp_timer.c.
* Complete the (mechanical) move of the TCP reassembly and timewaitandre2007-05-131-352/+2
| | | | | | | functions from their origininal place to their own files. TCP Reassembly from tcp_input.c -> tcp_reass.c TCP Timewait from tcp_subr.c -> tcp_timewait.c
* Make the TCP timer callout obtain Giant if the network stack is markedandre2007-05-111-2/+6
| | | | | | as non-mpsafe. This change is to be removed when all protocols are mp-safe.
* Add the timestamp offset to struct tcptw so we can generate properandre2007-05-111-3/+6
| | | | | ACKs in TIME_WAIT state that don't get dropped by the PAWS check on the receiver.
* Move universally to ANSI C function declarations, with relativelyrwatson2007-05-101-3/+3
| | | | consistent style(9)-ish layout.
* When setting up timewait state for a TCP connection, don't hold therwatson2007-05-071-1/+1
| | | | | socket lock over a crhold() of so_cred: so_cred is constant after socket creation, so doesn't require locking to read.
* Use existing TF_SACK_PERMIT flag in struct tcpcb t_flags field instead ofandre2007-05-061-2/+3
| | | | a decdicated sack_enable int for this bool. Change all users accordingly.
* Rename some fields of struct inpcbinfo to have the ipi_ prefix,rwatson2007-04-301-7/+8
| | | | | | consistent with the naming of other structure field members, and reducing improper grep matches. Clean up and comment structure fields in structure definition.
* Make tcp_twrespond() use tcp_addoptions() instead of a home grown version.andre2007-04-181-11/+6
|
* Change the TCP timer system from using the callout system five timesandre2007-04-111-15/+14
| | | | | | | | | | | | | | | | directly to a merged model where only one callout, the next to fire, is registered. Instead of callout_reset(9) and callout_stop(9) the new function tcp_timer_activate() is used which then internally manages the callout. The single new callout is a mutex callout on inpcb simplifying the locking a bit. tcp_timer() is the called function which handles all race conditions in one place and then dispatches the individual timer functions. Reviewed by: rwatson (earlier version)
* Retire unused TCP_SACK_DEBUG.andre2007-04-041-1/+0
|
* ANSIfy function declarations and remove register keywords for variables.andre2007-03-211-4/+4
| | | | Consistently apply style to all function declarations.
* Remove tcp_minmssoverload DoS detection logic. The problem it tried toandre2007-03-211-12/+0
| | | | | | protect us from wasn't really there and it only bloats the code. Should the problem surface in the future we can simply resurrect it from cvs history.
* Match up SYSCTL declaration style.andre2007-03-191-12/+14
|
* Remove unused and #if 0'd net.inet.tcp.tcp_rttdflt sysctl.rwatson2007-03-161-6/+0
|
* Reap FIN_WAIT_2 connections marked SOCANTRCVMORE faster. This mitigatemohans2007-02-261-0/+1
| | | | | | | | potential issues where the peer does not close, potentially leaving thousands of connections in FIN_WAIT_2. This is controlled by a new sysctl fast_finwait2_recycle, which is disabled by default. Reviewed by: gnn, silby.
* Whitespace fix and remove an extra cast.jhb2006-12-301-1/+2
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-2/+5
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* o Convert w/spaces to tabs in the previous commit.maxim2006-09-291-3/+3
|
* Rather than autoscaling the number of TIME_WAIT sockets to maxsockets / 5,silby2006-09-291-8/+24
| | | | | | | | | | | | scale it to min(ephemeral port range / 2, maxsockets / 5) so that people with large gobs of memory and/or large maxsockets settings will not exhaust their entire ephemeral port range with sockets in the TIME_WAIT state during periods of heavy load. Those who wish to tweak the size of the TIME_WAIT zone can still do so with net.inet.tcp.maxtcptw. Reviewed by: glebius, ru
* Add a sysctl net.inet.tcp.nolocaltimewait that allows to suppressglebius2006-09-081-4/+15
| | | | | creating a compress TIME WAIT states, if both connection endpoints are local. Default is off.
* Back when we had T/TCP support, we used to apply differentru2006-09-071-3/+2
| | | | | | | | | | | timeouts for TCP and T/TCP connections in the TIME_WAIT state, and we had two separate timed wait queues for them. Now that is has gone, the timeout is always 2*MSL again, and there is no reason to keep two queues (the first was unused anyway!). Also, reimplement the remaining queue using a TAILQ (it was technically impossible before, with two queues).
* First step of TSO (TCP segmentation offload) support in our network stack.andre2006-09-061-5/+19
| | | | | | | | | | | | o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6 o add CSUM_TSO flag to mbuf pkthdr csum_flags field o add tso_segsz field to mbuf pkthdr o enhance ip_output() packet length check to allow for large TSO packets o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities o adjust all callers of tcp_maxmtu[46]() accordingly Discussed on: -current, -net Sponsored by: TCP/IP Optimization Fundraise 2005
* o Backout rev. 1.125 of in_pcb.c. It appeared to behave extremelyglebius2006-09-061-4/+3
| | | | | | | | | | | | | | | | | | | | bad under high load. For example with 40k sockets and 25k tcptw entries, connect() syscall can run for seconds. Debugging showed that it iterates the cycle millions times and purges thousands of tcptw entries at a time. Besides practical unusability this change is architecturally wrong. First, in_pcblookup_local() is used in connect() and bind() syscalls. No stale entries purging shouldn't be done here. Second, it is a layering violation. o Return back the tcptw purging cycle to tcp_timer_2msl_tw(), that was removed in rev. 1.78 by rwatson. The commit log of this revision tells nothing about the reason cycle was removed. Now we need this cycle, since major cleaner of stale tcptw structures is removed. o Disable probably necessary, but now unused tcp_twrecycleable() function. Reviewed by: ru
* Finally fix rev. 1.256glebius2006-09-051-3/+4
| | | | Pointy hat to: glebius
* Remove extra parenthesis in last commit.glebius2006-09-051-2/+2
| | | | Nitpicked by: ru
* - Make net.inet.tcp.maxtcptw modifiable at run time.glebius2006-09-051-7/+28
| | | | | - If net.inet.tcp.maxtcptw was ever set explicitly, do not change it if kern.ipc.maxsockets is changed.
* Fix for a bug that causes the computation of "len" in tcp_output() tomohans2006-08-261-0/+4
| | | | | get messed up, resulting in an inconsistency between the TCP state and so_snd.
* Fixes an edge case bug in timewait handling where ticks rolling over causingmohans2006-08-111-1/+1
| | | | | the timewait expiry to be exactly 0 corrupts the timewait queues (and that entry). Reviewed by: silby
* Move soisdisconnected() in tcp_discardcb() to one of its calling contexts,rwatson2006-08-021-12/+7
| | | | | | | | | | | tcp_twstart(), but not to the other, tcp_detach(), as the socket is already being torn down and therefore there are no listeners. This avoids a panic if kqueue state is registered on the socket at close(), and eliminates to XXX comments. There is one case remaining in which tcp_discardcb() reaches up to the socket layer as part of the TCP host cache, which would be good to avoid. Reported by: Goran Gajic <ggajic at afrodita dot rcub dot bg dot ac dot yu>
* Change semantics of socket close and detach. Add a new protocol switchrwatson2006-07-211-37/+12
| | | | | | | | | | | | | | | | | | | function, pru_close, to notify protocols that the file descriptor or other consumer of a socket is closing the socket. pru_abort is now a notification of close also, and no longer detaches. pru_detach is no longer used to notify of close, and will be called during socket tear-down by sofree() when all references to a socket evaporate after an earlier call to abort or close the socket. This means detach is now an unconditional teardown of a socket, whereas previously sockets could persist after detach of the protocol retained a reference. This faciliates sharing mutexes between layers of the network stack as the mutex is required during the checking and removal of references at the head of sofree(). With this change, pru_detach can now assume that the mutex will no longer be required by the socket layer after completion, whereas before this was not necessarily true. Reviewed by: gnn
* Fix race conditions on enumerating pcb lists by moving the initializationups2006-07-181-2/+14
| | | | | | | | | | | | | | | ( and where appropriate the destruction) of the pcb mutex to the init/finit functions of the pcb zones. This allows locking of the pcb entries and race condition free comparison of the generation count. Rearrange locking a bit to avoid extra locking operation to update the generation count in in_pcballoc(). (in_pcballoc now returns the pcb locked) I am planning to convert pcb list handling from a type safe to a reference count model soon. ( As this allows really freeing the PCBs) Reviewed by: rwatson@, mohans@ MFC after: 1 week
* Abstract inpcb drop logic, previously just setting of INP_DROPPED in TCP,rwatson2006-04-251-3/+2
| | | | | | | | | | into in_pcbdrop(). Expand logic to detach the inpcb from its bound address/port so that dropping a TCP connection releases the inpcb resource reservation, which since the introduction of socket/pcb reference count updates, has been persisting until the socket closed rather than being released implicitly due to prior freeing of the inpcb on TCP drop. MFC after: 3 months
* Replace isn_mtx direct use with ISN_*() lock macros so that lockingrwatson2006-04-231-5/+9
| | | | | | details/strategy can be changed without touching every use. MFC after: 3 months
* Introduce a new TCP mutex, isn_mtx, which protects the initial sequencerwatson2006-04-221-3/+6
| | | | | | | | number state, rather than re-using pcbinfo. This introduces some additional mutex operations during isn query, but avoids hitting the TCP pcbinfo lock out of yet another frequently firing TCP timer. MFC after: 3 months
* Allow for nmbclusters and maxsockets to be increased via sysctl.ps2006-04-211-0/+11
| | | | | An eventhandler is used to update all the various zones that depend on these values.
* Add a tunable net.inet.tcp.maxtcptw, that allows to set a limitglebius2006-04-041-1/+8
| | | | on tcptw zone independently from setting a limit on socket zone.
* Before dereferencing intotw() when INP_TIMEWAIT, check for inp_ppcb beingrwatson2006-04-041-5/+15
| | | | | | | | | | NULL. We currently do allow this to happen, but may want to remove that possibility in the future. This case can occur when a socket is left open after TCP wraps up, and the timewait state is recycled. This will be cleaned up in the future. Found by: Kazuaki Oda <kaakun at highway dot ne dot jp> MFC after: 3 months
* In TCP notify routines, check inpcb for INP_TIMEWAIT and INP_DROPPED.rwatson2006-04-031-66/+81
| | | | | | | | | | The INP_DROPPED check replaces the current NULL checks; the INP_TIMEWAIT checks appear to have always been required, but not been there, which is/was a bug. This avoids unconditionally casting of in_ppcb to a tcpcb, when it may be a twtcb, which may have resulted in obscure ICMP-related panics in earlier releases. MFC after: 3 months
* Change inp_ppcb from caddr_t to void *, fix/remove associated relatedrwatson2006-04-031-7/+10
| | | | | | | | | | | | | | | | | | casts. Consistently use intotw() to cast inp_ppcb pointers to struct tcptw * pointers. Consistently use intotcpcb() to cast inp_ppcb pointers to struct tcpcb * pointers. Don't assign tp to the results to intotcpcb() during variable declation at the top of functions, as that is before the asserts relating to locking have been performed. Do this later in the function after appropriate assertions have run to allow that operation to be conisdered safe. MFC after: 3 months
* Style tweaks: convert to ANSI from K&R function prototypes.rwatson2006-04-031-59/+26
| | | | MFC after: 3 months
OpenPOWER on IntegriCloud