summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_syncache.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Be more restrictive with segment validity checks in syncache_expand()andre2007-05-181-3/+42
| | | | | | and log check failures to syslog at LOG_DEBUG level. Always prefill the sc->sc_ts field to use it in the checks.
* o Add syslog logging under LOG_DEBUG to various failures caused byandre2007-05-181-5/+38
| | | | | | bogus segments o Add more KASSERT()s o Update comments
* Use existing TF_SACK_PERMIT flag in struct tcpcb t_flags field instead ofandre2007-05-061-3/+1
| | | | a decdicated sack_enable int for this bool. Change all users accordingly.
* o Remove unused and redundant TCP option definitionsandre2007-04-201-1/+1
| | | | | o Replace usage of MAX_TCPOPTLEN with the correctly constructed and derived MAX_TCPOPTLEN
* Remove bogus check for accept queue length and associated failure handlingandre2007-04-201-2/+2
| | | | | | | | | | | | | | from the incoming SYN handling section of tcp_input(). Enforcement of the accept queue limits is done by sonewconn() after the 3WHS is completed. It is not necessary to have an earlier check before a connection request enters the SYN cache awaiting the full handshake. It rather limits the effectiveness of the syncache by preventing legit and illegit connections from entering it and having them shaken out before we hit the real limit which may have vanished by then. Change return value of syncache_add() to void. No status communication is required.
* Simplifly syncache_expand() and clarify its semantics. Zero is returnedandre2007-04-201-17/+4
| | | | | | | | | | | | | | | when the ACK is invalid and doesn't belong to any registered connection, either in syncache or through SYN cookies. True but a NULL struct socket is returned when the 3WHS completed but the socket could not be created due to insufficient resources or limits reached. For both cases an RST is sent back in tcp_input(). A logic error leading to a panic is fixed where syncache_expand() would free the mbuf on socket allocation failure but tcp_input() later supplies it to tcp_dropwithreset() to issue a RST to the peer. Reported by: kris (the panic)
* Only update TCP timestamp on SYN duplication if it is present onandre2007-04-201-1/+3
| | | | current SYN in syncache_add(). Otherwise disable timestamps.
* o Plug memory leak in syncache_add() on MAC label allocation failure.andre2007-04-201-18/+12
| | | | | | o Simplify code flow with 'done' goto label. o Remove mbuf argument from syncache_respond(). It doesn't make use of it.
* When we run into the syncache entry limits syncache_add() triesandre2007-04-171-2/+2
| | | | | | | | | to free the oldest entry in the current bucket row. The global entry limit may be smaller than the bucket rows and their limit combined however. Thus only try to free a syncache entry if we found one in this bucket row. Reported by: kris
* Change the TCP timer system from using the callout system five timesandre2007-04-111-1/+1
| | | | | | | | | | | | | | | | directly to a merged model where only one callout, the next to fire, is registered. Instead of callout_reset(9) and callout_stop(9) the new function tcp_timer_activate() is used which then internally manages the callout. The single new callout is a mutex callout on inpcb simplifying the locking a bit. tcp_timer() is the called function which handles all race conditions in one place and then dispatches the individual timer functions. Reviewed by: rwatson (earlier version)
* Move last tcpcb initialization for the inbound connection case fromandre2007-04-041-0/+3
| | | | | | | | tcp_input() to syncache_socket() where it belongs and the majority of it already happens. The "tp->snd_up = tp->snd_una" is removed as it is done with the tcp_sendseqinit() macro a few lines earlier.
* Unbreak IPv6 after consolidation of TCP options insertion.andre2007-03-171-3/+2
| | | | Submitted by: tegge
* Fix the most obvious of the bugs introduced by recent syncache changeskmacy2007-03-171-0/+3
| | | | | | | | - *ip is not initialized in the case of inet6 connection, but ip->ip_len is being changed anyway Now the question is, why does it think an ipv4 connection is an ipv6 connection? xemacs still doesn't work over X11 forwarding, but the kernel no longer panics.
* Consolidate insertion of TCP options into a segment from within tcp_output()andre2007-03-151-75/+43
| | | | | | | | | | | | | | and syncache_respond() into its own generic function tcp_addoptions(). tcp_addoptions() is alignment agnostic and does optimal packing in all cases. In struct tcpopt rename to_requested_s_scale to just to_wscale. Add a comment with quote from RFC1323: "The Window field in a SYN (i.e., a <SYN> or <SYN,ACK>) segment itself is never scaled." Reviewed by: silby, mohans, julian Sponsored by: TCP/IP Optimization Fundraise 2005
* Change the way the advertized TCP window scaling is computed. Instead ofandre2007-02-011-2/+8
| | | | | | | | | | | | | | | upper-bounding it to the size of the initial socket buffer lower-bound it to the smallest MSS we accept. Ideally we'd use the actual MSS information here but it is not available yet. For socket buffer auto sizing to be effective we need room to grow the receive window. The window scale shift is determined at connection setup and can't be changed afterwards. The previous, original, method effectively just did a power of two roundup of the socket buffer size at connection setup severely limiting the headroom for larger socket buffers. Tested by: many (as part of the socket buffer auto sizing patch) MFC after: 1 month
* Fix LOR between the syncache and inpcb locks when MAC is present in thecsjp2006-12-131-43/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel. This LOR snuck in with some of the recent syncache changes. To fix this, the inpcb handling was changed: - Hang a MAC label off the syncache object - When the syncache entry is initially created, we pickup the PCB lock is held because we extract information from it while initializing the syncache entry. While we do this, copy the MAC label associated with the PCB and use it for the syncache entry. - When the packet is transmitted, copy the label from the syncache entry to the mbuf so it can be processed by security policies which analyze mbuf labels. This change required that the MAC framework be extended to support the label copy operations from the PCB to the syncache entry, and then from the syncache entry to the mbuf. These functions really should be referencing the syncache structure instead of the label. However, due to some of the complexities associated with exposing this syncache structure we operate directly on it's label pointer. This should be OK since we aren't making any access control decisions within this code directly, we are merely allocating and copying label storage so we can properly initialize mbuf labels for any packets the syncache code might create. This also has a nice side effect of caching. Prior to this change, the PCB would be looked up/locked for each packet transmitted. Now the label is cached at the time the syncache entry is initialized. Submitted by: andre [1] Discussed with: rwatson [1] andre submitted the tcp_syncache.c changes
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Add missing #ifdef INET6 (can't be compiled)ache2006-09-141-0/+2
|
* Remove unessary includes and follow common ordering style.andre2006-09-131-10/+2
|
* Rewrite of TCP syncookies to remove locking requirements and to enhanceandre2006-09-131-191/+277
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | functionality: - Remove a rwlock aquisition/release per generated syncookie. Locking is now integrated with the bucket row locking of syncache itself and syncookies no longer add any additional lock overhead. - Syncookie secrets are different for and stored per syncache buck row. Secrets expire after 16 seconds and are reseeded on-demand. - The computational overhead for syncookie generation and verification is one MD5 hash computation as before. - Syncache can be turned off and run with syncookies only by setting the sysctl net.inet.tcp.syncookies_only=1. This implementation extends the orginal idea and first implementation of FreeBSD by using not only the initial sequence number field to store information but also the timestamp field if present. This way we can keep track of the entire state we need to know to recreate the session in its original form. Almost all TCP speakers implement RFC1323 timestamps these days. For those that do not we still have to live with the known shortcomings of the ISN only SYN cookies. The use of the timestamp field causes the timestamps to be randomized if syncookies are enabled. The idea of SYN cookies is to encode and include all necessary information about the connection setup state within the SYN-ACK we send back and thus to get along without keeping any local state until the ACK to the SYN-ACK arrives (if ever). Everything we need to know should be available from the information we encoded in the SYN-ACK. A detailed description of the inner working of the syncookies mechanism is included in the comments in tcp_syncache.c. Reviewed by: silby (slightly earlier version) Sponsored by: TCP/IP Optimization Fundraise 2005
* In syncache_respond() do not reply with a MSS that is larger than whatandre2006-06-261-0/+2
| | | | | | the peer announced to us but make it at least tcp_minmss in size. Sponsored by: TCP/IP Optimization Fundraise 2005
* Some cleanups and janitorial work to tcp_syncache:andre2006-06-261-45/+33
| | | | | | | | | | | | | | | | | o don't assign remote/local host/port information manually between provided struct in_conninfo and struct syncache, bcopy() it instead o rename sc_tsrecent to sc_tsreflect in struct syncache to better capture the purpose of this field o rename sc_request_r_scale to sc_requested_r_scale for ditto reasons o fix IPSEC error case printf's to report correct function name o in syncache_socket() only transpose enhanced tcp options parameters to struct tcpcb when the inpcb doesn't has TF_NOOPT set o in syncache_respond() reorder stack variables o in syncache_respond() remove bogus KASSERT() No functional changes. Sponsored by: TCP/IP Optimization Fundraise 2005
* Reverse the source/destination parameters to in[6]_pcblookup_hash() inandre2006-06-261-2/+2
| | | | | | syncache_respond() for the #ifdef MAC case. Submitted by: Tai-hwa Liang <avatar-at-mmlab.cse.yzu.edu.tw>
* Decrement the global syncache counter in syncache_expand() when the entryandre2006-06-251-0/+1
| | | | is removed from the bucket. This fixes the syncache statistics.
* Move the syncookie MD5 context from globals to the stack to make it MP safe.andre2006-06-221-2/+2
|
* Allocate a zero'ed syncache hashtable. mtx_init() tests the suppliedandre2006-06-201-1/+1
| | | | | | | | memory location for already existing/initialized mutexes. With random data in the memory location this fails (ie. after a soft reboot). Reported by: brueffer, YAMAMOTO Shigeru Submitted by: YAMAMOTO Shigeru <shigeru-at-iij.ad.jp>
* Do not access syncache entry before it was allocated for the TF_NOOPT caseandre2006-06-181-3/+4
| | | | | | | in syncache_add(). Found by: Coverity Prevent CID: 1473
* Move all syncache related structures to tcp_syncache.c. They are only usedandre2006-06-181-0/+35
| | | | | | | | there. This unbreaks userland programs that include tcp_var.h. Discussed with: rwatson
* Remove double lock acquisition in syncookie_lookup() which came from lastandre2006-06-181-1/+0
| | | | | | minute conversions to macros. Pointy hat to: andre
* Fix the !INET6 compile.andre2006-06-171-2/+4
| | | | Reported by: alc
* ANSIfy and tidy up comments.andre2006-06-171-52/+23
| | | | Sponsored by: TCP/IP Optimization Fundraise 2005
* Add locking to TCP syncache and drop the global tcpinfo lock as earlyandre2006-06-171-254/+285
| | | | | | | | | | | | | | | | | | as possible for the syncache_add() case. The syncache timer no longer aquires the tcpinfo lock and timeout/retransmit runs can happen in parallel with bucket granularity. On a P4 the additional locks cause a slight degression of 0.7% in tcp connections per second. When IP and TCP input are deserialized and can run in parallel this little overhead can be neglected. The syncookie handling still leaves room for improvement and its random salts may be moved to the syncache bucket head structures to remove the second lock operation currently required for it. However this would be a more involved change from the way syncookies work at the moment. Reviewed by: rwatson Tested by: rwatson, ps (earlier version) Sponsored by: TCP/IP Optimization Fundraise 2005
* Change soabort() from returning int to returning void, since allrwatson2006-03-161-1/+1
| | | | | | consumers ignore the return value, soabort() is required to succeed, and protocols produce errors here to report multiple freeing of the pcb, which we hope to eliminate.
* Rework TCP window scaling (RFC1323) to properly scale the send windowandre2006-02-281-1/+1
| | | | | | | | | | | | | right from the beginning and partly clean up the differences in handling between SYN_SENT and SYN_RCVD (syncache). Further changes to this code to come. This is a first incremental step to a general overhaul and streamlining of the TCP code. PR: kern/15095 PR: kern/92690 (partly) Reviewed by: qingli (and tested with ANVL) Sponsored by: TCP/IP Optimization Fundraise 2005
* Set the M_ZERO flag when calling uma_zalloc() to allocate a syncache entry.qingli2006-02-091-5/+4
| | | | | Reviewed by: andre, glebius MFC after: 3 days
* Redo the previous fix by setting the UMA_ZONE_ZINIT bit in the syncacheqingli2006-02-081-3/+2
| | | | | | | | | zone, eliminating the need to call bzero() after each syncache entry allocation. Suggested by: glebius Reviewed by: andre MFC after: 3 days
* Fixes a crash due to the memory of the newly allocated syncache entryqingli2006-02-071-0/+1
| | | | | | | | in syncache_lookup() is not cleared and may lead to an arbitrary and bogus rtentry pointer which later gets free'd. Reviewed by: andre MFC after: 3 days
* In syncache_expand() insert a proper syncache_free() to fix a caseandre2006-01-181-1/+4
| | | | | | | | | | | that currently can't be triggered. But better be safe than sorry later on. Additionally it properly silences Coverity Prevent for future tests. Found by: Coverity Prevent(tm) Coverity ID: CID802 Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
* UMA can return NULL not only in case when our zone is full, butglebius2006-01-141-1/+7
| | | | | | | also in case of generic memory shortage. In the latter case we may not find an old entry. Found with: Coverity Prevent(tm)
* Consolidate all IP Options handling functions into ip_options.[ch] andandre2005-11-181-0/+1
| | | | | | | | | | | | | | | | | | | | include ip_options.h into all files making use of IP Options functions. From ip_input.c rev 1.306: ip_dooptions(struct mbuf *m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt) From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb *inp, int optname, struct mbuf *m) No functional changes in this commit. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
* Retire MT_HEADER mbuf type and change its users to use MT_DATA.andre2005-11-021-1/+1
| | | | | | | | | | | | Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005
* Do not ignore all other TCP options (eg. timestamp, window scaling)andre2005-09-141-1/+1
| | | | | | | | when responding to TCP SYN packets with TCP_MD5 enabled and set. PR: kern/82963 Submitted by: <demizu at dd.iij4u.or.jp> MFC after: 3 days
* - Refuse hashsize of 0, since it is invalid.glebius2005-08-251-2/+2
| | | | - Use defined constant instead of 512.
* Remove no-op spl's and most comment references to spls, as TCP lockingrwatson2005-07-191-1/+0
| | | | | | is believed to be basically done (modulo any remaining bugs). MFC after: 3 days
* Remove some code that snuck in by accident.ps2005-04-211-5/+0
| | | | Submitted by: Mohan Srinivasan
* Fix for interaction problems between TCP SACK and TCP Signature.ps2005-04-211-10/+22
| | | | | | | | | | | If TCP Signatures are enabled, the maximum allowed sack blocks aren't going to fit. The fix is to compute how many sack blocks fit and tack these on last. Also on SYNs, defer padding until after the SACK PERMITTED option has been added. Found by: Mohan Srinivasan. Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.
* Undo rev 1.71 as it is the wrong change.ps2005-04-211-10/+7
|
* Fix for 2 bugs related to TCP Signatures :ps2005-04-211-7/+10
| | | | | | | | | | - If the peer sends the Signature option in the SYN, use of Timestamps and Window Scaling were disabled (even if the peer supports them). - The sender must not disable signatures if the option is absent in the received SYN. (See comment in syncache_add()). Found, Submitted by: Noritoshi Demizu <demizu at dd dot ij4u dot or dot jp>. Reviewed by: Mohan Srinivasan <mohans at yahoo-inc dot com>.
* Use NET_CALLOUT_MPSAFE macro.glebius2005-03-011-2/+1
|
* Remove clause three from tcp_syncache.c license per permission ofrwatson2005-01-301-6/+3
| | | | McAfee. Update copyright to McAfee from NETA.
OpenPOWER on IntegriCloud