summaryrefslogtreecommitdiffstats
path: root/sys/netinet
Commit message (Collapse)AuthorAgeFilesLines
* Fix race conditions on enumerating pcb lists by moving the initializationups2006-07-188-24/+90
| | | | | | | | | | | | | | | ( and where appropriate the destruction) of the pcb mutex to the init/finit functions of the pcb zones. This allows locking of the pcb entries and race condition free comparison of the generation count. Rearrange locking a bit to avoid extra locking operation to update the generation count in in_pcballoc(). (in_pcballoc now returns the pcb locked) I am planning to convert pcb list handling from a type safe to a reference count model soon. ( As this allows really freeing the PCBs) Reviewed by: rwatson@, mohans@ MFC after: 1 week
* Revise network interface cloning to take an optional opaquesam2006-07-091-2/+2
| | | | | | | | | parameter that can specify configuration parameters: o rev cloner api's to add optional parameter block o add SIOCCREATE2 that accepts parameter data o rev vlan support to use new api (maintain old code) Reviewed by: arch@
* Make in-kernel multicast protocols for pfsync and carp work after enablingmlaier2006-07-081-0/+5
| | | | | | | | dynamic resizing of multicast membership array. Reported and testing by: Maxim Konovalov, Scott Ullrich Reminded by: thompsa MFC after: 2 weeks
* Remove unneeded mac.h include.rwatson2006-07-061-1/+0
| | | | MFC after: 3 days
* Complete timebase (time_second -> time_uptime) conversion.oleg2006-07-051-4/+4
| | | | | | PR: kern/94249 Reviewed by: andre (few months ago) Approved by: glebius (mentor)
* o Kill BUGS section as it is not valid since rev. 1.4 alias_pptp.c.maxim2006-07-041-6/+1
| | | | | Spotted by: ru.unix.bsd activists MFC after: 1 week
* There is a consensus that ifaddr.ifa_addr should never be NULL,yar2006-06-293-7/+1
| | | | | | | | | | except in places dealing with ifaddr creation or destruction; and in such special places incomplete ifaddrs should never be linked to system-wide data structures. Therefore we can eliminate all the superfluous checks for "ifa->ifa_addr != NULL" and get ready to the system crashing honestly instead of masking possible bugs. Suggested by: glebius, jhb, ru
* Use TAILQ_FOREACH consistently.yar2006-06-291-2/+1
|
* Fix URL to Bellovin's paper.glebius2006-06-291-1/+1
| | | | Submitted by: Anton Yuzhaninov <citrin rambler-co.ru>
* Eliminate the offset argument from send_reject. It's not beenbz2006-06-291-9/+7
| | | | | | | | used since FreeBSD-SA-06:04.ipfw. Adopt send_reject6 to what had been done for legacy IP: no longer send or permit sending rejects for any but the first fragment. Discussed with: oleg, csjp (some weeks ago)
* Use INPLOOKUP_WILDCARD instead of just 1 more consistently.bz2006-06-294-8/+14
| | | | OKed by: rwatson (some weeks ago)
* - Use suser_cred(9) instead of directly checking cr_uid.pjd2006-06-271-2/+2
| | | | | | | - Change the order of conditions to first verify that we actually need to check for privileges and then eventually check them. Reviewed by: rwatson
* In syncache_respond() do not reply with a MSS that is larger than whatandre2006-06-261-0/+2
| | | | | | the peer announced to us but make it at least tcp_minmss in size. Sponsored by: TCP/IP Optimization Fundraise 2005
* Some cleanups and janitorial work to tcp_syncache:andre2006-06-263-45/+35
| | | | | | | | | | | | | | | | | o don't assign remote/local host/port information manually between provided struct in_conninfo and struct syncache, bcopy() it instead o rename sc_tsrecent to sc_tsreflect in struct syncache to better capture the purpose of this field o rename sc_request_r_scale to sc_requested_r_scale for ditto reasons o fix IPSEC error case printf's to report correct function name o in syncache_socket() only transpose enhanced tcp options parameters to struct tcpcb when the inpcb doesn't has TF_NOOPT set o in syncache_respond() reorder stack variables o in syncache_respond() remove bogus KASSERT() No functional changes. Sponsored by: TCP/IP Optimization Fundraise 2005
* Some cleanups and janitorial work to tcp_dooptions():andre2006-06-263-41/+58
| | | | | | | | | | | | | | | | o redefine the parameter 'is_syn' to 'flags', add TO_SYN flag and adjust its usage accordingly o update the comments to the tcp_dooptions() invocation in tcp_input():after_listen to reflect reality o move the logic checking the echoed timestamp out of tcp_dooptions() to the only place that uses it next to the invocation described in the previous item o adjust parsing of TCPOPT_SACK_PERMITTED to use the same style as the others o add comments in to struct tcpopt.to_flags #defines No functional changes. Sponsored by: TCP/IP Optimization Fundraise 2005
* Reverse the source/destination parameters to in[6]_pcblookup_hash() inandre2006-06-261-2/+2
| | | | | | syncache_respond() for the #ifdef MAC case. Submitted by: Tai-hwa Liang <avatar-at-mmlab.cse.yzu.edu.tw>
* In tcp6_usr_attach(), return immediately if SS_ISDISCONNECTED, torwatson2006-06-261-4/+2
| | | | | | | avoid dereferencing an uninitialized inp variable. Submitted by: Michiel Boland <michiel at boland dot org> MFC after: 1 month
* Decrement the global syncache counter in syncache_expand() when the entryandre2006-06-251-0/+1
| | | | is removed from the bucket. This fixes the syncache statistics.
* Move the syncookie MD5 context from globals to the stack to make it MP safe.andre2006-06-221-2/+2
|
* - Pullup even when the extention header is unknown, to preventume2006-06-221-1/+13
| | | | | | | | | | | infinite loop with net.inet6.ip6.fw.deny_unknown_exthdrs=0. - Teach ipv6 and ipencap as they appear in an IPv4/IPv6 over IPv6 tunnel. - Test the next extention header even when the routing header type is unknown with net.inet6.ip6.fw.deny_unknown_exthdrs=0. Found by: xcast-fan-club MFC after: 1 week
* Allocate a zero'ed syncache hashtable. mtx_init() tests the suppliedandre2006-06-201-1/+1
| | | | | | | | memory location for already existing/initialized mutexes. With random data in the memory location this fails (ie. after a soft reboot). Reported by: brueffer, YAMAMOTO Shigeru Submitted by: YAMAMOTO Shigeru <shigeru-at-iij.ad.jp>
* When we receive an out-of-window SYN for an "ESTABLISHED" connection,dwmalone2006-06-192-0/+4
| | | | | | | | | ACK the SYN as required by RFC793, rather than ignoring it. NetBSD have had a similar change since 1999. PR: 93236 Submitted by: Grant Edwards <grante@visi.com> MFC after: 1 month
* Remove T/TCP RFC1644 Connection Count comparison macros. They are no longerandre2006-06-181-13/+0
| | | | | | used and needed. Sponsored by: TCP/IP Optimization Fundraise 2005
* Do not access syncache entry before it was allocated for the TF_NOOPT caseandre2006-06-181-3/+4
| | | | | | | in syncache_add(). Found by: Coverity Prevent CID: 1473
* Move all syncache related structures to tcp_syncache.c. They are only usedandre2006-06-182-39/+39
| | | | | | | | there. This unbreaks userland programs that include tcp_var.h. Discussed with: rwatson
* Remove double lock acquisition in syncookie_lookup() which came from lastandre2006-06-181-1/+0
| | | | | | minute conversions to macros. Pointy hat to: andre
* Fix the !INET6 compile.andre2006-06-171-2/+4
| | | | Reported by: alc
* Rearrange fields in struct syncache and syncache_head to make them moreandre2006-06-171-5/+6
| | | | | | cache line friendly. Sponsored by: TCP/IP Optimization Fundraise 2005
* ANSIfy and tidy up comments.andre2006-06-171-52/+23
| | | | Sponsored by: TCP/IP Optimization Fundraise 2005
* Add locking to TCP syncache and drop the global tcpinfo lock as earlyandre2006-06-174-272/+312
| | | | | | | | | | | | | | | | | | as possible for the syncache_add() case. The syncache timer no longer aquires the tcpinfo lock and timeout/retransmit runs can happen in parallel with bucket granularity. On a P4 the additional locks cause a slight degression of 0.7% in tcp connections per second. When IP and TCP input are deserialized and can run in parallel this little overhead can be neglected. The syncookie handling still leaves room for improvement and its random salts may be moved to the syncache bucket head structures to remove the second lock operation currently required for it. However this would be a more involved change from the way syncookies work at the moment. Reviewed by: rwatson Tested by: rwatson, ps (earlier version) Sponsored by: TCP/IP Optimization Fundraise 2005
* Add support of 'tablearg' feature for:oleg2006-06-151-15/+30
| | | | | | | | | | | | | | | | | | | - 'tag' & 'untag' action parameters. - 'tagged' & 'limit' rule options. Rule examples: pipe 1 tag tablearg ip from table(1) to any allow ip from any to table(2) tagged tablearg allow tcp from table(3) to any 25 setup limit src-addr tablearg sbin/ipfw/ipfw2.c: 1) new macros GET_UINT_ARG - support of 'tablearg' keyword, argument range checking. PRINT_UINT_ARG - support of 'tablearg' keyword. 2) strtoport(): do not silently truncate/accept invalid port list expressions like: '1,2-abc' or '1,2-3-4' or '1,2-3x4'. style(9) cleanup. Approved by: glebius (mentor) MFC after: 1 month
* install_state(): style(9) cleanupoleg2006-06-151-33/+36
| | | | | Approved by: glebius (mentor) MFC after: 1 month
* Enable proxy ARP answers on any of the bridged interfaces if proxy recordthompsa2006-06-091-3/+6
| | | | | | | | belongs to another interface within the bridge group. PR: kern/94408 Submitted by: Eygene A. Ryabinkin MFC after: 1 month
* install_state() should properly initialize 'addr_type' field of newly createdoleg2006-06-081-0/+1
| | | | | | | | | flows for O_LIMIT rules. Otherwise 'ipfw -d show' is unable to display PARENT rules properly. (This bug was exposed by ipfw2.c rev.1.90) Approved by: glebius (mentor) MFC after: 2 weeks
* Fix following rules: pipe X (tag|altq) Y ...oleg2006-06-081-0/+4
| | | | | Approved by: glebius (mentor) MFC after: 2 weeks
* Push acquisition of pcbinfo lock out of tcp_usr_attach() intorwatson2006-06-041-6/+8
| | | | | | | tcp_attach() after the call to soreserve(), as it doesn't require the global lock. Rearrange inpcb locking here also. MFC after: 1 month
* When entering a timer on a tcpcb, don't continue processing if it has beenrwatson2006-06-031-9/+14
| | | | | | | | | | | | dropped. This prevents a bug introduced during the socket/pcb refcounting work from occuring, in which occasionally the retransmit timer may fire after a connection has been reset, resulting in the resulting R|A TCP packet having a source port of 0, as the port reservation has been released. While here, fixing up some RUNLOCK->WUNLOCK bugs. MFC after: 1 month
* Acquire udbinfo lock after call to soreserve() rather than before, as itrwatson2006-06-031-4/+2
| | | | | | | is not required. This simplifies error-handling, and reduces the time that this lock is held. MFC after: 1 month
* Fix the following bpf(4) race condition which can result in a panic:csjp2006-06-022-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load. Summary of changes: - Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present Now what happens is: (1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process From the attach/detach side: (1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1). [1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets. In collaboration with: sam@ MFC after: 1 month
* Minor restyling and cleanup around ipport_tick().rwatson2006-06-021-11/+9
| | | | MFC after: 1 month
* Implement internal (i.e. inside kernel) packet tagging using mbuf_tags(9).oleg2006-05-242-1/+63
| | | | | | | | | | | Since tags are kept while packet resides in kernelspace, it's possible to use other kernel facilities (like netgraph nodes) for altering those tags. Submitted by: Andrey Elsukov <bu7cher at yandex dot ru> Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru> Approved by: glebius (mentor) Idea from: OpenBSD PF MFC after: 1 month
* o In udp|rip_disconnect() acquire a socket lock before the socketmaxim2006-05-212-2/+6
| | | | | | | state modification. To prevent races do that while holding inpcb lock. Reviewed by: rwatson
* o Add missed error check: in ip_ctloutput() sooptcopyin() returns amaxim2006-05-211-0/+4
| | | | | | | result but we never examine it. Reviewed by: rwatson MFC after: 2 weeks
* Initialize the new members of struct ip_moptions asbms2006-05-181-0/+4
| | | | | | | | | | | | | a defensive programming measure. Note that whilst these members are not used by the ip_output() path, we are passing an instance of struct ip_moptions here which is declared on the stack (which could be considered a bad thing). ip_output() does not consume struct ip_moptions, but in case it does in future, declare an in_multi vector on the stack too to behave more like ip_findmoptions() does.
* Since m_pullup() can return a new mbuf, change gre_input2() toglebius2006-05-161-23/+23
| | | | | | | return mbuf back to gre_input(). If the former returns mbuf back to the latter, then pass it to raw_input(). Coverity ID: 829
* - Backout one line from 1.78. The tp can be freed by tcp_drop().glebius2006-05-161-3/+2
| | | | | | - Style next line. Coverity ID: 912
* o In rip_disconnect() do not call rip_abort(), just mark a socketmaxim2006-05-151-1/+11
| | | | | | | | | | as not connected. In soclose() case rip_detach() will kill inpcb for us later. It makes rawconnect regression test do not panic a system. Reviewed by: rwatson X-MFC after: with all 1th April inpcb changes
* Use only lower 64bit of src/dest (and src/dest port) for hashing of IPv6mlaier2006-05-141-4/+4
| | | | | | | | | | connections and get rid of the flow_id as it is not guaranteed to be stable some (most?) current implementations seem to just zero it out. PR: kern/88664 Reported by: jylefort Submitted by: Joost Bekkers (w/ changes) Tested by "regisr" <regisrApoboxDcom>
* Fix a long-standing limitation in IPv4 multicast group membership.bms2006-05-143-4/+40
| | | | | | | | | | | | | | By making the imo_membership array a dynamically allocated vector, this minimizes disruption to existing IPv4 multicast code. This change breaks the ABI for the kernel module ip_mroute.ko, and may cause a small amount of churn for folks working on the IGMPv3 merge. Previously, sockets were subject to a compile-time limitation on the number of IPv4 group memberships, which was hard-coded to 20. The imo_membership relationship, however, is 1:1 with regards to a tuple of multicast group address and interface address. Users who ran routing protocols such as OSPF ran into this limitation on machines with a large system interface tree.
* Remove ip6fw. Since ipfw has full functional IPv6 support now and - inmlaier2006-05-121-1/+0
| | | | contrast to ip6fw - is properly lockes, it is time to retire ip6fw.
OpenPOWER on IntegriCloud