summaryrefslogtreecommitdiffstats
path: root/sys/netinet/udp_usrreq.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix IP_SENDSRCADDR semantics.bms2007-03-081-4/+11
| | | | | | | | | | | | | | | | * To use this option with a UDP socket, it must be bound to a local port, and INADDR_ANY, to disallow possible collisions with existing udp inpcbs bound to the same port on other interfaces at send time. * If the socket is bound to INADDR_ANY, specifying IP_SENDSRCADDR with INADDR_ANY will be rejected as it is ambiguous. * If the socket is bound to an address other than INADDR_ANY, specifying IP_SENDSRCADDR with INADDR_ANY will be disallowed by in_pcbbind_setup(). Reviewed by: silence on -net Tested with: src/tools/regression/netinet/ipbroadcast MFC after: 4 days
* Rename two identically named log_in_vain variables: tcp_input.c's staticrwatson2007-02-201-3/+3
| | | | | | | log_in_vain to tcp_log_in_vain, and udp_usrreq's global log_in_vain to udp_log_in_vain. MFC after: 1 week
* Gratuitous UDP restyling toward style(9) in 7.x.rwatson2007-02-201-142/+135
|
* o One more typo in the comment.maxim2007-01-061-1/+1
| | | | | PR: kern/107609 Submitted by: Dr. Markus Waldeck
* Fix typo in comment.imp2007-01-011-1/+1
| | | | Submitted by: remko
* Add comment about udp checksums being off in BSD 4.2 compatibility mode.imp2006-12-311-1/+8
| | | | | Submitted by: Dr. Markus Waldeck PR: kern/106657
* Some whitespace nits and remove a few casts.jhb2006-12-291-1/+2
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-1/+3
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Check inp_flags instead of inp_vflag for INP_ONESBCAST flag.andre2006-09-061-2/+2
| | | | | | | PR: kern/99558 Tested by: Andrey V. Elsukov <bu7cher-at-yandex.ru> Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days
* Fix typo in comment.thomas2006-09-041-1/+1
|
* Change semantics of socket close and detach. Add a new protocol switchrwatson2006-07-211-4/+28
| | | | | | | | | | | | | | | | | | | function, pru_close, to notify protocols that the file descriptor or other consumer of a socket is closing the socket. pru_abort is now a notification of close also, and no longer detaches. pru_detach is no longer used to notify of close, and will be called during socket tear-down by sofree() when all references to a socket evaporate after an earlier call to abort or close the socket. This means detach is now an unconditional teardown of a socket, whereas previously sockets could persist after detach of the protocol retained a reference. This faciliates sharing mutexes between layers of the network stack as the mutex is required during the checking and removal of references at the head of sofree(). With this change, pru_detach can now assume that the mutex will no longer be required by the socket layer after completion, whereas before this was not necessarily true. Reviewed by: gnn
* Fix race conditions on enumerating pcb lists by moving the initializationups2006-07-181-4/+14
| | | | | | | | | | | | | | | ( and where appropriate the destruction) of the pcb mutex to the init/finit functions of the pcb zones. This allows locking of the pcb entries and race condition free comparison of the generation count. Rearrange locking a bit to avoid extra locking operation to update the generation count in in_pcballoc(). (in_pcballoc now returns the pcb locked) I am planning to convert pcb list handling from a type safe to a reference count model soon. ( As this allows really freeing the PCBs) Reviewed by: rwatson@, mohans@ MFC after: 1 week
* Acquire udbinfo lock after call to soreserve() rather than before, as itrwatson2006-06-031-4/+2
| | | | | | | is not required. This simplifies error-handling, and reduces the time that this lock is held. MFC after: 1 month
* o In udp|rip_disconnect() acquire a socket lock before the socketmaxim2006-05-211-1/+3
| | | | | | | state modification. To prevent races do that while holding inpcb lock. Reviewed by: rwatson
* Modify UDP to use sosend_dgram() instead of sosend(). This allowsrwatson2006-05-061-0/+1
| | | | | | | | | | | for signicantly optimized UDP socket I/O when using a single UDP socket from many threads or processes that share it, by avoiding significant locking and other overhead in the general sosend() path that isn't necessary for simple datagram sockets. Specifically, this change results in a significant performance improvement for threaded name service in BIND9 under load. Suggested by: Jinmei_Tatsuya at isc dot org
* Rename 'last' to 'inp' in udp_append(): the name 'last' is due torwatson2006-04-251-15/+15
| | | | | | | | the fact that the loop through inpcb's in udp_input() tracks the last inpcb while looping. We keep that name in the calling loop but not in the delivery routine itself. MFC after: 3 months
* Allow for nmbclusters and maxsockets to be increased via sysctl.ps2006-04-211-0/+10
| | | | | An eventhandler is used to update all the various zones that depend on these values.
* Update in_pcb-derived basic socket types following changes torwatson2006-04-011-35/+16
| | | | | | | | | | | | | | | | | | | | | pru_abort(), pru_detach(), and in_pcbdetach(): - Universally support and enforce the invariant that so_pcb is never NULL, converting dozens of unnecessary NULL checks into assertions, and eliminating dozens of unnecessary error handling cases in protocol code. - In some cases, eliminate unnecessary pcbinfo locking, as it is no longer required to ensure so_pcb != NULL. For example, in protocol shutdown methods, and in raw IP send. - Abort and detach protocol switch methods no longer return failures, nor attempt to free sockets, as the socket layer does this. - Invoke in_pcbfree() after in_pcbdetach() in order to free the detached in_pcb structure for a socket. MFC after: 3 months
* Chance protocol switch method pru_detach() so that it returns voidrwatson2006-04-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | rather than an error. Detaches do not "fail", they other occur or the protocol flags SS_PROTOREF to take ownership of the socket. soclose() no longer looks at so_pcb to see if it's NULL, relying entirely on the protocol to decide whether it's time to free the socket or not using SS_PROTOREF. so_pcb is now entirely owned and managed by the protocol code. Likewise, no longer test so_pcb in other socket functions, such as soreceive(), which have no business digging into protocol internals. Protocol detach routines no longer try to free the socket on detach, this is performed in the socket code if the protocol permits it. In rts_detach(), no longer test for rp != NULL in detach, and likewise in other protocols that don't permit a NULL so_pcb, reduce the incidence of testing for it during detach. netinet and netinet6 are not fully updated to this change, which will be in an upcoming commit. In their current state they may leak memory or panic. MFC after: 3 months
* Change protocol switch pru_abort() API so that it returns void ratherrwatson2006-04-011-3/+2
| | | | | | | | | | | | | | than an int, as an error here is not meaningful. Modify soabort() to unconditionally free the socket on the return of pru_abort(), and modify most protocols to no longer conditionally free the socket, since the caller will do this. This commit likely leaves parts of netinet and netinet6 in a situation where they may panic or leak memory, as they have not are not fully updated by this commit. This will be corrected shortly in followup commits to these components. MFC after: 3 months
* Implement 'ipfw fwd laddr,port' feature for UDP. According to ipfw(8)glebius2006-01-241-0/+20
| | | | | | it should work, however it never did. People expect it to work. PR: kern/90834
* Remove dead code: 'opts' is not used in udp_append(), only in udp_input(),rwatson2006-01-141-3/+0
| | | | | | | so no need to assign it to NULL or conditionally free it. Found with: Coverity Prevent(tm) MFC after: 3 days
* Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() tomux2005-12-141-2/+2
| | | | | | | match the type of the variable they are exporting. Spotted by: Thomas Hurst <tom@hur.st> MFC after: 3 days
* Consolidate all IP Options handling functions into ip_options.[ch] andandre2005-11-181-0/+1
| | | | | | | | | | | | | | | | | | | | include ip_options.h into all files making use of IP Options functions. From ip_input.c rev 1.306: ip_dooptions(struct mbuf *m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt) From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb *inp, int optname, struct mbuf *m) No functional changes in this commit. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
* o INP_ONESBCAST is inpcb.inp_vflag flag not inp_flags. The confusionmaxim2005-10-121-2/+2
| | | | | | | | | with IP_PORTRANGE_HIGH leads to the incorrect checksum calculation. PR: kern/87306 Submitted by: Rickard Lind Reviewed by: bms MFC after: 2 weeks
* Implement IP_DONTFRAG IP socket option enabling the Don't Fragmentandre2005-09-261-0/+9
| | | | | | | | | | | | flag on IP packets. Currently this option is only repected on udp and raw ip sockets. On tcp sockets the DF flag is controlled by the path MTU discovery option. Sending a packet larger than the MTU size of the egress interface returns an EMSGSIZE error. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005
* Add socketoption IP_MINTTL. May be used to set the minimum acceptableandre2005-08-221-0/+3
| | | | | | | | | | | | | | | | | TTL a packet must have when received on a socket. All packets with a lower TTL are silently dropped. Works on already connected/connecting and listening sockets for RAW/UDP/TCP. This option is only really useful when set to 255 preventing packets from outside the directly connected networks reaching local listeners on sockets. Allows userland implementation of 'The Generalized TTL Security Mechanism (GTSM)' according to RFC3682. Examples of such use include the Cisco IOS BGP implementation command "neighbor ttl-security". MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005
* De-spl UDP.rwatson2005-06-011-31/+5
| | | | MFC after: 3 days
* If we are going tocperciva2005-05-061-0/+1
| | | | | | | | | | 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem
* eliminate extraneous null ptr checkssam2005-03-291-2/+2
| | | | Noticed by: Coverity Prevent analysis tool
* In in_pcbconnect_setup() jailed sockets are treated specially: if localglebius2005-02-221-0/+5
| | | | | | | | | | | | | | | address is not supplied, then jail IP is choosed and in_pcbbind() is called. Since udp_output() does not save local addr after call to in_pcbconnect_setup(), in_pcbbind() is called for each packet, and this is incorrect. So, we shall treat jailed sockets specially in udp_output(), we will save their local address. This fixes a long standing bug with broken sendto() system call in jails. PR: kern/26506 Reviewed by: rwatson MFC after: 2 weeks
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Initialize struct pr_userreqs in new/sparse style and fill in commonphk2004-11-081-5/+12
| | | | | | default elements in net_init_domain(). This makes it possible to grep these structures and see any bogosities.
* Hide udp_in6 behind #ifdef INET6phk2004-11-041-0/+2
|
* Until this change, the UDP input code used global variables udp_in,rwatson2004-11-041-57/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | udp_in6, and udp_ip6 to pass socket address state between udp_input(), udp_append(), and soappendaddr_locked(). While file in the default configuration, when running with multiple netisrs or direct ithread dispatch, this can result in races wherein user processes using recvmsg() get back the wrong source IP/port. To correct this and related races: - Eliminate udp_ip6, which is believed to be generated but then never used. Eliminate ip_2_ip6_hdr() as it is now unneeded. - Eliminate setting, testing, and existence of 'init' status fields for the IPv6 structures. While with multiple UDP delivery this could lead to amortization of IPv4 -> IPv6 conversion when delivering an IPv4 UDP packet to an IPv6 socket, it added substantial complexity and side effects. - Move global structures into the stack, declaring udp_in in udp_input(), and udp_in6 in udp_append() to be used if a conversion is required. Pass &udp_in into udp_append(). - Re-annotate comments to reflect updates. With this change, UDP appears to operate correctly in the presence of substantial inbound processing parallelism. This solution avoids introducing additional synchronization, but does increase the potential stack depth. Discovered by: kris (Bug Magnet) MFC after: 3 weeks
* Don't release the udbinfo lock until after the last use of UDP inpcbrwatson2004-10-121-3/+3
| | | | | | | | in udp_input(), since the udbinfo lock is used to prevent removal of the inpcb while in use (i.e., as a form of reference count) in the in-bound path. RELENG_5 candidate.
* fix up socket/ip layer violation... don't assume/know thatjmg2004-09-051-1/+5
| | | | SO_DONTROUTE == IP_ROUTETOIF and SO_BROADCAST == IP_ALLOWBROADCAST...
* When sliding the m_data pointer forward, update m_pktrhdr.len as wellrwatson2004-08-221-1/+3
| | | | | | | | | as m_len, or the pkthdr length will be inconsistent with the actual length of data in the mbuf chain. The symptom of this occuring was "out of data" warnings from in_cksum_skip() on large UDP packets sent via the loopback interface. Foot shot: green
* When prepending space onto outgoing UDP datagram payloads to hold therwatson2004-08-211-4/+7
| | | | | | | | | UDP/IP header, make sure that space is also allocated for the link layer header. If an mbuf must be allocated to hold the UDP/IP header (very likely), then this will avoid an additional mbuf allocation at the link layer. This trick is also used by TCP and other protocols to avoid extra calls to the mbuf allocator in the ethernet (and related) output routines.
* Push down pcbinfo and inpcb locking from udp_send() into udp_output().rwatson2004-08-191-25/+35
| | | | | | | This provides greater context for the locking and allows us to avoid locking the pcbinfo structure if not binding operations will take place (i.e., already bound, connected, and no expliti sendto() address).
* White space cleanup for netinet before branch:rwatson2004-08-161-12/+12
| | | | | | | | | | | - Trailing tab/space cleanup - Remove spurious spaces between or before tabs This change avoids touching files that Andre likely has in his working set for PFIL hooks changes for IPFW/DUMMYNET. Approved by: re (scottl) Submitted by: Xin LI <delphij@frontfree.net>
* When udp_send() fails, make sure to free the control mbufs as well asrwatson2004-08-121-0/+2
| | | | | the data mbuf. This was done in most error cases, but not the case where the inpcb pointer is surprisingly NULL.
* Backout removal of UMA_ZONE_NOFREE flag for all zones which are establishedandre2004-08-111-1/+1
| | | | | | | | | for structures with timers in them. It might be that a timer might fire even when the associated structure has already been free'd. Having type- stable storage in this case is beneficial for graceful failure handling and debugging. Discussed with: bosko, tegge, rwatson
* Remove the UMA_ZONE_NOFREE flag to all uma_zcreate() calls in the IP andandre2004-08-111-1/+1
| | | | | TCP code. This flag would have prevented giving back excessive free slabs to the global pool after a transient peak usage.
* When iterating the UDP inpcb list processing an inbound broadcastrwatson2004-08-061-10/+9
| | | | | | | | | | | or multicast packet, we don't need to acquire the inpcb mutex unless we are actually using inpcb fields other than the bound port and address. Since we hold the pcbinfo lock already, these can't change. Defer acquiring the inpcb mutex until we have a high chance of a match. This avoids about 120 mutex operations per UDP broadcast packet received on one of my work systems. Reviewed by: sam
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-1/+1
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* Reduce the number of unnecessary unlock-relocks on socket buffer mutexesrwatson2004-06-261-2/+7
| | | | | | | | | | | | | | | | | | | | associated with performing a wakeup on the socket buffer: - When performing an sbappend*() followed by a so[rw]wakeup(), explicitly acquire the socket buffer lock and use the _locked() variants of both calls. Note that the _locked() sowakeup() versions unlock the mutex on return. This is done in uipc_send(), divert_packet(), mroute socket_send(), raw_append(), tcp_reass(), tcp_input(), and udp_append(). - When the socket buffer lock is dropped before a sowakeup(), remove the explicit unlock and use the _locked() sowakeup() variant. This is done in soisdisconnecting(), soisdisconnected() when setting the can't send/ receive flags and dropping data, and in uipc_rcvd() which adjusting back-pressure on the sockets. For UNIX domain sockets running mpsafe with a contention-intensive SMP mysql benchmark, this results in a 1.6% query rate improvement due to reduce mutex costs.
* Reverse a patch which has no effect on -CURRENT and should probably bebms2004-06-161-7/+1
| | | | | | | applied directly to -STABLE. Noticed by: iedowse Pointy hat to: bms
* Disconnect a temporarily-connected UDP socket in out-of-mbufs case. Thisbms2004-06-161-1/+7
| | | | | | | | | | | | | | | | | | | | | | fixes the problem of UDP sockets getting wedged in a connected state (and bound to their destination) under heavy load. Temporary bind/connect should probably be deleted in future as an optimization, as described in "A Faster UDP" [Partridge/Pink 1993]. Notes: - INP_LOCK() is already held in udp_output(). The connection is in effect happening at a layer lower than the socket layer, therefore in theory socket locking should not be needed. - Inlining the in_pcbdisconnect() operation buys us nothing (in the case of the current state of the code), as laddr is not part of the inpcb hash or the udbinfo hash. Therefore there should be no need to rehash after restoring laddr in the error case (this was a concern of the original author of the patch). PR: kern/41765 Requested by: gnn Submitted by: Jinmei Tatuya (with cleanups) Tested by: spray(8)
OpenPOWER on IntegriCloud