summaryrefslogtreecommitdiffstats
path: root/sys/netinet/in_pcb.c
Commit message (Collapse)AuthorAgeFilesLines
* Add a new privilege, PRIV_NETINET_REUSEPORT, which will replace superuserrwatson2007-04-101-1/+2
| | | | | checks to see whether bind() can reuse a port/address combination while it's already in use (for some definition of use).
* #ifdef INET6 printing of inpcb IPv6 addresses in DDB. Patch committedrwatson2007-02-181-0/+4
| | | | | | with minor adjustments. Submitted by: Florian C. Smeets <flo at kasimir dot com>
* Add "show inpcb", "show tcpcb" DDB commands, which should come in handyrwatson2007-02-171-1/+251
| | | | for debugging sblock and other network panics.
* Some whitespace nits and remove a few casts.jhb2006-12-291-1/+1
|
* Consistently use #ifdef INET6 rather than mixing and matching withrwatson2006-11-301-21/+19
| | | | | | | | | | #if defined(INET6). Don't comment the end of short #ifdef blocks. Comment cleanup. Line wrap.
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-2/+6
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* o Backout rev. 1.125 of in_pcb.c. It appeared to behave extremelyglebius2006-09-061-16/+0
| | | | | | | | | | | | | | | | | | | | bad under high load. For example with 40k sockets and 25k tcptw entries, connect() syscall can run for seconds. Debugging showed that it iterates the cycle millions times and purges thousands of tcptw entries at a time. Besides practical unusability this change is architecturally wrong. First, in_pcblookup_local() is used in connect() and bind() syscalls. No stale entries purging shouldn't be done here. Second, it is a layering violation. o Return back the tcptw purging cycle to tcp_timer_2msl_tw(), that was removed in rev. 1.78 by rwatson. The commit log of this revision tells nothing about the reason cycle was removed. Now we need this cycle, since major cleaner of stale tcptw structures is removed. o Disable probably necessary, but now unused tcp_twrecycleable() function. Reviewed by: ru
* Fix race conditions on enumerating pcb lists by moving the initializationups2006-07-181-5/+9
| | | | | | | | | | | | | | | ( and where appropriate the destruction) of the pcb mutex to the init/finit functions of the pcb zones. This allows locking of the pcb entries and race condition free comparison of the generation count. Rearrange locking a bit to avoid extra locking operation to update the generation count in in_pcballoc(). (in_pcballoc now returns the pcb locked) I am planning to convert pcb list handling from a type safe to a reference count model soon. ( As this allows really freeing the PCBs) Reviewed by: rwatson@, mohans@ MFC after: 1 week
* Use INPLOOKUP_WILDCARD instead of just 1 more consistently.bz2006-06-291-1/+1
| | | | OKed by: rwatson (some weeks ago)
* - Use suser_cred(9) instead of directly checking cr_uid.pjd2006-06-271-2/+2
| | | | | | | - Change the order of conditions to first verify that we actually need to check for privileges and then eventually check them. Reviewed by: rwatson
* Minor restyling and cleanup around ipport_tick().rwatson2006-06-021-11/+9
| | | | MFC after: 1 month
* In in_pcbdrop(), fix !INVARIANTS build.marcel2006-04-251-2/+1
|
* Abstract inpcb drop logic, previously just setting of INP_DROPPED in TCP,rwatson2006-04-251-0/+28
| | | | | | | | | | into in_pcbdrop(). Expand logic to detach the inpcb from its bound address/port so that dropping a TCP connection releases the inpcb resource reservation, which since the introduction of socket/pcb reference count updates, has been persisting until the socket closed rather than being released implicitly due to prior freeing of the inpcb on TCP drop. MFC after: 3 months
* Assert the inpcb lock when rehashing an inpcb.rwatson2006-04-221-0/+5
| | | | | | Improve consistency of style around some current assertions. MFC after: 3 months
* Remove pcbinfo locking from in_setsockaddr() and in_setpeeraddr();rwatson2006-04-221-6/+4
| | | | | | | | | | holding the inpcb lock is sufficient to prevent races in reading the address and port, as both the inpcb lock and pcbinfo lock are required to change the address/port. Improve consistency of spelling in assertions about inp != NULL. MFC after: 3 months
* Before dereferencing intotw() when INP_TIMEWAIT, check for inp_ppcb beingrwatson2006-04-041-4/+14
| | | | | | | | | | NULL. We currently do allow this to happen, but may want to remove that possibility in the future. This case can occur when a socket is left open after TCP wraps up, and the timewait state is recycled. This will be cleaned up in the future. Found by: Kazuaki Oda <kaakun at highway dot ne dot jp> MFC after: 3 months
* Change inp_ppcb from caddr_t to void *, fix/remove associated relatedrwatson2006-04-031-2/+4
| | | | | | | | | | | | | | | | | | casts. Consistently use intotw() to cast inp_ppcb pointers to struct tcptw * pointers. Consistently use intotcpcb() to cast inp_ppcb pointers to struct tcpcb * pointers. Don't assign tp to the results to intotcpcb() during variable declation at the top of functions, as that is before the asserts relating to locking have been performed. Do this later in the function after appropriate assertions have run to allow that operation to be conisdered safe. MFC after: 3 months
* Break out in_pcbdetach() into two functions:rwatson2006-04-011-18/+21
| | | | | | | | | | | | | | | | | | - in_pcbdetach(), which removes the link between an inpcb and its socket. - in_pcbfree(), which frees a detached pcb. Unlike the previous in_pcbdetach(), neither of these functions will attempt to conditionally free the socket, as they are responsible only for managing in_pcb memory. Mirror these changes into in6_pcbdetach() by breaking it into in6_pcbdetach() and in6_pcbfree(). While here, eliminate undesired checks for NULL inpcb pointers in sockets, as we will now have as an invariant that sockets will always have valid so_pcb pointers. MFC after: 3 months
* In in_pcbconnect_setup() reduce code duplication and use ip_rtaddr()andre2006-02-161-16/+10
| | | | | | | to find the outgoing interface for this connection. Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 2 weeks
* Never select the PCB that has INP_IPV6 flag and is bound to :: ifume2006-02-041-1/+23
| | | | | | | | | we have another PCB which is bound to 0.0.0.0. If a PCB has the INP_IPV6 flag, then we set its cost higher than IPv4 only PCBs. Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME MFC after: 1 week
* Convert remaining functions to ANSI C function declarations; removerwatson2006-01-221-77/+33
| | | | | | 'register' where present. MFC after: 1 week
* Remove no-op spl references in in_pcb.c, since in_pcb locking has beenrwatson2005-07-191-15/+3
| | | | | | | basically complete for several years now. Update one spl comment to reference the locking strategy. MFC after: 3 days
* Commit correct version of previous commit (in_pcb.c:1.164). Use therwatson2005-06-011-2/+2
| | | | | | local variables as currently named. MFC after: 7 days
* Assert pcbinfo lock in in_pcbdisconnect() and in_pcbdetach(), as therwatson2005-06-011-0/+3
| | | | | | global pcb lists are modified. MFC after: 7 days
* o Tweak the comment a bit.maxim2005-04-081-1/+1
|
* o Disable random port allocation when ip.portrange.first ==maxim2005-04-081-0/+6
| | | | | | | | | | | | ip.portrange.last and there is the only port for that because: a) it is not wise; b) it leads to a panic in the random ip port allocation code. In general we need to disable ip port allocation randomization if the last - first delta is ridiculous small. PR: kern/79342 Spotted by: Anjali Kulkarni Glanced at by: silby MFC after: 2 weeks
* o Document net.inet.ip.portrange.random* sysctls.maxim2005-03-231-7/+10
| | | | | | | | o Correct a comment about random port allocation threshold implementation. Reviewed by: silby, ru MFC after: 3 days
* We can make code simplier after last change.glebius2005-02-221-2/+2
| | | | Noticed by: Andrew Thompson
* In in_pcbconnect_setup() remove a check that route points atglebius2005-02-221-4/+2
| | | | | | | | | loopback interface. Nobody have explained me sense of this check. It breaks connect() system call to a destination address which is loopback routed (e.g. blackholed). Reviewed by: silence on net@ MFC after: 2 weeks
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Port randomization leads to extremely fast port reuse at highsilby2005-01-021-4/+52
| | | | | | | | | | | | | | | | | connection rates, which is causing problems for some users. To retain the security advantage of random ports and ensure correct operation for high connection rate users, disable port randomization during periods of high connection rates. Whenever the connection rate exceeds randomcps (10 by default), randomization will be disabled for randomtime (45 by default) seconds. These thresholds may be tuned via sysctl. Many thanks to Igor Sysoev, who proved the necessity of this change and tested many preliminary versions of the patch. MFC After: 20 seconds
* Push acquisition of the accept mutex out of sofree() into the callerrwatson2004-10-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (sorele()/sotryfree()): - This permits the caller to acquire the accept mutex before the socket mutex, avoiding sofree() having to drop the socket mutex and re-order, which could lead to races permitting more than one thread to enter sofree() after a socket is ready to be free'd. - This also covers clearing of the so_pcb weak socket reference from the protocol to the socket, preventing races in clearing and evaluation of the reference such that sofree() might be called more than once on the same socket. This appears to close a race I was able to easily trigger by repeatedly opening and resetting TCP connections to a host, in which the tcp_close() code called as a result of the RST raced with the close() of the accepted socket in the user process resulting in simultaneous attempts to de-allocate the same socket. The new locking increases the overhead for operations that may potentially free the socket, so we will want to revise the synchronization strategy here as we normalize the reference counting model for sockets. The use of the accept mutex in freeing of sockets that are not listen sockets is primarily motivated by the potential need to remove the socket from the incomplete connection queue on its parent (listen) socket, so cleaning up the reference model here may allow us to substantially weaken the synchronization requirements. RELENG_5_3 candidate. MFC after: 3 days Reviewed by: dwhite Discussed with: gnn, dwhite, green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>
* Assign so_pcb to NULL rather than 0 as it's a pointer.rwatson2004-09-291-1/+1
| | | | Spotted by: dwhite
* In in_pcbrehash(), do assert the inpcb lock as well as the pcbinfo lock.rwatson2004-08-191-1/+1
|
* Assert the locks of inpcbinfo's and inpcb's passed into in_pcbconnect()rwatson2004-08-111-0/+6
| | | | | and in_pcbconnect_setup(), since these functions frob the port and address state of inpcbs.
* Disallow a particular kind of port theft described by the following scenario:yar2004-07-281-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Alice is too lazy to write a server application in PF-independent manner. Therefore she knocks up the server using PF_INET6 only and allows the IPv6 socket to accept mapped IPv4 as well. An evil hacker known on IRC as cheshire_cat has an account in the same system. He starts a process listening on the same port as used by Alice's server, but in PF_INET. As a consequence, cheshire_cat will distract all IPv4 traffic supposed to go to Alice's server. Such sort of port theft was initially enabled by copying the code that implemented the RFC 2553 semantics on IPv4/6 sockets (see inet6(4)) for the implied case of the same owner for both connections. After this change, the above scenario will be impossible. In the same setting, the user who attempts to start his server last will get EADDRINUSE. Of course, using IPv4 mapped to IPv6 leads to security complications in the first place, but there is no reason to make it even more unsafe. This change doesn't apply to KAME since it affects a FreeBSD-specific part of the code. It doesn't modify the out-of-box behaviour of the TCP/IP stack either as long as mapping IPv4 to IPv6 is off by default. MFC after: 1 month
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-2/+2
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* o connect(2): if there is no a route to the destinationmaxim2004-06-161-3/+1
| | | | | | | | do not pick up the first local ip address for the source ip address, return ENETUNREACH instead. Submitted by: Gleb Smirnoff Reviewed by: -current (silence)
* Socket MAC labels so_label and so_peerlabel are now protected byrwatson2004-06-131-1/+4
| | | | | | | | | | | | | SOCK_LOCK(so): - Hold socket lock over calls to MAC entry points reading or manipulating socket labels. - Assert socket lock in MAC entry point implementations. - When externalizing the socket label, first make a thread-local copy while holding the socket lock, then release the socket lock to externalize to userspace.
* Extend coverage of SOCK_LOCK(so) to include so_count, the socketrwatson2004-06-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | reference count: - Assert SOCK_LOCK(so) macros that directly manipulate so_count: soref(), sorele(). - Assert SOCK_LOCK(so) in macros/functions that rely on the state of so_count: sofree(), sotryfree(). - Acquire SOCK_LOCK(so) before calling these functions or macros in various contexts in the stack, both at the socket and protocol layers. - In some cases, perform soisdisconnected() before sotryfree(), as this could result in frobbing of a non-present socket if sotryfree() actually frees the socket. - Note that sofree()/sotryfree() will release the socket lock even if they don't free the socket. Submitted by: sam Sponsored by: FreeBSD Foundation Obtained from: BSD/OS
* When checking for possible port theft, skip over a TCP inpcbyar2004-05-201-7/+3
| | | | | | | | | | | | | | unless it's in the closed or listening state (remote address == INADDR_ANY). If a TCP inpcb is in any other state, it's impossible to steal its local port or use it for port theft. And if there are both closed/listening and connected TCP inpcbs on the same localIP:port couple, the call to in_pcblookup_local() will find the former due to the design of that function. No objections raised in: -net, -arch MFC after: 1 month
* Wrap two long lines in the previous commit.silby2004-04-231-2/+4
|
* Take out an unneeded variable I forgot to remove in the last commit,silby2004-04-221-2/+3
| | | | and make two small whitespace fixes so that diffs vs rev 1.142 are minimal.
* Simplify random port allocation, and add net.inet.ip.portrange.randomized,silby2004-04-221-27/+13
| | | | | | which can be used to turn off randomized port allocation if so desired. Requested by: alfred
* Switch from using sequential to random ephemeral port allocation,silby2004-04-201-6/+28
| | | | | | | | | implementation taken directly from OpenBSD. I've resisted committing this for quite some time because of concern over TIME_WAIT recycling breakage (sequential allocation ensures that there is a long time before ports are recycled), but recent testing has shown me that my fears were unwarranted.
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* Fixed misspelling of IPPORT_MAX as USHRT_MAX. Don't include <sys/limits.h>bde2004-04-061-9/+9
| | | | | | | | | to implement this mistake. Fixed some nearby style bugs (initialization in declaration, misformatting of this initialization, missing blank line after the declaration, and comparision of the non-boolean result of the initialization with 0 using "!". In KNF, "!" is not even used to compare booleans with 0).
* Reduce 'td' argument to 'cred' (struct ucred) argument in those functions:pjd2004-03-271-25/+24
| | | | | | | | | | | | | | - in_pcbbind(), - in_pcbbind_setup(), - in_pcbconnect(), - in_pcbconnect_setup(), - in6_pcbbind(), - in6_pcbconnect(), - in6_pcbsetport(). "It should simplify/clarify things a great deal." --rwatson Requested by: rwatson Reviewed by: rwatson, ume
* Remove unused argument.pjd2004-03-271-2/+1
| | | | Reviewed by: ume
OpenPOWER on IntegriCloud