summaryrefslogtreecommitdiffstats
path: root/sys/kern/uipc_socket.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove any possibility of hiwat-related race conditions by changinggreen2000-08-291-2/+2
| | | | | | | the chgsbsize() call to use a "subject" pointer (&sb.sb_hiwat) and a u_long target to set it to. The whole thing is splnet(). This fixes a problem that jdp has been able to provoke.
* Make the kqueue socket read filter honor the SO_RCVLOWAT value.jlemon2000-08-071-1/+1
| | | | Spotted by: "Steve M." <stevem@redlinenetworks.com>
* only allow accept filter modifications on listening socketsalfred2000-07-201-0/+8
| | | | Submitted by: ps
* fix races in the uidinfo subsystem, several problems existed:alfred2000-06-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | 1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle. fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already. 2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races. fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure. 3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity. Reviewed by: green
* return of the accept filter part IIalfred2000-06-201-0/+101
| | | | | | | | | | | accept filters are now loadable as well as able to be compiled into the kernel. two accept filters are provided, one that returns sockets when data arrives the other when an http request is completed (doesn't work with 0.9 requests) Reviewed by: jmg
* backout accept optimizations.alfred2000-06-181-4/+0
| | | | Requested by: jmg, dcs, jdp, nate
* add socketoptions DELAYACCEPT and HTTPACCEPT which will not allow an accept()alfred2000-06-151-0/+4
| | | | | | | | | | | | until the incoming connection has either data waiting or what looks like a HTTP request header already in the socketbuffer. This ought to reduce the context switch time and overhead for processing requests. The initial idea and code for HTTPACCEPT came from Yahoo engineers and has been cleaned up and a more lightweight DELAYACCEPT for non-http servers has been added Reviewed by: silence on hackers.
* Fix panic by moving the prp == 0 check up the order of sanity checks.asmodai2000-06-131-2/+3
| | | | | Submitted by: Bart Thate <freebsd@1st.dudi.org> on -current Approved by: rwatson
* o Modify jail to limit creation of sockets to UNIX domain sockets,rwatson2000-06-041-0/+9
| | | | | | | | | | | | | | | | | TCP/IP (v4) sockets, and routing sockets. Previously, interaction with IPv6 was not well-defined, and might be inappropriate for some environments. Similarly, sysctl MIB entries providing interface information also give out only addresses from those protocol domains. For the time being, this functionality is enabled by default, and toggleable using the sysctl variable jail.socket_unixiproute_only. In the future, protocol domains will be able to determine whether or not they are ``jail aware''. o Further limitations on process use of getpriority() and setpriority() by jailed processes. Addresses problem described in kern/17878. Reviewed by: phk, jmg
* Back out the previous change to the queue(3) interface.jake2000-05-261-2/+2
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-2/+2
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Introduce kqueue() and kevent(), a kernel event notification facility.jlemon2000-04-161-0/+109
|
* Make sure to free the socket in soabort() if the protocol couldn'tfenner2000-03-181-1/+7
| | | | | free it (this could happen if the protocol already freed its part and we just kept the socket around to make sure accept(2) didn't block)
* Add aio_waitcomplete(). Make aio work correctly for socket descriptors.jasone2000-01-141-0/+1
| | | | | | | | Make gratuitous style(9) fixes (me, not the submitter) to make the aio code more readable. PR: kern/12053 Submitted by: Chris Sedore <cmsedore@maxwell.syr.edu>
* Correct an uninitialized variable use, which, unlike most times, isgreen1999-12-271-4/+2
| | | | | | | actually a bug this time. Submitted by: bde Reviewed by: bde
* This is Bosko Milekic's mbuf allocation waiting code. Basically, thisgreen1999-12-121-0/+12
| | | | | | | | means that running out of mbuf space isn't a panic anymore, and code which runs out of network memory will sleep to wait for it. Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: green, wollman
* KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCPshin1999-11-221-1/+112
| | | | | | | | | | for IPv6 yet) With this patch, you can assigne IPv6 addr automatically, and can reply to IPv6 ping. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
* This is a partial commit of the patch from PR 14914:phk1999-11-161-5/+6
| | | | | | | | | | | | | Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
* Implement RLIMIT_SBSIZE in the kernel. This is a per-uid sockbuf totalgreen1999-10-091-4/+10
| | | | usage limit.
* Change so_cred's type to a ucred, not a pcred. THis makes more sense, actually.green1999-09-191-8/+4
| | | | | | Make a sonewconn3() which takes an extra argument (proc) so new sockets created with sonewconn() from a user's system call get the correct credentials, not just the parent's credentials.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Reviewed by: the cast of thousandsgreen1999-06-171-4/+11
| | | | | | | | | This is the change to struct sockets that gets rid of so_uid and replaces it with a much more useful struct pcred *so_cred. This is here to be able to do socket-level credential checks (i.e. IPFW uid/gid support, to be added to HEAD soon). Along with this comes an update to pidentd which greatly simplifies the code necessary to get a uid from a socket. Soon to come: a sysctl() interface to finding individual sockets' credentials.
* Plug a mbuf leak in tcp_usr_send(). pru_send() routines are expectedpeter1999-06-041-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | to either enqueue or free their mbuf chains, but tcp_usr_send() was dropping them on the floor if the tcpcb/inpcb has been torn down in the middle of a send/write attempt. This has been responsible for a wide variety of mbuf leak patterns, ranging from slow gradual leakage to rather rapid exhaustion. This has been a problem since before 2.2 was branched and appears to have been fixed in rev 1.16 and lost in 1.23/1.28. Thanks to Jayanth Vijayaraghavan <jayanth@yahoo-inc.com> for checking (extensively) into this on a live production 2.2.x system and that it was the actual cause of the leak and looks like it fixes it. The machine in question was loosing (from memory) about 150 mbufs per hour under load and a change similar to this stopped it. (Don't blame Jayanth for this patch though) An alternative approach to this would be to recheck SS_CANTSENDMORE etc inside the splnet() right before calling pru_send() after all the potential sleeps, interrupts and delays have happened. However, this would mean exposing knowledge of the tcp stack's reset handling and removal of the pcb to the generic code. There are other things that call pru_send() directly though. Problem originally noted by: John Plevyak <jplevyak@inktomi.com>
* Realy fix overflow on SO_*TIMEOache1999-05-211-4/+12
| | | | Submitted by: bde
* Add sysctl descriptions to many SYSCTL_XXXsbillf1999-05-031-3/+3
| | | | | | | PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)
* Lite2 bugfixes merge:ache1999-04-241-3/+3
| | | | | | | so_linger is in seconds, not in 1/HZ range checking in SO_*TIMEO was wrong PR: 11252
* * Change sysctl from using linker_set to construct its tree using SLISTs.dfr1999-02-161-1/+3
| | | | | | | | | | This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
* Fix the port of the NetBSD 19990120-accept fix. I misread a piece offenner1999-02-021-2/+7
| | | | | | code when examining their fix, which caused my code (in rev 1.52) to: - panic("soaccept: !NOFDREF") - fatal trap 12, with tracebacks going thru soclose and soaccept
* Fix warnings in preparation for adding -Wall -Wcast-qual to thedillon1999-01-271-2/+2
| | | | kernel compile
* Port NetBSD's 19990120-accept bug fix. This works around the race conditionfenner1999-01-251-3/+15
| | | | | | | | | | where select(2) can return that a listening socket has a connected socket queued, the connection is broken, and the user calls accept(2), which then blocks because there are no connections queued. Reviewed by: wollman Obtained from: NetBSD (ftp://ftp.NetBSD.ORG/pub/NetBSD/misc/security/patches/19990120-accept)
* Also consider the space left in the socket buffer when deciding whetherfenner1999-01-201-2/+2
| | | | to set PRUS_MORETOCOME.
* Add a flag, passed to pru_send routines, PRUS_MORETOCOME. Thisfenner1999-01-201-2/+4
| | | | | | | | | flag means that there is more data to be put into the socket buffer. Use it in TCP to reduce the interaction between mbuf sizes and the Nagle algorithm. Based on: "Justin C. Walker" <justin@apple.com>'s description of Apple's fix for this problem.
* KNFize, by bde.eivind1999-01-101-2/+2
|
* Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT aseivind1999-01-081-13/+6
| | | | | | | | | discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith
* The "easy" fixes for compiling the kernel -Wunused: remove unreferenced staticarchie1998-12-071-3/+1
| | | | and local variables, goto labels, and functions declared but not defined.
* Installed the second patch attached to kern/7899 with some changes suggestedtruckman1998-11-111-5/+4
| | | | | | | | | | | | | | | | by bde, a few other tweaks to get the patch to apply cleanly again and some improvements to the comments. This change closes some fairly minor security holes associated with F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN had on tty devices. For more details, see the description on the PR. Because this patch increases the size of the proc and pgrp structures, it is necessary to re-install the includes and recompile libkvm, the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w. PR: kern/7899 Reviewed by: bde, elvind
* Bow to tradition and correctly implement the bogus-but-hallowed semanticswollman1998-08-311-5/+6
| | | | | of getsockopt never telling how much it might have copied if only the buffer were big enough.
* Correctly set the return length regardless of the relative size of thewollman1998-08-311-6/+3
| | | | | user's buffer. Simplify the logic a bit. (Can we have a version of min() for size_t?)
* Yow! Completely change the way socket options are handled, eliminatingwollman1998-08-231-97/+148
| | | | | | another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.
* Undo rev 1.41 until we get more details about why it makes some systemsfenner1998-07-181-2/+1
| | | | fail.
* Introduce (fairly hacky) workaround for odd TCP behavior with applicationfenner1998-07-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | writes of size (100,208]+N*MCLBYTES. The bug: sosend() hands each mbuf off to the protocol output routine as soon as it has copied it, in the hopes of increasing parallelism (see http://www.kohala.com/~rstevens/vanj.88jul20.txt ). This works well for TCP as long as the first mbuf handed off is at least the MSS. However, when doing small writes (between MHLEN and MINCLSIZE), the transaction is split into 2 small MBUF's and each is individually handed off to TCP. TCP assumes that the first small mbuf is the whole transaction, so sends a small packet. When the second small mbuf arrives, Nagle prevents TCP from sending it so it must wait for a (potentially delayed) ACK. This sends throughput down the toilet. The workaround: Set the "atomic" flag when we're doing small writes. The "atomic" flag has two meanings: 1. Copy all of the data into a chain of mbufs before handing off to the protocol. 2. Leave room for a datagram header in said mbuf chain. TCP wants the first but doesn't want the second. However, the second simply results in some memory wastage (but is why the workaround is a hack and not a fix). The real fix: The real fix for this problem is to introduce something like a "requested transfer size" variable in the socket->protocol interface. sosend() would then accumulate an mbuf chain until it exceeded the "requested transfer size". TCP could set it to the TCP MSS (note that the current interface causes strange TCP behaviors when the MSS > MCLBYTES; nobody notices because MCLBYTES > ethernet's MTU).
* Convert socket structures to be type-stable and add a version number.wollman1998-05-151-8/+46
| | | | | | | | | | | | | | | | | | | Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).
* Moved some #includes from <sys/param.h> nearer to where they are actuallybde1998-03-281-1/+2
| | | | used.
* Make sure that you can only bind a more specific address when it isguido1998-03-011-1/+2
| | | | | done by the same uid. Obtained from: OpenBSD
* Revert sosend() to its behavior from 4.3-Tahoe and before: iffenner1998-02-191-3/+7
| | | | | | | | | | | | | so_error is set, clear it before returning it. The behavior introduced in 4.3-Reno (to not clear so_error) causes potentially transient errors (e.g. ECONNREFUSED if the other end hasn't opened its socket yet) to be permanent on connected datagram sockets that are only used for writing. (soreceive() clears so_error before returning it, as does getsockopt(...,SO_ERROR,...).) Submitted by: Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.
* Back out DIAGNOSTIC changes.eivind1998-02-061-3/+1
|
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-041-1/+3
|
* MF22: MSG_EOR bug fix.jkh1997-11-091-3/+9
| | | | Submitted by: wollman
* Last major round (Unless Bruce thinks of somthing :-) of malloc changes.phk1997-10-121-1/+5
| | | | | | | | Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
* While booting diskless we have no proc pointer.phk1997-10-041-2/+3
|
OpenPOWER on IntegriCloud