summaryrefslogtreecommitdiffstats
path: root/sys/kern/uipc_socket.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Add sosend_dgram(), a greatly reduced and simplified version of sosend()rwatson2006-01-131-2/+155
| | | | | | | intended for use solely with atomic datagram socket types, and relies on the previous break-out of sosend_copyin(). Changes to allow UDP to optionally use this instead of sosend() will be committed as a follow-up.
* Fix snderr() to not leak the socket buffer lock if an error occurs injhb2005-11-291-1/+1
| | | | | | | | sosend(). Robert accidentally changed the snderr() macro to jump to the out label which assumes the lock is already released rather than the release label which drops the lock in his previous change to sosend(). This should fix the recent panics about returning from write(2) with the socket lock held and the most recent LOR on current@.
* Move zero copy statistics structure before sosend_copyin().rwatson2005-11-281-15/+15
| | | | | MFC after: 1 month Reported by: tinderbox, sam
* Break out functionality in sosend() responsible for building mbufrwatson2005-11-281-141/+170
| | | | | | | | | | chains and copying in mbufs from the body of the send logic, creating a new function sosend_copyin(). This changes makes sosend() almost readable, and will allow the same logic to be used by tailored socket send routines. MFC after: 1 month Reviewed by: andre, glebius
* Retire MT_HEADER mbuf type and change its users to use MT_DATA.andre2005-11-021-1/+1
| | | | | | | | | | | | Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005
* Push the assignment of a new or updated so_qlimit from solisten()rwatson2005-10-301-15/+6
| | | | | | | | | | | | | | following the protocol pru_listen() call to solisten_proto(), so that it occurs under the socket lock acquisition that also sets SO_ACCEPTCONN. This requires passing the new backlog parameter to the protocol, which also allows the protocol to be aware of changes in queue limit should it wish to do something about the new queue limit. This continues a move towards the socket layer acting as a library for the protocol. Bump __FreeBSD_version due to a change in the in-kernel protocol interface. This change has been tested with IPv4 and UNIX domain sockets, but not other protocols.
* Allow 32bit get/setsockopt with SO_SNDTIMEO or SO_RECVTIMEO to work.ps2005-10-271-3/+29
|
* Add three new read-only socket options, which allow regression testsrwatson2005-09-181-0/+17
| | | | | | | | | | | | | | | and other applications to query the state of the stack regarding the accept queue on a listen socket: SO_LISTENQLIMIT Return the value of so_qlimit (socket backlog) SO_LISTENQLEN Return the value of so_qlen (complete sockets) SO_LISTENINCQLEN Return the value of so_incqlen (incomplete sockets) Minor white space tweaks to existing socket options to make them consistent. Discussed with: andre MFC after: 1 week
* Fix spelling in a comment.rwatson2005-09-181-1/+1
| | | | MFC after: 3 days
* Backout rev. 1.246, it breaks code uses shutdown(2) on non-connectedmaxim2005-09-151-2/+0
| | | | | | sockets. Pointed out by: rwatson
* o Return ENOTCONN when shutdown(2) on non-connected socket.maxim2005-09-151-0/+2
| | | | | | | PR: kern/84761 Submitted by: James Juran R-test: tools/regression/sockets/shutdown MFC after: 1 month
* In soreceive(), when a first mbuf is removed from socket buffer useglebius2005-09-061-8/+1
| | | | | | | sockbuf_pushsync(). Previous manipulation could lead to an inconsistent mbuf. Reviewed by: rwatson
* Make getsockopt(..., SOL_SOCKET, SO_ACCEPTCONN, ...) work per IEEE Stdkbyanc2005-08-011-0/+1
| | | | 1003.1 (POSIX).
* Fix for PR 83885.gnn2005-07-281-1/+4
| | | | | | | | | | Make sure that there actually is a next packet before setting nextrecord to that field. PR: 83885 Submitted by: hirose@comm.yamaha.co.jp Obtained from: Patch suggested in the PR MFC after: 1 week
* Fix the recent panics/LORs/hangs created by my kqueue commit by:ssouhlal2005-07-011-2/+4
| | | | | | | | | | | | | | | | | - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone
* Stop embedding struct ifnet at the top of driver softcs. Instead thebrooks2005-06-101-3/+3
| | | | | | | | | | | | | | | | | | | | struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam
* Drat! Committed from the wrong branch. Restore HEAD to its previous goodness.scottl2005-06-091-504/+966
|
* Back out 1.68.2.26. It was a mis-guided change that was already backed outscottl2005-06-091-966/+504
| | | | | | | of HEAD and should not have been MFC'd. This will restore UDP socket functionality, which will correct the recent NFS problems. Submitted by: rwatson
* Allow sends sent from non page-aligned userspace addresses to begallatin2005-06-051-7/+4
| | | | | | | considered for zero-copy sends. Reviewed by: alc Submitted by: Romer Gil at Rice University
* Move the logic implementing retrieval of the SO_ACCEPTFILTER socket optionrwatson2005-03-121-18/+1
| | | | | | | from uipc_socket.c to uipc_accf.c in do_getopt_accept_filter(), so that it now matches do_setopt_accept_filter(). Slightly reformulate the logic to match the optimistic allocation of storage for the argument in advance, and slightly expand the coverage of the socket lock.
* Remove an additional commented out reference to a possible future sxrwatson2005-03-111-1/+0
| | | | lock.
* When setting up a socket in socreate(), there's no need to lock therwatson2005-03-111-3/+1
| | | | | | | | | | socket lock around knlist_init(), so don't. Hard code the setting of the socket reference count to 1 rather than using soref() to avoid asserting the socket lock, since we've not yet exposed the socket to other threads. This removes two mutex operations from each socket allocation.
* Remove suggestive sx_init() comment in soalloc(). We will have somethingrwatson2005-03-111-1/+0
| | | | like this at some point, but for now it clutters the source.
* In the current world order, solisten() implements the state transition ofrwatson2005-02-211-14/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a socket from a regular socket to a listening socket able to accept new connections. As part of this state transition, solisten() calls into the protocol to update protocol-layer state. There were several bugs in this implementation that could result in a race wherein a TCP SYN received in the interval between the protocol state transition and the shortly following socket layer transition would result in a panic in the TCP code, as the socket would be in the TCPS_LISTEN state, but the socket would not have the SO_ACCEPTCONN flag set. This change does the following: - Pushes the socket state transition from the socket layer solisten() to to socket "library" routines called from the protocol. This permits the socket routines to be called while holding the protocol mutexes, preventing a race exposing the incomplete socket state transition to TCP after the TCP state transition has completed. The check for a socket layer state transition is performed by solisten_proto_check(), and the actual transition is performed by solisten_proto(). - Holds the socket lock for the duration of the socket state test and set, and over the protocol layer state transition, which is now possible as the socket lock is acquired by the protocol layer, rather than vice versa. This prevents additional state related races in the socket layer. This permits the dual transition of socket layer and protocol layer state to occur while holding locks for both layers, making the two changes atomic with respect to one another. Similar changes are likely require elsewhere in the socket/protocol code. Reported by: Peter Holm <peter@holm.cc> Review and fixes from: emax, Antoine Brodin <antoine.brodin@laposte.net> Philosophical head nod: gnn
* In soreceive(), when considering delivery to a socket in SS_ISCONFIRMING,rwatson2005-02-201-1/+2
| | | | | | | | only call the protocol's pru_rcvd() if the protocol has the flag PR_WANTRCVD set. This brings that instance of pru_rcvd() into line with the rest, which do check the flag. MFC after: 3 days
* Correct a typo in the comment describing soreceive_rcvoob().rwatson2005-02-181-1/+1
| | | | MFC after: 3 days
* In soconnect(), when resetting so->so_error, the socket lock is notrwatson2005-02-181-2/+0
| | | | | required due to a straight integer write in which minor races are not a problem.
* Move do_setopt_accept_filter() from uipc_socket.c to uipc_accf.c, whererwatson2005-02-181-126/+0
| | | | | | the rest of the accept filter code currently lives. MFC after: 3 days
* Re-order checks in socheckuid() so that we check all deny cases beforerwatson2005-02-181-3/+3
| | | | | | returning accept. MFC after: 3 days
* In solisten(), unconditionally set the SO_ACCEPTCONN option inrwatson2005-02-181-6/+4
| | | | | | | | | | | | | | | | | | | | so->so_options when solisten() will succeed, rather than setting it conditionally based on there not being queued sockets in the completed socket queue. Otherwise, if the protocol exposes new sockets via the completed queue before solisten() completes, the listen() system call will succeed, but the socket and protocol state will be out of sync. For TCP, this didn't happen in practice, as the TCP code will panic if a new connection comes in after the tcpcb has been transitioned to a listening state but the socket doesn't have SO_ACCEPTCONN set. This is historical behavior resulting from bitrot since 4.3BSD, in which that line of code was associated with the conditional NULL'ing of the connection queue pointers (one-time initialization to be performed during the transition to a listening socket), which are now initialized separately. Discussed with: fenner, gnn MFC after: 3 days
* - Convert so_qlen, so_incqlen, so_qlimit fields of struct socket fromglebius2005-01-241-2/+23
| | | | | | | | | | | short to unsigned short. - Add SYSCTL_PROC() around somaxconn, not accepting values < 1 or > U_SHRTMAX. Before this change setting somaxconn to smth above 32767 and calling listen(fd, -1) lead to a socket, which doesn't accept connections at all. Reviewed by: rwatson Reported by: Igor Sysoev
* When re-connecting already connected datagram socket ensure to cleansobomax2005-01-121-2/+11
| | | | | | | | | | up its pending error state, which may be set in some rare conditions resulting in connect() syscall returning that bogus error and making application believe that attempt to change association has failed, while it has not in fact. There is sockets/reconnect regression test which excersises this bug. MFC after: 2 weeks
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Remove an XXXRW indicating atomic operations might be used as arwatson2004-12-231-12/+4
| | | | | | | | | | | | | | | | substitute for a global mutex protecting the socket count and generation number. The observation that soreceive_rcvoob() can't return an mbuf chain is a property, not a bug, so remove the XXXRW. In sorflush, s/existing/previous/ for code when describing prior behavior. For SO_LINGER socket option retrieval, remove an XXXRW about why we hold the mutex: this is correct and not dubious. MFC after: 2 weeks
* In soalloc(), simplify the mac_init_socket() handling to removerwatson2004-12-231-14/+3
| | | | | | | | | | | unnecessary use of a global variable and simplify the return case. While here, use ()'s around return values. In sodealloc(), remove a comment about why we bump the gencnt and decrement the socket count separately. It doesn't add substantially to the reading, and clutters the function. MFC after: 2 weeks
* Remove unneeded code from the zero-copy receive path.alc2004-12-101-12/+0
| | | | | Discussed with: gallatin@ Tested by: ken@
* Tidy up the zero-copy receive path: Remove an unneeded argument toalc2004-12-081-3/+2
| | | | uiomoveco() and userspaceco().
* If soreceive() is called from a socket callback, there's no reasonps2004-11-291-1/+7
| | | | | | | | | to do a window update to the peer (thru an ACK) from soreceive() itself. TCP will do that upon return from the socket callback. Sending a window update from soreceive() results in a lock reversal. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson
* Make soreceive(MSG_DONTWAIT) nonblocking. If MSG_DONTWAIT is passed intops2004-11-291-3/+21
| | | | | | | | soreceive(), then pass in M_DONTWAIT to m_copym(). Also fix up error handling for the case where m_copym() returns failure. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson
* Since sb_timeo type was increased to int, use INT_MAX instead of SHRT_MAX.glebius2004-11-091-3/+3
| | | | | | | | This also gives us ability to close PR. PR: kern/42352 Approved by: julian (mentor) MFC after: 1 week
* Acquire the accept mutex in soabort() before calling sotryfree(), asrwatson2004-11-021-0/+1
| | | | | | | | that is now required. RELENG_5_3 candidate. Foot provided by: Dikshie <dikshie at ppk dot itb dot ac dot id>
* socreate() does an early abort if either the protocol cannot be found,andre2004-10-231-1/+2
| | | | | | | | | | | | or pru_attach is NULL. With loadable protocols the SPACER dummy protocols have valid function pointers for all methods to functions returning just EOPNOTSUPP. Thus the early abort check would not detect immediately that attach is not supported for this protocol. Instead it would correctly get the EOPNOTSUPP error later on when it calls the protocol specific attach function. Add testing against the pru_attach_notsupp() function pointer to the early abort check as well.
* Push acquisition of the accept mutex out of sofree() into the callerrwatson2004-10-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (sorele()/sotryfree()): - This permits the caller to acquire the accept mutex before the socket mutex, avoiding sofree() having to drop the socket mutex and re-order, which could lead to races permitting more than one thread to enter sofree() after a socket is ready to be free'd. - This also covers clearing of the so_pcb weak socket reference from the protocol to the socket, preventing races in clearing and evaluation of the reference such that sofree() might be called more than once on the same socket. This appears to close a race I was able to easily trigger by repeatedly opening and resetting TCP connections to a host, in which the tcp_close() code called as a result of the RST raced with the close() of the accepted socket in the user process resulting in simultaneous attempts to de-allocate the same socket. The new locking increases the overhead for operations that may potentially free the socket, so we will want to revise the synchronization strategy here as we normalize the reference counting model for sockets. The use of the accept mutex in freeing of sockets that are not listen sockets is primarily motivated by the potential need to remove the socket from the incomplete connection queue on its parent (listen) socket, so cleaning up the reference model here may allow us to substantially weaken the synchronization requirements. RELENG_5_3 candidate. MFC after: 3 days Reviewed by: dwhite Discussed with: gnn, dwhite, green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>
* Rework sofree() logic to take into account a possible race with accept().rwatson2004-10-111-5/+19
| | | | | | | | | | | | | | | | | | | | | | Sockets in the listen queues have reference counts of 0, so if the protocol decides to disconnect the pcb and try to free the socket, this triggered a race with accept() wherein accept() would bump the reference count before sofree() had removed the socket from the listen queues, resulting in a panic in sofree() when it discovered it was freeing a referenced socket. This might happen if a RST came in prior to accept() on a TCP connection. The fix is two-fold: to expand the coverage of the accept mutex earlier in sofree() to prevent accept() from grabbing the socket after the "is it really safe to free" tests, and to expand the logic of the "is it really safe to free" tests to check that the refcount is still 0 (i.e., we didn't race). RELENG_5 candidate. Much discussion with and work by: green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>
* Expand the scope of the socket buffer locks in sopoll() to include therwatson2004-09-051-4/+4
| | | | | | | | | | | | | | | | state test as well as set, or we risk a race between a socket wakeup and registering for select() or poll() on the socket. This does increase the cost of the poll operation, but can probably be optimized some in the future. This appears to correct poll() "wedges" experienced with X11 on SMP systems with highly interactive applications, and might affect a plethora of other select() driven applications. RELENG_5 candidate. Problem reported by: Maxim Maximov <mcsi at mcsi dot pp dot ru> Debugged with help of: dwhite
* Conditional acquisition of socket buffer mutexes when testing socketrwatson2004-08-241-35/+16
| | | | | | | | buffers with kqueue filters is no longer required: the kqueue framework will guarantee that the mutex is held on entering the filter, either due to a call from the socket code already holding the mutex, or by explicitly acquiring it. This removes the last of the conditional socket locking.
* Back out uipc_socket.c:1.208, as it incorrectly assumes that allrwatson2004-08-201-3/+1
| | | | | | | | | | | | sockets are connection-oriented for the purposes of kqueue registration. Since UDP sockets aren't connection-oriented, this appeared to break a great many things, such as RPC-based applications and services (i.e., NFS). Since jmg isn't around I'm backing this out before too many more feet are shot, but intend to investigate the right solution with him once he's available. Apologies to: jmg Discussed with: imp, scottl
* make sure that the socket is either accepting connections or is connectedjmg2004-08-201-1/+3
| | | | | | when attaching a knote to it... otherwise return EINVAL... Pointed out by: benno
* Add locking to the kqueue subsystem. This also makes the kqueue subsystemjmg2004-08-151-6/+10
| | | | | | | | | | | | | a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
* Replace a reference to splnet() with a reference to locking in a comment.rwatson2004-08-111-1/+1
|
OpenPOWER on IntegriCloud