summaryrefslogtreecommitdiffstats
path: root/sys/kern/uipc_syscalls.c
Commit message (Collapse)AuthorAgeFilesLines
* Mechanically substitute flags from historic mbuf allocator withglebius2012-12-051-3/+3
| | | | | | | | | malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-9/+1
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-051-0/+1
| | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
* Style fixes and simplifications.pjd2012-06-111-8/+3
| | | | MFC after: 1 month
* Plug socket refcount leak on error in sys_sctp_peeloff.mjg2012-06-081-2/+2
| | | | | | Reviewed by: tuexen Approved by: trasz (mentor) MFC after: 3 days
* style(9) for r236563.glebius2012-06-051-2/+2
|
* Microoptimisation of code from r236560, also coming from Nginx Inc.glebius2012-06-041-6/+4
| | | | Submitted by: ru
* Optimise kern_sendfile(): skip cycling through the entire mbuf chain inglebius2012-06-041-4/+10
| | | | | | | | | | m_cat(), storing pointer to last mbuf in chain in local variable and attaching new mbuf to the end of chain. Submitter reports that CPU load dropped for > 10% on a web server serving large files with this optimisation. Submitted by: Sergey Budnevitch <sb nginx.com>
* Fix bugs which can result in a panic when an non-SCTP socket ittuexen2012-03-151-0/+16
| | | | | | used with an sctp_ system-call which expects an SCTP socket. MFC after: 3 days.
* Fix found places where uio_resid is truncated to int.kib2012-02-211-13/+16
| | | | | | | | | Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-21/+21
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Second-to-last commit implementing Capsicum capabilities in the FreeBSDrwatson2011-08-111-35/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
* Add some checks to ensure that Capsicum is behaving correctly, and add somejonathan2011-06-301-0/+7
| | | | | | | more explicit comments about what's going on and what future maintainers need to do when e.g. adding a new operation to a sys_machdep.c. Approved by: mentor(rwatson), re(bz)
* Log the socket address passed as the destination to sendto() and sendmsg()jhb2011-06-071-0/+4
| | | | | | via ktrace. MFC after: 1 week
* After the r219999 is merged to stable/8, rename fallocf(9) to falloc(9)kib2011-04-011-5/+5
| | | | | | | | and remove the falloc() version that lacks flag argument. This is done to reduce the KPI bloat. Requested by: jhb X-MFC-note: do not
* Mfp4 CH=177274,177280,177284-177285,177297,177324-177325bz2011-02-161-14/+1
| | | | | | | | | | | | | | | | | | | | | | VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
* Eliminate unnecessary page hold_count checks. These checks predatealc2011-02-031-2/+1
| | | | | | | r90944, which introduced a general mechanism for handling the freeing of held pages. Reviewed by: kib@
* If more than one thread allocated sf buffers for sendfile(2), andkib2011-01-281-5/+12
| | | | | | | | | | | | each of the threads needs more while current pool of the buffers is exhausted, then neither thread can make progress. Switch to nowait allocations after we got first buffer already. Reported by: az Reviewed by: alc (previous version) Tested by: pho MFC after: 1 week
* Just pass M_ZERO to malloc(9) instead of clearing allocated memory separately.pjd2010-12-141-2/+1
|
* Implement correct handling of address parameter andtuexen2010-09-051-4/+2
| | | | | | sendinfo for SCTP send calls. MFC after: 4 weeks.
* Send SIGPIPE to the thread that issued the offending system calljhb2010-06-291-3/+3
| | | | | | | | rather than to the entire process. Reported by: Anit Chakraborty Reviewed by: kib, deischen (concept) MFC after: 1 week
* * Do not dereference a NULL pointer when calling an SCTP send syscalltuexen2010-06-261-2/+3
| | | | | | | | not providing a destination address and using ktrace. * Do not copy out kernel memory when providing sinfo for sctp_recvmsg(). Both bug where reported by Valentin Nechayev. The first bug results in a kernel panic. MFC after: 3 days.
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-1/+1
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Remove page queues locking from all sf_buf_mext()-like functions. The pagealc2010-05-061-6/+1
| | | | | | lock now suffices. Fix a couple nearby style violations.
* Eliminate a small bit of unneeded code from kern_sendfile(): Whilealc2010-05-061-7/+2
| | | | | | | | | | kern_sendfile() is running, the file's vm object can't be destroyed because kern_sendfile() increments the vm object's reference count. (Once kern_sendfile() decrements the reference count and returns, the vm object can, however, be destroyed. So, sf_buf_mext() must handle the case where the vm object is destroyed.) Reviewed by: kib
* This is the first step in transitioning responsibility for synchronizingalc2010-05-031-0/+4
| | | | | | access to the page's wire_count from the page queues lock to the page lock. Submitted by: kmacy
* Lock the page around hold_count access.kib2010-05-021-0/+2
| | | | Reviewed by: alc
* Properly handle compat32 calls to sctp generic sendmsd/recvmsg functions thatkib2010-03-191-4/+19
| | | | | | | take iov. Reviewed by: tuexen MFC after: 2 weeks
* Remove dead statement.kib2010-03-191-1/+0
| | | | | Reviewed by: tuexen MFC after: 2 weeks
* Fix two style issues.kib2010-03-191-2/+2
| | | | MFC after: 2 weeks
* Use NULL instead of 0 when setting up pointer.pjd2010-02-181-2/+2
|
* Fix argument order in a call to mtx_init.mjacob2009-12-171-1/+1
| | | | MFC after: 1 week
* If socket buffer space appears to be lower then sum of count of alreadykib2009-11-031-9/+1
| | | | | | | | | | | | | | prepared bytes and next portion of transfer, inner loop of kern_sendfile() aborts, not preparing next mbuf for socket buffer, and not modifying any outer loop invariants. The thread loops in the outer loop forever. Instead of breaking from inner loop, prepare only bytes that fit into the socket buffer space. In collaboration with: pho Reviewed by: bz PR: kern/138999 MFC after: 2 weeks
* Fix style issue.kib2009-10-291-1/+1
|
* Do not dereference vp->v_mount without holding vnode lock and checkingkib2009-10-011-2/+5
| | | | | | | that the vnode is not reclaimed. Noted by: Igor Sysoev <is rambler-co ru> MFC after: 1 week
* Get SCTP working in combination with VIMAGE.tuexen2009-09-191-0/+9
| | | | | | Contains code from bz. Approved by: rrs (mentor) MFC after: 1 month.
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-011-1/+2
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Audit file descriptor numbers for various socket-related system calls.rwatson2009-07-011-0/+17
| | | | | Approved by: re (audit argument blanket) MFC after: 3 days
* Define missing audit argument macro AUDIT_ARG_SOCKET(), andrwatson2009-07-011-0/+3
| | | | | | | | capture the domain, type, and protocol arguments to socket(2) and socketpair(2). Approved by: re (audit argument blanket) MFC after: 3 days
* SCTP needs either IPv4 or IPv6 as lower layer[1].bz2009-06-101-4/+8
| | | | | | | | So properly hide the already #ifdef SCTP code with #if defined(INET) || defined(INET6) as well to get us closer to a non-INET/INET6 kernel. Discussed with: tuexen [1]
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Add internal 'mac_policy_count' counter to the MAC Framework, which is arwatson2009-06-021-36/+12
| | | | | | | | | | | | | | | | | | count of the number of registered policies. Rather than unconditionally locking sockets before passing them into MAC, lock them in the MAC entry points only if mac_policy_count is non-zero. This avoids locking overhead for a number of socket system calls when no policies are registered, eliminating measurable overhead for the MAC Framework for the socket subsystem when there are no active policies. Possibly socket locks should be acquired by policies if they are required for socket labels, which would further avoid locking overhead when there are policies but they don't require labeling of sockets, or possibly don't even implement socket controls. Obtained from: TrustedBSD Project
* Split native socketpair() syscall onto kern_socketpair() which shoulddchagin2009-05-311-24/+29
| | | | | | | be used by kernel consumers and socketpair() itself. Approved by: kib (mentor) MFC after: 1 month
* - Implement a lockless file descriptor lookup algorithm injeff2009-05-141-16/+9
| | | | | | | | | | | | fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.
* A NOP change: style / whitespace cleanup of the noise that slippedzec2009-05-081-1/+1
| | | | | | | into r191816. Spotted by: bz Approved by: julian (mentor) (an earlier version of the diff)
* Change the curvnet variable from a global const struct vnet *,zec2009-05-051-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)
* sendfile doesn't modify the vnode - acquire vnode lock sharedkmacy2009-04-121-1/+1
| | | | Reviewed by: ups, jeffr
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-5/+5
| | | | MFC after: 3 months
* When sendto(2) is called with an explicit destination addressrwatson2008-05-221-1/+5
| | | | | | | | | | | argument, call mac_socket_check_connect() on that address before proceeding with the send. Otherwise policies instrumenting the connect entry point for the purposes of checking destination addresses will not have the opportunity to check implicit connect requests. MFC after: 3 weeks Sponsored by: nCircle Network Security, Inc.
* When writing trailers in sendfile(2), don't call kern_writev()rwatson2008-04-271-3/+4
| | | | | | | | | | | | | | | | | | while holding the socket buffer lock. These leads to an immediate panic due to recursing the socket buffer lock. This bug was introduced in uipc_syscalls.c:1.240, but masked by another bug until that was fixed in uipc_syscalls.c:1.269. Note that the current fix isn't perfect, but better than panicking: normally we guarantee that simultaneous invocations of a system call to write on a stream socket won't be interlaced, which is ensured by use of the socket buffer sleep lock. This is guaranteed for the sendfile headers, but not trailers. In practice, this is likely not a problem, but should be fixed. MFC after: 3 days Pointy hat to: andre (1.240), cperciva (1.269)
OpenPOWER on IntegriCloud