summaryrefslogtreecommitdiffstats
path: root/sys/kern/sys_pipe.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't dec/inc the amountpipes counter every time we resize a pipe --rwatson2004-02-031-2/+3
| | | | | | | | | | instead, just dec/inc in the ctor/dtor. For now, increment/decrement in two's, since we're now performing the operation once per pair, not once per pipe. Not really any measurable performance change in my micro-benchmarks, but doing less work is good, especially when it comes to atomic operations. Suggested by: alc
* Catch instances of (pipe == NULL) that were obsoleted with recentrwatson2004-02-031-6/+6
| | | | | | | | changes to jointly allocated pipe pairs. Replace these checks with pipe_present checks. This avoids a NULL pointer dereference when a pipe is half-closed. Submitted by: Peter Edwards <peter.edwards@openet-telecom.com>
* Coalesce pipe allocations and frees. Previously, the pipe coderwatson2004-02-011-91/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | would allocate two 'struct pipe's from the pipe zone, and malloc a mutex. - Create a new "struct pipepair" object holding the two 'struct pipe' instances, struct mutex, and struct label reference. Pipe structures now have a back-pointer to the pipe pair, and a 'pipe_present' flag to indicate whether the half has been closed. - Perform mutex init/destroy in zone init/destroy, avoiding reallocating the mutex for each pipe. Perform most pipe structure setup in zone constructor. - VM memory mappings for pageable buffers are still done outside of the UMA zone. - Change MAC API to speak 'struct pipepair' instead of 'struct pipe', update many policies. MAC labels are also handled outside of the UMA zone for now. Label-only policy modules don't have to be recompiled, but if a module is recompiled, its pipe entry points will need to be updated. If a module actually reached into the pipe structures (unlikely), that would also need to be modified. These changes substantially simplify failure handling in the pipe code as there are many fewer possible failure modes. On half-close, pipes no longer free the 'struct pipe' for the closed half until a full-close takes place. However, VM mapped buffers are still released on half-close. Some code refactoring is now possible to clean up some of the back references, etc; this patch attempts not to change the structure of most of the pipe implementation, only allocation/free code paths, so as to avoid introducing bugs (hopefully). This cuts about 8%-9% off the cost of sequential pipe allocation and free in system call tests on UP and SMP in my micro-benchmarks. May or may not make a difference in macro-benchmarks, but doing less work is good. Reviewed by: juli, tjr Testing help: dwhite, fenestro, scottl, et al
* Fix an error in a KASSERT string: it's pipe_free_kmem(), notrwatson2004-01-311-1/+1
| | | | pipespace(), that contains this KASSERT.
* New file descriptor allocation code, derived from similar code introduceddes2004-01-151-0/+1
| | | | | | | | | | | in OpenBSD by Niels Provos. The patch introduces a bitmap of allocated file descriptors which is used to locate available descriptors when a new one is needed. It also moves the task of growing the file descriptor table out of fdalloc(), reducing complexity in both fdalloc() and do_dup(). Debts of gratitude are owed to tjr@ (who provided the original patch on which this work is based), grog@ (for the gdb(4) man page) and rwatson@ (for assistance with pxeboot(8)).
* Back out 1.160, which was committed by mistake.des2004-01-111-1/+0
|
* Mechanical whitespace cleanup.des2004-01-111-27/+27
|
* Mechanical whitespace cleanup + minor style nits.des2004-01-111-1/+3
|
* Fix the maxpipekva warning message so that it points to the correctsilby2003-12-281-1/+1
| | | | | | sysctl, and shorten the message. Noticed by: bde
* - Implement selwakeuppri() which allows raising the priority of atanimura2003-11-091-1/+1
| | | | | | | | | | | | | thread being waken up. The thread waken up can run at a priority as high as after tsleep(). - Replace selwakeup()s with selwakeuppri()s and pass appropriate priorities. - Add cv_broadcastpri() which raises the priority of the broadcast threads. Used by selwakeuppri() if collision occurs. Not objected in: -arch, -current
* - Delay the allocation of memory for the pipe mutex until we need it.alc2003-11-061-5/+1
| | | | | This avoids the need to free said memory in various error cases along the way.
* - Simplify pipespace() by eliminating the explicit creation of vm objects.alc2003-11-061-10/+2
| | | | | | Instead, let the vm objects be lazily instantiated at fault time. This results in the allocation of fewer vm objects and vm map entries due to aggregation in the vm system.
* Unlock pipe mutex when failing MAC pipe ioctl access control check.rwatson2003-11-031-1/+3
| | | | | Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Change all SYSCTLS which are readonly and have a related TUNABLEsilby2003-10-211-1/+1
| | | | | from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide more useful error messages.
* falloc allocates a file structure and adds it to the file descriptordwmalone2003-10-191-1/+3
| | | | | | | | | | | | | | | | | | | | | table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb
* fix a problem referencing free'd memory. This is only a problem forjmg2003-10-121-2/+7
| | | | | | | | | | kqueue write events on a socket and you regularly create tons of pipes which overwrites the structure causing a panic when removing the knote from the list. If the peer has gone away (and it's a write knote), then don't bother trying to remove the knote from the list. Submitted by: Brian Buchanan and myself Obtained from: nCircle
* pipe_build_write_buffer() only requires read access of the page that italc2003-09-121-1/+2
| | | | obtains from pmap_extract_and_hold().
* Use pmap_extract_and_hold() in pipe_build_write_buffer(). Consequently,alc2003-09-081-35/+11
| | | | | | pipe_build_write_buffer() no longer requires Giant on entry. Reviewed by: tegge
* Giant is no longer required by pipe_destroy_write_buffer(). Reducealc2003-09-061-9/+7
| | | | unnecessary white space from pipe_destroy_write_buffer().
* if we got this far, we definately don't have an EBADF. Return a morejmg2003-08-151-1/+1
| | | | | | | sane result of EPIPE. Reported by: nCircle dev team MFC after: 3 day
* - The vm_object pointer in pipe_buffer is unused. Remove it.alc2003-08-131-5/+2
| | | | | - Check for successful initialization of pipe_zone in pipeinit() rather than every call to pipe(2).
* Pipespace() no longer requires Giant.alc2003-08-111-2/+2
|
* More pipe changes:silby2003-08-111-38/+26
| | | | | | | | | | | | | | From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)
* Use vm_page_hold() instead of vm_page_wire(). Otherwise, a multithreadedalc2003-08-111-3/+3
| | | | | | | | | | application could cause a wired page to be freed. In general, vm_page_hold() should be preferred for ephemeral kernel mappings of pages borrowed from a user-level address space. (vm_page_wire() should really be reserved for indefinite duration pinning by the "owner" of the page.) Discussed with: silby Submitted by: tegge
* - Remove GIANT_REQUIRED from pipespace().alc2003-08-081-4/+0
| | | | - Remove a duplicate initialization from pipe_create().
* - Remove GIANT_REQUIRED from pipe_free_kmem().alc2003-08-071-3/+0
| | | | | - Remove the acquisition and release of Giant around pipe_kmem_free() and uma_zfree() in pipeclose().
* Remove test in pipe_write() which causes write(2) to return EAGAINpb2003-07-301-1/+1
| | | | | | | | | | | | on a non-blocking pipe in cases where select(2) returns the file descriptor as ready for write. This in turns causes libc_r, for one, to busy wait in such cases. Note: it is a quick performance fix, a more complex fix might be required in case this turns out to have unexpected side effects. Reviewed by: silby MFC after: 3 days
* The introduction of vm object locking has caused witness to revealalc2003-07-301-1/+1
| | | | | | | | | | | a long-standing mistake in the way a portion of a pipe's KVA is allocated. Specifically, kmem_alloc_pageable() is inappropriate for use in the "direct" case because it allows a preceding vm map entry and vm object to be extended to support the new KVA allocation. However, the direct case KVA allocation should not have a backing vm object. This is corrected by using kmem_alloc_nofault(). Submitted by: tegge (with the above explanation by me)
* A few minor changes:silby2003-07-091-6/+7
| | | | | | | - Use atomic ops to update the bigpipe count - Make the bigpipe count sysctl readable - Remove a duplicate comparison in an if statement - Comment two SYSCTLs.
* Put some concrete limits on pipe memory consumption:silby2003-07-081-17/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Limit the total number of pipes so that we do not exhaust all vm objects in the kernel map. When this limit is reached, a ratelimited message will be printed to the console. - Put a soft limit on the amount of memory consumable by pipes. Once the limit has been reached, all new pipes will be limited to 4K in size, rather than the default of 16K. - Put a limit on the number of pages that may be used for high speed page flipping in order to reduce the amount of wired memory. Pipe writes that occur while this limit is exceeded will fall back to non-page flipping mode. The above values are auto-tuned in subr_param.c and are scaled to take into account both the size of physical memory and the size of the kernel map. These limits help to reduce the "kernel resources exhausted" panics that could be caused by opening a large number of pipes. (Pipes alone are no longer able to exhaust all resources, but other kernel memory hogs in league with pipes may still be able to do so.) PR: 53627 Ideas / comments from: hsu, tjr, dillon@apollo.backplane.com MFC after: 1 week
* Initialize struct fileops with C99 sparse initialization.phk2003-06-181-2/+8
|
* Use __FBSDID().obrien2003-06-111-2/+3
|
* style(9).mux2003-06-091-12/+20
|
* Need to hold the same SMP lock for (knote) list traversal as forhsu2003-04-021-1/+1
| | | | | list manipulation. This lock also protects read-modify-write operations on the pipe_state field.
* - Add vm_paddr_t, a physical address type. This is required for systemsjake2003-03-251-1/+2
| | | | | | | | | | | | | | | where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
* Back out M_* changes, per decision of the TRB.imp2003-02-191-2/+2
| | | | Approved by: trb
* Do not allow kqueues to be passed via unix domain sockets.alfred2003-02-151-1/+1
|
* Use atomic ops to update amountpipekva. Amountpipekva represents thealc2003-02-131-5/+8
| | | | | total kernel virtual address space used by all pipes. It is, thus, outside the scope of any individual pipe lock.
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Bow to the whining masses and change a union back into void *. Retaindillon2003-01-131-12/+12
| | | | | removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
* Change struct file f_data to un_data, a union of the correct structdillon2003-01-121-12/+12
| | | | | | | | | | pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
* White-space changes.phk2002-12-241-7/+7
|
* Detediousficate declaration of fileops array members by introducingphk2002-12-231-12/+7
| | | | typedefs for them.
* Remove a KASSERT I added in 1.73 to catch uninitialized pipes.alfred2002-10-141-2/+0
| | | | | | | | | | | | It must be removed because it is done without the pipe being locked via pipelock() and therefore is vulnerable to races with pipespace() erroneously triggering it by temporarily zero'ing out the structure backing the pipe. It looks as if this assertion is not needed because all manipulation of the data changed by pipespace() _is_ protected by pipelock(). Reported by: kris, mckusick
* whitespace fixes.alfred2002-10-121-2/+2
|
* Change iov_base's type from `char *' to the standard `void *'. Allmike2002-10-111-1/+1
| | | | | uses of iov_base which assume its type is `char *' (in order to do pointer arithmetic) have been updated to cast iov_base to `char *'.
* In an SMP environment post-Giant it is no longer safe to blindlytruckman2002-10-031-2/+2
| | | | | | | | | dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson
* Improve locking of pipe mutexes in the context of MAC:rwatson2002-10-011-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | (1) Where previously the pipe mutex was selectively grabbed during pipe_ioctl(), now always grab it and then release if if not needed. This protects the call to mac_check_pipe_ioctl() to make sure the label remains consistent. (Note: it looks like sigio locking may be incorrect for fgetown() since we call it not-by-reference and sigio locking assumes call by reference). (2) In pipe_stat(), lock the pipe if MAC is compiled in so that the call to mac_check_pipe_stat() gets a locked pipe to protect label consistency. We still release the lock before returning actual stat() data, risking inconsistency, but apparently our pipe locking model accepts that risk. (3) In various pipe MAC authorization checks, assert that the pipe lock is held. (4) Grab the lock when performing a pipe relabel operation, and assert it a little deeper in the stack. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Be consistent about "static" functions: if the function is markedphk2002-09-281-2/+2
| | | | | | static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
* Don't use "NULL" when "0" is really meant.archie2002-08-211-2/+2
|
OpenPOWER on IntegriCloud