summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* - Move the function prototypes for kern_setrlimit() and kern_wait() tojhb2005-01-051-0/+1
| | | | | sys/syscallsubr.h where all the other kern_foo() prototypes live. - Resort kern_execve() while I'm there.
* Rework the optimization for spinlocks on UP to be slightly less drastic andjhb2005-01-051-8/+2
| | | | | | | | | | | | turn it back on. Specifically, the actual changes are now less intrusive in that the _get_spin_lock() and _rel_spin_lock() macros now have their contents changed for UP vs SMP kernels which centralizes the changes. Also, UP kernels do not use _mtx_lock_spin() and no longer include it. The UP versions of the spin lock functions do not use any atomic operations, but simple compares and stores which allow mtx_owned() to still work for spin locks while removing the overhead of atomic operations. Tested on: i386, alpha
* Since we do not support forceful unmount of DEVFS we can do away withphk2005-01-041-45/+3
| | | | the partially implemented vnode-readoption code in vgonechrl().
* Regen.marcel2005-01-032-3/+3
|
* uuidgen(2) is MP safe.marcel2005-01-031-1/+1
|
* Implement device_quiesce. This method means 'you are about to beimp2004-12-312-0/+132
| | | | | | | | unloaded, cleanup, or return ebusy of that's inconvenient.' The default module hanlder for newbus will now call this when we get a MOD_QUIESCE event, but in the future may call this at other times. This shouldn't change any actual behavior until drivers start to use it.
* Be consistent and always use form 'return (value);' instead of 'return value;'.pjd2004-12-311-15/+15
| | | | | We had (before this change) 84 lines where it was style(9)-clean and 15 lines where it was not.
* Fix a typo and two whitespace nits.jhb2004-12-301-3/+3
|
* Rework the interface between priority propagation (lending) and thejhb2004-12-303-105/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | schedulers a bit to ensure more correct handling of priorities and fewer priority inversions: - Add two functions to the sched(9) API to handle priority lending: sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these functions to ask the scheduler to lend a thread a set priority and to tell the scheduler when it thinks it is ok for a thread to stop borrowing priority. The unlend case is slightly complex in that the turnstile code tells the scheduler what the minimum priority of the thread needs to be to satisfy the requirements of any other threads blocked on locks owned by the thread in question. The scheduler then decides where the thread can go back to normal mode (if it's normal priority is high enough to satisfy the pending lock requests) or it it should continue to use the priority specified to the sched_unlend_prio() call. This involves adding a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag for priority elevation. - Schedulers now refuse to lower the priority of a thread that is currently borrowing another therad's priority. - If a scheduler changes the priority of a thread that is currently sitting on a turnstile, it will call a new function turnstile_adjust() to inform the turnstile code of the change. This function resorts the thread on the priority list of the turnstile if needed, and if the thread ends up at the head of the list (due to having the highest priority) and its priority was raised, then it will propagate that new priority to the owner of the lock it is blocked on. Some additional fixes specific to the 4BSD scheduler include: - Common code for updating the priority of a thread when the user priority of its associated kse group has been consolidated in a new static function resetpriority_thread(). One change to this function is that it will now only adjust the priority of a thread if it already has a time sharing priority, thus preserving any boosts from a tsleep() until the thread returns to userland. Also, resetpriority() no longer calls maybe_resched() on each thread in the group. Instead, the code calling resetpriority() is responsible for calling resetpriority_thread() on any threads that need to be updated. - schedcpu() now uses resetpriority_thread() instead of just calling sched_prio() directly after it updates a kse group's user priority. - sched_clock() now uses resetpriority_thread() rather than writing directly to td_priority. - sched_nice() now updates all the priorities of the threads after the group priority has been adjusted. Discussed with: bde Reviewed by: ups, jeffr Tested on: 4bsd, ule Tested on: i386, alpha, sparc64
* Whitespace fix.jhb2004-12-301-0/+1
|
* Stop explicitly touching td_base_pri outside of the scheduler and simplyjhb2004-12-303-7/+3
| | | | | set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
* Call tty_close() at the very end of ttyclose() since otherwise NULLjhb2004-12-301-1/+1
| | | | | | | deferences can occur since tty_close() may end up freeing the tty structure if it drops the last reference to it. Glanced at by: phk
* Make the sysctls kern.ipc.msgmnb and kern.ipc.msgtql into tunables asrwatson2004-12-301-2/+4
| | | | | | | | | is the case for most other sysctls in the System V IPC message queue implementation. PR: 75541 Submitted by: Sergiy Vyshnevetskiy <serg at vostok dot net> MFC after: 2 weeks
* Make umtx_wait and umtx_wake more like linux futex does, it isdavidxu2004-12-301-41/+9
| | | | | | | more general than previous. It also lets me implement cancelable point in thread library. Also in theory, umtx_lock and umtx_unlock can be implemented by using umtx_wait and umtx_wake, all atomic operations can be done in userland without kernel's casuptr() function.
* Eliminate (now) unnecessary acquisition and release of the global pagealc2004-12-291-4/+0
| | | | queues lock.
* - Up the WITNESS_COUNT macro from 200 to 1024 to support the growing numberjhb2004-12-281-2/+1
| | | | | | | | | of lock types in the kernel. This results in an increase of witness data usage from ~145k to ~280k on i386 for kernels with 'options WITNESS'. - Remove the unused witness malloc bucket. Submitted by: Michal Mertl mime at traveller dot cz (1)
* Attempt to slightly refine the print out from "show alllocks" -- listrwatson2004-12-271-2/+2
| | | | | the process and thread numbers/names on the same line rather than on separate lines, and print the thread pointer not just the tid.
* Do not vput(9) unlocked vnode and do not VREF it with the sole purposekan2004-12-271-2/+0
| | | | | | of vputting it back immediately. Complained by: DEBUG_VFS_LOCKS
* - Unintentionally checked in a debugging panic. Remove that.jeff2004-12-261-4/+0
|
* - Remove a 4BSD specific hack since this will work on ULE too.jeff2004-12-261-4/+0
|
* - Fix a long standing problem where an ithread would not honor sched_pin().jeff2004-12-261-127/+140
| | | | | | | | | | | | | | | | | | | | | | | | - Remove the sched_add wrapper that used sched_add_internal() as a backend. Its only purpose was to interpret one flag and turn it into an int. Do the right thing and interpret the flag in sched_add() instead. - Pass the flag argument to sched_add() to kseq_runq_add() so that we can get the SRQ_PREEMPT optimization too. - Add a KEF_INTERNAL flag. If KEF_INTERNAL is set we don't adjust the SLOT counts, otherwise the slot counts are adjusted as soon as we enter sched_add() or sched_rem() rather than when the thread is actually placed on the run queue. This greatly simplifies the handling of slots. - Remove the explicit prevention of migration for ithreads on non-x86 platforms. This was never shown to have any real benefit. - Remove the unused class argument to KSE_CAN_MIGRATE(). - Add ktr points for thread migration events. - Fix a long standing bug on platforms which don't initialize the cpu topology. The ksg_maxid variable was never correctly set on these platforms which caused the long term load balancer to never inspect more than the first group or processor. - Fix another bug which prevented the long term load balancer from working properly. If stathz != hz we can't expect sched_clock() to be called on the exact tick count that we're anticipating. - Rearrange sched_switch() a bit to reduce indentation levels.
* Add "show alllocks" command to DDB, which dumps a list of processesrwatson2004-12-261-0/+42
| | | | | | | | | | | | and threads currently holding sleep mutexes (and spin mutexes for curthread). This can be quite useful in looking for a lock condition summary for a system, as it avoids manually iterating through threads and processes to find all the interesting locks. NB: "alllocks" is up there with "lockedvnods" for a bad argument for show. MFC after: 2 weeks
* - Run sched_userret() after thread_userret(). Before, sched_userret() wouldjeff2004-12-261-5/+4
| | | | | | | | | lower the priority of the returning thread to a user priority before calling into thread_userret() which would call wakeup() which in turn would cause the returning thread to eventually context switch rather than completing its slice. Allowing this thread to complete its slice first yields a 15% performance improvement in super-smack on my dual opteron with 4BSD.
* - Wrap the thread count adjustment in sched_load_add() and sched_load_rem()jeff2004-12-261-6/+30
| | | | | | so that we may place some ktr entries nearby. - Define other KTR_SCHED tracepoints so that we may graph the operation of the scheduler.
* - Remove earlier KTR_ULE tracepoints.jeff2004-12-261-32/+14
| | | | | - Define new KTR_SCHED points so that we can graph the operation of the scheduler.
* - Define KTR points for KTR_SCHED.jeff2004-12-263-0/+22
|
* Make _umtx_op() as more general interface, the final parameter needn't bedavidxu2004-12-254-7/+7
| | | | timespec pointer, every parameter will be interpreted by its opcode.
* 1. introduce umtx_owner to get an owner of a umtx.davidxu2004-12-251-3/+1
| | | | | 2. add const qualifier to umtx_timedlock and umtx_timedwait. 3. add missing blackets in umtx do_unlock_and_wait.
* Add umtxq_lock/unlock around umtx_signal, fix debug kernel compiling,davidxu2004-12-241-5/+9
| | | | | let umtx_lock returns EINTR when it returns ERESTART, this lets userland have chance to back off mtx lock code when needed.
* 1. Fix race condition between umtx lock and unlock, heavy testingdavidxu2004-12-241-133/+104
| | | | | on SMP can explore the bug. 2. Let umtx_wake returns number of threads have been woken.
* Assert the sem lock in sem_ref() and sem_rel(), as it is required torwatson2004-12-231-0/+2
| | | | safely manipulate the reference count.
* Remove temporary debugging printf that was used to detect the presencerwatson2004-12-231-4/+0
| | | | | | | of a race that had previously caused a panic in order to determine if the fix was for the right problem. It was. MFC after: 2 weeks
* In sonewconn(), the s/if/while/ change to wait for room at the tail ofrwatson2004-12-232-10/+10
| | | | | the accept queue is a feature, not a bug/issue, so remove the XXXRW from the comment.
* Remove an XXXRW indicating atomic operations might be used as arwatson2004-12-231-12/+4
| | | | | | | | | | | | | | | | substitute for a global mutex protecting the socket count and generation number. The observation that soreceive_rcvoob() can't return an mbuf chain is a property, not a bug, so remove the XXXRW. In sorflush, s/existing/previous/ for code when describing prior behavior. For SO_LINGER socket option retrieval, remove an XXXRW about why we hold the mutex: this is correct and not dubious. MFC after: 2 weeks
* In soalloc(), simplify the mac_init_socket() handling to removerwatson2004-12-231-14/+3
| | | | | | | | | | | unnecessary use of a global variable and simplify the return case. While here, use ()'s around return values. In sodealloc(), remove a comment about why we bump the gencnt and decrement the socket count separately. It doesn't add substantially to the reading, and clutters the function. MFC after: 2 weeks
* Add send buffer locking to uipc_send(). Without this locking a race canalc2004-12-221-0/+3
| | | | | | | | | occur between a reader and a writer that results in a panic upon close, e.g., "panic: sbflush_locked: cc 4 || mb 0xffffff0052afa400 || mbcnt 0" Reviewed by: rwatson@ MFC after: 2 weeks
* Include uio.hphk2004-12-221-3/+3
| | | | | Check O_NONBLOCK instead if IO_NDELAY Don't include vnode.h
* Hide/remove various printfs, now that root mounting doesn't seem to explodephk2004-12-201-9/+2
| | | | on people.
* fix a misleading sleep identifier.phk2004-12-201-1/+1
|
* We can only ever get to vgonechrl() from a devfs vnode, so we do notphk2004-12-201-2/+0
| | | | | | | need to reassign the vp->v_op to devfs_specops, we know that is the value already. Make devfs_specops private to devfs.
* 1. msleep returns EWOULDBLOCK not ETIMEDOUT, use EWOULDBLOCK instead.davidxu2004-12-181-8/+6
| | | | 2. Eliminate a possible lock leak in timed wait loop.
* 1. make umtx sharable between processes, the way is two or more processesdavidxu2004-12-184-170/+548
| | | | | | | | | | call mmap() to create a shared space, and then initialize umtx on it, after that, each thread in different processes can use the umtx same as threads in same process. 2. introduce a new syscall _umtx_op to support timed lock and condition variable semantics. also, orignal umtx_lock and umtx_unlock inline functions now are reimplemented by using _umtx_op, the _umtx_op can use arbitrary id not just a thread id.
* fix m_append for case where additional mbufs are requiredsam2004-12-151-2/+2
|
* Fix a deadlock I introduced this morning.phk2004-12-141-6/+7
| | | | Mostly from: tegge
* - Garbage collect several unused members of struct kse and struce ksegrp.jeff2004-12-144-26/+0
| | | | As best as I can tell, some of these were never used.
* - In kseq_choose(), don't recalculate slice values for processes with ajeff2004-12-141-11/+25
| | | | | | | | | | | | | nice of 0. Doing so can cause an infinite loop because they should be running, but a nice -20 process could prevent them from doing so. - Add a new flag KEF_PRIOELEV to flag a thread that has had its priority elevated due to priority propagation. If a thread has had its priority elevated, we assume that it must go on the current queue and it must get a slice. - In sched_userret() if our priority was elevated and we shouldn't have a timeslice, yield here until we should. Found/Tested by: glebius
* Add a new kind of reference count (fd_holdcnt) to struct filedescphk2004-12-141-16/+45
| | | | | | | | | | | | | | | | | | | | | | | which holds on to just the data structure and the mutex. (The existing refcount (fd_refcnt) holds onto the open files in the descriptor.) The fd_holdcnt is protected by fdesc_mtx, fd_refcnt by FILEDESC_LOCK. Add fdhold(struct proc *) which gets a hold on the filedescriptors of the specified proc.. Add fddrop(struct filedesc *) which drops the fd_holdcnt and if zero destroys the mutex and frees the memory. Initialize the fd_holdcnt to one in fdinit(). Normal operations on the filedesc structure will not change it. In fdfree() use fddrop() to dispose of the mutex and structure. Hold the FILEDESC_LOCK() until we have cleaned out the contents and carefully set the fields to null values during cleanup. Use fdhold()/fddrop() in mountcheckdirs() and sysctl_kern_file().
* Make fdesc_mtx private to kern_descrip.c now that the flock has come home.phk2004-12-141-1/+1
|
* Move the checkdirs() function from vfs_mount.c to kern_descrip.c andphk2004-12-142-52/+51
| | | | call it mountcheckdirs().
* Add new function fdunshare() which encapsulates the necessary light magicphk2004-12-143-22/+22
| | | | | | | | for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c
OpenPOWER on IntegriCloud