summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_turnstile.c
Commit message (Collapse)AuthorAgeFilesLines
* Improve check coverage about idle threads.attilio2012-09-121-1/+0
| | | | | | | | | | | | Idle threads are not allowed to acquire any lock but spinlocks. Deny any attempt to do so by panicing at the locking operation when INVARIANTS is on. Then, remove the check on blocking on a turnstile. The check in sleepqueues is left because they are not allowed to use tsleep() either which could happen still. Reviewed by: bde, jhb, kib MFC after: 1 week
* Mark the idle threads as non-sleepable and also assert that an idlejhb2012-08-221-0/+1
| | | | thread never blocks on a turnstile.
* Implement the DTrace sched provider. This implementation aims to berstone2012-05-151-0/+10
| | | | | | | | | | | | | | | | | | | | | | compatible with the sched provider implemented by Solaris and its open- source derivatives. Full documentation of the sched provider can be found on Oracle's DTrace wiki pages. Note that for compatibility with scripts originally written for Solaris, serveral probes are defined that will never fire. These probes are defined to fire when Solaris-specific features perform certain actions. As these features are not present in FreeBSD, the probes can never fire. Also, I have added a two probes that are not defined in Solaris, lend-pri and load-change. These probes have been added to make it possible to collect schedgraph data with DTrace. Finally, a few probes are defined in Solaris to take a cpuinfo_t * argument. As it was not immediately clear to me how to translate that to FreeBSD, currently those probes are passed NULL in place of a cpuinfo_t *. Sponsored by: Sandvine Incorporated MFC after: 2 weeks
* Fix a typo.davide2012-04-141-1/+1
| | | | | Approved by: gnn (mentor) MFC after: 2 days
* Fix !DDB build after r234190.marius2012-04-141-1/+1
|
* - Extend the KDB interface to add a per-debugger callback to print ajhb2012-04-121-3/+1
| | | | | | | | | | | backtrace for an arbitrary thread (rather than the calling thread). A kdb_backtrace_thread() wrapper function uses the configured debugger if possible, otherwise it falls back to using stack(9) if that is available. - Replace a direct call to db_trace_thread() in propagate_priority() with a call to kdb_backtrace_thread() instead. MFC after: 1 week
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-2/+3
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Always assert that the turnstile chain lock is held in turnstile_wait()jhb2011-02-041-2/+1
| | | | | | and remove a duplicate hash lookup. MFC after: 1 week
* Introduce the new kernel thread called "deadlock resolver".attilio2010-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | While the name is pretentious, a good explanation of its targets is reported in this 17 months old presentation e-mail: http://lists.freebsd.org/pipermail/freebsd-arch/2008-August/008452.html In order to implement it, the sq_type in sleepqueues is mandatory and not only compiled along with INVARIANTS option. Additively, a new sleepqueue function, sleepq_type() is added, returning the type of the sleepqueue linked to a wchan. Three new sysctls are added in order to configure the thread: debug.deadlkres.slptime_threshold debug.deadlkres.blktime_threshold debug.deadlkres.sleepfreq rappresenting the thresholds for sleep and block time that will lead to a deadlock matching (when exceeded), while the sleepfreq rappresents the number of seconds between 2 consecutive thread runnings. In order to enable the deadlock resolver thread recompile your kernel with the option DEADLKRES. Reviewed by: jeff Tested by: pho, Giovanni Trematerra Sponsored by: Nokia Incorporated, Sandvine Incorporated MFC after: 2 weeks
* Fix indentation.ed2009-12-201-1/+1
|
* Make ddb command registration dynamic so modules can extendsam2008-09-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | the command set (only so long as the module is present): o add db_command_register and db_command_unregister to add and remove commands, respectively o replace linker sets with SYSINIT's (and SYSUINIT's) that register commands o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table for registering top-level commands, show operands, and show all operands, respectively While here also: o sort command lists o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases for existing commands o add "show all trace" as an alias for "show alltrace" o add "show all locks" as an alias for "show alllocks" Submitted by: Guillaume Ballet <gballet@gmail.com> (original version) Reviewed by: jhb MFC after: 1 month
* - Reduce scope of #ifdef's in uma_zcreate() call in init_turnstile0().jhb2008-09-081-3/+4
| | | | | | | | | - Set UMA_ZONE_NOFREE so that the per-turnstile spin locks are type stable to avoid a race where one thread might dereference a lock in a free'd turnstile that was previously used by another thread. Theorized by: tegge (2) MFC after: 1 week
* - Make SCHED_STATS more generic by adding a wrapper to create thejeff2008-04-171-2/+1
| | | | | | | | | | | | | | | | | | variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia
* - Add THREAD_LOCKPTR_ASSERT() to assert that the thread's lock points atjeff2008-02-071-7/+7
| | | | | | | | | | the provided lock or &blocked_lock. The thread may be temporarily assigned to the blocked_lock by the scheduler so a direct comparison can not always be made. - Use THREAD_LOCKPTR_ASSERT() in the primary consumers of the scheduling interfaces. The schedulers themselves still use more explicit asserts. Sponsored by: Nokia
* Adaptive spinning in write path with readers and writer starvation avoidance.jeff2008-02-061-8/+3
| | | | | | | | | | | | | | | | - Move recursion checking into rwlock inlines to free a bit for use with adaptive spinners. - Clear the RW_LOCK_WRITE_SPINNERS flag whenever the lock state changes causing write spinners to restart their loop. - Write spinners are limited by a count while readers hold the lock as there is no way to know for certain whether readers are running still. - In the read path block if there are write waiters or spinners to avoid starving writers. Use a new per-thread count, td_rw_rlocks, to skip starvation avoidance if it might cause a deadlock. - Remove or change invalid assertions in turnstiles. Reviewed by: attilio (developed parts of the patch as well) Sponsored by: Nokia
* generally we are interested in what thread did something asjulian2007-11-141-4/+4
| | | | | | opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
* - Include opt_sched.h for SCHED_STATS.jeff2007-06-121-0/+1
|
* Commit 3/14 of sched_lock decomposition.jeff2007-06-041-135/+149
| | | | | | | | | | | | | | | | | - Add a per-turnstile spinlock to solve potential priority propagation deadlocks that are possible with thread_lock(). - The turnstile lock order is defined as the exact opposite of the lock order used with the sleep locks they represent. This allows us to walk in reverse order in priority_propagate and this is the only place we wish to multiply acquire turnstile locks. - Use the turnstile_chain lock to protect assigning mutexes to turnstiles. - Change the turnstile interface to pass back turnstile pointers to the consumers. This allows us to reduce some locking and makes it easier to cancel turnstile assignment while the turnstile chain lock is held. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* - Convert turnstiles and sleepqueus to use UMA. This provides a modestjeff2007-05-181-21/+54
| | | | | | | | | | speedup and will be more useful after each gains a spinlock in the impending thread_lock() commit. - Move initialization and asserts into init/fini routines. fini routines are only needed in the INVARIANTS case for now. Submitted by: Attilio Rao <attilio@FreeBSD.org> Tested by: kris, jeff
* - Remove setrunqueue and replace it with direct calls to sched_add().jeff2007-01-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
* Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.delphij2007-01-171-1/+1
|
* Wrap propagate_priority() in a critical section to prevent unwantedjhb2007-01-111-0/+4
| | | | | | | | | preemptions when adjusting the priority of a thread that is on a run queue. This was only observed when FULL_PREEMPTION was enabled. Reported by: kris Diagnosed by: ups MFC after: 1 week
* Add a new 'show sleepchain' ddb command similar to 'show lockchain' exceptjhb2006-08-151-0/+67
| | | | | | | | | | that it operates on lockmgr and sx locks. This can be useful for tracking down vnode deadlocks in VFS for example. Note that this command is a bit more fragile than 'show lockchain' as we have to poke around at the wait channel of a thread to see if it points to either a struct lock or a condition variable inside of a struct sx. If td_wchan points to something unmapped, then this command will terminate early due to a fault, but no harm will be done.
* Rename 'show lockchain' to 'show locktree' and 'show threadchain' tojhb2006-08-151-5/+9
| | | | | | | | 'show lockchain'. The churn is because I'm about to add a new 'show sleepchain' similar to 'show lockchain' for sleep locks (lockmgr and sx) and 'show threadchain' was a bit ambiguous as both commands show a chain of thread dependencies, 'lockchain' is for non-sleepable locks (mtx and rw) and 'sleepchain' is for sleepable locks.
* Honor db_pager_quit in 'show threadchain', 'show allchains', andjhb2006-07-121-1/+7
| | | | | 'show lockchain'. This is especially helpful for the first 2 as a threadchain could get stuck in an infinite loop during a mutex deadlock.
* Add some new commands to hopefully make it easier to diagnose lock-relatedjhb2006-04-251-0/+138
| | | | | | | | | | | | | | | | problems in ddb: - "show threadchain [thread]" will start with the specified thread (or the current kdb thread by default) and show it's state. If it is blocked on a lock, it will find the owner of the lock and show its state, etc. - "show allchains" will find all of the threads that are blocked on a lock (but do not have any threads blocked on a lock they hold) and show the resulting thread chain. - "show lockchain <lock>" takes a pointer to a lock_object (such as a mutex or rwlock). If there is a turnstile for that lock, then it will display all the threads blocked on the lock. In addition, for each thread blocked on the lock, it will display any contested locks they hold, and recurse on those locks to show any threads blocked on those locks, etc.
* Print td_name instead of p_comm if td_name is non-empty forjhb2006-04-211-1/+2
| | | | 'show turnstile' and 'show sleepq'.
* - Bring back turnstile_empty() which can check to see if an individualjhb2006-04-181-0/+67
| | | | | | queue on a turnstile is empty. - Add a turnstile_disown() function that allows a thread to give up ownership of a turnstile w/o waking up any waiters.
* Always explicitly panic in propogate_priority() if we try to propogatejhb2006-03-291-8/+14
| | | | | | | | | | a lock's priority to a sleeping thread. When we panic, dump a stack trace of the thread that is asleep if DDB is compiled into the kernel just before calling panic(). This is much more informative and useful for debugging than the current behavior of getting a page fault and not having an easy way of determining which thread caused the original problem. MFC after: 1 week
* - Add support for having both a shared and exclusive queue of threads injhb2006-01-271-47/+174
| | | | | | | | | | | | | | | each turnstile. Also, allow for the owner thread pointer of a turnstile to be NULL. This is needed for the upcoming reader/writer lock implementation. - Add a new ddb command 'show turnstile' that will look up the turnstile associated with the given lock argument and display useful information like the list of threads blocked on each queue, etc. If there isn't an active turnstile for a lock at the specified address, then the function will see if there is an active turnstile at the specified address and display info about it if so. - Adjust the mutex code to handle the turnstile API changes. Tested on: i386 (all), alpha, amd64, sparc64 (1 and 3)
* Initialize thread0.td_contested in init_turnstiles() rather thanjhb2006-01-171-0/+1
| | | | mutex_init() as it is used by the turnstile code and is not mutex-specific.
* Garbage collect turnstile_empty() since it is unused.jhb2006-01-171-16/+0
|
* Trim a couple of unneeded includes.jhb2005-09-291-1/+0
|
* Make a bunch of malloc types static.phk2005-02-101-1/+1
| | | | Found by: src/tools/tools/kernxref
* Rework the interface between priority propagation (lending) and thejhb2004-12-301-71/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | schedulers a bit to ensure more correct handling of priorities and fewer priority inversions: - Add two functions to the sched(9) API to handle priority lending: sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these functions to ask the scheduler to lend a thread a set priority and to tell the scheduler when it thinks it is ok for a thread to stop borrowing priority. The unlend case is slightly complex in that the turnstile code tells the scheduler what the minimum priority of the thread needs to be to satisfy the requirements of any other threads blocked on locks owned by the thread in question. The scheduler then decides where the thread can go back to normal mode (if it's normal priority is high enough to satisfy the pending lock requests) or it it should continue to use the priority specified to the sched_unlend_prio() call. This involves adding a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag for priority elevation. - Schedulers now refuse to lower the priority of a thread that is currently borrowing another therad's priority. - If a scheduler changes the priority of a thread that is currently sitting on a turnstile, it will call a new function turnstile_adjust() to inform the turnstile code of the change. This function resorts the thread on the priority list of the turnstile if needed, and if the thread ends up at the head of the list (due to having the highest priority) and its priority was raised, then it will propagate that new priority to the owner of the lock it is blocked on. Some additional fixes specific to the 4BSD scheduler include: - Common code for updating the priority of a thread when the user priority of its associated kse group has been consolidated in a new static function resetpriority_thread(). One change to this function is that it will now only adjust the priority of a thread if it already has a time sharing priority, thus preserving any boosts from a tsleep() until the thread returns to userland. Also, resetpriority() no longer calls maybe_resched() on each thread in the group. Instead, the code calling resetpriority() is responsible for calling resetpriority_thread() on any threads that need to be updated. - schedcpu() now uses resetpriority_thread() instead of just calling sched_prio() directly after it updates a kse group's user priority. - sched_clock() now uses resetpriority_thread() rather than writing directly to td_priority. - sched_nice() now updates all the priorities of the threads after the group priority has been adjusted. Discussed with: bde Reviewed by: ups, jeffr Tested on: 4bsd, ule Tested on: i386, alpha, sparc64
* Refine the turnstile and sleep queue interfaces just a bit:jhb2004-10-121-12/+34
| | | | | | | | | | | | | | | | | | | | | | | | | - Add a new _lock() call to each API that locks the associated chain lock for a lock_object pointer or wait channel. The _lookup() functions now require that the chain lock be locked via _lock() when they are called. - Change sleepq_add(), turnstile_wait() and turnstile_claim() to lookup the associated queue structure internally via _lookup() rather than accepting a pointer from the caller. For turnstiles, this means that the actual lookup of the turnstile in the hash table is only done when the thread actually blocks rather than being done on each loop iteration in _mtx_lock_sleep(). For sleep queues, this means that sleepq_lookup() is no longer used outside of the sleep queue code except to implement an assertion in cv_destroy(). - Change sleepq_broadcast() and sleepq_signal() to require that the chain lock is already required. For condition variables, this lets the cv_broadcast() and cv_signal() functions lock the sleep queue chain lock while testing the waiters count. This means that the waiters count internal to condition variables is no longer protected by the interlock mutex and cv_broadcast() and cv_signal() now no longer require that the interlock be held when they are called. This lets consumers of condition variables drop the lock before waking other threads which can result in fewer context switches. MFC after: 1 month
* Add a critical section in turnstile_unpend() from before dropping thejhb2004-10-051-0/+2
| | | | | | | | | turnstile chain lock until after making all the awakened threads runnable. First, this fixes a priority inversion race. Second, this attempts to finish waking up all of the threads waiting on a turnstile before doing a preemption. Reviewed by: Stephan Uphoff (who found the priority inversion race)
* Give setrunqueue() and sched_add() more of a clue as tojulian2004-09-011-1/+1
| | | | | | where they are coming from and what is expected from them. MFC after: 2 days
* Revert modification of subr_turnstile.c accidentally included in therwatson2004-07-251-1/+0
| | | | | last commit; this assertion was provided by jhb for local debugging and not intended for broader consumption.
* In uipc_connect(), assert that the passed thread is curthread, and passrwatson2004-07-251-0/+1
| | | | td into unp_connect() instead of reading curthread.
* - Change mi_switch() and sched_switch() to accept an optional thread tojhb2004-07-021-1/+1
| | | | | | | | | | | | | switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.
* Oops, this didn't make it into my submit before I committed: Deferjhb2004-06-291-7/+19
| | | | | | creation of the sysctl tree for the turnstile profiling stats until a SI_SUB_LOCK sysinit. Doing it in init_turnstiles() is too early as it is called before mi_startup().
* Add two new kernel options to allow rudimentary profiling of the internaljhb2004-06-291-3/+48
| | | | | | | hash tables used in the sleep queue and turnstile code. Each option adds a sysctl tree under debug containing the maximum depth of any bucket in the hash table as well as a separate node for each bucket (or chain) containing the current depth and maximum depth for that bucket.
* Rename turnstile_wakeup() to turnstile_broadcast() to make the namingjhb2004-04-061-2/+2
| | | | | | more consistent with other APIs. sleepq and cv's use signal/broadcast, and msleep uses wakeup_one/wakeup. Prior to this turnstiles were using a signal/wakeup mixture.
* Fixup a comment.jhb2004-03-121-1/+1
|
* Add an implementation of a generic sleep queue abstraction that is usedjhb2004-02-271-5/+0
| | | | | | | | | | | | | | | | to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor. Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.
* Clarify and tweak some comments.jhb2004-02-271-3/+3
|
* - Add a flags parameter to mi_switch. The value of flags may be SW_VOL orjeff2004-01-251-2/+1
| | | | | | | | | | SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.
* Adjust an assertion for the TDF_TSNOBLOCK race handling injhb2003-12-091-2/+3
| | | | | | | turnstile_unpend(). A racing thread that does not have TDI_LOCK set may either be running on another CPU or it may be sitting on a run queue if it was preempted during the very small window in turnstile_wait() between unlocking the turnstile chain lock and locking sched_lock.
* Assert that the we never give a thread a NULL turnstile when waking it up.jhb2003-12-091-0/+2
|
OpenPOWER on IntegriCloud