summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_switch.c
Commit message (Collapse)AuthorAgeFilesLines
* Add a comment on why inlining critical_enter() may not be a good ideaattilio2012-12-091-0/+6
| | | | | | | for the general case. Reviewed by: bde MFC after: 1 week
* critical_exit: ignore td_owepreempt if kdb_active is setavg2011-12-041-1/+1
| | | | | | | | | calling mi_switch in such a context results in a recursion via kdb_switch Suggested by: jhb Reviewed by: jhb MFC after: 5 weeks
* Update several places that iterate over CPUs to use CPU_FOREACH().jhb2010-06-111-3/+1
|
* - Use DPCPU for SCHED_STATS. This is somewhat awkward because thejeff2009-06-251-18/+37
| | | | | | | | | | | offset of the stat is not known until link time so we must emit a function to call SYSCTL_ADD_PROC rather than using SYSCTL_PROC directly. - Eliminate the atomic from SCHED_STAT_INC now that it's using per-cpu variables. Sched stats are always incremented while we're holding a spinlock so no further protection is required. Reviewed by: sam
* fix typo in runz_fuzzjulian2008-05-121-1/+1
| | | | noticed by:Elijah Buck
* - Make SCHED_STATS more generic by adding a wrapper to create thejeff2008-04-171-25/+41
| | | | | | | | | | | | | | | | | | variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia
* - Restore runq to manipulating threads directly by putting runq links andjeff2008-03-201-64/+49
| | | | | | | | | | | rqindex back in struct thread. - Compile kern_switch.c independently again and stop #include'ing it from schedulers. - Remove the ts_thread backpointers and convert most code to go from struct thread to struct td_sched. - Cleanup the ts_flags #define garbage that was causing us to sometimes do things that expanded to td->td_sched->ts_thread->td_flags in 4BSD. - Export the kern.sched sysctl node in sysctl.h
* - Remove the unused and redundant sched_newproc() function.jeff2008-03-201-34/+0
| | | | | - Remove the unused and redundant sched_newthread() which peaks into scheduler private structures.
* - Move maybe_preempt() from kern_switch.c to sched_4bsd.c. This is functionjeff2008-03-201-113/+26
| | | | | | | | is only used by 4bsd. - Create a new runq_choose_fuzz() function rather than polluting runq_choose() with 4BSD specific code. - Move the fuzz sysctl into sched_4bsd.c - Remove some dead code from kern_switch.c
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Remove kernel support for M:N threading.jeff2008-03-121-1/+1
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* generally we are interested in what thread did something asjulian2007-11-141-3/+3
| | | | | | opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
* - Fix ULE in kernels without PREEMPTION compiled in by always enabling thejeff2007-10-081-12/+1
| | | | | | | | | | critical_exit() owepreempt check. ULE will always use owepreempt to preempt the idle thread. This change does not effect 4BSD since it will never set owepreempt without PREEMPTION enabled. - Remove some unused code from choosethread(). Discussed with: jhb Approved by: re
* Fix some entries in the locks static table of witness.attilio2007-09-201-1/+0
| | | | | | | | | | | | | | | | | | | | In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re
* - Move all of the PS_ flags into either p_flag or td_flags.jeff2007-09-171-2/+2
| | | | | | | | | | | | | | - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
* - Improve runq_findbit_from() which is used by ULE's circular queue. Maskjeff2007-08-201-32/+22
| | | | | | | | | of the bits we want to ignore on the first pass rather than doing a linear scan. This puts us within a few instructions of the cost of runq_findbit() and removes this function from the top of profiling output for context switch heavy workloads. Approved by: re
* - Set SW_PREEMPT when we preempt in critical_exit().jeff2007-08-031-1/+1
| | | | Approved by: re
* - Remove explicit references to sched_lock. A simpler assert will do.jeff2007-07-191-2/+1
| | | | Approved by: re
* - Garbage collect unused concurrency functions.jeff2007-06-121-23/+0
|
* Commit 1/14 of sched_lock decomposition.jeff2007-06-041-21/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | - Move all scheduler locking into the schedulers utilizing a technique similar to solaris's container locking. - A per-process spinlock is now used to protect the queue of threads, thread count, suspension count, p_sflags, and other process related scheduling fields. - The new thread lock is actually a pointer to a spinlock for the container that the thread is currently owned by. The container may be a turnstile, sleepqueue, or run queue. - thread_lock() is now used to protect access to thread related scheduling fields. thread_unlock() unlocks the lock and thread_set_lock() implements the transition from one lock to another. - A new "blocked_lock" is used in cases where it is not safe to hold the actual thread's lock yet we must prevent access to the thread. - sched_throw() and sched_fork_exit() are introduced to allow the schedulers to fix-up locking at these points. - Add some minor infrastructure for optionally exporting scheduler statistics that were invaluable in solving performance problems with this patch. Generally these statistics allow you to differentiate between different causes of context switches. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* - Change types for necent runq additions to u_char rather than int.jeff2007-02-081-5/+5
| | | | | | | - Fix these types in ULE as well. This fixes bugs in priority index calculations in certain edge cases. (int)-1 % 64 != (uint)-1 % 64. Reported by: kkenn using pho's stress2.
* - Remove setrunqueue and replace it with direct calls to sched_add().jeff2007-01-231-82/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
* - Don't pass a pointer into runq_choose_from(). The caller can adjust thejeff2007-01-041-3/+2
| | | | index if it chooses to.
* - Add three new functions to support circular run queues.jeff2007-01-041-4/+93
| | | | | | | | | | - runq_add_pri allows the caller to position the thread at any rqindex regardless of priority. - runq_choose_from() chooses the lowest priority thread starting from a given index. The index is updated with the rqindex of the chosen thread. This routine is used to pick the lowest priority relative to a given index. - runq_remove_idx() updates the index if the run queue that held the removed thread is now empty.
* Prefer a more traditional spelling of inhibited in comments and panicrwatson2006-12-311-2/+2
| | | | messages.
* Threading cleanup.. part 2 of several.julian2006-12-061-573/+62
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* Make KSE a kernel option, turned on by default in all GENERICjb2006-10-261-0/+68
| | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@
* Add scheduler CORE, the work I have done half a year ago, recent,davidxu2006-06-131-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.
* sched_rem() already sets ke->ke_state to KES_THREAD, so there's no needcognet2006-06-011-2/+0
| | | | to redo it.
* Trim trailing whitespace.kan2005-12-281-46/+45
|
* Restore KTR_CRITICAL but conditionally compile it in as KTR_SCHED.njl2005-12-181-2/+9
| | | | Requested by: scottl, jhb
* Clean up unused or poorly utilized KTR values. Remove KTR_FS, KTR_KGDB,njl2005-12-171-2/+2
| | | | | | | and KTR_IO as they were never used. Remove KTR_CLK since it was only used for hardclock firing and use KTR_INTR there instead. Remove KTR_CRITICAL since it was only used for crit enter/exit and use KTR_CONTENTION instead.
* In adjustrunqueue(), add code to handle thread migrating case fordavidxu2005-08-031-1/+6
| | | | | ULE scheduler. In original code, local run queue of threaded ksegrp is corrupted if adjustrunqueue() is called while thread is migrating.
* Restore preemption of idle threads.ups2005-06-101-3/+1
| | | | Submitted by: jhb
* Lots of whitespace cleanup.ups2005-06-091-5/+6
| | | | | | Fix for broken if condition. Submitted by: nate@
* Fix some race conditions for pinned threads that may cause them to runups2005-06-091-18/+30
| | | | | | | | on the wrong CPU. Add IPI support for preempting a thread on another CPU. MFC after:3 weeks
* Use low level constructs borrowed from interrupt threads to wait forups2005-05-231-11/+4
| | | | | work in proc0. Remove the TDP_WAKEPROC0 workaround.
* Fix a bug that caused preemption to happen for a thread in the sameups2005-05-191-2/+2
| | | | | | | ksegrp with the same priority as the currently running thread. This can cause propagate_priority() to panic. Pointy hat to: ups
* Sprinkle some volatile magic and rearrange things a bit to avoid raceups2005-04-081-5/+11
| | | | | | conditions in critical_exit now that it no longer blocks interrupts. Reviewed by: jhb
* Divorce critical sections from spinlocks. Critical sections as denoted byjhb2005-04-041-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more
* Add a read-only kern.sched.preemption sysctl so that user space can tellrwatson2005-03-201-0/+13
| | | | if "options PREEMPTION" is compiled into the kernel.
* A further step on the journey of meaking panics and debugging more reliable:rwatson2005-03-171-2/+3
| | | | | | | | | | | | | | | in the window between the beginning of panic() and entering the debugger, it's possible to receive interrupts. If we receive an interrupt, don't preempt if panicstr != NULL, as the system is in the process of failing, and the preempting thread is likely to stumble over the failure. The typical scenario is during the printf() in panic() prior to entering the debugger, but when running with a slower console type such as serial console. It could be that the panic string should be passed to the debugger to print, so that it can run from the debugger's environment rather than a regular kernel printf. Glanced at by: jhb
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* - Define KTR points for KTR_SCHED.jeff2004-12-261-0/+3
|
* - Garbage collect several unused members of struct kse and struce ksegrp.jeff2004-12-141-6/+0
| | | | As best as I can tell, some of these were never used.
* Remove local definitions of RANGEOF() and use __rangeof() instead.das2004-11-201-2/+0
| | | | Also remove a few bogus casts.
* Add basic critical section tracing to KTR using event type KTR_CRITICAL.rwatson2004-11-071-0/+4
| | | | | | | | This generates a KTR event for each critical section entered and exited. It would be desirable to also log the filename and line number of the source entering or exiting the critical section, but this requires hacking up the critical section API, so I've not done that yet.
* If a process needs to be swapped in, wakeup the swapper from withinscottl2004-10-161-0/+4
| | | | | | | | | critical_exit as the process is getting scheduled to run. This is subotimal but for now avoid the LOR between the scheduler and the sleepq systems. This is a 5.3 candidate. Submitted by: davidxu MFC After: 3 days
* Fix maybe_preempt_in_ksegrp for !SMP.ups2004-10-131-7/+33
| | | | | | | Tested by: tegge Reviewed by: julian Approved by: sam (mentor) MFC after: 3 days
* Make !SMP kernels compile, and as far as I can tell, work again.phk2004-10-121-1/+2
|
OpenPOWER on IntegriCloud