summaryrefslogtreecommitdiffstats
path: root/sys/kern/sched_4bsd.c
Commit message (Collapse)AuthorAgeFilesLines
* - Fix a bug in sched_4bsd where the timestamp for the sleeping operationattilio2010-01-081-1/+1
| | | | | | | | | | | | | | is not cleaned up on the wakeup but reset. This is harmless mostly because td_slptick (and ki_slptime from userland) should be analyzed only with the assumption that the thread is actually sleeping (thus while the td_slptick is correctly set) but without this invariant the number is nomore consistent. - Move td_slptick from u_int to int in order to follow 'ticks' signedness and wrap up accordingly [0] [0] Submitted by: emaste Sponsored by: Sandvine Incorporated MFC 1 week
* Allow swap out of the kernel stack for the thread with priority greaterkib2009-12-311-1/+1
| | | | | | | | | | | | | or equial then PSOCK, not less or equial. Higher priority has lesser numerical value. Existing test does not allow for swapout of the thread waiting for advisory lock, for exiting child or sleeping for timeout. On the other hand, high-priority waiters of VFS/VM events can be swapped out. Tested by: pho Reviewed by: jhb MFC after: 1 week
* Split P_NOLOAD into a per-thread flag (TDF_NOLOAD).attilio2009-11-031-8/+8
| | | | | | | | | | This improvements aims for avoiding further cache-misses in scheduler specific functions which need to keep track of average thread running time and further locking in places setting for this flag. Reported by: jeff (originally), kris (currently) Reviewed by: jhb Tested by: Giuseppe Cocomazzi <sbudella at email dot it>
* - Use __XSTRING where I want the define to be expanded. This resulted injeff2009-01-251-1/+1
| | | | | | | sizeof("MAXCPU") being used to calculate a string length rather than something more reasonable such as sizeof("32"). This shouldn't have caused any ill effect until we run on machines with 1000000 or more cpus.
* - Implement generic macros for producing KTR records that are compatiblejeff2009-01-171-18/+54
| | | | | | | | | | | | with src/tools/sched/schedgraph.py. This allows developers to quickly create a graphical view of ktr data for any resource in the system. - Add sched_tdname() and the pcpu field 'name' for quickly and uniformly identifying records associated with a thread or cpu. - Reimplement the KTR_SCHED traces using the new generic facility. Obtained from: attilio Discussed with: jhb Sponsored by: Nokia
* When choosing a CPU for a thread in a cpuset, prefer the last CPU that thejhb2008-07-281-1/+4
| | | | | thread ran on if there are no other CPUs in the set with a shorter per-CPU runqueue.
* Implement support for cpusets in the 4BSD scheduler.jhb2008-07-281-0/+116
| | | | | | | | | | | | | | | | | | | | | | - When a cpuset is applied to a thread, walk the cpuset to see if it is a "full" cpuset (includes all available CPUs). If not, set a new TDS_AFFINITY flag to indicate that this thread can't run on all CPUs. When inheriting a cpuset from another thread during thread creation, the new thread also inherits this flag. It is in a new ts_flags field in td_sched rather than using one of the TDF_SCHEDx flags because fork() clears td_flags after invoking sched_fork(). - When placing a thread on a runqueue via sched_add(), if the thread is not pinned or bound but has the TDS_AFFINITY flag set, then invoke a new routine (sched_pickcpu()) to pick a CPU for the thread to run on next. sched_pickcpu() walks the cpuset and picks the CPU with the shortest per-CPU runqueue length. Note that the reason for the TDS_AFFINITY flag is to avoid having to walk the cpuset and examine runq lengths in the common case. - To avoid walking the per-CPU runqueues in sched_pickcpu(), add an array of counters to hold the length of the per-CPU runqueues and update them when adding and removing threads to per-CPU runqueues. MFC after: 2 weeks
* Various and sundry style and whitespace fixes.jhb2008-07-281-74/+74
|
* Add the vtime (virtual time) hooks for DTrace.jb2008-05-251-0/+17
|
* - Add an integer argument to idle to indicate how likely we are to wakejeff2008-04-251-1/+1
| | | | | | | | | | | | | | | from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted. Sponsored by: Nokia
* - Make SCHED_STATS more generic by adding a wrapper to create thejeff2008-04-171-6/+4
| | | | | | | | | | | | | | | | | | variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia
* - Restore runq to manipulating threads directly by putting runq links andjeff2008-03-201-47/+30
| | | | | | | | | | | rqindex back in struct thread. - Compile kern_switch.c independently again and stop #include'ing it from schedulers. - Remove the ts_thread backpointers and convert most code to go from struct thread to struct td_sched. - Cleanup the ts_flags #define garbage that was causing us to sometimes do things that expanded to td->td_sched->ts_thread->td_flags in 4BSD. - Export the kern.sched sysctl node in sysctl.h
* - ULE and 4BSD share only one line of code from sched_newthread() so implementjeff2008-03-201-1/+5
| | | | | the required pieces in sched_fork_thread(). The td_sched pointer is already setup by thread_init anyway.
* - Move maybe_preempt() from kern_switch.c to sched_4bsd.c. This is functionjeff2008-03-201-1/+89
| | | | | | | | is only used by 4bsd. - Create a new runq_choose_fuzz() function rather than polluting runq_choose() with 4BSD specific code. - Move the fuzz sysctl into sched_4bsd.c - Remove some dead code from kern_switch.c
* - Directly include opt_sched.h in sched_4bsd.jeff2008-03-201-0/+1
|
* - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice fromjeff2008-03-191-4/+3
| | | | | | | requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-2/+3
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Remove kernel support for M:N threading.jeff2008-03-121-2/+0
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* - Pass the priority argument from *sleep() into sleepq and down intojeff2008-03-121-1/+6
| | | | | | | | | | | | | | | | | sched_sleep(). This removes extra thread_lock() acquisition and allows the scheduler to decide what to do with the static boost. - Change the priority arguments to cv_* to match sleepq/msleep/etc. where 0 means no priority change. Catch -1 in cv_broadcastpri() and convert it to 0 for now. - Set a flag when sleeping in a way that is compatible with swapping since direct priority comparisons are meaningless now. - Add a sysctl to ule, kern.sched.static_boost, that defaults to on which controls the boost behavior. Turning it off gives better performance in some workloads but needs more investigation. - While we're modifying sleepq, change signal and broadcast to both return with the lock held as the lock was held on enter. Reviewed by: jhb, peter
* - Add a sched_preempt() routine to be called by md code after IPI_PREEMPT isjeff2008-03-101-0/+11
| | | | | delivered. - Add a simple implementation to 4bsd.
* Unbreak after cpuset: initialize td_cpuset in sched_fork_thread().marcel2008-03-021-0/+2
|
* - Add a new sched_affinity() api to be used in the upcoming cpusetjeff2008-03-021-0/+5
| | | | | | | implementation. - Add empty implementations of sched_affinity() to 4BSD and ULE. Sponsored by: Nokia
* - Re-implement lock profiling in such a way that it no longer breaksjeff2007-12-151-1/+6
| | | | | | | | | | | | | | | | | | | | | | the ABI when enabled. There is no longer an embedded lock_profile_object in each lock. Instead a list of lock_profile_objects is kept per-thread for each lock it may own. The cnt_hold statistic is now always 0 to facilitate this. - Support shared locking by tracking individual lock instances and statistics in the per-thread per-instance lock_profile_object. - Make the lock profiling hash table a per-cpu singly linked list with a per-cpu static lock_prof allocator. This removes the need for an array of spinlocks and reduces cache contention between cores. - Use a seperate hash for spinlocks and other locks so that only a critical_enter() is required and not a spinlock_enter() to modify the per-cpu tables. - Count time spent spinning in the lock statistics. - Remove the LOCK_PROFILE_SHARED option as it is always supported now. - Specifically drop and release the scheduler locks in both schedulers since we track owners now. In collaboration with: Kip Macy Sponsored by: Nokia
* Fix LOR of thread lock and umtx's priority propagation mutex duedavidxu2007-12-111-8/+5
| | | | | | to the reworking of scheduler lock. MFC: after 3 days
* generally we are interested in what thread did something asjulian2007-11-141-10/+10
| | | | | | opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
* Remove unused variable td from sched_idletd().rwatson2007-11-051-2/+0
| | | | | | MFC after: 3 days Found with: Coverity Prevent(tm) CID: 3561
* Change the roundrobin implementation in the 4BSD scheduler to trigger ajhb2007-10-271-29/+8
| | | | | | | | | | | | | | userland preemption directly from hardclock() via sched_clock() when a thread uses up a full quantum instead of using a periodic timeout to cause a userland preemption every so often. This fixes a potential deadlock when IPI_PREEMPTION isn't enabled where softclock blocks on a lock held by a thread pinned or bound to another CPU. The current thread on that CPU will never be preempted while softclock is blocked. Note that ULE already drives its round-robin userland preemption from sched_clock() as well and always enables IPI_PREEMPT. MFC after: 1 week
* Introduce a way to make pure kernal threads.julian2007-10-261-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | kthread_add() takes the same parameters as the old kthread_create() plus a pointer to a process structure, and adds a kernel thread to that process. kproc_kthread_add() takes the parameters for kthread_add, plus a process name and a pointer to a pointer to a process instead of just a pointer, and if the proc * is NULL, it creates the process to the specifications required, before adding the thread to it. All other old kthread_xxx() calls return, but act on (struct thread *) instead of (struct proc *). One reason to change the name is so that any old kernel modules that are lying around and expect kthread_create() to make a process will not just accidentally link. fix top to show kernel threads by their thread name in -SH mode add a tdnam formatting option to ps to show thread names. make all idle threads actual kthreads and put them into their own idled process. make all interrupt threads kthreads and put them in an interd process (mainly for aesthetic and accounting reasons) rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper' man page fixes to follow.
* - Restore historical sched_yield() behavior by changing sched_relinquish()jeff2007-10-081-2/+0
| | | | | | | | | to simply switch rather than lowering priority and switching. This allows threads of equal priority to run but not lesser priority. Discussed with: davidxu Reported by: NIIMI Satoshi <sa2c@sa2c.net> Approved by: re
* - Redefine p_swtime and td_slptime as p_swtick and td_slptick. Thisjeff2007-09-211-17/+20
| | | | | | | | | | | | changes the units from seconds to the value of 'ticks' when swapped in/out. ULE does not have a periodic timer that scans all threads in the system and as such maintaining a per-second counter is difficult. - Change computations requiring the unit in seconds to subtract ticks and divide by hz. This does make the wraparound condition hz times more frequent but this is still in the range of several months to years and the adverse effects are minimal. Approved by: re
* - Move all of the PS_ flags into either p_flag or td_flags.jeff2007-09-171-8/+8
| | | | | | | | | | | | | | - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
* - Remove the global definition of sched_lock in mutex.h to breakjeff2007-07-181-0/+2
| | | | | | | | | | | new code and third party modules which try to depend on it. - Initialize sched_lock in sched_4bsd.c. - Declare sched_lock in sparc64 pmap.c and assert that we're compiling with SCHED_4BSD to prevent accidental crashes from running ULE. This is the sole remaining file outside of the scheduler that uses the global sched_lock. Approved by: re
* - Move some common code out of sched_fork_exit() and back into fork_exit().jeff2007-06-121-15/+4
|
* - Placing the 'volatile' on the right side of the * in the td_lockjeff2007-06-061-1/+1
| | | | | | declaration removes the need for __DEVOLATILE(). Pointed out by: tegge
* - Better fix for previous error; use DEVOLATILE on the td_lock pointerjeff2007-06-051-1/+1
| | | | | | | it can actually sometimes be something other than sched_lock even on schedulers which rely on a global scheduler lock. Tested by: kan
* - Pass &sched_lock as the third argument to cpu_switch() as this willjeff2007-06-051-1/+1
| | | | | | | always be the correct lock and we don't get volatile warnings this way. Pointed out by: kan
* Commit 1/14 of sched_lock decomposition.jeff2007-06-041-45/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | - Move all scheduler locking into the schedulers utilizing a technique similar to solaris's container locking. - A per-process spinlock is now used to protect the queue of threads, thread count, suspension count, p_sflags, and other process related scheduling fields. - The new thread lock is actually a pointer to a spinlock for the container that the thread is currently owned by. The container may be a turnstile, sleepqueue, or run queue. - thread_lock() is now used to protect access to thread related scheduling fields. thread_unlock() unlocks the lock and thread_set_lock() implements the transition from one lock to another. - A new "blocked_lock" is used in cases where it is not safe to hold the actual thread's lock yet we must prevent access to the thread. - sched_throw() and sched_fork_exit() are introduced to allow the schedulers to fix-up locking at these points. - Add some minor infrastructure for optionally exporting scheduler statistics that were invaluable in solving performance problems with this patch. Generally these statistics allow you to differentiate between different causes of context switches. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* Use pause() rather than tsleep() on stack variables and function pointers.jhb2007-02-271-2/+1
|
* Move the seting of the idle_mask bits to a place where theyjulian2007-02-021-17/+25
| | | | | | can't be wrong. Also use the IDLETD bit in the thread mask to test if its an idle thread rather than doing a PCPU access.
* - Remove setrunqueue and replace it with direct calls to sched_add().jeff2007-01-231-34/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
* Prefer a more traditional spelling of inhibited in comments and panicrwatson2006-12-311-1/+1
| | | | messages.
* Fix typo, p_slptime should be td_slptime.davidxu2006-12-241-1/+1
|
* Threading cleanup.. part 2 of several.julian2006-12-061-629/+135
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* whitespace fix onlyjulian2006-11-201-6/+6
|
* Fix a copy-paste bug in NON-KSE case.davidxu2006-11-141-11/+11
|
* Unbreak userland priority inheriting in NO_KSE case.davidxu2006-11-111-1/+2
|
* Make KSE a kernel option, turned on by default in all GENERICjb2006-10-261-0/+358
| | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@
* Add user priority loaning code to support priority propagation fordavidxu2006-08-251-1/+56
| | | | | 1:1 threading's POSIX priority mutexes, the code is no-op unless priority-aware umtx code is committed.
* o Fix grammar in the comment, indent macros. No functional changes.maxim2006-07-021-7/+7
|
* o Remove rev. 1.57 leftover, not reached code.maxim2006-07-021-2/+0
|
OpenPOWER on IntegriCloud