summaryrefslogtreecommitdiffstats
path: root/sys/kern/sched_ule.c
Commit message (Collapse)AuthorAgeFilesLines
* rqb_bits[] may be an int64_t (eg: on alpha, and recently on amd64).peter2003-12-071-1/+1
| | | | | | | Be sure to shift (long)1 << 33 and higher, not (int)1. Otherwise bad things happen(TM). This is why beast.freebsd.org paniced with ULE. Reviewed by: jeff
* Fix all users of mp_maxid to use the same semantics, namely:jhb2003-12-031-1/+1
| | | | | | | | 1) mp_maxid is a valid FreeBSD CPU ID in the range 0 .. MAXCPU - 1. 2) For all active CPUs in the system, PCPU_GET(cpuid) <= mp_maxid. Approved by: re (scottl) Tested on: i386, amd64, alpha
* - Mark ksq_assigned as volatile so that when this code is used withoutjeff2003-11-171-3/+3
| | | | sched_lock we can be sure that we'll pick up the new value.
* - Remove long dead code. rslices hasn't been used in some time and neitherjeff2003-11-171-52/+4
| | | | has sched_pickcpu().
* - Introduce kseq_runq_{add,rem}() which are used to insert and removejeff2003-11-151-61/+83
| | | | | | | | | | | | | | | | | | | | | | kses from the run queues. Also, on SMP, we track the transferable count here. Threads are transferable only as long as they are on the run queue. - Previously, we adjusted our load balancing based on the transferable count minus the number of actual cpus. This was done to account for the threads which were likely to be running. All of this logic is simpler now that transferable accounts for only those threads which can actually be taken. Updated various places in sched_add() and kseq_balance() to account for this. - Rename kseq_{add,rem} to kseq_load_{add,rem} to reflect what they're really doing. The load is accounted for seperately from the runq because the load is accounted for even as the thread is running. - Fix a bug in sched_class() where we weren't properly using the PRI_BASE() version of the kg_pri_class. - Add a large comment that describes the impact of a seemingly simple conditional in sched_add(). - Also in sched_add() check the transferable count and KSE_CAN_MIGRATE() prior to checking kseq_idle. This reduces the frequency of access for kseq_idle which is a shared resource.
* - Somehow I botched my last commit. Add an extra ( to fix things up. I'mjeff2003-11-061-1/+1
| | | | | | still not sure how this happened. Reported by: ps
* - Remove the local definition of sched_pin and unpin. They are provided injeff2003-11-061-17/+3
| | | | | sched.h now. - Respect the td pin count.
* - It's ok if sched_runnable() has races in it, we don't need the sched_lockjeff2003-11-051-3/+4
| | | | here unless we have something on the assigned queue.
* - Add initial support for pinning and binding.jeff2003-11-041-2/+53
|
* - Remove kseq_find(), we no longer scan other cpu's run queues when we gojeff2003-11-031-66/+17
| | | | | | | | | | | | | idle. They figure out that we're idle fast enough that the cache pollution introduces by scanning their run queue is more expensive than waiting a little longer. - Add kseq_setidle() to mark us as being idle. Use this in place of kseq_find(). - Remove kseq_load_highest(), kseq_find() was the only consumer of this interface. kseq_balance() has it's own customized version that finds the lowest and highest loads simultaneously. Continuously told that this would be faster by: terry
* - Remove the ksq_loads[] array. We are only interested in three counts,jeff2003-11-021-33/+50
| | | | | | | | | the total load, the timeshare load, and the number of threads that can be migrated to another cpu. Account for these seperately. - Introduce a KSE_CAN_MIGRATE() macro which determines whether or not a KSE can be migrated to another CPU. Currently, this only checks to see if we're an interrupt handler. Eventually this will also be used to support CPU binding.
* - In sched_prio() only force us onto the current queue if our priority isjeff2003-11-021-1/+2
| | | | being elevated (numerically smaller).
* - Rename SCHED_PRI_NTHRESH to SCHED_SLICE_NTHRESH since it is only used injeff2003-11-021-10/+11
| | | | | | | | | slice assignment. Add a comment describing what it does. - Remove a stale XXX comment, the nice should not impact the interactivity, nice adjustments only effect non-interactive tasks in ULE. - Don't allow nice -20 tasks to totally starve nice 0 tasks. Give them at least SCHED_SLICE_MIN ticks. We still allow nice 0 tasks to starve nice +20 tasks as intended.
* - Remove uses of PRIO_TOTAL and replace them with SCHED_PRI_NRESVjeff2003-11-021-5/+5
| | | | | | | - SCHED_PRI_NRESV does not have the off by one error in PRIO_TOTAL so we do not have to account for it in the few places that we use it. Requested by: bde
* - Change sched_interact_update() to only accept slp+runtime values betweenjeff2003-11-021-27/+56
| | | | | | | | | | | | | | | | | | | 0 and SCHED_SLP_RUN_MAX * 2. This allows us to simplify the algorithm quite a bit. Before, it dealt with arbitrary values which required us to do nasty integer division tricks that didn't quite work out correctly. - Chnage sched_wakeup() to detect conditions where the slp+runtime could exceed SCHED_SLP_RUN_MAX * 2. This can happen if we go to sleep for longer than 6 seconds. In this case, we'll just clear the runtime and set the sleep time to the max. - Define a new function, sched_interact_fork() which updates the slp+runtime of a newly forked thread. We want to limit the amount of history retained from the parent so that we learn the child's behavior quickly. We don't, however want to decay it to nothing. Previously, we would simply divide each parameter by 100 whenever we forked. After a few forks the values would reach 0 and tasks would not be considered interactive. - Add another KTR entry, cleanup some existing entries. - Remove a useless sched_interact_update() from sched_priority(). This is already done by the callers that require it.
* - Add static to local functions and data where it was missing.jeff2003-10-311-78/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add an IPI based mechanism for migrating kses. This mechanism is broken down into several components. This is intended to reduce cache thrashing by eliminating most cases where one cpu touches another's run queues. - kseq_notify() appends a kse to a lockless singly linked list and conditionally sends an IPI to the target processor. Right now this is protected by sched_lock but at some point I'd like to get rid of the global lock. This is why I used something more complicated than a standard queue. - kseq_assign() processes our list of kses that have been assigned to us by other processors. This simply calls sched_add() for each item on the list after clearing the new KEF_ASSIGNED flag. This flag is used to indicate that we have been appeneded to the assigned queue but not added to the run queue yet. - In sched_add(), instead of adding a KSE to another processor's queue we use kse_notify() so that we don't touch their queue. Also in sched_add(), if KEF_ASSIGNED is already set return immediately. This can happen if a thread is removed and readded so that the priority is recorded properly. - In sched_rem() return immediately if KEF_ASSIGNED is set. All callers immediately readd simply to adjust priorites etc. - In sched_choose(), if we're running an IDLE task or the per cpu idle thread set our cpumask bit in 'kseq_idle' so that other processors may know that we are idle. Before this, make a single pass through the run queues of other processors so that we may find work more immediately if it is available. - In sched_runnable(), don't scan each processor's run queue, they will IPI us if they have work for us to do. - In sched_add(), if we're adding a thread that can be migrated and we have plenty of work to do, try to migrate the thread to an idle kseq. - Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into consideration. - No longer use kseq_choose() to steal threads, it can lose it's last argument. - Create a new function runq_steal() which operates like runq_choose() but skips threads based on some criteria. Currently it will not steal PRI_ITHD threads. In the future this will be used for CPU binding. - Create a kseq_steal() that checks each run queue with runq_steal(), use kseq_steal() in the places where we used kseq_choose() to steal with before.
* Removed sched_nest variable in sched_switch(). Context switches alwaysbde2003-10-291-3/+0
| | | | | | | | | | | | | | | | | begin with sched_lock held but not recursed, so this variable was always 0. Removed fixup of sched_lock.mtx_recurse after context switches in sched_switch(). Context switches always end with this variable in the same state that it began in, so there is no need to fix it up. Only sched_lock.mtx_lock really needs a fixup. Replaced fixup of sched_lock.mtx_recurse in fork_exit() by an assertion that sched_lock is owned and not recursed after it is fixed up. This assertion much match the one in mi_switch(), and if sched_lock were recursed then a non-null fixup of sched_lock.mtx_recurse would probably be needed again, unlike in sched_switch(), since fork_exit() doesn't return to its caller in the normal way.
* - Only change the run queue in sched_prio() if the kse is non null. threadsjeff2003-10-281-10/+2
| | | | | | can be in the TD_ON_RUNQ state and not have an associated kse. - Remove the PRI_IDLE special case from sched_clock(), it was not actually necessary.
* - Use a better algorithm in sched_pctcpu_update()jeff2003-10-271-56/+50
| | | | | | | | | | | | | | | | | | | | | Contributed by: Thomaswuerfl@gmx.de - In sched_prio(), adjust the run queue for threads which may need to move to the current queue due to priority propagation . - In sched_switch(), fix style bug introduced when the KSE support went in. Columns are 80 chars wide, not 90. - In sched_switch(), Fix the comparison in the idle case and explicitly re-initialize the runq in the not propagated case. - Remove dead code in sched_clock(). - In sched_clock(), If we're an IDLE class td set NEEDRESCHED so that threads that have become runnable will get a chance to. - In sched_runnable(), if we're not the IDLETD, we should not consider curthread when examining the load. This mimics the 4BSD behavior of returning 0 when the only runnable thread is running. - In sched_userret(), remove the code for setting NEEDRESCHED entirely. This is not necessary and is not implemented in 4BSD. - Use the correct comparison in sched_add() when checking to see if an idle prio task has had it's priority temporarily elevated.
* - If a thread is not bound to a kse return 0 from sched_pctcpu().jeff2003-10-201-0/+2
| | | | Reported by: pawel.worach@nordea.com
* - Only kse_reassign() in the !running case.jeff2003-10-161-8/+10
| | | | Reported by: kris
* - Call sched_add() with the correct argument on SMP.jeff2003-10-161-1/+1
| | | | Reported by: Valentin Chopov <valentin@valcho.net>
* - Fix a minor problem with my last commit, we don't want to return fromjeff2003-10-161-3/+1
| | | | | sched_switch if the thread is running, we want to fall through and pick a new thread because we have been preempted.
* - Collapse sched_switchin() and sched_switchout() into sched_switch(). Nowjeff2003-10-161-8/+9
| | | | | mi_switch() calls sched_switch() which calls cpu_switch(). This is actually one less function call than it had been.
* - Update the sched api. sched_{add,rem,clock,pctcpu} now all accept a tdjeff2003-10-161-7/+14
| | | | argument rather than a kse.
* - The non iterative algorithm for interact_update was broken due tojeff2003-10-161-8/+6
| | | | | | | | | | rounding errors. This was the source of the majority of the interactivity problems. Reintroduce the old algorithm and its XXX. - Up the interactivity threshold to 30. It really could stand to be even a tiny bit higher. - Let the sleep and run time accumulate up to 5 seconds of history rather than two. This helps stop XFree86 from becoming non-interactive during bursts of activity.
* - If our user_pri doesn't match our actual priority our priority has beenjeff2003-10-151-3/+10
| | | | | | | | | | | | | elevated either due to priority propagation or because we're in the kernel in either case, put us on the current queue so that we dont stop others from using important resources. At some point the priority elevations from sleeping in the kernel should go away. - Remove an optimization in sched_userret(). Before we would only set NEEDRESCHED if there was something of a higher priority available. This is a trivial optimization and it breaks priority propagation because it doesn't take threads which we may be blocking into account. Notice that the thread which is blocking others gets up to one tick of cpu time before we honor this NEEDRESCHED in sched_clock().
* - In SCHED_CURR() add holding Giant to the list of criteria that will keepjeff2003-10-121-8/+7
| | | | | | | | | you on the current queue. In the future, it would be nice if priority propagation could deterministicly pluck a thread off of the next queue and put it on the current queue. Until then this hack stops us from holding up our entire current queue, including interrupt handlers, while a thread on the next queue is blocked while holding Giant. - Inherit our pctcpu information from our parent.
* - Change a lame iterative algorithm to a constant time algorithm. Removejeff2003-10-041-4/+6
| | | | | | the XXX that complains about it as well. Submitted by: ThomasWuerfl@gmx.de
* - Somewhere along the line I stupidly removed critical logic fromjeff2003-09-201-10/+11
| | | | | sched_ptcpu_update(). This caused erroneous cpu times in TOP for processes that were asleep. Replace the code that was removed.
* Let SA process work under ULE scheduler, originally it would panic kernel.davidxu2003-08-261-18/+17
| | | | Reviewed by: jeff
* Change instances of callout_init that specify MPSAFE behaviour tosam2003-08-191-1/+1
| | | | | use CALLOUT_MPSAFE instead of "1" for the second parameter. This does not change the behaviour; it just makes the intent more clear.
* - When stealing a kse in kseq_move() ignore the current kseq's min nicejeff2003-07-081-7/+13
| | | | | value. We want to steal any thread, even one that is not given a slice on its current queue.
* - Clean up an unused variable.jeff2003-07-071-0/+2
| | | | Submitted by: Steve Kargl <skg@routmask.apl.washington.edu>
* - Parse the cpu topology map in sched_setup().jeff2003-07-041-13/+53
| | | | | | | | - Associate logical CPUs on the same physical core with the same kseq. - Adjust code that assumed there would only be one running thread in any kseq. - Wrap the HTT code with a ULE_HTT_EXPERIMENTAL ifdef. This is a start towards HyperThreading support but it isn't quite there yet.
* - Don't migrate to stopped cpus.jeff2003-06-281-4/+4
|
* - If smp is not started yet don't try to load balance or we'll put threadsjeff2003-06-281-0/+3
| | | | on cpus that aren't running yet.
* - Throttle the inherited sleep and run time in sched_fork_kseg(). Thisjeff2003-06-281-4/+4
| | | | | allows us to learn the behavior of a thread much more quickly after it starts up.
* - Adjust the default maximum slice value to ~140ms. This has improved thejeff2003-06-281-2/+2
| | | | | | nice distribution without significantly impacting interactive response. As a side effect it should also allow batch processes to run for a slightly longer period which will positively impact their performance.
* - lticks was erroneously being updated in sched_pctcpu(). This was causingjeff2003-06-211-2/+0
| | | | | us to skip the pctcpu_update() call which lead to inaccurate cpu usage statistics for processes that didn't run often.
* - Don't allow nice to have such a large effect on priority. This wasjeff2003-06-211-8/+7
| | | | | | causing poor interactive performance while unnice processes were running. The new scheme still allows nice to have an effect on priority but it is not as dramatic as the effect of the interactivity score.
* - Use a more robust mechanism for determining whether or not a kse is on ajeff2003-06-171-2/+1
| | | | kseq.
* - Temporarily patch a problem where the interact score could be negativejeff2003-06-171-1/+2
| | | | | | | because the run time exceeds the largest value a signed int can hold. The real solution involves calculating how far we are over the limit. To quickly solve this problem we loop removing 1/5th of the current value until it falls below the limit. The common case requires no passes.
* - Add a new function "sched_interact_update()" that scales back the sleepjeff2003-06-171-23/+20
| | | | | | | | | | | | | | | | | | and run time. - Scale the sleep and run time back via sched_interact_update() in more places. This is to keep the statistic more accurate. - Charge a parent one tick for forking a child. - Add only the run time and not the sleep time to the parents kg when a thread exits. This allows us to give a penalty for having an expensive thread exit but does not give a bonus for having an interactive thread exit. - Change the SLP_RUN_THROTTLE to limit us to 4/5th and not 1/2. - Change the SLP_RUN_MAX to two seconds. This keeps bursty interactive applications like mozilla and openoffice in the interactive range even through expensive tasks. - Recalculate the slice after every sleep. This ensures that once a task has been marked interactive it only has a slice of 1 at the risk of giving tasks that sleep for a very brief period a longer time slice.
* - Increase the ksegrp's cpu time history buffer to 250ms.jeff2003-06-151-2/+2
| | | | | - Decrease the history buffer divisor to 2 so that we remember more of the old behavior.
* - Cap the growth of sleep and run time in sched_exit_kse().jeff2003-06-151-0/+4
|
* - Fix the maximum slice value. I accidentally checked in a value of '2'jeff2003-06-151-38/+54
| | | | | | | | | | | | | | | | | | | | | | which meant no process would run for longer than 20ms. - Slightly redo the interactivity scorer. It follows the same algorithm but in a slightly more correct way. Previously values above half were incorrect. - Lower the interactivity threshold to 20. It seems that in testing non- interactive tasks are hardly ever near there and expensive interactive tasks can sometimes surpass it. This area needs more testing. - Remove an unnecessary KTR. - Fix a case where an idle thread that had an elevated priority due to priority prop. would be placed back on the idle queue. - Delay setting NEEDRESCHED until userret() for threads that haad their priority elevated while in kernel. This gives us the same context switch optimization as SCHED_4BSD. - Limit the child's slice to 1 in sched_fork_kse() so we detect its behavior more quickly. - Inhert some of the run/slp time from the child in sched_exit_ksegrp(). - Redo some of the priority comparisons so they are more clear. - Throttle the frequency of sched_pctcpu_update() so that rounding errors do not make it invalid.
* Rename P_THREADED to P_SA. P_SA means a process is using schedulerdavidxu2003-06-151-1/+1
| | | | activations.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* - Add a simple CPU load balancing algorithm. This works by executing once ajeff2003-06-091-7/+95
| | | | | | | | second and equalizing the load between the two most imbalanced CPU. This is intended to clear up long term load imbalances that would not be handled by the 'pull' method in sched_choose(). - Pull out some bits of sched_choose() into a kseq_move() function that moves an arbitrary thread from one kseq to another.
OpenPOWER on IntegriCloud