summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_synch.c
Commit message (Collapse)AuthorAgeFilesLines
* - Catch up to proc flag changes.jhb2001-01-241-25/+29
| | | | | | - Add in some locking ops that might fix SIGXCPU, but don't enable them yet. - Assert that sched_lock is not recursed when mi_switch() is called.
* Do not do the commenting out the way that saves bytes and looks cleanermjacob2001-01-231-1/+4
| | | | to you. Do it the way Vox Populi wants it.
* Move (now) unused variable declaration inside the block (now commented out).mjacob2001-01-221-2/+1
|
* Temporarily disable the printf() for micruptime() going backwards, thejhb2001-01-201-0/+5
| | | | | | | SIGXCPU signal, and killing of processes that exceed their allowed run time until they can play nice with sched_lock. Right now they are just potentital panics waiting to happen. The printf() has bitten several people.
* Implement condition variables.jasone2001-01-161-6/+8
|
* Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variablesjake2001-01-101-11/+12
| | | | other then curproc.
* - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), insteadjake2000-12-131-2/+2
| | | | | | | | of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
* Add in #include of <sys/lock.h> since it was axed from <sys/proc.h>.jhb2000-12-061-0/+1
| | | | | Noticed by: Wesley Morgan <morganw@chemikals.org> Pointy hat to: me
* Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeupjake2000-12-021-0/+25
| | | | | | | | | | from struct proc, which are now unused (p_nthread already was). Remove process flag P_KTHREADP which was untested and only set in vfs_aio.c (it should use kthread_create). Move the yield system call to kern_synch.c as kern_threads.c has been removed completely. moral support from: alfred, jhb
* Use an mp-safe callout for endtsleep.jake2000-12-011-0/+2
|
* Fix up priority propagation:jhb2000-11-301-1/+0
| | | | | | | - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).
* Don't drop Giant and the passed in mutex incorrectly in thejhb2000-11-291-10/+12
| | | | | cold || panicstr case. Do drop the passed in mutex in that case if PDROP is specified.
* Use callout_reset instead of timeout(9). Most callouts are staticallyjake2000-11-271-8/+13
| | | | | | allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon
* Protect the following with a lockmgr lock:jake2000-11-221-0/+2
| | | | | | | | | | | | allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd
* - Split the run queue and sleep queue linkage, so that a processjake2000-11-171-9/+9
| | | | | | | | | may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb
* Don't release and acquire Giant in mi_switch(). Instead, release andjhb2000-11-161-16/+5
| | | | | | | | acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled. Submitted by: witness
* Argh, add in a missing release of the sched_lock.jhb2000-11-161-0/+1
|
* CURSIG() calls functions that acquire sleep mutexes, so it is not a goodjhb2000-11-161-3/+12
| | | | | | | | idea to be holding the sched_lock while we are calling it. As such, release sched_lock before calling CURSIG() in msleep() and mawait() and reacquire it after CURSIG() returns. Submitted by: witness
* - Rename await() to mawait(). mawait() is to await() as msleep() is tojhb2000-11-151-14/+27
| | | | | | | | | tsleep(). Namely, mawait() takes an extra argument which is a mutex to drop when going to sleep. Just as with msleep(), if the priority argument includes the PDROP flag, then the mutex will be dropped and will not be reacquired when the process wakes up. - Add in a backwards compatible macro await() that passes in NULL as the mutex argument to mawait().
* - Replace a KASSERT() that knew too much about mutex internals with ajhb2000-11-151-2/+1
| | | | | mtx_assert() that ensures the mutex we release during msleep() is both not recursed and owned by the current process.
* - Convert references from tsleep() -> msleep()jhb2000-11-151-9/+9
| | | | - Fix a buglet in a comment above await()
* - GC some #if 0'd code regarding the non-existant safepri variable.jhb2000-10-201-17/+6
| | | | | - Don't dink with the witness state of Giant unless we actually own it during mi_switch().
* - Change fast interrupts on x86 to push a full interrupt frame and tojhb2000-10-061-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast(). Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())
* Add a KASSERT() to catch instances where the mutex that we pass in tojhb2000-09-241-0/+2
| | | | | | msleep() are recursed. Suggested by: cp
* Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, justjhb2000-09-141-1/+1
| | | | | | use struct mtx, struct witness, and struct witness_blessed. Requested by: bde
* Rename tsleep to msleep and add a mutex argument, which isjake2000-09-111-4/+19
| | | | | | | | | | | | | | | | released before sleeping and re-acquired before msleep returns. A compatibility cpp macro has been provided for tsleep to avoid changing all occurences of it in the kernel. Remove an assertion that the Giant mutex be held before calling tsleep or asleep. This is intended to serve the same purpose as condition variables, but does not preclude their addition in the future. Approved by: jasone Obtained from: BSD/OS
* Fix printf warnings in CTRx calls.dfr2000-09-101-18/+18
|
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-14/+101
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-1/+1
| | | | Pointed out by: bde
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-1/+1
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* Back out the previous change to the queue(3) interface.jake2000-05-261-1/+1
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-1/+1
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Correct a couple of typos.grog2000-05-071-2/+2
|
* Change the scheduler to actually respect the PUSER barrier. It's beengreen2000-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wrong for many years that negative niceness would lower the priority of a process below PUSER, and once below PUSER, there were conditionals in the code that are required to test for whether a process was in the kernel which would break. The breakage could (and did) cause lock-ups, basically nothing else but the least nice program being able to run in some conditions. The algorithm which adjusts the priority now subtracts PRIO_MIN to do things properly, and the ESTCPULIM() algorithm was updated to use PRIO_TOTAL (PRIO_MAX - PRIO_MIN) to calculate the estcpu. NICE_WEIGHT is now 1 to accomodate the full range of priorities better (a -20 process with full CPU time has the priority of a +0 process with no CPU time). There are now 20 queues (exactly; 80 priorities) for use in user processes' scheduling, and PUSER has been lowered to 48 to accomplish this. This means, to the user, that things will be scheduled more correctly (noticeable), there is no lock-up anymore WRT a niced -20 process never releasing the CPU time for other processes. In this fair system, tsleep()ed < PUSER processes now will get the proper higher priority than priority >= PUSER user processes. The detective work of this was done by me, along with part of the solution. Luoqi Chen has provided most of the solution, and really helped me understand what was happening better, to boot :) Submitted by: luoqi Concept reviewed by: bde
* The SMP cleanup commit broke UP compiles. Make UP compiles work again.dillon2000-03-281-2/+0
|
* Commit major SMP cleanups and move the BGL (big giant lock) in thedillon2000-03-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
* I applied the wrong patch set. Back out anything associateddufault2000-03-021-6/+1
| | | | | | | | | with the known bogus currtpriority. This undoes the previous changes to sys/i386/i386/trap.c, sys/alpha/alpha/trap.c, sys/sys/systm.h Now we have the patch set approved by bde. Approved by: bde
* Patches that eliminate extra context switches in FIFO case.dufault2000-03-021-11/+47
| | | | | | | Fixes p1003_1b regression test in the simple case of no RR and FIFO processes competing. Reviewed by: jkh, bde
* Don't make the ktrace hook in tsleep() deref a null curproc after a panic.peter1999-11-301-1/+1
| | | | | PR: 15169 Submitted by: David Gilbert <dgilbert@velocet.ca>
* Add a bit of sanity checking and problem avoidance in case thephk1999-11-291-2/+9
| | | | | | | timecounter hardware is bogus. This will produce a new warning "microuptime() went backwards" and try to not screw up the process resource accounting.
* Scheduler fixes equivalent to the ones logged in the following NetBSDbde1999-11-281-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit to kern_synch.c: ---------------------------- revision 1.55 date: 1999/02/23 02:56:03; author: ross; state: Exp; lines: +39 -10 Scheduler bug fixes and reorganization * fix the ancient nice(1) bug, where nice +20 processes incorrectly steal 10 - 20% of the CPU, (or even more depending on load average) * provide a new schedclk() mechanism at a new clock at schedhz, so high platform hz values don't cause nice +0 processes to look like they are niced * change the algorithm slightly, and reorganize the code a lot * fix percent-CPU calculation bugs, and eliminate some no-op code === nice bug === Correctly divide the scheduler queues between niced and compute-bound processes. The current nice weight of two (sort of, see `algorithm change' below) neatly divides the USRPRI queues in half; this should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op, and it was done after decay_cpu() which can only _reduce_ the value. It has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can scheduler-penalize themselves onto the same queue as nice +20 processes. (Or even a higher one.) === New schedclk() mechansism === Some platforms should be cutting down stathz before hitting the scheduler, since the scheduler algorithm only works right in the vicinity of 64 Hz. Rather than prescale hz, then scale back and forth by 4 every time p_estcpu is touched (each occurance an abstraction violation), use p_estcpu without scaling and require schedhz to be generated directly at the right frequency. Use a default stathz (well, actually, profhz) / 4, so nothing changes unless a platform defines schedhz and a new clock. Define these for alpha, where hz==1024, and nice was totally broke. === Algorithm change === The nice value used to be added to the exponentially-decayed scheduler history value p_estcpu, in _addition_ to be incorporated directly (with greater wieght) into the priority calculation. At first glance, it appears to be a pointless increase of 1/8 the nice effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that because it will ramp up linearly but be decayed only exponentially, thus converging to an additional .75 nice for a loadaverage of one. I killed this, it makes the behavior hard to control, almost impossible to analyze, and the effect (~~nothing at for the first second, then somewhat increased niceness after three seconds or more, depending on load average) pointless. === Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation. Collect scheduler functionality. Try to put each abstraction in just one place. ---------------------------- The details are a little different in FreeBSD: === nice bug === Fixing this is the main point of this commit. We use essentially the same clipping rule as NetBSD (our limit on p_estcpu differs by a scale factor). However, clipping at all is fundamentally bad. It gives free CPU the hoggiest hogs once they reach the limit, and reaching the limit is normal for long-running hogs. This will be fixed later. === New schedclk() mechanism === We don't use the NetBSD schedclk() (now schedclock()) mechanism. We require (real)stathz to be about 128 and scale by an extra factor of 2 compared with NetBSD's statclock(). We scale p_estcpu instead of scaling the clock. This is more accurate and flexible. === Algorithm change === Same change. === Other bugs === The p_pctcpu bug was fixed long ago. We don't try as hard to abstract functionality yet. Related changes: the new limit on p_estcpu must be exported to kern_exit.c for clipping in wait1(). Agreed with by: dufault
* Updated comments for the move in the previous commit.bde1999-11-271-5/+5
|
* Moved scheduling-related code to kern_synch.c so that it is easier to fixbde1999-11-271-0/+26
| | | | | | | | and extend. The new function containing the code is named schedclock() as in NetBSD, but it has slightly different semantics (it already handles incrementation of p->p_cpticks, and it should handle any calling frequency). Agreed with in principle by: dufault
* This is a partial commit of the patch from PR 14914:phk1999-11-161-3/+3
| | | | | | | | | | | | | Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
* sigset_t change (part 2 of 5)marcel1999-09-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Don't initialize run queues here, do it all in one place.peter1999-08-191-22/+2
|
* The magic "no-cpu" cpu number is 0xff. Don't misrepresent cpubde1999-03-051-2/+2
| | | | | | numbers as chars or use bogus casts in an attempt to unmisrepresnt them. In top, don't assume that 0xff is the only negative cpu number when cpu numbers are (mis)represented.
* The tunable parameter for the scheduler quantum was inverted.julian1999-03-031-30/+23
| | | | | | | | | | | Higher numbers led to smaller quanta. In discussion with BDE, change this parameter to be in uSecs to make it machine independent, and limit it to non zero multiples of 'tick' (rounding down). Also make the variabel globally available so that the present function that returns its value (used for posix scheduling I believe) can go away. Submitted by: Bruce Evans <bde@freebsd.org>
* Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu,bde1999-02-281-8/+8
| | | | | | | | | | | | | not per-process. Keep it in `switchtime' consistently. It is now clear that the timestamp is always valid in fork_trampoline() except when the child is running on a previously idle cpu, which can only happen if there are multiple cpus, so don't check or set the timestamp in fork_trampoline except in the (i386) SMP case. Just remove the alpha code for setting it unconditionally, since there is no SMP case for alpha and the code had rotted. Parts reviewed by: dfr, phk
OpenPOWER on IntegriCloud