summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_resource.c
Commit message (Collapse)AuthorAgeFilesLines
* Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparamdavidxu2006-09-211-0/+95
| | | | | with rtprio_thread, while rtprio system call is for process only, the new system call rtprio_thread is responsible for LWP.
* Commit the results of the typo hunt by Darren Pilgrim.yar2006-08-041-1/+1
| | | | | | | | | | This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
* Go over calcru and friends once more.phk2006-03-111-47/+48
| | | | | Reintroduce the monotonicity for the normal case and make the two special cases behave in what is belived to be the most sensible fasion.
* Add slop to "backwards" cpu accounting messages, 3 usec or 1% whicheverphk2006-03-091-1/+5
| | | | | | | | | | | | | triggers. This should eliminate all the trivial messages which result from minor increases in cpu_tick frequency. Machines which don't du cpu clock fiddling shouldn't issue "backwards" messages now. Laptops and other machines where the initial estimate of cputicks may be waaaay off will still issue warnings.
* Various style and comment fixes.jhb2006-02-221-8/+7
| | | | Submitted by: bde
* Split calcru() back into a calcru1() function shared with calccru() andjhb2006-02-211-10/+33
| | | | | | | | | | a calcru() wrapper that passes a local rusage_ext on the stack that is a snapshot to do the calculations on. Now we can pass p->p_crux to calcru1() in calccru() again which fixes the issues with runtime going backwards messages when dead processes are harvested by init. Reviewed by: phk Tested by: Stefan Ehmann shoesoft at gmx dot net
* CPU time accounting speedup (step 2)phk2006-02-111-68/+45
| | | | | | | | | | | | | | | | | | | Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64
* Modify the way we account for CPU time spent (step 1)phk2006-02-071-9/+12
| | | | | | | | | | | | | | | | Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
* Back out changes made in rev. 1.151.ups2006-01-251-1/+1
| | | | | | They were bogus. Cluebat applied by: jhb@
* Hopefully fix the "calcru: runtime went backwards from ..." problem byups2006-01-231-1/+1
| | | | | | | keeping the resource values locked (where needed) while we use them for calculations. MFC after: 3 days
* Calling setrlimit from 32bit apps could potentially increase certainps2005-11-021-0/+7
| | | | | | | limits beyond what should be capiable in a 32bit process, so we must fixup the limits. Reviewed by: jhb
* Use the reference count API to manage the reference counts for processjhb2005-09-271-11/+4
| | | | | | | limit structures rather than using pool mutexes to protect the reference counts. Tested on: i386, alpha, sparc64
* Giant is no longer required in kern_setrlimit(); remove its acquisition andalc2005-06-011-2/+0
| | | | | | release. Reviewed by: jhb
* Stop explicitly touching td_base_pri outside of the scheduler and simplyjhb2004-12-301-1/+0
| | | | | set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
* Rework how we store process times in the kernel such that we always storejhb2004-10-051-79/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month
* A modest collection of various and sundry style, spelling, and whitespacejhb2004-09-241-38/+33
| | | | | | fixes. Submitted by: bde (mostly)
* Various small style fixes.jhb2004-09-221-2/+4
|
* Push UIDINFO_UNLOCK() slightly earlier in chgsbize(), as it's notrwatson2004-08-061-2/+2
| | | | | needed if we print the local variable version of the limit rather than the shared version.
* Remove spl's from kern_resource.c.rwatson2004-08-041-4/+0
|
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-1/+1
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* Turned off the "calcru: negative time" warning for certain SMP casesbde2004-06-211-12/+34
| | | | | | | | | | | where it is known to detect a problem but the problem is not very easy to fix. The warning became very common recently after a call to calcru() was added to fill_kinfo_thread(). Another (much older) cause of "negative times" (actually non-monotonic times) was fixed in rev.1.237 of kern_exit.c. Print separate messages for non-monotonic and negative times.
* Nice, is a property of a process as a whole..julian2004-06-161-34/+10
| | | | | I mistakenly moved it to the ksegroup when breaking up the process structure. Put it back in the proc structure.
* Deorbit COMPAT_SUNOS.phk2004-06-111-2/+2
| | | | | We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
* Fix rtprio() to do sensible things when called from threaded processes.julian2004-05-081-4/+45
| | | | | | | | | | | It's not quite correct from a posix Point Of view, but it is a lot better than what was there before. This will be revisited later when we decide what form our priority extensions will take. Posix doesn't specify how a system scope thread can change its priority so you need to add non-standard extensions to be able to do it.. For now make this slightly non standard to allow it to be done. Submitted by: Dan Eischen originally, changed by myself.
* Remove a comment that complains about the lack of %qd, to justifymux2004-04-101-3/+2
| | | | | | truncating a rlim_t to a long. We have %qd since some time now. However, the correct format to use here is %jd and a cast to intmax_t, so do this.
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Argh! Fix a bogon. lim_cur() was returning the hard (max) limit ratherjhb2004-02-111-1/+1
| | | | | | than the soft (cur) limit. Submitted by: bde
* - Convert the plimit lock to a pool mutex lock.jhb2004-02-061-3/+3
| | | | | | - Hide struct plimit from userland. Submitted by: bde (2)
* - Correct the translation of old rlimit values to properly handle the oldjhb2004-02-061-21/+28
| | | | | | | | | | | | | RLIM_INFINITY case for ogetrlimit(). - Use %jd and intmax_t to output negative time in usec in calcru(). - Rework getrusage() to make a copy of the rusage struct into a local variable while holding Giant and then do the copyout from the local variable to avoid having to have the original process rusage struct locked while doing the copyout (which would not be safe). This also includes a few style fixes from Bruce to getrusage(). Submitted by: bde (1, parts of 3) Suggested by: bde (2)
* A few more style fixes from Bruce including a few I missed last time.jhb2004-02-061-18/+12
| | | | Submitted by: bde
* - A lot of style and whitespace fixes.jhb2004-02-051-60/+53
| | | | | | - Update a few comments regarding locking notes. Submitted by: bde (1, mostly)
* Locking for the per-process resource limits structure.jhb2004-02-041-64/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
* - Don't set td_priority directly here, use sched_prio().jeff2003-10-271-1/+1
|
* Extend the mutex pool implementation to permit the creation and use oftruckman2003-07-131-1/+1
| | | | | | | | | | | | | | | | multiple mutex pools with different options and sizes. Mutex pools can be created with either the default sleep mutexes or with spin mutexes. A dynamically created mutex pool can now be destroyed if it is no longer needed. Create two pools by default, one that matches the existing pool that uses the MTX_NOWITNESS option that should be used for building higher level locks, and a new pool with witness checking enabled. Modify the users of the existing mutex pool to use the appropriate pool in the new implementation. Reviewed by: jhb
* Use __FBSDID().obrien2003-06-111-1/+3
|
* Remove Giant from [gs]etpriority().jhb2003-04-231-6/+0
|
* Lock both the proc lock and sched_lock when calling sched_nice sincejhb2003-04-221-0/+2
| | | | | | kg_nice is now protected by both. Being protected by both means that other places in the kernel that want to read kg_nice only need one of the two locks.
* Add a couple of sched_lock asserts.jhb2003-04-181-0/+2
|
* - Adjust sched hooks for fork and exec to take processes as arguments insteadjeff2003-04-111-1/+1
| | | | | | | | | | of ksegs since they primarily operation on processes. - KSEs take ticks so pass the kse through sched_clock(). - Add a sched_class() routine that adjusts a ksegrp pri class. - Define a sched_fork_{kse,thread,ksegrp} and sched_exit_{kse,thread,ksegrp} that will be used to tell the scheduler about new instances of these structures within the same process. These will be used by THR and KSE. - Change sched_4bsd to reflect this API update.
* Back out previous. The locking here needs a rethink.tjr2003-03-131-15/+3
|
* Acquire sched_lock around use of FOREACH_KSEGRP_IN_PROC, accessestjr2003-03-121-3/+15
| | | | | to kg_nice and calls to sched_nice() in getpriority() and setpriority() (really donice()).
* Remove the PL_SHAREMOD flag from struct plimit, which could have beentjr2003-02-201-3/+1
| | | | | | used to share resource limits between rfork threads, but never was. Removing it makes resource limit locking much simpler -- only the current process can change the contents of the structure that p_limit points to.
* Back out M_* changes, per decision of the TRB.imp2003-02-191-2/+2
| | | | Approved by: trb
* - Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back intojeff2003-02-171-74/+60
| | | | | | | the proc. These counters are only examined through calcru. Submitted by: davidxu Tested on: x86, alpha, UP/SMP
* Add an XXX comment noting that getrusage() accesses p_stats->p_rutjr2003-02-131-0/+1
| | | | and p_stats->p_cru without holding the appropriate locks.
* Reversion of commit by Davidxu plus fixes since applied.julian2003-02-011-60/+74
| | | | | | | | I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.
* Move UPCALL related data structure out of kse, introduce a newdavidxu2003-01-261-74/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* - Create a new scheduler api that is defined in sys/sched.hjeff2002-10-121-2/+2
| | | | | | | | | | - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch
* - Move p_cpulimit to struct proc from struct plimit and protect it withjhb2002-10-091-4/+3
| | | | | | | | | | | | | | | sched_lock. This means that we no longer access p_limit in mi_switch() and the p_limit pointer can be protected by the proc lock. - Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE processes don't call mi_switch(), and even if they did there is no longer the danger of p_limit being NULL (which is what the original zombie check was added for). - When we bump the current processes soft CPU limit in ast(), just bump the private p_cpulimit instead of the shared rlimit. This fixes an XXX for some value of fix. There is still a (probably benign) bug in that this code doesn't check that the new soft limit exceeds the hard limit. Inspired by: bde (2)
OpenPOWER on IntegriCloud