summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_resource.c
Commit message (Collapse)AuthorAgeFilesLines
* o Move suser() calls in kern/ to using suser_xxx() with an explicitrwatson2001-11-011-2/+3
| | | | | | | | | credential selection, rather than reference via a thread or process pointer. This is part of a gradual migration to suser() accepting a struct ucred instead of a struct proc, simplifying the reference and locking semantics of suser(). Obtained from: TrustedBSD Project
* Adjust printfs to be time_t agnostic.dillon2001-10-281-3/+3
|
* Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loaderps2001-10-101-9/+8
| | | | | | | tunable. Reviewed by: peter MFC after: 2 weeks
* KSE Milestone 2julian2001-09-121-99/+124
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Use sched_lock to protect rtp_to_pri() and pri_to_rtp() when needed.jhb2001-09-021-0/+4
|
* Giant Pushdown. Saved the worst P4 tree breakage for last.dillon2001-09-011-21/+81
| | | | | | | | | | | | reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit() setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp() getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid() setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid() setresuid() setresgid() getresuid() getresgid () __setugid() getlogin() setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload() kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize() dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf() flock()
* add prototype for dosetrlimitassar2001-07-221-2/+0
|
* o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx().rwatson2001-07-051-9/+9
| | | | | | | | | | | | | | | | | | | | | The p_can(...) construct was a premature (and, it turns out, awkward) abstraction. The individual calls to p_canxxx() better reflect differences between the inter-process authorization checks, such as differing checks based on the type of signal. This has a side effect of improving code readability. o Replace direct credential authorization checks in ktrace() with invocation of p_candebug(), while maintaining the special case check of KTR_ROOT. This allows ktrace() to "play more nicely" with new mandatory access control schemes, as well as making its authorization checks consistent with other "debugging class" checks. o Eliminate "privused" construct for p_can*() calls which allowed the caller to determine if privilege was required for successful evaluation of the access control check. This primitive is currently unused, and as such, serves only to complicate the API. Approved by: ({procfs,linprocfs} changes) des Obtained from: TrustedBSD Project
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-2/+2
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+2
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Make rtprio work again.jake2001-04-291-19/+14
| | | | | | | | | | - add a missing break which caused RTP_SET to always return EINVAL - break instead of returning if p_can fails so proc_lock is always dropped correctly - only copyin data that is actually needed - use break instead of goto - make rtp_to_pri return EINVAL instead of -1 if the values are out or range so we don't have to translate
* Change the pfind() and zpfind() functions to lock the process that theyjhb2001-04-241-24/+36
| | | | | | find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake
* o Limit process information leakage by introducing a p_can(...P_CAN_SEE...)rwatson2001-04-121-0/+2
| | | | | | in rtprio()'s RTP_LOOKIP implementation. Obtained from: TrustedBSD Project
* Convert the allproc and proctree locks from lockmgr locks to sx locks.jhb2001-03-281-4/+5
|
* Catch up to header include changes:jhb2001-03-281-2/+2
| | | | | - <sys/mutex.h> now requires <sys/systm.h> - <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>
* Don't call malloc with M_WAITOK while holding a mutex.alfred2001-03-091-22/+21
|
* Backout previous commit. sched_lock is held, thus interrupts are preventedtegge2001-02-221-14/+6
| | | | | | here. Submitted by: jhb
* Protect update of the per processor switchtime variable againsttegge2001-02-221-6/+14
| | | | | | | | | interrupts. Protect usage of the per processor switchtime variable against interrupts in calcru(). This seem to eliminate the "microuptime() went backwards" warnings.
* Ensure that RLIMIT_NPROC limits are at least 1 to avoid bad interactiontegge2001-02-201-0/+4
| | | | | | with chgproccnt. MFC candiate. Reviewed by: alfred
* Implement a unified run queue and adjust priority levels accordingly.jake2001-02-121-15/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
* Change and clean the mutex lock interface.bmilekic2001-02-091-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* - Add a mtx_assert() for sched_lock in calcru().jhb2001-01-241-0/+3
| | | | - Protect calcru() with sched_lock later on in the file when it is called.
* Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variablesjake2001-01-101-4/+5
| | | | other then curproc.
* - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), insteadjake2000-12-131-4/+4
| | | | | | | | of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
* Translate alfred to english.alfred2000-12-011-33/+27
| | | | Submitted by: bde
* use a oppurtunistic locking strategy with the uidinfo structures to avoidalfred2000-11-301-4/+55
| | | | | | | | | | locking the global hash on each uifree() make struct uidinfo only visible to the kernel make uihold() a function rather than a macro to reduce bloat swap the order of a spl/mutex to maintain consistancy
* Make uidinfo subsystem mpsafealfred2000-11-261-22/+45
| | | | | | | | | | | | use a mutex lock when looking up/deleting entries on the hashlist use a mutex lock on each uidinfo when updating fields make uifree() a void function rather than 'int' since no one cares allocate uidinfo structs with the M_ZERO flag and don't explicitly initialize them Assisted by: eivind, jhb, jakeb
* Protect the following with a lockmgr lock:jake2000-11-221-0/+4
| | | | | | | | | | | | allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd
* Add new line character to debugging printf's.ps2000-09-181-4/+4
|
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-1/+1
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Change the calls to panic() in uifree(), chgproccnt(), and chgsbsize()truckman2000-09-061-4/+4
| | | | | to printf(). Any errors detected are not likely to be fatal, so it should be safe to let things keep running.
* Remove uidinfo hash table lookup and maintenance out of chgproccnt() andtruckman2000-09-051-0/+142
| | | | | | | | | | | | | | chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().
* o Centralize inter-process access control, introducing:rwatson2000-08-301-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | int p_can(p1, p2, operation, privused) which allows specification of subject process, object process, inter-process operation, and an optional call-by-reference privused flag, allowing the caller to determine if privilege was required for the call to succeed. This allows jail, kern.ps_showallprocs and regular credential-based interaction checks to occur in one block of code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL, and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a series of static function checks in kern_prot, which should not be invoked directly. o Commented out capabilities entries are included for some checks. o Update most inter-process authorization to make use of p_can() instead of manual checks, PRISON_CHECK(), P_TRESPASS(), and kern.ps_showallprocs. o Modify suser{,_xxx} to use const arguments, as it no longer modifies process flags due to the disabling of ASU. o Modify some checks/errors in procfs so that ENOENT is returned instead of ESRCH, further improving concealment of processes that should not be visible to other processes. Also introduce new access checks to improve hiding of processes for procfs_lookup(), procfs_getattr(), procfs_readdir(). Correct a bug reported by bp concerning not handling the CREATE case in procfs_lookup(). Remove volatile flag in procfs that caused apparently spurious qualifier warnigns (approved by bde). o Add comment noting that ktrace() has not been updated, as its access control checks are different from ptrace(), whereas they should probably be the same. Further discussion should happen on this topic. Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project
* Revert the suser -> suser_xxx change made previously. It was rightgreen2000-08-241-1/+1
| | | | before.
* Fix a couple cases where p_trespass wasn't transitioned into place.green2000-08-161-9/+3
| | | | Make RTP_SET (rtprio) only accessible to real root, not root in jails.
* fix a typophk2000-06-101-1/+1
|
* o Modify jail to limit creation of sockets to UNIX domain sockets,rwatson2000-06-041-5/+13
| | | | | | | | | | | | | | | | | TCP/IP (v4) sockets, and routing sockets. Previously, interaction with IPv6 was not well-defined, and might be inappropriate for some environments. Similarly, sysctl MIB entries providing interface information also give out only addresses from those protocol domains. For the time being, this functionality is enabled by default, and toggleable using the sysctl variable jail.socket_unixiproute_only. In the future, protocol domains will be able to determine whether or not they are ``jail aware''. o Further limitations on process use of getpriority() and setpriority() by jailed processes. Addresses problem described in kern/17878. Reviewed by: phk, jmg
* Don't try to account for the partial quantum unless the process isphk2000-02-151-4/+0
| | | | | | | | | | curproc. This only makes any difference on SMP, where we used a (potentially very bogus) switchtime from our own CPU to calculate resource usage on another CPU. This should remove some if not all calcru() related warnings on SMP. Approved by: jkh
* Fix a bug that could crash the system if you press ^T while a slowergreen2000-01-281-17/+16
| | | | | | | | | | | | | | | system is slowed down and in the right spot (a race condition in fork()). The "previous time" fields have moved from pstat to proc. Anything which uses KVM needs to be recompiled with a new libkvm/headers. A couple wacky u_quad_t's in struct proc are now u_int64_t (the same, but according to lack of 'quad's in proc.h and usage in kern_resource.c). This will have no effect on code. This has been make-world-and-installed-new-kernel-which-works-fine-tested. Reviewed by: bde (previous version)
* Add a bit of sanity checking and problem avoidance in case thephk1999-11-291-2/+8
| | | | | | | timecounter hardware is bogus. This will produce a new warning "microuptime() went backwards" and try to not screw up the process resource accounting.
* This is a partial commit of the patch from PR 14914:phk1999-11-161-6/+4
| | | | | | | | | | | | | Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Trim unused options (or #ifdef for undoc options).peter1999-10-111-1/+0
| | | | Submitted by: phk
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* This Implements the mumbled about "Jail" feature.phk1999-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
* Change suser_xxx() to suser() where it applies.phk1999-04-271-3/+3
|
* Suser() simplification:phk1999-04-271-4/+4
| | | | | | | | | | | | | | | | | | | 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc *), prototyped in <sys/proc.h>. 3: s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.
* Enforce monotonicity of apparent process user, system and interrupt times.bde1999-03-131-22/+51
| | | | PR: 975, 10402
* Fixed runtime accounting. The time since the previous context switchbde1999-03-111-12/+1
| | | | | | | | was discarded on every call to calcru(). Hacking on the `switchtime' global for a related fix in rev.1.38 of kern_resource.c was too fragile and broke when p_switchtime went away. PR: 10402
* The magic "no-cpu" cpu number is 0xff. Don't misrepresent cpubde1999-03-051-2/+2
| | | | | | numbers as chars or use bogus casts in an attempt to unmisrepresnt them. In top, don't assume that 0xff is the only negative cpu number when cpu numbers are (mis)represented.
OpenPOWER on IntegriCloud