summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_fork.c
Commit message (Collapse)AuthorAgeFilesLines
* Rip some well duplicated code out of cpu_wait() and cpu_exit() and movepeter2001-09-101-4/+4
| | | | | | | | | | | | it to the MI area. KSE touched cpu_wait() which had the same change replicated five ways for each platform. Now it can just do it once. The only MD parts seemed to be dealing with fpu state cleanup and things like vm86 cleanup on x86. The rest was identical. XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional stub in place. Reviewed by: jake, tmm, dillon
* Pushdown Giant for acct(), kqueue(), kevent(), execve(), fork(),dillon2001-09-011-4/+15
| | | | vfork(), rfork(), jail().
* Get rid of useless bcopy (the next statement was equivalent)guido2001-07-091-2/+0
|
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-3/+2
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Remove the p_spinlocks spin lock count that was obsoleted by thejhb2001-06-301-1/+0
| | | | per-CPU spinlocks list.
* Rename nextpid to lastpid and externalize it.des2001-06-111-7/+7
|
* o Merge contents of struct pcred into struct ucred. Specifically, add therwatson2001-05-251-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+2
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Properly copy the P_ALTSTACK flag in struct proc::p_flag to the childknu2001-05-071-1/+1
| | | | | | | | | | | | | | | | | | process on fork(2). It is the supposed behavior stated in the manpage of sigaction(2), and Solaris, NetBSD and FreeBSD 3-STABLE correctly do so. The previous fix against libc_r/uthread/uthread_fork.c fixed the problem only for the programs linked with libc_r, so back it out and fix fork(2) itself to help those not linked with libc_r as well. PR: kern/26705 Submitted by: KUROSAWA Takahiro <fwkg7679@mb.infoweb.ne.jp> Tested by: knu, GOTOU Yuuzou <gotoyuzo@notwork.org>, and some other people Not objected by: hackers MFC in: 3 days
* Convert the allproc and proctree locks from lockmgr locks to sx locks.jhb2001-03-281-4/+4
|
* Rework the witness code to work with sx locks as well as mutexes.jhb2001-03-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.
* Don't explicitly zero p_intr_nesting_level and p_aioinfo in fork.jhb2001-03-281-2/+0
|
* Use mtx_intr_enable() on sched_lock to ensure child processes always startjhb2001-03-281-2/+2
| | | | | | with interrupts enabled rather than calling the no-longer MI function enable_intr(). This is bogus anyways and in theory shouldn't even be needed.
* Fix mtx_legal2block. The only time that it is bad to block on a mutex isjhb2001-03-091-0/+4
| | | | | | | | | | | | | | | | if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held. Consulting from: cp
* - Don't hold the proc lock across VREF and the fd* functions to avoid lockjhb2001-03-071-4/+21
| | | | | | order reversals. - Add some preliminary locking in the !RF_PROC case. - Protect p_estcpu with sched_lock.
* - Lock the forklist with an sx lock.jhb2001-03-071-14/+57
| | | | | | | | | | - Add proc locking to fork1(). Always lock the child procoess (new process) first when both processes need to be locked at the same time. - Remove unneeded spl()'s as the data they protected is now locked. - Ensure that the proctree is exclusively locked and the new process is locked when setting up the parent process pointer. - Lock the check for P_KTHREAD in p_flag in fork_exit().
* Sigh. Try to get priorities sorted out. Don't bother trying tojake2001-02-281-1/+0
| | | | | | | | | | | update native priority, it is diffcult to get right and likely to end up horribly wrong. Use an honestly wrong fixed value that seems to work; PUSER for user threads, and the interrupt priority for ithreads. Set it once when the process is created and forget about it. Suggested by: bde Pointy hat: me
* Initialize native priority to PRI_MAX. It was usually 0 which made ajake2001-02-261-0/+1
| | | | | | | | process's priority go through the roof when it released a (contested) mutex. Only set the native priority in mtx_lock if hasn't already been set. Reviewed by: jhb
* Quiet a warning with a uintptr_t cast.jhb2001-02-221-1/+1
| | | | Noticed by: bde
* o Move per-process jail pointer (p->pr_prison) to inside of the subjectrwatson2001-02-211-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project
* - Don't call clear_resched() in userret(), instead, clear the resched flagjhb2001-02-201-0/+6
| | | | | | | | | | | | in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.
* o Export the nextpid variable via SYSCTL as kern.lastpid, decreasing byrwatson2001-02-121-0/+2
| | | | | | | | one the number of variables needed for top and other setgid kmem utilities that could only be accessed via /dev/kmem previously. Submitted by: Thomas Moestl <tmoestl@gmx.net> Reviewed by: freebsd-audit
* Change and clean the mutex lock interface.bmilekic2001-02-091-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* Fix fork_exit() to take a pointer to a function that returns void as itsjhb2001-01-261-2/+2
| | | | | | first argument rather than a function that returns a void *. Noticed by: jake
* - Change fork_exit() to take a pointer to a trapframe as its 3rd argumentjhb2001-01-241-2/+2
| | | | | | | instead of a trapframe directly. (Requested by bde.) - Convert the alpha switch_trampoline to call fork_exit() and use the MI fork_return() instead of child_return(). - Axe child_return().
* - Catch up to proc flag changes.jhb2001-01-241-4/+79
| | | | - Add new fork_exit() and fork_return() MI C functions.
* Add mibs to hold the number of forks since boot. New mibs are:ume2001-01-231-0/+15
| | | | | | | | | | | | | | vm.stats.vm.v_forks vm.stats.vm.v_vforks vm.stats.vm.v_rforks vm.stats.vm.v_kthreads vm.stats.vm.v_forkpages vm.stats.vm.v_vforkpages vm.stats.vm.v_rforkpages vm.stats.vm.v_kthreadpages Submitted by: Paul Herman <pherman@frenchfries.net> Reviewed by: alfred
* Make intr_nesting_level per-process, rather than per-cpu. Setupjake2001-01-211-0/+1
| | | | | | | | interrupt threads to run with it always >= 1, so that malloc can detect M_WAITOK from "interrupt" context. This is also necessary in order to context switch from sched_ithd() directly. Reviewed By: peter
* Protect proc.p_pptr and proc.p_children/p_sibling with thejake2000-12-231-0/+2
| | | | | | | | proctree_lock. linprocfs not locked pending response from informal maintainer. Reviewed by: jhb, -smp@
* - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), insteadjake2000-12-131-2/+2
| | | | | | | | of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
* Whitespace. Fix indentation, align comments.jake2000-12-041-17/+17
|
* - Add a mutex to the proc structure p_mtx that will be used to lock accessesjhb2000-12-031-0/+1
| | | | | to each individual proc. - Initialize the lock during fork1(), and destroy it in wait1().
* Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeupjake2000-12-021-2/+0
| | | | | | | | | | from struct proc, which are now unused (p_nthread already was). Remove process flag P_KTHREADP which was untested and only set in vfs_aio.c (it should use kthread_create). Move the yield system call to kern_synch.c as kern_threads.c has been removed completely. moral support from: alfred, jhb
* Use an mp-safe callout for endtsleep.jake2000-12-011-1/+1
|
* Use callout_reset instead of timeout(9). Most callouts are staticallyjake2000-11-271-0/+3
| | | | | | allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon
* Protect the following with a lockmgr lock:jake2000-11-221-6/+8
| | | | | | | | | | | | allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd
* Catch up to moving headers:jhb2000-10-201-2/+1
| | | | | - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
* Enforce process limit policy in one place to keep proccnt from divergingtruckman2000-09-141-2/+2
| | | | from reality.
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-25/+55
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Remove uidinfo hash table lookup and maintenance out of chgproccnt() andtruckman2000-09-051-1/+3
| | | | | | | | | | | | | | chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-1/+1
| | | | Pointed out by: bde
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-1/+1
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* Add sysctl descriptions to a few sysctls. Simply "documentation".nbm2000-06-261-1/+2
| | | | | PR: kern/8015 Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>
* fix races in the uidinfo subsystem, several problems existed:alfred2000-06-221-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | 1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle. fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already. 2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races. fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure. 3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity. Reviewed by: green
* Back out the previous change to the queue(3) interface.jake2000-05-261-2/+2
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-2/+2
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Introduce kqueue() and kevent(), a kernel event notification facility.jlemon2000-04-161-0/+5
|
* Put on my asbestos underwear and commit the patch that I posted to -archpeter1999-12-061-3/+36
| | | | | | | | | some time ago that changes kern.randompid from a boolean to a randomness range for the next pid assigment. Too high causes a lot of extra work to scan for free pids, and too low merely wastes randomness entropy. It's still possible to select a completely random range by using PID_MAX (100k) or -1 as a shortcut to mean "the whole range". Also, don't waste randomness when doing a wraparound.
* User ldt sharing.luoqi1999-12-061-10/+1
|
* Introduce OpenBSD-like Random PIDs. Controlled by a sysctl knobdan1999-11-281-2/+5
| | | | | | | | (kern.randompid), which is currently defaulted off. Use ARC4 (RC4) for our random number generation, which will not get me executed for violating crypto laws; a Good Thing(tm). Reviewed and Approved by: bde, imp
OpenPOWER on IntegriCloud