summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_fork.c
Commit message (Collapse)AuthorAgeFilesLines
* Always set a process' state to normal when it is fully constructed injhb2004-02-051-5/+9
| | | | | fork1() rather than only doing it for the RFSTOPPED case and then having to fix it up in other places later on.
* Locking for the per-process resource limits structure.jhb2004-02-041-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
* When aborting fork() due to a failure, if using MAC, make sure to cleanrwatson2004-01-251-0/+3
| | | | | | | up the p_label field. Obtained from: TrustedBSD Project Sponsored by: DARPA, McAfee Research
* Reduce gratuitous includes: don't include jail.h if it's not needed.rwatson2004-01-211-1/+0
| | | | | | | Presumably, at some point, you had to include jail.h if you included proc.h, but that is no longer required. Result of: self injury involving adding something to struct prison
* Prevent a race condition between fork1() and whatever changes the pgrp bycognet2004-01-091-0/+1
| | | | | | | | | | setting the new process' p_pgrp again before inserting it in the p_pglist. Without it we can get the new process to be inserted in a different p_pglist than the one p2->p_pgrp points to, and this is not something we want to happen. This is not a fix, merely a bandaid, but it will work until someone finds a better way to do it. Discussed with: jhb (a long time ago)
* Make sigaltstack as per-threaded, because per-process sigaltstack statedavidxu2004-01-031-1/+4
| | | | | | | | | | | | | is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process. Reviewed by: deischen, dfr
* Removed mostly-dead code for setting switchtime after the idle loopbde2003-10-291-3/+0
| | | | | | | | | | | | | | | | | | | | | | clobbers this variable. Long ago, when the idle loop wasn't in a process, it set switchtime.tv_sec to zero to indicate that the time needs to be read after the idle loop finishes. The special case for this isn't needed now that there is an idle process (for each CPU). The time is read in the normal way when the idle process is switched away from. The seconds component of the time is only zero for the first second after the uptime is set, and the mostly-dead code was only executed during this time. (This was slightly broken by using uptimes instead of times relative to the Epoch -- in the original version the seconds component of the time was only 0 for the first second after the Epoch.) In mi_switch(), moved the setting of switchticks to just after the first (and now only) setting of switchtime. This setting used to be delayed since a late setting was needed for the idle case and an early setting was not needed. Now the early setting is needed so that fork_exit() doesn't need to set either switchtime or switchticks. Removed now-completely-rotted comment attached to this. Most of the code described by the comment had already moved to sched_switch().
* Removed sched_nest variable in sched_switch(). Context switches alwaysbde2003-10-291-1/+1
| | | | | | | | | | | | | | | | | begin with sched_lock held but not recursed, so this variable was always 0. Removed fixup of sched_lock.mtx_recurse after context switches in sched_switch(). Context switches always end with this variable in the same state that it began in, so there is no need to fix it up. Only sched_lock.mtx_lock really needs a fixup. Replaced fixup of sched_lock.mtx_recurse in fork_exit() by an assertion that sched_lock is owned and not recursed after it is fixed up. This assertion much match the one in mi_switch(), and if sched_lock were recursed then a non-null fixup of sched_lock.mtx_recurse would probably be needed again, unlike in sched_switch(), since fork_exit() doesn't return to its caller in the normal way.
* Change instances of callout_init that specify MPSAFE behaviour tosam2003-08-191-1/+1
| | | | | use CALLOUT_MPSAFE instead of "1" for the second parameter. This does not change the behaviour; it just makes the intent more clear.
* - Various style fixes in both code and comments.jhb2003-08-151-22/+28
| | | | | | | | | - Update some stale comments. - Sort a couple of includes. - Only set 'newcpu' in updatepri() if we use it. - No functional changes. Obtained from: bde (via an old diff I got a long time ago)
* Adjust a comment to remove staleness and take slightly less implementationjhb2003-08-041-6/+2
| | | | specific perspective.
* Add a ratelimited message of the formsilby2003-06-191-1/+5
| | | | | | | | | "maxproc limit exceeded by uid %i, please see tuning(7) and login.conf(5)." Which will be triggered whenever a user hits his/her maxproc limit or the systemwide maxproc limit is reached. MFC after: 1 week
* Rename P_THREADED to P_SA. P_SA means a process is using schedulerdavidxu2003-06-151-3/+3
| | | | activations.
* Move the *_new_altkstack() and *_dispose_altkstack() functions out of thealc2003-06-141-1/+1
| | | | | various pmap implementations into the machine-independent vm. They were all identical.
* Use __FBSDID().obrien2003-06-111-1/+3
|
* Add tracking of process leaders sharing a file descriptor table andtegge2003-06-021-11/+31
| | | | | | | allow a file descriptor table to be shared between multiple process leaders. PR: 50923
* - Merge struct procsig with struct sigacts.jhb2003-05-131-31/+7
| | | | | | | | | | | | | | | | | - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)
* Initialize and destroy the struct proc mutex in the proc zone's init andjhb2003-05-011-4/+3
| | | | | | | fini routines instead of in fork() and wait(). This has the nice side benefit that the proc lock of any process on the allproc list is always valid and sched_lock doesn't have to be used to test against PRS_NEW anymore.
* Instead of recording the Unix time in a process when it starts, record thedes2003-05-011-1/+1
| | | | | | | uptime. Where necessary, convert it back to Unix time by adding boottime to it. This fixes a potential problem in the accounting code, which would compute the elapsed time incorrectly if the Unix time was stepped during the lifetime of the process.
* Axe a stale comment.jhb2003-04-301-2/+0
|
* Fix some easy, global, lint warnings. In most cases, this meansmarkm2003-04-301-1/+1
| | | | | making some local variables static. In a couple of cases, this means removing an unused variable.
* - Move PS_PROFIL and its new cousin PS_STOPPROF back over to p_flag andjhb2003-04-221-3/+3
| | | | | | | rename them appropriately. Protect both flags with both the proc lock and the sched_lock. - Protect p_profthreads with the proc lock. - Remove Giant from profil(2).
* - Push Giant down into the fork1() function a small bit.jhb2003-04-171-11/+10
| | | | | | - Set p_acflag earlier while already hold the proc lock in fork1(). - Mark the realitexpire() callout MPSAFE for new processes. It was already marked safe for proc0 a long while ago.
* - Adjust sched hooks for fork and exec to take processes as arguments insteadjeff2003-04-111-1/+1
| | | | | | | | | | of ksegs since they primarily operation on processes. - KSEs take ticks so pass the kse through sched_clock(). - Add a sched_class() routine that adjusts a ksegrp pri class. - Define a sched_fork_{kse,thread,ksegrp} and sched_exit_{kse,thread,ksegrp} that will be used to tell the scheduler about new instances of these structures within the same process. These will be used by THR and KSE. - Change sched_4bsd to reflect this API update.
* Move the _oncpu entry from the KSE to the thread.julian2003-04-101-1/+1
| | | | | The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.
* - Borrow the KSE single threading code for exec and exit. We use the checkjeff2003-04-011-0/+5
| | | | | | | | if (p->p_numthreads > 1) and not a flag because action is only necessary if there are other threads. The rest of the system has no need to identify thr threaded processes. - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED is not set.
* Replace the at_fork, at_exec, and at_exit functions with the slightly morejhb2003-03-241-86/+2
| | | | | | | | | flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit. Reviewed by: phk, jake, almost complete silence on arch@
* - Cache a reference to the credential of the thread that starts a ktrace injhb2003-03-131-3/+7
| | | | | | | | | | | struct proc as p_tracecred alongside the current cache of the vnode in p_tracep. This credential is then used for all later ktrace operations on this file rather than using the credential of the current thread at the time of each ktrace event. - Now that we have multiple ktrace-related items in struct proc that are pointers, rename p_tracep to p_tracevp to make it less ambiguous. Requested by: rwatson (1)
* Change the process flags P_KSES to be P_THREADED.julian2003-02-271-3/+3
| | | | This is just a cosmetic change but I've been meaning to do it for about a year.
* Remove the PL_SHAREMOD flag from struct plimit, which could have beentjr2003-02-201-10/+3
| | | | | | used to share resource limits between rfork threads, but never was. Removing it makes resource limit locking much simpler -- only the current process can change the contents of the structure that p_limit points to.
* Back out M_* changes, per decision of the TRB.imp2003-02-191-3/+3
| | | | Approved by: trb
* - Split the struct kse into struct upcall and struct kse. struct kse willjeff2003-02-171-2/+0
| | | | | | | soon be visible only to schedulers. This greatly simplifies much the KSE code. Submitted by: davidxu
* Avoid file lock leakage when linuxthreads port or rfork is used:tegge2003-02-151-0/+7
| | | | | | | | | | | | - Mark the process leader as having an advisory lock - Check if process leader is marked as having advisory lock when closing file - Check that file is still open after lock has been obtained - Don't allow file descriptor table sharing between processes with different leaders PR: 10265 Reviewed by: alfred
* Reversion of commit by Davidxu plus fixes since applied.julian2003-02-011-0/+2
| | | | | | | | I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.
* Move UPCALL related data structure out of kse, introduce a newdavidxu2003-01-261-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-3/+3
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* fdcopy() only needs a filedesc pointer.alfred2003-01-011-2/+2
|
* Since fdshare() and fdinit() only operate on filedescs, make themalfred2003-01-011-4/+4
| | | | | | | | take pointers to filedesc structures instead of threads. This makes it more clear that they do not do any voodoo with the thread/proc or anything other than the filedesc passed in or returned. Remove some XXX KSE's as this resolves the issue.
* Add code to ddb to allow backtracing an arbitrary thread.julian2002-12-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | (show thread {address}) Remove the IDLE kse state and replace it with a change in the way threads sahre KSEs. Every KSE now has a thread, which is considered its "owner" however a KSE may also be lent to other threads in the same group to allow completion of in-kernel work. n this case the owner remains the same and the KSE will revert to the owner when the other work has been completed. All creations of upcalls etc. is now done from kse_reassign() which in turn is called from mi_switch or thread_exit(). This means that special code can be removed from msleep() and cv_wait(). kse_release() does not leave a KSE with no thread any more but converts the existing thread into teh KSE's owner, and sets it up for doing an upcall. It is just inhibitted from being scheduled until there is some reason to do an upcall. Remove all trace of the kse_idle queue since it is no-longer needed. "Idle" KSEs are now on the loanable queue.
* Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storagejulian2002-12-101-2/+8
| | | | | | | | during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled. Approved by: re (jhb)
* Introduce p_label, extensible security label storage for the MAC frameworkrwatson2002-11-201-0/+5
| | | | | | | | | | | | | | | | | | | in struct proc. While the process label is actually stored in the struct ucred pointed to by p_ucred, there is a need for transient storage that may be used when asynchronous (deferred) updates need to be performed on the "real" label for locking reasons. Unlike other label storage, this label has no locking semantics, relying on policies to provide their own protection for the label contents, meaning that a policy leaf mutex may be used, avoiding lock order issues. This permits policies that act based on historical process behavior (such as audit policies, the MAC Framework port of LOMAC, etc) can update process properties even when many existing locks are held without violating the lock order. No currently committed policies implement use of this label storage. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* We leaked a process lock reference in the event an RFTHREAD processrwatson2002-11-181-1/+2
| | | | | | | leader wasn't exiting during a fork; instead, do remember to release the lock avoiding lock order reversals and recursion panic. Reported by: "Joel M. Baldwin" <qumqats@outel.org>
* Do not lock the process when calling fdfree() (this would have recursed onjhb2002-10-181-4/+0
| | | | | a non-recursive lock, the proc lock, before) since we don't need it to change p_fd.
* - Add a new global mutex 'ppeers_lock' to protect the p_peers list ofjhb2002-10-151-38/+50
| | | | | | | | | | | | | | | | | | | processes forked with RFTHREAD. - Use a goto to a label for common code when exiting from fork1() in case of an error. - Move the RFTHREAD linkage setup code later in fork since the ppeers_lock cannot be locked while holding a proc lock. Handle the race of a task leader exiting and killing its peers while a peer is forking a new child. In that case, go ahead and let the peer process proceed normally as the parent is about to kill it. However, the task leader may have already gone to sleep to wait for the peers to die, so the new child process may not receive a SIGKILL from the task leader. Rather than try to destruct the new child process, just go ahead and send it a SIGKILL directly and add it to the p_peers list. This ensures that the task leader will wait until both the peer process doing the fork() and the new child process have received their KILL signals and exited. Discussed with: truckman (earlier versions)
* - Create a new scheduler api that is defined in sys/sched.hjeff2002-10-121-6/+7
| | | | | | | | | | - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch
* Round out the facilty for a 'bound' thread to loan out its KSEjulian2002-10-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | in specific situations. The owner thread must be blocked, and the borrower can not proceed back to user space with the borrowed KSE. The borrower will return the KSE on the next context switch where teh owner wants it back. This removes a lot of possible race conditions and deadlocks. It is consceivable that the borrower should inherit the priority of the owner too. that's another discussion and would be simple to do. Also, as part of this, the "preallocatd spare thread" is attached to the thread doing a syscall rather than the KSE. This removes the need to lock the scheduler when we want to access it, as it's now "at hand". DDB now shows a lot mor info for threaded proceses though it may need some optimisation to squeeze it all back into 80 chars again. (possible JKH project) Upcalls are now "bound" threads, but "KSE Lending" now means that other completing syscalls can be completed using that KSE before the upcall finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is one of the highest priorities in the KSE system.) The upcall when it happens will present all the completed syscalls to the KSE for selection.
* Some kernel threads try to do significant work, and the default KSTACK_PAGESscottl2002-10-021-4/+9
| | | | | | | | | | | | | doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb
* Back our kernel support for reliable signal queues.jmallett2002-10-011-1/+0
| | | | Requested by: rwatson, phk, and many others
* First half of implementation of ksiginfo, signal queues, and such. Thisjmallett2002-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs. After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such. CODAFS is unfinished in this regard because the logic is unclear in some places. Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]
* Add kernel support needed for the KSE-aware libpthread:mini2002-09-161-2/+0
| | | | | | | | | | | | - Use ucontext_t's to store KSE thread state. - Synthesize state for the UTS upon each upcall, rather than saving and copying a trapframe. - Deliver signals to KSE-aware processes via upcall. - Rename kse mailbox structure fields to be more BSD-like. - Store the UTS's stack in struct proc in a stack_t. Reviewed by: bde, deischen, julian Approved by: -arch
OpenPOWER on IntegriCloud