summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_fork.c
Commit message (Collapse)AuthorAgeFilesLines
* Do not lock the process when calling fdfree() (this would have recursed onjhb2002-10-181-4/+0
| | | | | a non-recursive lock, the proc lock, before) since we don't need it to change p_fd.
* - Add a new global mutex 'ppeers_lock' to protect the p_peers list ofjhb2002-10-151-38/+50
| | | | | | | | | | | | | | | | | | | processes forked with RFTHREAD. - Use a goto to a label for common code when exiting from fork1() in case of an error. - Move the RFTHREAD linkage setup code later in fork since the ppeers_lock cannot be locked while holding a proc lock. Handle the race of a task leader exiting and killing its peers while a peer is forking a new child. In that case, go ahead and let the peer process proceed normally as the parent is about to kill it. However, the task leader may have already gone to sleep to wait for the peers to die, so the new child process may not receive a SIGKILL from the task leader. Rather than try to destruct the new child process, just go ahead and send it a SIGKILL directly and add it to the p_peers list. This ensures that the task leader will wait until both the peer process doing the fork() and the new child process have received their KILL signals and exited. Discussed with: truckman (earlier versions)
* - Create a new scheduler api that is defined in sys/sched.hjeff2002-10-121-6/+7
| | | | | | | | | | - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch
* Round out the facilty for a 'bound' thread to loan out its KSEjulian2002-10-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | in specific situations. The owner thread must be blocked, and the borrower can not proceed back to user space with the borrowed KSE. The borrower will return the KSE on the next context switch where teh owner wants it back. This removes a lot of possible race conditions and deadlocks. It is consceivable that the borrower should inherit the priority of the owner too. that's another discussion and would be simple to do. Also, as part of this, the "preallocatd spare thread" is attached to the thread doing a syscall rather than the KSE. This removes the need to lock the scheduler when we want to access it, as it's now "at hand". DDB now shows a lot mor info for threaded proceses though it may need some optimisation to squeeze it all back into 80 chars again. (possible JKH project) Upcalls are now "bound" threads, but "KSE Lending" now means that other completing syscalls can be completed using that KSE before the upcall finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is one of the highest priorities in the KSE system.) The upcall when it happens will present all the completed syscalls to the KSE for selection.
* Some kernel threads try to do significant work, and the default KSTACK_PAGESscottl2002-10-021-4/+9
| | | | | | | | | | | | | doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb
* Back our kernel support for reliable signal queues.jmallett2002-10-011-1/+0
| | | | Requested by: rwatson, phk, and many others
* First half of implementation of ksiginfo, signal queues, and such. Thisjmallett2002-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs. After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such. CODAFS is unfinished in this regard because the logic is unclear in some places. Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]
* Add kernel support needed for the KSE-aware libpthread:mini2002-09-161-2/+0
| | | | | | | | | | | | - Use ucontext_t's to store KSE thread state. - Synthesize state for the UTS upon each upcall, rather than saving and copying a trapframe. - Deliver signals to KSE-aware processes via upcall. - Rename kse mailbox structure fields to be more BSD-like. - Store the UTS's stack in struct proc in a stack_t. Reviewed by: bde, deischen, julian Approved by: -arch
* Allocate KSEs and KSEGRPs separatly and remove them from the proc structure.julian2002-09-151-5/+3
| | | | | | | | | next step is to allow > 1 to be allocated per process. This would give multi-processor threads. (when the rest of the infrastructure is in place) While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc are diverging more than they should.. corrective action needed soon.
* Completely redo thread states.julian2002-09-111-0/+1
| | | | Reviewed by: davidxu@freebsd.org
* Use UMA as a complex object allocator.julian2002-09-061-33/+3
| | | | | | | | | | | | | | | The process allocator now caches and hands out complete process structures *including substructures* . i.e. it get's the process structure with the first thread (and soon KSE) already allocated and attached, all in one hit. For the average non threaded program (non KSE that is) the allocated thread and its stack remain attached to the process, even when the process is unused and in the process cache. This saves having to allocate and attach it later, effectively bringing us (hopefully) close to the efficiency of pre-KSE systems where these were a single structure. Reviewed by: davidxu@freebsd.org, peter@freebsd.org
* s/SGNL/SIG/davidxu2002-09-051-1/+1
| | | | | | | | | | s/SNGL/SINGLE/ s/SNGLE/SINGLE/ Fix abbreviation for P_STOPPED_* etc flags, in original code they were inconsistent and difficult to distinguish between them. Approved by: julian (mentor)
* slight cleanup of single-threading code for KSE processesjulian2002-08-221-0/+9
|
* Move code block added in 1.157 to a safer part of fork1().mdodd2002-08-071-9/+9
| | | | Submitted by: jake
* Kernel modifications necessary to allow to follow fork()ed children.mdodd2002-08-041-0/+10
| | | | | PR: bin/25587 (in part) MFC after: 3 weeks
* Update docs to reflect change in count of procs reserved for rootsilby2002-07-301-1/+1
| | | | | | | | from 1 to 10. PR: kern/40515 Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 1 day
* Wire the sysctl output buffer before grabbing any locks to preventtruckman2002-07-281-0/+1
| | | | | | | SYSCTL_OUT() from blocking while locks are held. This should only be done when it would be inconvenient to make a temporary copy of the data and defer calling SYSCTL_OUT() until after the locks are released.
* part of a greater patch set..julian2002-07-141-1/+1
| | | | | | | | | | | 1/ don't need to set td_state to TDS_RUNNING in fork_return. it's already set in choosethread(). 2/ Set a child process state to "normal" as opposed to "new" when we allow it to be put on the run queue. Allows child to receive signals from the parent if the parent runs first and tries to immediatly signal he child. Submitted by: (part 2) Thomas Moestl <tmoestl@gmx.net>
* Thinking about it I came to the conclusion that the KSE states were incorrectlyjulian2002-07-141-3/+1
| | | | | | | | | | | | | | formulated. The correct states should be: IDLE: On the idle KSE list for that KSEG RUNQ: Linked onto the system run queue. THREAD: Attached to a thread and slaved to whatever state the thread is in. This means that most places where we were adjusting kse state can go away as it is just moving around because the thread is.. The only places we need to adjust the KSE state is in transition to and from the idle and run queues. Reviewed by: jhb@freebsd.org
* Revert removal of cred_free_thread(): It is used to ensure that a thread'smini2002-07-111-0/+3
| | | | | | | credentials are not improperly borrowed when the thread is not current in the kernel. Requested by: jhb, alfred
* Part 1 of KSE-IIIjulian2002-06-291-21/+54
| | | | | | | | | | | | | The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
* Remove unused diagnostic function cread_free_thread().mini2002-06-241-3/+0
| | | | Approved by: alfred
* - Proper locking for p_tracep and p_traceflag.jhb2002-06-071-7/+7
| | | | - Catch up to new ktrace API.
* - Protect randompid and nprocs with the allproc_lock.jhb2002-05-021-101/+122
| | | | | | | | | - Reorder fork1() to do malloc() and other blocking operations prior to acquiring the needed process locks. - The new process inherit's the credentials of curthread, not the credentials of the old process. - Document a really weird race that will come up with KSE allows multiple kernel threads per process.
* Lock proctree_lock instead of pgrpsess_lock.jhb2002-04-161-2/+2
|
* Whitespace changes to wrap long lines.jhb2002-04-091-4/+8
|
* Change callers of mtx_init() to pass in an appropriate lock type name. Injhb2002-04-041-1/+1
| | | | | | | most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64
* Fix leakage of p_pgrp lock.tanimura2002-04-021-0/+4
|
* Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()dillon2002-04-011-0/+1
| | | | | | | | | | | | | | | | | | | | | and cpu_critical_exit() and moves associated critical prototypes into their own header file, <arch>/<arch>/critical.h, which is only included by the three MI source files that need it. Backout and re-apply improperly comitted syntactical cleanups made to files that were still under active development. Backout improperly comitted program structure changes that moved localized declarations to the top of two procedures. Partially re-apply one of the program structure changes to move 'mask' into an intermediate block rather then in three separate sub-blocks to make the code more readable. Re-integrate bug fixes that Jake made to the sparc64 code. Note: In general, developers should not gratuitously move declarations out of sub-blocks. They are where they are for reasons of structure, grouping, readability, compiler-localizability, and to avoid developer-introduced bugs similar to several found in recent years in the VFS and VM code. Reviewed by: jake
* Make the reference counting of 'struct pargs' SMP safe.alfred2002-03-271-2/+1
| | | | | | | | | There is still some locations where the PROC lock should be held in order to prevent inconsistent views from outside (like the proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be fixed later. Submitted by: Jonathan Mini <mini@haikugeek.com>
* Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locksjeff2002-03-271-1/+1
| | | | | | | | | | | with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb
* Compromise for critical*()/cpu_critical*() recommit. Cleanup the interruptdillon2002-03-271-3/+7
| | | | | | | | | | | | | | | | | | | disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up). Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement. This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical*() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary. Reviewed by: core Approved by: core
* Add a change mirroring that made to kern/subr_trap.c and others.benno2002-03-211-9/+3
| | | | | | | This makes kernel builds with DIAGNOSTIC work again. Apparently forgotten by: jhb Might want to be checked by: jhb
* Remove references to vm_zone.h and switch over to the new uma API.jeff2002-03-201-2/+2
| | | | | Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
* revert last commit temporarily due to whining on the lists.dillon2002-02-261-8/+1
|
* STAGE-1 of 3 commit - allow (but do not require) interrupts to remaindillon2002-02-261-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled in critical sections and streamline critical_enter() and critical_exit(). This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change. This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs. This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_*() functions are moved entirely into MD files. The following changes have been made: * critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section. Other architectures are unaffected. * fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code. The new code places the burden of entering the critical section in the trampoline code where it belongs. * I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask. * In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made. * Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags. * FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock. * ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts. * Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them. Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code. * Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code. Performance issues: One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant. The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer. The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example). The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit(). Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
* Lock struct pgrp, session and sigio.tanimura2002-02-231-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
* Add some DIAGNOSTIC code.julian2002-02-221-6/+9
| | | | | | | | | | | | While in userland, keep the thread's ucred reference in a shadow field so that the usual place to store it is NULL. If DIAGNOSTIC is not set, the thread ucred is kept valid until the next kernel entry, at which time it is checked against the process cred and possibly corrected. Produces a BIG speedup in kernels with INVARIANTS set. (A previous commit corrected it for the non INVARIANTS case already) Reviewed by: dillon@freebsd.org
* Convert p->p_runtime and PCPU(switchtime) to bintime format.phk2002-02-221-2/+2
|
* A few misc forkbomb defenses:silby2002-02-191-2/+5
| | | | | | | | | | | | | | | - Leave 10 processes for root-only use, the previous value of 1 was insufficient to run ps ax | more. - Remove the printing of "proc: table full". When the table really is full, this would flood the screen/logs, making the problem tougher to deal with. - Force any process trying to fork beyond its user's maximum number of processes to sleep for .5 seconds before returning failure. This turns 2000 rampaging fork monsters into 2000 harmlessly snoozing fork monsters. Reviewed by: dillon, peter MFC after: 1 week
* If the credential on an incoming thread is correct, don't botherjulian2002-02-171-0/+2
| | | | | | | | reaquiring it. In the same vein, don't bother dropping the thread cred when goinf ot userland. We are guaranteed to nned it when we come back, (which we are guaranteed to do). Reviewed by: jhb@freebsd.org, bde@freebsd.org (slightly different version)
* Fix a couple of style bugs introduced (or touched by) previous commit.peter2002-02-071-1/+0
|
* Pre-KSE/M3 commit.julian2002-02-071-30/+33
| | | | | | | | | | this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
* SMP Lock struct file, filedesc and the global file list.alfred2002-01-131-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
* GC fast_vfork; it's not actually referenced anywhere.silby2002-01-091-4/+0
| | | | MFC after: 3 weeks
* Return EINVAL if kernel only flags are passed to the rfork syscall ratherjhb2001-12-191-2/+4
| | | | than silently masking them.
* Modify the critical section API as follows:jhb2001-12-181-7/+2
| | | | | | | | | | | | | | | | | | | - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha
* Fix some nits in fork_exit() so it more properly duplicates the backendjhb2001-12-141-4/+4
| | | | | | | | | of mi_switch: - Set the oncpu value for the current thread. - Always set switchticks, not just in the SMP case. - Add a KTR entry for fork_exit that is the same as the "new proc" entry in mi_switch(). - Release sched_lock a bit later like we do with mi_switch().
* Add a per-thread ucred reference for syscalls and synchronous traps fromjhb2001-10-261-0/+5
| | | | | | | | | userland. The per thread ucred reference is immutable and thus needs no locks to be read. However, until all the proc locking associated with writes to p_ucred are completed, it is still not safe to use the per-thread reference. Tested on: x86 (SMP), alpha, sparc64
* Fix ktrace enablement/disablement races that can result in a vnodedillon2001-10-241-3/+4
| | | | | | | | ref count panic. Bug noticed by: ps Reviewed by: ps MFC after: 1 day
OpenPOWER on IntegriCloud