summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exec.c
Commit message (Collapse)AuthorAgeFilesLines
* Modify the MAC Framework so that instead of embedding a (struct label)rwatson2003-11-121-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures. This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability. While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory. NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol. Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Remove md_bspstore from the MD fields of struct thread. Now thatmarcel2003-10-211-1/+0
| | | | | the backing store is at a fixed address, there's no need for a per-thread variable.
* Put the RSE backing store at a fixed address. This change is triggeredmarcel2003-10-201-1/+1
| | | | | | | | | | | | | by libguile that needs to know the base of the RSE backing store. We currently do not export the fixed address to userland by means of a sysctl so user code needs to hardcode it for now. This will be revisited later. The RSE backing store is now at the bottom of region 4. The memory stack is at the top of region 4. This means that the whole region is usable for the stacks, giving a 61-bit stack space. Port: lang/guile (depended of x11/gnome2)
* Eliminate some unnecessary uses of the vm page queues lock around thealc2003-10-041-9/+6
| | | | | vm page's valid field. This field is being synchronized using the containing vm object's lock.
* Remove the regstkpages sysctl variable. We have a growable registermarcel2003-09-271-6/+0
| | | | stack now.
* Part 2 of implementing rstacks: add the ability to create rstacks andmarcel2003-09-271-15/+9
| | | | | | | | | | | | | | | | | | | | use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect. Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address. Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done. Tested on: alpha, i386, ia64, sparc64
* Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bitpeter2003-09-251-0/+9
| | | | | | | | | | | | | | | | | | | | | systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
* Add a "int fd" argument to VOP_OPEN() which in the future willphk2003-07-261-1/+1
| | | | | | | | | contain the filedescriptor number on opens from userland. The index is used rather than a "struct file *" since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/* For now pass -1 all over the place.
* Rename P_THREADED to P_SA. P_SA means a process is using schedulerdavidxu2003-06-151-2/+2
| | | | activations.
* Add vm object locking to various pagers' "get pages" methods, i386 stackalc2003-06-131-2/+0
| | | | management functions, and a u area management function.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Update the vm object and page locking in exec_map_first_page(). Mark thealc2003-06-091-9/+15
| | | | one still anticipated change with XXX. Otherwise, this function is done.
* Lock the vm object when performing vm_page_grab().alc2003-06-081-2/+2
|
* - Merge struct procsig with struct sigacts.jhb2003-05-131-15/+10
| | | | | | | | | | | | | | | | | - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)
* - Borrow the KSE single threading code for exec and exit. We use the checkjeff2003-04-011-1/+1
| | | | | | | | if (p->p_numthreads > 1) and not a flag because action is only necessary if there are other threads. The rest of the system has no need to identify thr threaded processes. - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED is not set.
* Replace the at_fork, at_exec, and at_exit functions with the slightly morejhb2003-03-241-59/+2
| | | | | | | | | flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit. Reviewed by: phk, jake, almost complete silence on arch@
* - Cache a reference to the credential of the thread that starts a ktrace injhb2003-03-131-3/+8
| | | | | | | | | | | struct proc as p_tracecred alongside the current cache of the vnode in p_tracep. This credential is then used for all later ktrace operations on this file rather than using the credential of the current thread at the time of each ktrace event. - Now that we have multiple ktrace-related items in struct proc that are pointers, rename p_tracep to p_tracevp to make it less ambiguous. Requested by: rwatson (1)
* Change the process flags P_KSES to be P_THREADED.julian2003-02-271-2/+2
| | | | This is just a cosmetic change but I've been meaning to do it for about a year.
* Back out M_* changes, per decision of the TRB.imp2003-02-191-3/+3
| | | | Approved by: trb
* - Split the struct kse into struct upcall and struct kse. struct kse willjeff2003-02-171-3/+0
| | | | | | | soon be visible only to schedulers. This greatly simplifies much the KSE code. Submitted by: davidxu
* Reversion of commit by Davidxu plus fixes since applied.julian2003-02-011-0/+3
| | | | | | | | I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.
* Move UPCALL related data structure out of kse, introduce a newdavidxu2003-01-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-3/+3
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Perform VOP_GETATTR() before mac_check_vnode_exec() so thatrwatson2003-01-211-5/+5
| | | | | | | the cached attributes are available to MAC modules. Submitted by: mike halderman <mrh@nosc.mil> Obtained from: TrustedBSD Project
* It is possible for an active aio to prevent shared memory from beingdillon2003-01-131-2/+1
| | | | | | | | | | | | | dereferenced when a process exits due to the vmspace ref-count being bumped. Change shmexit() and shmexit_myhook() to take a vmspace instead of a process and call it in vmspace_dofree(). This way if it is missed in exit1()'s early-resource-free it will still be caught when the zombie is reaped. Also fix a potential race in shmexit_myhook() by NULLing out vmspace->vm_shm prior to calling shm_delete_mapping() and free(). MFC after: 7 days
* Clear some KSE fields after kse mode was turned off.davidxu2003-01-071-0/+3
|
* Add a sysctl to get the vm protections for the stack of the current process.jake2003-01-041-0/+14
| | | | | | | | On architectures with a non-executable stack, eg sparc64, this is used by libgcc to determine at runtime if its necessary to enable execute permissions on a region of the stack which will be used to execute code, allowing the call to mprotect to be avoided if the kernel is configured to map the stack executable.
* fdcopy() only needs a filedesc pointer.alfred2003-01-011-1/+1
|
* Hold the page queues lock when performing vm_page_busy().alc2002-12-181-0/+2
|
* remove syscallarg().alfred2002-12-141-7/+7
| | | | Suggested by: peter
* To avoid sleeping with all sorts of resources acquired (the reportedrobert2002-11-261-3/+7
| | | | | | | | problem was a locked directory vnode), do not give the process a chance to sleep in state "stopevent" (depends on the S_EXEC bit being set in p_stops) until most resources have been released again. Approved by: re
* Acquire and release the page queues lock around pmap_remove_pages() becausealc2002-11-251-0/+2
| | | | it updates several of vm_page's fields.
* - Release the imgp vnode prior to freeing exec_map resources to avoidjeff2002-11-171-4/+4
| | | | deadlock.
* Now that pmap_remove_all() is exported by our pmap implementationsalc2002-11-161-1/+1
| | | | use it directly.
* When prot is VM_PROT_NONE, call pmap_page_protect() directly rather thanalc2002-11-101-1/+1
| | | | | | | | | indirectly through vm_page_protect(). The one remaining page flag that is updated by vm_page_protect() is already being updated by our various pmap implementations. Note: A later commit will similarly change the VM_PROT_READ case and eliminate vm_page_protect().
* Correct merge-o: disable the right execve() variation if !MACrwatson2002-11-051-4/+4
|
* Bring in two sets of changes:rwatson2002-11-051-8/+72
| | | | | | | | | | | | | | | | | | | | | | (1) Permit userland applications to request a change of label atomic with an execve() via mac_execve(). This is required for the SEBSD port of SELinux/FLASK. Attempts to invoke this without MAC compiled in result in ENOSYS, as with all other MAC system calls. Complexity, if desired, is present in policy modules, rather than the framework. (2) Permit policies to have access to both the label of the vnode being executed as well as the interpreter if it's a shell script or related UNIX nonsense. Because we can't hold both vnode locks at the same time, cache the interpreter label. SEBSD relies on this because it supports secure transitioning via shell script executables. Other policies might want to take both labels into account during an integrity or confidentiality decision at execve()-time. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Hook up the mac_will_execve_transition() and mac_execve_transition()rwatson2002-11-051-0/+15
| | | | | | | | | | | | | | | | | entrypoints, #ifdef MAC. The supporting logic already existed in kern_mac.c, so no change there. This permits MAC policies to cause a process label change as the result of executing a binary -- typically, as a result of executing a specially labeled binary. For example, the SEBSD port of SELinux/FLASK uses this functionality to implement TE type transitions on processes using transitioning binaries, in a manner similar to setuid. Policies not implementing a notion of transition (all the ones in the tree right now) require no changes, since the old label data is copied to the new label via mac_create_cred() even if a transition does occur. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Remove reference to struct execve_args from struct imgact, whichrwatson2002-11-051-17/+39
| | | | | | | | | | | | | | | | | describes an image activation instance. Instead, make use of the existing fname structure entry, and introduce two new entries, userspace_argv, and userspace_envv. With the addition of mac_execve(), this divorces the image structure from the specifics of the execve() system call, removes a redundant pointer, etc. No semantic change from current behavior, but it means that the structure doesn't depend on syscalls.master-generated includes. There seems to be some redundant initialization of imgact entries, which I have maintained, but which could probably use some cleaning up at some point. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* - Move the 'done1' label down below the unlock of the proc lock and movejhb2002-10-111-10/+9
| | | | | | | | | | the locking of the proc lock after the goto to done1 to avoid locking the lock in an error case just so we can turn around and unlock it. - Move the exec_setregs() stuff out from under the proc lock and after the p_args stuff. This allows exec_setregs() to be able to sleep or write things out to userland, etc. which ia64 does. Tested by: peter
* Use the fields in the sysentvec and in the vm map header in place of thejake2002-09-211-20/+46
| | | | | | | | constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
* Move setugidsafety() call outside of process lock. This prevents a locknjl2002-09-141-3/+5
| | | | | | | recursion when closef() calls pfind() which also wants the proc lock. This case only occurred when setugidsafety() needed to close unsafe files. Reviewed by: truckman
* Drop the proc lock while calling fdcheckstd() which may block to allocatetruckman2002-09-131-1/+8
| | | | | | memory. Reviewed by: jhb
* s/SGNL/SIG/davidxu2002-09-051-1/+1
| | | | | | | | | | s/SNGL/SINGLE/ s/SNGLE/SINGLE/ Fix abbreviation for P_STOPPED_* etc flags, in original code they were inconsistent and difficult to distinguish between them. Approved by: julian (mentor)
* Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections tojake2002-09-011-2/+0
| | | | | | sysentvec. Initialized all fields of all sysentvecs, which will allow them to be used instead of constants in more places. Provided stack fixup routines for emulations that previously used the default.
* Renamed poorly named setregs to exec_setregs. Moved its prototype tojake2002-08-291-2/+2
| | | | imgact.h with the other exec support functions.
* Don't require that sysentvec.sv_szsigcode be non-NULL.jake2002-08-291-3/+7
|
* Fixed most indentation bugs.jake2002-08-251-7/+6
|
* Fixed placement of operators. Wrapped long lines.jake2002-08-251-11/+15
|
* Fixed white space around operators, casts and reserved words.jake2002-08-241-9/+8
| | | | Reviewed by: md5
OpenPOWER on IntegriCloud