summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exec.c
Commit message (Collapse)AuthorAgeFilesLines
* Grrr, this committer needs to have a sleep. Remove lines from the previoussobomax2005-01-291-3/+0
| | | | | | delta not intended for public consumption. MFC after: 2 weeks
* Fix small non-conformance introduced in the previous commit: execve() issobomax2005-01-291-4/+4
| | | | | | | expected to return ENAMETOOLONG, not E2BIG if first argument doesn't fit into {PATH_MAX} bytes. MFC after: 2 weeks
* o Split out kernel part of execve(2) syscall into two parts: one thatsobomax2005-01-291-96/+112
| | | | | | | | | | | copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks
* Don't use VOP_GETVOBJECT, use vp->v_object directly.phk2005-01-251-2/+3
|
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Add new function fdunshare() which encapsulates the necessary light magicphk2004-12-141-10/+1
| | | | | | | | for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c
* Don't include sys/user.h merely for its side-effect of recursivelydas2004-11-271-1/+1
| | | | including other headers.
* Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.phk2004-11-131-3/+3
| | | | | | | | Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
* Use more intuitive pointer for fdinit() and fdcopy().phk2004-11-081-1/+1
| | | | Change fdcopy() to take unlocked filedesc.
* Put on my peril sensitive sunglasses and add a flags field to the internalpeter2004-10-111-4/+4
| | | | | | | | | | | | | | | | sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
* Add an execve command for kse_thr_interrupt to allow libpthread todavidxu2004-10-071-3/+2
| | | | | | restore signal mask correctly, this is required by POSIX. Reviewed by: deischen
* In original kern_execve() code, at the start of the function, it forcesdavidxu2004-10-061-12/+46
| | | | | | | | | | | | | | | | | all other threads to suicide, problem is execve() could be failed, and a failed execve() would change threaded process to unthreaded, this side effect is unexpected. The new code introduces a new single threading mode SINGLE_BOUNDARY, in the mode, all threads should suspend themself at user boundary except the singler. we can not use SINGLE_NO_EXIT because we want to start from a clean state if execve() is successful, suspending other threads at unknown point and later resuming them from there and forcing them to exit at user boundary may cause the process to start from a dirty state. If execve() is successful, current thread upgrades to SINGLE_EXIT mode and forces other threads to suicide at user boundary, otherwise, other threads will be resumed and their interrupted syscall will be restarted. Reviewed by: julian
* - Don't try to unlock Giant if single threading fails since we don't havejhb2004-09-231-1/+1
| | | | | it locked. - Unlock Giant before calling exit1() since exit1() does not require Giant.
* Revert the last change..julian2004-09-221-17/+11
| | | | | | | Better to kill all other threads than to panic the system if 2 threads call execve() at the same time. A better fix will be committed later. Note that this only affects the case where the execve fails.
* In a threaded process, don't kill off all the other threads until we have ajulian2004-09-211-11/+17
| | | | | | | reasonable chance that the eceve() is going to succeeed. I.e. wait until we've done the permission checks etc. MFC after: 1 week
* Refactor a bunch of scheduler code to give basically the same behaviourjulian2004-09-051-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
* Add locking to the kqueue subsystem. This also makes the kqueue subsystemjmg2004-08-151-1/+1
| | | | | | | | | | | | | a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-1/+1
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* White space fix..julian2004-07-241-3/+3
| | | | diff reduction for upcoming commit.
* Push down the acquisition and release of the page queues lock intoalc2004-07-131-2/+0
| | | | | | | | pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.) Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
* Move TDF_SA from td_flags to td_pflags (and rename it accordingly)tjr2004-06-021-3/+1
| | | | | | | so that it is no longer necessary to hold sched_lock while manipulating it. Reviewed by: davidxu
* Clear KSE thread flags after KSE thread mode is ended. The side effectdavidxu2004-05-211-0/+3
| | | | | | | | of not clearing the flags for execv() syscall will result that a new program runs in KSE thread mode without enabling it. Submitted by: tjr Modified by: davidxu
* Utilize sf_buf_alloc() rather than pmap_qenter() (and sometimesalc2004-04-231-10/+12
| | | | | kmem_alloc_wait()) for mapping the image header. On all machines with a direct virtual-to-physical mapping and SMP/HTT i386s, this is a clear win.
* Use vm_page_hold() rather than vm_page_wire() for short-duration pagealc2004-04-111-2/+2
| | | | wiring. The reason being that vm_page_hold() is cheaper.
* Remove sysctl kern.ps_argsopen, it is not very useful, one should usepjd2004-04-011-3/+0
| | | | | | security.bsd.see_other_uids instead. Discussed with: phk, rwatson
* Make the process_exit eventhandler run without Giant. Add Giant hookspeter2004-03-141-0/+1
| | | | | | in the two consumers that need it.. processes using AIO and netncp. Update docs. Say that process_exec is called with Giant, but not to depend on it. All our consumers can handle it without Giant.
* Push Giant down a little further:peter2004-03-131-2/+0
| | | | | | | | | | | | | | | - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe
* Do what the execve(2) manpage says and enforce what a Strictlyru2004-03-121-1/+2
| | | | | | | | | Conforming POSIX application should do by disallowing the argv argument to be NULL. PR: kern/33738 Submitted by: Marc Olzheim, Serge van den Boom OK'ed by: nectar
* Lock Giant around the single threading code in exec() to satisfy anjhb2004-03-051-0/+3
| | | | assertion in the single threading code.
* Checkpoint a hack to enable running i386 libc_r binaries on a 64 bitpeter2004-02-181-4/+22
| | | | | | kernel. I'm not happy with it yet - refinements are to come. This hack allows the kern.ps_strings and kern.usrstack sysctls to respond to a 32 bit request, such as those coming from emulated i386 binaries.
* Fixed some style bugs (mainly, try to always use explicit comparisons withbde2003-12-281-8/+7
| | | | NULL when checking for null pointers).
* Fixed some disordering in revs.1.194 and 1,196. Moved the exceve() syscallbde2003-12-281-55/+55
| | | | | | | function back to near the beginning of the file. Rev.1.194 moved it into the middle of auxiliary functions following kern_execve(). Moved the __mac_execve() syscall function up together with execve(). It was new in rev1.1.196 and perfectly misplaced after execve().
* Remove GIANT_REQUIRED from exec_unmap_first_page().alc2003-12-271-1/+0
|
* Modify the MAC Framework so that instead of embedding a (struct label)rwatson2003-11-121-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures. This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability. While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory. NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol. Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Remove md_bspstore from the MD fields of struct thread. Now thatmarcel2003-10-211-1/+0
| | | | | the backing store is at a fixed address, there's no need for a per-thread variable.
* Put the RSE backing store at a fixed address. This change is triggeredmarcel2003-10-201-1/+1
| | | | | | | | | | | | | by libguile that needs to know the base of the RSE backing store. We currently do not export the fixed address to userland by means of a sysctl so user code needs to hardcode it for now. This will be revisited later. The RSE backing store is now at the bottom of region 4. The memory stack is at the top of region 4. This means that the whole region is usable for the stacks, giving a 61-bit stack space. Port: lang/guile (depended of x11/gnome2)
* Eliminate some unnecessary uses of the vm page queues lock around thealc2003-10-041-9/+6
| | | | | vm page's valid field. This field is being synchronized using the containing vm object's lock.
* Remove the regstkpages sysctl variable. We have a growable registermarcel2003-09-271-6/+0
| | | | stack now.
* Part 2 of implementing rstacks: add the ability to create rstacks andmarcel2003-09-271-15/+9
| | | | | | | | | | | | | | | | | | | | use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect. Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address. Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done. Tested on: alpha, i386, ia64, sparc64
* Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bitpeter2003-09-251-0/+9
| | | | | | | | | | | | | | | | | | | | | systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
* Add a "int fd" argument to VOP_OPEN() which in the future willphk2003-07-261-1/+1
| | | | | | | | | contain the filedescriptor number on opens from userland. The index is used rather than a "struct file *" since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/* For now pass -1 all over the place.
* Rename P_THREADED to P_SA. P_SA means a process is using schedulerdavidxu2003-06-151-2/+2
| | | | activations.
* Add vm object locking to various pagers' "get pages" methods, i386 stackalc2003-06-131-2/+0
| | | | management functions, and a u area management function.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Update the vm object and page locking in exec_map_first_page(). Mark thealc2003-06-091-9/+15
| | | | one still anticipated change with XXX. Otherwise, this function is done.
* Lock the vm object when performing vm_page_grab().alc2003-06-081-2/+2
|
* - Merge struct procsig with struct sigacts.jhb2003-05-131-15/+10
| | | | | | | | | | | | | | | | | - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)
* - Borrow the KSE single threading code for exec and exit. We use the checkjeff2003-04-011-1/+1
| | | | | | | | if (p->p_numthreads > 1) and not a flag because action is only necessary if there are other threads. The rest of the system has no need to identify thr threaded processes. - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED is not set.
* Replace the at_fork, at_exec, and at_exit functions with the slightly morejhb2003-03-241-59/+2
| | | | | | | | | flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit. Reviewed by: phk, jake, almost complete silence on arch@
* - Cache a reference to the credential of the thread that starts a ktrace injhb2003-03-131-3/+8
| | | | | | | | | | | struct proc as p_tracecred alongside the current cache of the vnode in p_tracep. This credential is then used for all later ktrace operations on this file rather than using the credential of the current thread at the time of each ktrace event. - Now that we have multiple ktrace-related items in struct proc that are pointers, rename p_tracep to p_tracevp to make it less ambiguous. Requested by: rwatson (1)
OpenPOWER on IntegriCloud