summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exec.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Move execve's access time update functionality into a newdds2005-10-121-12/+2
| | | | | | | | vfs_mark_atime() function, and use the new function for performing efficient atime updates in mmap(). Reviewed by: bde MFC after: 2 weeks
* Add missing word to comment.truckman2005-10-041-1/+1
|
* If sufficiently bad things happen during a call to kern_execve(), it iscperciva2005-10-031-0/+8
| | | | | | | | | | | | | | | | possible for do_execve() to call exit1() rather than returning. As a result, the sequence "allocate memory; call kern_execve; free memory" can end up leaking memory. This commit documents this astonishing behaviour and adds a call to exec_free_args() before the exit1() call in do_execve(). Since all the users of kern_execve() in the tree use exec_free_args() to free the command-line arguments after kern_execve() returns, this should be safe, and it fixes the memory leak which can otherwise occur. Submitted by: Peter Holm MFC after: 3 days Security: Local denial of service
* Copy new process argument list in do_execve() before grabbing PROC_LOCKtruckman2005-10-011-10/+10
| | | | | | | | | | to avoid touching pageable memory while holding a mutex. Simplify argument list replacement logic. PR: kern/84935 Submitted by: "Antoine Pelisse" apelisse AT gmail.com (in a different form) MFC after: 3 days
* MFP4:jkoshy2005-06-301-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - pmcstat(8) gprof output mode fixes: lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h: + Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event + Add an 'entryaddr' field to the PMCLOG_PROCEXEC event, so that pmcstat(8) can determine where the runtime loader /libexec/ld-elf.so.1 is getting loaded. sys/kern/kern_exec.c: + Use a local struct to group the entry address of the image being exec()'ed and the process credential changed flag to the exec handling hook inside hwpmc(4). usr.sbin/pmcstat/*: + Support "-k kernelpath", "-D sampledir". + Implement the ELF bits of 'gmon.out' profile generation in a new file "pmcstat_log.c". Move all log related functions to this file. + Move local definitions and prototypes to "pmcstat.h" - Other bug fixes: + lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read(). + sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all attached PMCs when a process exits. + sys/sys/pmc.h: correct a function prototype. + Improve usage checks in pmcstat(8). Approved by: re (blanket hwpmc)
* Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't meritpeter2005-06-241-0/+1
| | | | | | | being in opt_global.h and forcing a global recompile when only a few files reference it. Approved by: re
* MFP4:jkoshy2005-06-091-2/+3
| | | | | | | | | | | | | | | | - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.
* This patch addresses a standards violation issue. The standards say akensmith2005-05-311-1/+13
| | | | | | | | | | | | | | | | | | | | | | file's access time should be updated when it gets executed. A while ago the mechanism used to exec was changed to use a more mmap based mechanism and this behavior was broken as a side-effect of that. A new vnode flag is added that gets set when the file gets executed, and the VOP_SETATTR() vnode operation gets called. The underlying filesystem is expected to handle it based on its own semantics, some filesystems don't support access time at all. Those that do should handle it in a way that does not block, does not generate I/O if possible, etc. In particular vn_start_write() has not been called. The UFS code handles it the same way as it would normally handle the access time if a file was read - the IN_ACCESS flag gets set in the inode but no other action happens at this point. The actual time update will happen later during a sync (which handles all the necessary locking). Got me into this: cperciva Discussed with: a lot with bde, a little with kan Showed patches to: phk, jeffr, standards@, arch@ Minor discussion on: arch@
* - Initialize vfslocked correctly early enough for MAC to compile.jeff2005-05-031-5/+4
| | | | | | | | - Fix one place where we explicitly drop Giant! Pointy hat to: me Submitted by: Max Laier Warned by: Tinderbox
* - Use namei to acquire Giant for VFS if it is necessary. Drop the explicitjeff2005-05-031-9/+7
| | | | | | Giant acquisition. - Remove GIANT_REQUIRED in the few remaining cases; the vm and vfs have both been locked.
* - Return EACCES if we're trying to exec on a vp with no object.jeff2005-05-011-0/+2
| | | | Errno supplied by: cperciva
* - Pass the ISOPEN flag to namei so filesystems will know we're about tojeff2005-04-271-1/+1
| | | | open them or otherwise access the data.
* Bring a working snapshot of hwpmc(4), its associated libraries, userland ↵jkoshy2005-04-191-0/+22
| | | | | | | | | | utilities and documentation into -CURRENT. Bump FreeBSD_version. Reviewed by: alc, jhb (kernel changes)
* Welcome to the 21st century: increase MAXSHELLCMDLEN from 128 bytes tosobomax2005-02-251-5/+9
| | | | | | | | | | | | PAGE_SIZE. Unlike originator of the PR suggests retain MAXSHELLCMDLEN definition (he has been proposing to replace it with PAGE_SIZE everywhere), not only this reduced the diff significantly, but prevents code obfuscation and also allows to increase/decrease this parameter easily if needed. PR: kern/64196 Submitted by: Magnus Bäckström <b@etek.chalmers.se>
* Grrr, this committer needs to have a sleep. Remove lines from the previoussobomax2005-01-291-3/+0
| | | | | | delta not intended for public consumption. MFC after: 2 weeks
* Fix small non-conformance introduced in the previous commit: execve() issobomax2005-01-291-4/+4
| | | | | | | expected to return ENAMETOOLONG, not E2BIG if first argument doesn't fit into {PATH_MAX} bytes. MFC after: 2 weeks
* o Split out kernel part of execve(2) syscall into two parts: one thatsobomax2005-01-291-96/+112
| | | | | | | | | | | copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks
* Don't use VOP_GETVOBJECT, use vp->v_object directly.phk2005-01-251-2/+3
|
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Add new function fdunshare() which encapsulates the necessary light magicphk2004-12-141-10/+1
| | | | | | | | for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c
* Don't include sys/user.h merely for its side-effect of recursivelydas2004-11-271-1/+1
| | | | including other headers.
* Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.phk2004-11-131-3/+3
| | | | | | | | Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.
* Use more intuitive pointer for fdinit() and fdcopy().phk2004-11-081-1/+1
| | | | Change fdcopy() to take unlocked filedesc.
* Put on my peril sensitive sunglasses and add a flags field to the internalpeter2004-10-111-4/+4
| | | | | | | | | | | | | | | | sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
* Add an execve command for kse_thr_interrupt to allow libpthread todavidxu2004-10-071-3/+2
| | | | | | restore signal mask correctly, this is required by POSIX. Reviewed by: deischen
* In original kern_execve() code, at the start of the function, it forcesdavidxu2004-10-061-12/+46
| | | | | | | | | | | | | | | | | all other threads to suicide, problem is execve() could be failed, and a failed execve() would change threaded process to unthreaded, this side effect is unexpected. The new code introduces a new single threading mode SINGLE_BOUNDARY, in the mode, all threads should suspend themself at user boundary except the singler. we can not use SINGLE_NO_EXIT because we want to start from a clean state if execve() is successful, suspending other threads at unknown point and later resuming them from there and forcing them to exit at user boundary may cause the process to start from a dirty state. If execve() is successful, current thread upgrades to SINGLE_EXIT mode and forces other threads to suicide at user boundary, otherwise, other threads will be resumed and their interrupted syscall will be restarted. Reviewed by: julian
* - Don't try to unlock Giant if single threading fails since we don't havejhb2004-09-231-1/+1
| | | | | it locked. - Unlock Giant before calling exit1() since exit1() does not require Giant.
* Revert the last change..julian2004-09-221-17/+11
| | | | | | | Better to kill all other threads than to panic the system if 2 threads call execve() at the same time. A better fix will be committed later. Note that this only affects the case where the execve fails.
* In a threaded process, don't kill off all the other threads until we have ajulian2004-09-211-11/+17
| | | | | | | reasonable chance that the eceve() is going to succeeed. I.e. wait until we've done the permission checks etc. MFC after: 1 week
* Refactor a bunch of scheduler code to give basically the same behaviourjulian2004-09-051-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
* Add locking to the kqueue subsystem. This also makes the kqueue subsystemjmg2004-08-151-1/+1
| | | | | | | | | | | | | a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-1/+1
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* White space fix..julian2004-07-241-3/+3
| | | | diff reduction for upcoming commit.
* Push down the acquisition and release of the page queues lock intoalc2004-07-131-2/+0
| | | | | | | | pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.) Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
* Move TDF_SA from td_flags to td_pflags (and rename it accordingly)tjr2004-06-021-3/+1
| | | | | | | so that it is no longer necessary to hold sched_lock while manipulating it. Reviewed by: davidxu
* Clear KSE thread flags after KSE thread mode is ended. The side effectdavidxu2004-05-211-0/+3
| | | | | | | | of not clearing the flags for execv() syscall will result that a new program runs in KSE thread mode without enabling it. Submitted by: tjr Modified by: davidxu
* Utilize sf_buf_alloc() rather than pmap_qenter() (and sometimesalc2004-04-231-10/+12
| | | | | kmem_alloc_wait()) for mapping the image header. On all machines with a direct virtual-to-physical mapping and SMP/HTT i386s, this is a clear win.
* Use vm_page_hold() rather than vm_page_wire() for short-duration pagealc2004-04-111-2/+2
| | | | wiring. The reason being that vm_page_hold() is cheaper.
* Remove sysctl kern.ps_argsopen, it is not very useful, one should usepjd2004-04-011-3/+0
| | | | | | security.bsd.see_other_uids instead. Discussed with: phk, rwatson
* Make the process_exit eventhandler run without Giant. Add Giant hookspeter2004-03-141-0/+1
| | | | | | in the two consumers that need it.. processes using AIO and netncp. Update docs. Say that process_exec is called with Giant, but not to depend on it. All our consumers can handle it without Giant.
* Push Giant down a little further:peter2004-03-131-2/+0
| | | | | | | | | | | | | | | - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe
* Do what the execve(2) manpage says and enforce what a Strictlyru2004-03-121-1/+2
| | | | | | | | | Conforming POSIX application should do by disallowing the argv argument to be NULL. PR: kern/33738 Submitted by: Marc Olzheim, Serge van den Boom OK'ed by: nectar
* Lock Giant around the single threading code in exec() to satisfy anjhb2004-03-051-0/+3
| | | | assertion in the single threading code.
* Checkpoint a hack to enable running i386 libc_r binaries on a 64 bitpeter2004-02-181-4/+22
| | | | | | kernel. I'm not happy with it yet - refinements are to come. This hack allows the kern.ps_strings and kern.usrstack sysctls to respond to a 32 bit request, such as those coming from emulated i386 binaries.
* Fixed some style bugs (mainly, try to always use explicit comparisons withbde2003-12-281-8/+7
| | | | NULL when checking for null pointers).
* Fixed some disordering in revs.1.194 and 1,196. Moved the exceve() syscallbde2003-12-281-55/+55
| | | | | | | function back to near the beginning of the file. Rev.1.194 moved it into the middle of auxiliary functions following kern_execve(). Moved the __mac_execve() syscall function up together with execve(). It was new in rev1.1.196 and perfectly misplaced after execve().
* Remove GIANT_REQUIRED from exec_unmap_first_page().alc2003-12-271-1/+0
|
* Modify the MAC Framework so that instead of embedding a (struct label)rwatson2003-11-121-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures. This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability. While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory. NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol. Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Remove md_bspstore from the MD fields of struct thread. Now thatmarcel2003-10-211-1/+0
| | | | | the backing store is at a fixed address, there's no need for a per-thread variable.
* Put the RSE backing store at a fixed address. This change is triggeredmarcel2003-10-201-1/+1
| | | | | | | | | | | | | by libguile that needs to know the base of the RSE backing store. We currently do not export the fixed address to userland by means of a sysctl so user code needs to hardcode it for now. This will be revisited later. The RSE backing store is now at the bottom of region 4. The memory stack is at the top of region 4. This means that the whole region is usable for the stacks, giving a 61-bit stack space. Port: lang/guile (depended of x11/gnome2)
OpenPOWER on IntegriCloud