summaryrefslogtreecommitdiffstats
path: root/sys/kern/imgact_aout.c
Commit message (Collapse)AuthorAgeFilesLines
* Reorganize syscall entry and leave handling.kib2010-05-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* Add sv_flags field to struct sysentvec with intention to provide descriptionkib2008-11-221-1/+7
| | | | | | | | of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features. Discussed with: dchagin, imp, jhb, peter
* Change the static struct sysentvec and struct Elf_Brandinfo initializerskib2008-09-241-26/+27
| | | | | | | | | | | to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-1/+1
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-3/+2
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Fix for the panic("vm_thread_new: kstack allocation failed") andkib2007-11-051-1/+3
| | | | | | | | | | | | | | | | | | | | silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb
* Correct two vm object reference leaks in error cases.alc2006-03-161-0/+2
| | | | Submitted by: davidxu
* Maintain the lock on the vnode for most of exec_elfN_imgact().alc2005-12-241-2/+13
| | | | | | | | | | | | | Specifically, it is required for the I/O that may be performed by elfN_load_section(). Avoid an obscure deadlock in the a.out, elf, and gzip image activators. Add a comment describing why the deadlock does not occur in the common case and how it might occur in less usual circumstances. Eliminate an unused variable from exec_aout_imgact(). In collaboration with: tegge
* - Neither of our image formats require Giant now that the vm and vfs havejeff2005-05-031-2/+0
| | | | been locked.
* o Split out kernel part of execve(2) syscall into two parts: one thatsobomax2005-01-291-6/+1
| | | | | | | | | | | copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Axe a.out core dump support. Neither older gdb binaries nor currentdas2004-11-271-54/+1
| | | | bfd sources understand the present format.
* Maintain the broken state of backwards compatibilty for a.out (anddas2004-11-201-2/+3
| | | | | | | PECOFF!) core dumps. None of the old versions of gdb I tried were able to read a.out core dumps before or after this change. Reviewed by: arch@
* Change the types of vn_rdwr_inchunks()'s len and aresid arguments totjr2004-06-051-1/+1
| | | | | | size_t and size_t *, respectively. Update callers for the new interface. This is a better fix for overflows that occurred when dumping segments larger than 2GB to core files.
* Locking for the per-process resource limits structure.jhb2004-02-041-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
* Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bitpeter2003-09-251-1/+2
| | | | | | | | | | | | | | | | | | | | | systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Back out M_* changes, per decision of the TRB.imp2003-02-191-1/+1
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-1/+1
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Use the fields in the sysentvec and in the vm map header in place of thejake2002-09-211-4/+4
| | | | | | | | constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
* Include <sys/malloc.h> instead of depending on namespace pollution 2bde2002-09-101-8/+5
| | | | | | layers deep in <sys/proc.h> or <sys/vnode.h>. Removed unused includes. Sorted includes.
* Tidy up some loose ends that bde pointed out. caddr_t bad, ok?peter2002-09-071-7/+7
| | | | | | | Move fill_kinfo_proc to before we copy the results instead of after the copy and too late. There is still more to do here.
* The true value of how the kernel was configured for KSTACK_PAGES was notpeter2002-09-071-8/+6
| | | | | | available at module compile time. Do not #include the bogus opt_kstack_pages.h at this point and instead refer to the variables that are also exported via sysctl.
* Collect the a.out coredump code into the calling functions.peter2002-09-071-1/+16
| | | | XXX why does pecoff dump in a.out format?
* Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections tojake2002-09-011-6/+24
| | | | | | sysentvec. Initialized all fields of all sysentvecs, which will allow them to be used instead of constants in more places. Provided stack fixup routines for emulations that previously used the default.
* In order to better support flexible and extensible access control,rwatson2002-08-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
* - Hold the vnode lock throughout execve.jeff2002-08-131-4/+0
| | | | | - Set VV_TEXT in the top level execve code. - Fixup the image activators to deal with the newly locked vnode.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-1/+2
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* Infrastructure tweaks to allow having both an Elf32 and an Elf64 executablepeter2002-07-201-1/+1
| | | | | | | | | | | | | | | handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages. Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable. Obtained from: dfr (mostly, many tweaks from me).
* Clean up execve locking:jeff2002-07-061-1/+1
| | | | | | - Grab the vnode object early in exec when we still have the vnode lock. - Cache the object in the image_params. - Make use of the cached object in imgact_*.c
* - Change fill_kinfo_proc() to require that the process is locked when itjhb2002-04-091-0/+2
| | | | | | | | | | | | | | is called. - Change sysctl_out_proc() to require that the process is locked when it is called and to drop the lock before it returns. If this proves too complex we can change sysctl_out_proc() to simply acquire the lock at the very end and have the calling code drop the lock right after it returns. - Lock the process we are going to export before the p_cansee() in the loop in sysctl_kern_proc() and hold the lock until we call sysctl_out_proc(). - Don't call p_cansee() on the process about to be exported twice in the aforementioned loop.
* Remove __P.alfred2002-03-191-1/+1
|
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-1/+1
| | | | reference.
* Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loaderps2001-10-101-1/+1
| | | | | | | tunable. Reviewed by: peter MFC after: 2 weeks
* Make uio_yield() a global. Call uio_yield() between chunksdillon2001-09-261-2/+2
| | | | | | | | | | | | | | in vn_rdwr_inchunks(), allowing other processes to gain an exclusive lock on the vnode. Specifically: directory scanning, to avoid a race to the root directory, and multiple child processes coring simultaniously so they can figure out that some other core'ing child has an exclusive adv lock and just exit instead. This completely fixes performance problems when large programs core. You can have hundreds of copies (forked children) of the same binary core all at once and not notice. MFC after: 3 days
* KSE Milestone 2julian2001-09-121-11/+15
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* This brings in a Yahoo coredump patch from Paul, with additional mods bydillon2001-09-081-4/+4
| | | | | | | | | | | | | | | | | | | me (addition of vn_rdwr_inchunks). The problem Yahoo is solving is that if you have large process images core dumping, or you have a large number of forked processes all core dumping at the same time, the original coredump code would leave the vnode locked throughout. This can cause the directory vnode to get locked up, which can cause the parent directory vnode to get locked up, and so on all the way to the root node, locking the entire machine up for extremely long periods of time. This patch solves the problem in two ways. First it uses an advisory non-blocking lock to abort multiple processes trying to core to the same file. Second (my contribution) it chunks up the writes and uses bwillwrite() to avoid holding the vnode locked while blocking in the buffer cache. Submitted by: ps Reviewed by: dillon MFC after: 2 weeks
* Optionize UPAGES for the i386. As part of this I split some of the lowpeter2001-08-251-0/+2
| | | | | | | | | | level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc. This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-8/+2
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+8
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-3/+5
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Back out proc locking to protect p_ucred for obtaining additionaljhb2001-01-271-6/+1
| | | | references along with the actual obtaining of additional references.
* Proc locking.jhb2001-01-241-1/+7
|
* Change the proc information returned from the kernel so that itmckusick2000-12-121-2/+1
| | | | | | | | | | | | no longer contains kernel specific data structures, but rather only scalar values and structures that are already part of the kernel/user interface, specifically rusage and rtprio. It no longer contains proc, session, pcred, ucred, procsig, vmspace, pstats, mtx, sigiolst, klist, callout, pasleep, or mdproc. If any of these changed in size, ps, w, fstat, gcore, systat, and top would all stop working. The new structure has over 200 bytes of unassigned space for future values to be added, yet is nearly 100 bytes smaller per entry than the structure that it replaced.
* Make MINSIGSTKSZ machine dependent, and have the sigaltstackmarcel2000-11-091-1/+3
| | | | | | | | | | | | | | | | | | | | | | syscall compare against a variable sv_minsigstksz in struct sysentvec as to properly take the size of the machine- and ABI dependent struct sigframe into account. The SVR4 and iBCS2 modules continue to have a minsigstksz of 8192 to preserve behavior. The real values (if different) are not known at this time. Other ABI modules use the real values. The native MINSIGSTKSZ is now defined as follows: Arch MINSIGSTKSZ ---- ----------- alpha 4096 i386 2048 ia64 12288 Reviewed by: mjacob Suggested by: bde
* Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT.bp2000-09-121-1/+1
| | | | | | | They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon
* Move the include of <sys/systm.h> so that KTR gets a declaration fordfr2000-09-101-1/+1
| | | | snprintf().
* Remove ~25 unneeded #include <sys/conf.h>phk2000-04-191-1/+0
| | | | Remove ~60 unneeded #include <sys/malloc.h>
* s/p_cred->pc_ucred/p_ucred/gphk1999-11-211-1/+1
|
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
OpenPOWER on IntegriCloud