summaryrefslogtreecommitdiffstats
path: root/sys/kern/imgact_elf.c
Commit message (Collapse)AuthorAgeFilesLines
* Locking for the per-process resource limits structure.jhb2004-02-041-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
* Add an additional field to the elf brandinfo structure to supportpeter2003-12-231-11/+16
| | | | | quicker exec-time replacement of the elf interpreter on an emulation environment where an entire /compat/* tree isn't really warranted.
* Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bitpeter2003-09-251-1/+8
| | | | | | | | | | | | | | | | | | | | | systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Fix ia32 compat on ia64. Recent ia64 MD changes caused the garbage onmarcel2003-05-311-5/+4
| | | | | | | | | | | the stack to be changed in a way incompatible with elf32_map_insert() where we used data_buf without initializing it for when the partial mapping resulting in a misaligned image (typical when the page size implied by the image is not the same as the page size in use by the kernel). Since data_buf is passed by reference to vm_map_find(), the compiler cannot warn about it. While here, move all local variables to the top of the function.
* Back out M_* changes, per decision of the TRB.imp2003-02-191-5/+5
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-5/+5
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* - Provide backwards compatibility for kern.fallback_elf_brand.jake2003-01-051-8/+5
| | | | | - Use the generic elf type macros in imgact_elf.h instead of ifdefing the entire contents of the header.
* Improve the way that an elf image activator for an alternate word size isjake2003-01-041-27/+23
| | | | | | | | | | | included in the kernel. Include imgact_elf.c in conf/files, instead of both imgact_elf32.c and imgact_elf64.c, which will use the default word size for an architecture as defined in machine/elf.h. Architectures that wish to build an additional image activator for an alternate word size can include either imgact_elf32.c or imgact_elf64.c in files.${ARCH}, which allows it to be dependent on MD options instead of solely on architecture. Glanced at by: peter
* Fix multiple registration of the elf_legacy_coredump sysctl variable.marcel2002-12-211-3/+5
| | | | | | | | | | | | | | | | | | | | The duplication is caused by the fact that imgact_elf.c is included by both imgact_elf32.c and imgact_elf64.c and both are compiled by default on ia64. Consequently, we have two seperate copies of the elf_legacy_coredump variable due to them being declared static, and two entries for the same sysctl in the linker set, both referencing the unique copy of the elf_legacy_coredump variable. Since the second sysctl cannot be registered, one of the elf_legacy_coredump variables can not be tuned (if ordering still holds, it's the ELF64 related one). The only solution is to create two different sysctl variables, just like the elf<32|64>_trace sysctl variables. This unfortunately is an (user) interface change, but unavoidable. Thus, on ELF32 platforms the sysctl variable is called elf32_legacy_coredump and on ELF64 platforms it is called elf64_legacy_coredump. Platforms that have both ELF formats have both sysctl variables. These variables should probably be retired sooner rather than later.
* Change the way ELF coredumps are handled. Instead of unconditionallydillon2002-12-161-11/+31
| | | | | | | | | | | | | | | | | | | skipping read-only pages, which can result in valuable non-text-related data not getting dumped, the ELF loader and the dynamic loader now mark read-only text pages NOCORE and the coredump code only checks (primarily) for complete inaccessibility of the page or NOCORE being set. Certain applications which map large amounts of read-only data will produce much larger cores. A new sysctl has been added, debug.elf_legacy_coredump, which will revert to the old behavior. This commit represents collaborative work by all parties involved. The PR contains a program demonstrating the problem. PR: kern/45994 Submitted by: "Peter Edwards" <pmedwards@eircom.net>, Archie Cobbs <archie@dellroad.org> Reviewed by: jdp, dillon MFC after: 7 days
* Assign value of NULL to imgp->execlabel when imgp is initializedrwatson2002-11-081-0/+1
| | | | | | | | in the ELF code. Missed in earlier merge from the MAC tree. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Remove reference to struct execve_args from struct imgact, whichrwatson2002-11-051-1/+2
| | | | | | | | | | | | | | | | | describes an image activation instance. Instead, make use of the existing fname structure entry, and introduce two new entries, userspace_argv, and userspace_envv. With the addition of mac_execve(), this divorces the image structure from the specifics of the execve() system call, removes a redundant pointer, etc. No semantic change from current behavior, but it means that the structure doesn't depend on syscalls.master-generated includes. There seems to be some redundant initialization of imgact entries, which I have maintained, but which could probably use some cleaning up at some point. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* Handle binaries with arbitrary number PT_LOAD sections, not onlykan2002-10-231-14/+19
| | | | | | | | | ones with one text and one data section. The text and data rlimit checks still needs to be fixed to properly accout for additional sections. Reviewed by: peter (slightly different patch version)
* Use strlcpy() instead of strncpy() to copy NUL terminated stringsrobert2002-10-171-2/+2
| | | | for safety and consistency.
* Use the fields in the sysentvec and in the vm map header in place of thejake2002-09-211-2/+1
| | | | | | | | constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.
* Do not blow up when we walk off the end of the brands list.peter2002-09-081-1/+3
| | | | Found by: kris, jake
* Alright, fix the problems with the elf loader for the Alpha. It turnsdillon2002-09-041-8/+18
| | | | | | | | | | | | | | | | out that there is no easy way to discern the difference between a text segment and a data segment through the read-only OR execute attribute in the elf segment header, so revert the algorithm to what it was before. Neither can we account for multiple data load segments in the vmspace structure (at least not without more work), due to assumptions obreak() makes in regards to the data start and data size fields. Retain RLIMIT_VMEM checking by using a local variable to track the total bytes of data being loaded. Reviewed by: peter X-MFC after: ASAP
* Make the text segment locating heuristics from rev 1.121 more reliablepeter2002-09-031-15/+10
| | | | | | | | so that it works on the Alpha. This defines the segment that the entry point exists in as 'text' and any others (usually one) as data. Submitted by: tmm Tested on: i386, alpha
* Grammer cleanupdillon2002-09-021-2/+2
|
* Moved elf brand identification into a function. Fully identify thejake2002-09-021-105/+75
| | | | | | | | brand early in the process of loading an elf file, so that we can identify the sysentvec, and so that we do not continue if we do not have a brand (and thus a sysentvec). Use the values in the sysentvec for the page size and vm ranges unconditionally, since they are all filled in now.
* Fixed more indentation bugs.jake2002-09-021-3/+3
|
* Implement data, text, and vmem limit checking in the elf loader and svr4dillon2002-08-301-10/+33
| | | | | | | compat code. Clean up accounting for multiple segments. Part 1/2. Submitted by: Andrey Alekseyev <uitm@zenon.net> (with some modifications) MFC after: 3 days
* Fixed most indentation bugs.jake2002-08-251-46/+34
|
* Fixed placement of operators. Wrapped long lines.jake2002-08-251-4/+8
|
* Fixed white space around operators, casts and reserved words.jake2002-08-241-23/+22
| | | | Reviewed by: md5
* return x; -> return (x);jake2002-08-241-32/+32
| | | | | | return(x); -> return (x); Reviewed by: md5
* In order to better support flexible and extensible access control,rwatson2002-08-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
* - Hold the vnode lock throughout execve.jeff2002-08-131-10/+3
| | | | | - Set VV_TEXT in the top level execve code. - Fixup the image activators to deal with the newly locked vnode.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-5/+7
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* Infrastructure tweaks to allow having both an Elf32 and an Elf64 executablepeter2002-07-201-142/+313
| | | | | | | | | | | | | | | handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages. Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable. Obtained from: dfr (mostly, many tweaks from me).
* Clean up execve locking:jeff2002-07-061-4/+11
| | | | | | - Grab the vnode object early in exec when we still have the vnode lock. - Cache the object in the image_params. - Make use of the cached object in imgact_*.c
* Fix typo in the BSD copyright: s/withough/without/schweikh2002-06-021-1/+1
| | | | | Spotted and suggested by: des MFC after: 3 weeks
* Remove __P.alfred2002-03-191-19/+18
|
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-1/+1
| | | | reference.
* Remove whitespace at end of line.mp2001-12-161-1/+1
|
* Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loaderps2001-10-101-2/+0
| | | | | | | tunable. Reviewed by: peter MFC after: 2 weeks
* Make uio_yield() a global. Call uio_yield() between chunksdillon2001-09-261-2/+2
| | | | | | | | | | | | | | in vn_rdwr_inchunks(), allowing other processes to gain an exclusive lock on the vnode. Specifically: directory scanning, to avoid a race to the root directory, and multiple child processes coring simultaniously so they can figure out that some other core'ing child has an exclusive adv lock and just exit instead. This completely fixes performance problems when large programs core. You can have hundreds of copies (forked children) of the same binary core all at once and not notice. MFC after: 3 days
* KSE Milestone 2julian2001-09-121-13/+19
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* This brings in a Yahoo coredump patch from Paul, with additional mods bydillon2001-09-081-4/+5
| | | | | | | | | | | | | | | | | | | me (addition of vn_rdwr_inchunks). The problem Yahoo is solving is that if you have large process images core dumping, or you have a large number of forked processes all core dumping at the same time, the original coredump code would leave the vnode locked throughout. This can cause the directory vnode to get locked up, which can cause the parent directory vnode to get locked up, and so on all the way to the root node, locking the entire machine up for extremely long periods of time. This patch solves the problem in two ways. First it uses an advisory non-blocking lock to abort multiple processes trying to core to the same file. Second (my contribution) it chunks up the writes and uses bwillwrite() to avoid holding the vnode locked while blocking in the buffer cache. Submitted by: ps Reviewed by: dillon MFC after: 2 weeks
* For ia64, set the default elf brand to be FreeBSD. This is temporarilypeter2001-09-021-0/+4
| | | | necessary only for as long as we're using a linux toolchain.
* OR M_WAITOK with M_ZERO in malloc()s args for clarity.brian2001-08-281-1/+1
|
* Unbreak linux compatibility by providing the correct length of the buffer.mp2001-08-181-1/+1
| | | | | | Reported by: "Pierre Y. Dampure" <pierre.dampure@westmarsh.com>, "Niels Chr. Bank-Pedersen" <ncbp@bank-pedersen.dk> Pointy hat to: mp
* Don't explicitly null-terminate. The buffer we are copying into ispeter2001-08-161-1/+0
| | | | | | already zeroed, and we explicitly leave the last byte untouched. Submitted by: bde
* Reduce stack allocation (stack-fast?).mp2001-08-161-40/+65
| | | | | | | | elf_load_file() => 352 to 52 bytes exec_elf_imgact() => 1072 to 48 bytes elf_corehdr() => 396 to 8 bytes Reviewed by: julian
* Use explicit sizes for the prpsinfo command length string so thatpeter2001-08-161-1/+2
| | | | | we dont have any more unexpected changes in core dumps. This gets us back to the original core dump layout from a few days ago.
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-13/+4
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Lock the VM while twiddling the vmspace.jhb2001-05-231-1/+2
|
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-3/+18
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Convert the allproc and proctree locks from lockmgr locks to sx locks.jhb2001-03-281-3/+3
|
OpenPOWER on IntegriCloud