summaryrefslogtreecommitdiffstats
path: root/sys/compat/freebsd32/freebsd32_misc.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r263349:kib2014-03-261-22/+27
| | | | Make the array pointed to by AT_PAGESIZES auxv properly aligned.
* MFC r261080:kib2014-02-061-4/+7
| | | | | | | | | The posix_fallocate(2) syscall should return error number on error, without modifying errno. MFC r261290: The posix_madvise(3) and posix_fadvise(2) should return error on failure, same as posix_fallocate(2).
* MFC: r258718: fix emulated jail_v0 byte orderpeter2013-12-041-1/+1
| | | | Approved by: re (gjb)
* Extend the support for exempting processes from being killed when swap isjhb2013-09-191-0/+21
| | | | | | | | | | | | | | | | | | | | | | exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes. Reviewed by: kib, jilles (earlier version) Approved by: re (delphij) MFC after: 1 month
* Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping usejhb2013-09-091-6/+4
| | | | | | | | | | | | | an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)
* Change the cap_rights_t type from uint64_t to a structure that we can extendpjd2013-09-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
* Move the PAIR32TO64() macro and the RETVAL_HI/RETVAL_LO defines to apjd2013-08-181-10/+1
| | | | | | header file for use by other .c files. Sponsored by: The FreeBSD Foundation
* Make sendfile() a method in the struct fileops. Currently onlyglebius2013-08-151-10/+16
| | | | | | | | vnode backed file descriptors have this method implemented. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
* Implement compat32 wrappers for the ktimer_* syscalls.kib2013-07-211-0/+64
| | | | | | Reported, reviewed and tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Move the convert_sigevent32() utility function into freebsd32_misc.ckib2013-07-211-0/+26
| | | | | | | | | | | | for consumption outside the vfs_aio.c. For SIGEV_THREAD_ID and SIGEV_SIGNAL notification delivery methods, also copy in the sigev_value, since librt event pumping loop compares note generation number with the value passed through sigev_value. Tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Cosmetic change, use the same union name on the left and right sideskib2013-07-211-1/+1
| | | | | | | | of the conversion. Tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week
* id_t is 64bit, provide the compat32 wrapper for clock_getcpuclockid2(2).kib2013-07-201-0/+14
| | | | | | Reported and tested by: Petr Salinger <Petr.Salinger@seznam.cz> PR: threads/180652 Sponsored by: The FreeBSD Foundation
* Add a "kern.features" MIB for 32bit support under a 64bit kernel.obrien2013-05-311-0/+2
|
* Fix the wait6(2) on 32bit architectures and for the compat32, by usingkib2013-05-211-2/+2
| | | | | | | | | | the right type for the argument in syscalls.master. Also fix the posix_fallocate(2) and posix_fadvise(2) compat32 syscalls on the architectures which require padding of the 64bit argument. Noted and reviewed by: jhb Pointy hat to: kib MFC after: 1 week
* Style fixes for r242958.kib2012-11-161-2/+0
| | | | | Reported and reviewed by: bde MFC after: 28 days
* Add the wait6(2) system call. It takes POSIX waitid()-like processkib2012-11-131-0/+38
| | | | | | | | | | | | | | | | | | | | | designator to select a process which is waited for. The system call optionally returns siginfo_t which would be otherwise provided to SIGCHLD handler, as well as extended structure accounting for child and cumulative grandchild resource usage. Allow to get the current rusage information for non-exited processes as well, similar to Solaris. The explicit WEXITED flag is required to wait for exited processes, allowing for more fine-grained control of the events the waiter is interested in. Fix the handling of siginfo for WNOWAIT option for all wait*(2) family, by not removing the queued signal state. PR: standards/170346 Submitted by: "Jukka A. Ukkonen" <jau@iki.fi> MFC after: 1 month
* Add kern_fhstat(), adjust sys_fhstat() to use it.gleb2012-05-241-1/+2
| | | | | | | Extend kern_getdirentries() to accept uio segflag and optionally return buffer residue. Sponsored by: Google Summer of Code 2011
* On MIPS, _ALIGN always aligns to 8 bytes, even for 32-bit binaries. This mightjmallett2012-03-031-0/+4
| | | | | | | not be ideal, but is the ABI we've shipped so far. Fix macros which reflect the results of _ALIGN on 32-bit MIPS to use the right alignment. This fixes sendmsg under COMPAT_FREEBSD32 on n64 MIPS kernels.
* o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with ↵jmallett2012-03-031-0/+6
| | | | | | | | | | | | | | | | | | | | | userlands using the o32 ABI. This mostly follows nwhitehorn's lead in implementing COMPAT_FREEBSD32 on powerpc64. o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the 32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit ABIs use a 64-bit time_t. o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS with 32-bit compatibility, then, disable some code which assumes otherwise wrongly when built for MIPS. A more general macro to check in this case would seem like a good idea eventually. If someone adds support for using n32 userland with n64 kernels on MIPS, then they will have to add a variety of flags related to each piece of the ABI that can vary. That's probably the right time to generalize further. o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the freebsd32 compat code. Probably this should be generalized at some point. Reviewed by: gonzo
* Make sure all intermediate variables holding mount flags (mnt_flag)mckusick2012-01-171-3/+11
| | | | | | | and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks
* - Split out a kern_posix_fadvise() from the posix_fadvise() system call sojhb2011-11-141-11/+4
| | | | | | | it can be used by in-kernel consumers. - Make kern_posix_fallocate() public. - Use kern_posix_fadvise() and kern_posix_fallocate() to implement the freebsd32 wrappers for the two system calls.
* Add the posix_fadvise(2) system call. It is somewhat similar tojhb2011-11-041-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month
* Control the execution permission of the readable segments forkib2011-10-151-2/+2
| | | | | | | i386 binaries on the amd64 and ia64 with the sysctl, instead of unconditionally enabling it. Reviewed by: marcel
* Use PAIR32TO64() for the offset and length parameters tojhb2011-10-141-2/+2
| | | | | | | freebsd32_posix_fallocate() to properly handle big-endian platforms. Reviewed by: mdf MFC after: 1 week
* Use PTRIN().marcel2011-10-131-1/+1
|
* Wrap mprotect(2) so that we can add execute permissions when readmarcel2011-10-131-0/+15
| | | | | permissions are requested. This is needed on amd64 and ia64 for JDK 1.4.x
* In freebsd32_mmap() and when compiling for amd64 or ia64, alsomarcel2011-10-131-0/+5
| | | | | ask for execute permissions when read permissions are wanted. This is needed for JDK 1.4.x on i386.
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-19/+19
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Implement compat32 for old lseek, for the a.out binaries on amd64.kib2011-06-161-0/+13
|
* Add the posix_fallocate(2) syscall. The default implementation inmdf2011-04-181-0/+12
| | | | | | | | | | | | | | vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by: -arch (previous version)
* Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.kib2011-04-011-3/+97
| | | | | | | | | | | | | | | In particular: - implement compat shims for old stat(2) variants and ogetdirentries(2); - implement delivery of signals with ancient stack frame layout and corresponding sigreturn(2); - implement old getpagesize(2); - provide a user-mode trampoline and LDT call gate for lcall $7,$0; - port a.out image activator and connect it to the build as a module on amd64. The changes are hidden under COMPAT_43. MFC after: 1 month
* Provide compat32 shims for kldstat(2).kib2011-03-301-0/+27
| | | | | Requested and tested by: jpaetzel MFC after: 1 week
* Create shared (readonly) page. Each ABI may specify the use of page bykib2011-01-081-2/+5
| | | | | | | | | | | | | setting SV_SHP flag and providing pointer to the vm object and mapping address. Provide simple allocator to carve space in the page, tailored to put the code with alignment restrictions. Enable shared page use for amd64, both native and 32bit FreeBSD binaries. Page is private mapped at the top of the user address space, moving a start of the stack one page down. Move signal trampoline code from the top of the stack to the shared page. Reviewed by: alc
* Update MNT_ROOTFS comments after changes in the root mount logic.pluknet2010-11-231-1/+2
| | | | | | Reported by: arundel Suggested by: marcel (phrasing) Approved by: kib (mentor)
* Supply some useful information to the started image using ELF aux vectors.kib2010-08-171-3/+26
| | | | | | | | In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month
* Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGSkib2010-08-071-1/+2
| | | | | | in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well. MFC after: 1 week
* Copy inode birthtime to the struct stat32.kib2010-08-041-0/+1
| | | | MFC after: 1 week
* Fix style.kib2010-08-041-1/+2
| | | | MFC after: 1 week
* When compat32 recvmsg(2) does not need to copy out control messages, setkib2010-08-031-0/+2
| | | | | | | | msg_controllen to 0. PR: kern/149227 Submitted by: Stef Walter <stef memberwebs com> MFC after: 1 weeks
* Introduce exec_alloc_args(). The objective being to encapsulate thealc2010-07-271-9/+7
| | | | | | | | | | details of the string buffer allocation in one place. Eliminate the portion of the string buffer that was dedicated to storing the interpreter name. The pointer to the interpreter name can simply be made to point to the appropriate argument string. Reviewed by: kib
* Change the order in which the file name, arguments, environment, andalc2010-07-251-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | shell command are stored in exec*()'s demand-paged string buffer. For a "buildworld" on an 8GB amd64 multiprocessor, the new order reduces the number of global TLB shootdowns by 31%. It also eliminates about 330k page faults on the kernel address space. Change exec_shell_imgact() to use "args->begin_argv" consistently as the start of the argument and environment strings. Previously, it would sometimes use "args->buf", which is the start of the overall buffer, but no longer the start of the argument and environment strings. While I'm here, eliminate unnecessary passing of "&length" to copystr(), where we don't actually care about the length of the copied string. Clean up the initialization of the exec map. In particular, use the correct size for an entry, and express that size in the same way that is used when an entry is allocated. The old size was one page too large. (This discrepancy originated in 2004 when I rewrote exec_map_first_page() to use sf_buf_alloc() instead of the exec map for mapping the first page of the executable.) Reviewed by: kib
* Remove the linux_exec_copyin_args(), freebsd32_exec_copyin_args() maykib2010-07-231-1/+1
| | | | | | | server as well. COMPAT_FREEBSD32 is a prerequisite for COMPAT_LINUX32. Reviewed by: alc MFC after: 3 weeks
* Eliminate a little bit of duplicated code.alc2010-07-231-3/+1
|
* Constify source argument for siginfo_to_siginfo32().kib2010-07-041-1/+1
| | | | MFC after: 1 week
* Extract the code to copy-out struct rusage32 from struct rusagekib2010-04-211-32/+24
| | | | | | | into the new function. Reviewed by: jhb MFC after: 1 week
* Rename st_*timespec fields to st_*tim for POSIX 2008 compliance.ed2010-03-281-3/+3
| | | | | | | | | | | | | | | A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names. This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous. I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.
* Remove empty line.kib2010-03-191-1/+0
| | | | MFC after: 2 weeks
* Move SysV IPC freebsd32 compat shims from freebsd32_misc.c to correspondingkib2010-03-191-530/+0
| | | | | | | | | | | | | | sysv_{msg,sem,shm}.c files. Mark SysV IPC freebsd32 syscalls as NOSTD and add required SYSCALL_INIT_HELPER/SYSCALL32_INIT_HELPERs to provide auto register/unregister on module load. This makes COMPAT_FREEBSD32 functional with SysV IPC compiled and loaded as modules. Reviewed by: jhb MFC after: 2 weeks
* Move SysV IPC freebsd32 compat shims helpers from freebsd32_misc.c tokib2010-03-191-54/+0
| | | | | | | sysv_ipc.c. Reviewed by: jhb MFC after: 2 weeks
* Introduce SYSCALL_INIT_HELPER and SYSCALL32_INIT_HELPER macros andkib2010-03-191-0/+30
| | | | | | | | | neccessary support functions to allow registering dynamically loaded syscalls from the MOD_LOAD handlers. Helpers handle registration failures semi-automatically. Reviewed by: jhb MFC after: 2 weeks
OpenPOWER on IntegriCloud