summaryrefslogtreecommitdiffstats
path: root/sys/compat/linux
Commit message (Collapse)AuthorAgeFilesLines
* Rename st_*timespec fields to st_*tim for POSIX 2008 compliance.ed2010-03-281-9/+18
| | | | | | | | | | | | | | | A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names. This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous. I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.
* Fix some problems which may lead to a panic:netchild2010-03-261-1/+3
| | | | | | | - right order of src and dst in memcpy - NULL out the clips after freeing to prevent an accident Noticed by: hselasky
* Actually make O_DIRECTORY work.ed2010-03-211-6/+2
| | | | | | According to POSIX open() must return ENOTDIR when the path name does not refer to a path name. Change vn_open() to respect this flag. This also simplifies the Linuxolator a bit.
* The NetBSD Foundation has granted permission to remove clause 3 and 4 fromjoel2010-03-012-14/+0
| | | | | | their software. Obtained from: NetBSD
* No need to include security/mac/mac_framework.h here.pjd2010-02-181-2/+0
|
* - Return EAFNOSUPPORT instead of EINVAL for unsupported address family,delphij2010-02-091-2/+7
| | | | | | | | | | | | this matches the Linux behavior. - Check if we have sufficient space allocated for socket structure, which fixes a buffer overflow when wrong length is being passed into the emulation layer. [1] PR: kern/138860 Submitted by: Mateusz Guzik <mjguzik gmail com> Reported by: Alexander Best [1] MFC after: 2 weeks
* Let us to use our libusb(3) in Linuxolator.wkoszek2010-01-182-0/+28
| | | | | | | | | | | With this change, Linux binaries can work with our libusb(3) when it's compiled against our header files on GNU/Linux system -- this solves the problem with differences between /dev layouts. With ported libusb(3), I am able to use my USB JTAG cable with Linux binaries that support it. Reviewed by: thompsa
* Whitespace change to be able to provide the correct commit log for r202364:netchild2010-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ---snip--- Add video clipping support but with the caveats below. Background info: Video clipping allows the user to provide either a series of clip rectangles or a clip bitmap to the driver and have the driver mask the video according to the clipping specs provided. Adding support for clipping to the FreeBSD Linux emulator is problematic because it seems that this feature is not supported by many drivers and therefore it is ignored by many applications. Unfortunately, when not using it, rather than passing in a null clipping list, some apps leave the clipping fields uninitialized, casuing random values to be passed in. In the case where the driver does not use the clipping info, this is not a problem (although it is bad form). But the Linux emulator does not know which drivers will use this and which won't, so the Linux emulator must try to handle this clip list, and deal gracefully with cases where the values seem to be uninitialized. Video clipping info is passed in using the VIDIOCSWIN ioctl in two fields in the video_window structure: the integer clipcount and the pointer clips. How the linuxulator handles this from this commit on: * if (clipcount == VIDEO_CLIP_BITMAP) The clips variable is a void * pointer to a 128*625 byte (1024*625 bit) memory area containing a bitmap of the clipping area. The pointer in the video_window structure is copied, but no video_clip structures are copied. * if (clipcount > 0 && clipcount <= 16384) The clips variable is pointer to a list of video_clip structures. Up to clipcount structures are copied and passed to the driver. The upper limit of 16384 was imposed here so that user code that does not properly initialize clipcount falls through below and no attempt is made to copy an uninitialized list. This value was found by examining Linux drivers that support the clip list. * else The clipcount is either negative (but not VIDEO_CLIP_BITMAP), zero or positive (> 16384). All these cases are treated as invalid data. Both the clipcount field and clips pointer are forced to zero/NULL and passed to the driver. It should be noted that, at the time of developing this V4L emulator code, the pwc(4) V4L driver does not support clipping. Submitted by: J.R. Oldroyd <fbsd@opal.com> MFC after: 1 month ---snip---
* This is v4l support for the linuxulator. This allows to access FreeBSDnetchild2010-01-151-33/+80
| | | | | | | | | | | | | | | native devices which support the v4l API from processes running within the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver. Not tested is firmware upload, framebuffer stuff and video tuner stuff due to lack of hardware. The clipping part (VIDIOCSWIN) needs a little bit of further work (partly in progress, but can not be tested due to lack of a suitable device). The submitter tested this sucessfully with Skype and flash apps on amd64 and i386 with the multimedia/pwcbsd driver. Submitted by: J.R. Oldroyd <fbsd@opal.com>
* Since all other comparisons involving ngroups_max usebrooks2010-01-152-2/+2
| | | | | "ngroups_max + 1", use ">= ngroups_max+1" instead of the equivalent "> ngroups_max" to reduce confusion.
* Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamicbrooks2010-01-122-2/+2
| | | | | | | | kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications. MFC after: 1 month
* Background:mckusick2010-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When renaming a directory it passes through several intermediate states. First its new name will be created causing it to have two names (from possibly different parents). Next, if it has different parents, its value of ".." will be changed from pointing to the old parent to pointing to the new parent. Concurrently, its old name will be removed bringing it back into a consistent state. When fsck encounters an extra name for a directory, it offers to remove the "extraneous hard link"; when it finds that the names have been changed but the update to ".." has not happened, it offers to rewrite ".." to point at the correct parent. Both of these changes were considered unexpected so would cause fsck in preen mode or fsck in background mode to fail with the need to run fsck manually to fix these problems. Fsck running in preen mode or background mode now corrects these expected inconsistencies that arise during directory rename. The functionality added with this update is used by fsck running in background mode to make these fixes. Solution: This update adds three new fsck sysctl commands to support background fsck in correcting expected inconsistencies that arise from incomplete directory rename operations. They are: setcwd(dirinode) - set the current directory to dirinode in the filesystem associated with the snapshot. setdotdot(oldvalue, newvalue) - Verify that the inode number for ".." in the current directory is oldvalue then change it to newvalue. unlink(nameptr, oldvalue) - Verify that the inode number associated with nameptr in the current directory is oldvalue then unlink it. As with all other fsck sysctls, these new ones may only be used by processes with appropriate priviledge. Reported by: jeff Security issues: rwatson
* Remove extraneous semicolons, no functional changes.mbr2010-01-071-1/+1
| | | | | Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week
* Signal 0 is used to check the permission for current process to signalkib2009-12-181-1/+1
| | | | | | | | | target one. Since r184058, linux_do_tkill() calls tdsignal() instead of kill(), without checking for validity of supplied signal number. Prevent panic when supplied signal is 0 by finishing work after checks. Found and tested by: scf MFC after: 3 days
* This is v4l support for the linuxulator. This allows to access FreeBSDnetchild2009-12-044-39/+425
| | | | | | | | | | | | | | | native devices which support the v4l API from processes running within the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver. Not tested is firmware upload, framebuffer stuff and video tuner stuff due to lack of hardware. The clipping part (VIDIOCSWIN) needs a little bit of further work (partly in progress, but can not be tested due to lack of a suitable device). The submitter tested this sucessfully with Skype and flash apps on amd64 and i386 with the multimedia/pwcbsd driver. Submitted by: J.R. Oldroyd <fbsd@opal.com>
* Import the unchanged v4l videodev.h from the vendor branch.netchild2009-12-041-0/+372
|
* Fix typo in kernel message. The fix is based upon the patch in the PR.netchild2009-11-051-1/+1
| | | | | | PR: kern/140279 Submitted by: Alexander Best <alexbestms@math.uni-muenster.de> MFC after: 1 week
* Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux socketsbz2009-10-251-12/+5
| | | | | | | | | | | | | no matter whether we are compiled as module or if our default of the net.inet6.ip6.v6only sysctl already matches what we would set. This avoids unnecessary complications with modules, VIMAGES, INET6 and the sysctl value, especially considering that most users will use linux compat as a module. Discussed with: kib, rwatson (weeks ago) Reviewed by: rwatson MFC after: 6 weeks
* Lock the ifnet list while iterating over it.zec2009-09-131-0/+2
| | | | | Submitted by: julian MFC after: 3 days
* kern_select(9) copies fd_set in and out of userspace in quantities ofkib2009-09-091-1/+1
| | | | | | | | | | | | longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in or out 4 bytes more then the process expected. Calculate the amount of bytes to copy taking into account size of fd_set for the current process ABI. Diagnosed and tested by: Peter Jeremy <peterjeremy acm org> Reviewed by: jhb MFC after: 1 week
* Fix a few panics in linuxulator + VIMAGE due to curvnet not being set.zec2009-08-281-1/+8
| | | | | | | This change affects only options VIMAGE builds. Reviewed by: julian MFC after: 3 days
* Rework global locks for interface list and index management, correctingrwatson2009-08-231-6/+4
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-012-2/+0
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Some jail parameters (in particular, "ip4" and "ip6" for IP addressjamie2009-07-251-35/+43
| | | | | | | | | restrictions) were found to be inadequately described by a boolean. Define a new parameter type with three values (disable, new, inherit) to handle these and future cases. Approved by: re (kib), bz (mentor) Discussed with: rwatson
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-142-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Replace AUDIT_ARG() with variable argument macros with a set more morerwatson2009-06-271-3/+3
| | | | | | | | | | | | | | specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
* Change the ABI of some of the structures used by the SYSV IPC API:jhb2009-06-241-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned short. - The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned short. - The mode member of struct ipc_perm is now mode_t instead of unsigned short (this is merely a style bug). - The rather dubious padding fields for ABI compat with SV/I386 have been removed from struct msqid_ds and struct semid_ds. - The shm_segsz member of struct shmid_ds is now a size_t instead of an int. This removes the need for the shm_bsegsz member in struct shmid_kernel and should allow for complete support of SYSV SHM regions >= 2GB. - The shm_nattch member of struct shmid_ds is now an int instead of a short. - The shm_internal member of struct shmid_ds is now gone. The internal VM object pointer for SHM regions has been moved into struct shmid_kernel. - The existing __semctl(), msgctl(), and shmctl() system call entries are now marked COMPAT7 and new versions of those system calls which support the new ABI are now present. - The new system calls are assigned to the FBSD-1.1 version in libc. The FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls. - A simplistic framework for tagging system calls with compatibility symbol versions has been added to libc. Version tags are added to system calls by adding an appropriate __sym_compat() entry to src/lib/libc/incldue/compat.h. [1] PR: kern/16195 kern/113218 bin/129855 Reviewed by: arch@, rwatson Discussed with: kan, kib [1]
* After cleaning up rt_tables from vnet.h and cleaning up opt_route.hbz2009-06-231-1/+0
| | | | | a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
* Rework the credential code to support larger values of NGROUPS andbrooks2009-06-192-15/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867
* Add explicit includes for jail.h to the files that need them andbz2009-06-171-0/+1
| | | | remove the "hidden" one from vimage.h.
* Get vnets from creds instead of threads where they're available, and fromjamie2009-06-151-3/+3
| | | | | | | passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor)
* Unlock process lock when return error from getrobustlist call.dchagin2009-06-141-1/+3
| | | | | | Tested by: Alexander Best <alexbestms at math uni-muenster de> Approved by: kib (mentor) MFC after: 3 days
* Add counterparts to getcredhostname:jamie2009-06-131-6/+1
| | | | | | | getcreddomainname, getcredhostuuid, getcredhostid Suggested by: rmacklem Approved by: bz
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-081-1/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-054-4/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Add forgotten in previous commit flags argument.dchagin2009-06-011-2/+2
| | | | | Approved by: kib (mentor) MFC after: 1 month
* Implement accept4 syscall.dchagin2009-06-011-1/+19
| | | | | Approved by: kib (mentor) MFC after: 1 month
* Implement a variation of the accept_common() which takesdchagin2009-06-011-14/+21
| | | | | | | | | | a flags argument. Do not preserve td_retval before kern_fcntl(F_SETFL) as it does not changed. Approved by: kib (mentor) MFC after: 1 month
* Split linux_accept() syscall onto linux_accept_common() which shoulddchagin2009-06-011-13/+22
| | | | | | | be used by linuxulator and linux_accept() itself. Approved by: kib (mentor) MFC after: 1 month
* Implement a variation of the socketpair() syscall which takes a flagsdchagin2009-05-311-2/+28
| | | | | | | in addition to the type argument. Approved by: kib (mentor) MFC after: 1 month
* Move new socket flags handling into a separate function as Linuxdchagin2009-05-311-15/+23
| | | | | | | introduced more syscalls which uses these flags. Approved by: kib (mentor) MFC after: 1 month
* Remove empty lines.dchagin2009-05-311-2/+0
| | | | | Approved by: kib (mentor) MFC after: 1 month
* Place hostnames and similar information fully under the prison system.jamie2009-05-291-5/+5
| | | | | | | | | | | | | | | | | The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)
* linux_ioctl_cdrom: reduce stack usageavg2009-05-271-11/+16
| | | | | | | | | | | ... by moving two ~2KB structures from stack to heap allocation. I experienced stack overflow in linux emulation on i386 (8K stack) when LINUX_DVD_READ_STRUCT ioctl was performed on atapicam cd device and there was an error that resulted in additional quite heavy stack use in cam layer. Reviewed by: dchagin Approved by: jhb (mentor)
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-140/+92
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
* Validate user-supplied arguments values.dchagin2009-05-191-1/+28
| | | | | | | | | Args argument is a pointer to the structure located in user space in which the socketcall arguments are packed. The structure must be copied to the kernel instead of direct dereferencing. Approved by: kib (mentor) MFC after: 1 week
* Implement MSG_CMSG_CLOEXEC flag for linux_recvmsg().dchagin2009-05-182-9/+25
| | | | | Approved by: kib (mentor) MFC after: 1 month
* Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC anddchagin2009-05-162-2/+30
| | | | | | | | | | SOCK_NONBLOCK flags, that allow to save fcntl() calls. Implement a variation of the socket() syscall which takes a flags in addition to the type argument. Approved by: kib (mentor) MFC after: 1 month
* Return EINVAL in case when the incorrect or unsupporteddchagin2009-05-162-0/+12
| | | | | | | | | type argument is specified. Do not map type argument value as its Linux values are identical to FreeBSD values. Approved by: kib (mentor)
* Use the protocol family constants for the domain argument validation.dchagin2009-05-161-3/+5
| | | | | | | Return immediately when the socket() failed. Approved by: kib (mentor) MFC after: 1 month
OpenPOWER on IntegriCloud