summaryrefslogtreecommitdiffstats
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* Merge projects/bhyve_npt_pmap into head.neel2013-10-052-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory. Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT. pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT. The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest. Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'. An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR) The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@). The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap. TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB. Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose. PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them. Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable. Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement. Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions. Special thanks to Peter Holm for testing the patch on short notice. Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho
* accept(2): Update portability note for accept4().jilles2013-10-011-3/+10
| | | | | | | | | | | The accept(2) man page warns that O_NONBLOCK and other properties on the new socket may vary across implementations. However, this issue only applies to accept() and not to accept4(). On the other hand, accept4() is not commonly available yet. Reported by: pluknet Reviewed by: bjk Approved by: re (kib)
* Remove BIND.des2013-09-3018-6851/+0
| | | | Approved by: re (gjb)
* Temporarily disable iconv for non-shared library builds. The dynamicdelphij2013-09-261-1/+3
| | | | | | loading of conversation table is not yet compatible with static builds. Approved by: re (gjb)
* Import NetBSD readline.c,v 1.104: do not crash with add_history(NULL).delphij2013-09-261-0/+3
| | | | | MFC after: 3 days Approved by: re (gjb)
* Add an elf note on ARM to store the MACHINE_ARCH an executable was builtandrew2013-09-262-0/+15
| | | | | | | | | | for. This is useful for software needing to know which architecture a binary is built for as arm and armv6 have slight differences meaning only some binaries build for one will work as expected on the other. It is expected pkgng will be able to make use of this to simplify the logic to determine which package ABI to use. Approved by: re (kib)
* Add LLDB bmake infrastructureemaste2013-09-2038-9/+963
| | | | | | | | | | | This connects LLDB to the build, but it is disabled by default. Add WITH_LLDB= to src.conf to build it. Note that LLDB requires a C++11 compiler so is disabled on platforms using GCC. Approved by: re (gjb) Sponsored by: DARPA, AFRL
* Minor mdoc improvements.joel2013-09-191-4/+4
| | | | Approved by: re (blanket)
* Extend the support for exempting processes from being killed when swap isjhb2013-09-193-0/+146
| | | | | | | | | | | | | | | | | | | | | | exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes. Reviewed by: kib, jilles (earlier version) Approved by: re (delphij) MFC after: 1 month
* Remove an unused variable and fix a memory leak in sctp_connectx().tuexen2013-09-191-3/+4
| | | | | Approved by: re (gjb) MFC after: 3 days
* Move libldns to the correct (ordered) library list.des2013-09-151-1/+1
| | | | Approved by: re (blanket)
* Build and install the Unbound caching DNS resolver daemon.des2013-09-152-0/+39
| | | | Approved by: re (blanket)
* After r255294, building lib/msun's symbol map (using clang as thedim2013-09-121-1/+1
| | | | | | | | | | | | | | preprocessor) gives the following error: --- Version.map --- <stdin>:287:4: error: invalid preprocessing directive # Implemented as weak aliases for imprecise versions ^ 1 error generated. Change the comment to a C-style one, to prevent this error. Approved by: re (hrs)
* Consistently reference file descriptors as "fd". 55 other manpagesbdrewery2013-09-126-37/+37
| | | | | | | | used "fd", while these used "d" and "filedes". MFC after: 1 week Approved by: gjb Approved by: re (delphij)
* Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping usejhb2013-09-091-1/+14
| | | | | | | | | | | | | an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)
* LDNS needs OpenSSL. This wasn't a problem as long as it was only builddes2013-09-081-0/+3
| | | | | | statically, since any program using it would have to link with it anyway. Approved by: re (blanket)
* Make libldns and libssh private.des2013-09-082-1/+2
| | | | Approved by: re (blanket)
* Update to OpenPAM Nummularia.des2013-09-071-1/+6
|\
| * Vendor import of OpenPAM Nummularia..des2013-09-0786-424/+1837
| |
* | MFV (r255364): move the code around in preparation for Nummularia.des2013-09-071-1/+1
|\ \ | |/
| * Prepare for OpenPAM Nummularia by reorganizing to match its new directorydes2013-09-0776-0/+0
| | | | | | | | structure.
| * Merge upstream r634:646: correctly parse mixed quoted / unquoted text.des2013-03-042-10/+3
| | | | | | | | See http://www.openpam.org/wiki/Errata#Configurationparsing for details.
| * OpenPAM Micrampelis was re-rolled due to a showstopper bug.des2012-05-261-1/+3
| |
* | On ARM EABI double precision floating point values are stored in theandrew2013-09-074-4/+4
| | | | | | | | | | | | endian the CPU is in, i.e. little-endian on most ARM cores. This allows ARMv4 and ARMv5 boards to boot with the ARM EABI.
* | wait(2): Add some possible caveats to standards section.jilles2013-09-071-4/+18
| |
* | libc: Make resolver sockets close-on-exec (SOCK_CLOEXEC).jilles2013-09-061-2/+3
| | | | | | | | | | Although the resolver's sockets are exposed to applications via res_state, I do not expect them to pass the sockets across execve().
* | libc: Use SOCK_CLOEXEC for various internal file descriptors.jilles2013-09-064-9/+12
| | | | | | | | | | | | | | This change avoids undesirably passing some internal file descriptors to a process created (fork+exec) by another thread. Kernel support for SOCK_CLOEXEC was added in r248534, March 19, 2013.
* | libc/stdio: Allow fopen/freopen modes in any order (except initial r/w/a).jilles2013-09-061-27/+28
| | | | | | | | | | | | | | | | | | | | Austin Group issue #411 requires 'e' to be accepted before and after 'x', and encourages accepting the characters in any order, except the initial 'r', 'w' or 'a'. Given that glibc accepts the characters after r/w/a in any order and that diagnosing this problem may be hard, change our libc to behave that way as well.
* | Use Makefile.inc instead of .export.theraven2013-09-062-3/+3
| |
* | Fix the namespace pollution caused by iconv.h including stdbool.htheraven2013-09-062-0/+4
| | | | | | | | | | This broke any C89 ports that defined bool themselves, including things like gcc, gtk, and so on.
* | Update some signal man pages for multithreading.jilles2013-09-064-25/+41
| |
* | Add stub implementations of the missing C++11 math functions.theraven2013-09-063-0/+79
| | | | | | | | | | | | | | | | | | | | These are weak and so can be replaced by other versions in applications that choose to do so, and will give a linker warning when used so that applications that rely on the extra precision can avoid them. Note that since the C/C++ specs only guarantee that long double has precision equal to double, code that actually relies on these functions having greater precision is unportable at best and broken at worst.
* | Correct two comments.hselasky2013-09-051-2/+2
| |
* | Change the cap_rights_t type from uint64_t to a structure that we can extendpjd2013-09-057-7/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
* | Add a c++/v1/tr1 include directory containing symlinks to all of the standardtheraven2013-09-041-0/+1
| | | | | | | | | | | | | | | | | | headrs. Lots of third-party code expects to find C++03 headers under tr1 because that's where GNU decided to hide them. This should fix ports that expect them there. MFC after: 1 week
* | Connect libexecinfo to the buildemaste2013-09-031-0/+1
| | | | | | | | Sponsored by: DARPA, AFRL
* | Don't install private libexecinfo headersemaste2013-09-031-1/+1
| |
* | Document SIGLIBRT in signal(3); take a stab at the signal description asrwatson2013-09-031-0/+1
| | | | | | | | | | | | the original committer didn't provide one. MFC after: 3 days
* | libexecinfo compatibility with devel/libexecinfo portemaste2013-09-021-0/+4
| | | | | | | | | | | | | | 1. Match shlib number 2. Add libelf dependency Suggested by: bapt[1]
* | system(): Restore behaviour for SIGINT and SIGQUIT.jilles2013-09-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | As mentioned in r16117 and the book "Advanced Programming in the Unix Environment" by W. Richard Stevens, we should ignore SIGINT and SIGQUIT before forking, since it is not guaranteed that the parent process starts running soon enough. To avoid calling sigaction() in the vforked child, instead block SIGINT and SIGQUIT before vfork() and keep the sigaction() to ignore after vfork(). The FreeBSD kernel discards ignored signals, even if they are blocked; therefore, it is not necessary to unblock SIGINT and SIGQUIT earlier.
* | libc: Always use our own copy of sys_errlist and sys_nerr (.so only).jilles2013-08-314-4/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ensures strerror() and friends continue to work correctly even if a (non-PIE) executable linked against an older libc imports sys_errlist (which causes sys_errlist to refer to the executable's copy with a size fixed when that executable was linked). The executable's use of sys_errlist remains broken because it uses the current value of sys_nerr and may access past the bounds of the array. Different from the message "Using sys_errlist from executables is not ABI-stable" on freebsd-arch, this change does not affect the static library. There seems no reason to prevent overriding the error messages in the static library.
* | Add support to the ARM platform specific section types.andrew2013-08-311-1/+9
| |
* | Unconditionally compile the __sync_* atomics support functions into compiler-rttheraven2013-08-311-1/+2
| | | | | | | | | | | | | | | | | | for ARM. This is quite ugly, because it has to work around a clang bug that does not allow built-in functions to be defined, even when they're ones that are expected to be built as part of a library. Reviewed by: ed
* | The round of expand_number() cleanups.pluknet2013-08-301-29/+10
| | | | | | | | | | | | | | | | | | o Fix range error checking to detect overflow when uint64_t < uintmax_t. o Remove a non-functional check for no valid digits as pointed out by Bruce. o Remove a rather pointless comment describing what the function does. o Clean up a bunch of style bugs. Brucified by: bde
* | libutil: Use O_CLOEXEC for internal file descriptors from open().jilles2013-08-285-9/+12
| |
* | Xref capsicum(4) and procdesc(4) from pdfork(2).rwatson2013-08-281-4/+18
| | | | | | | | | | Suggested by: sbruno MFC after: 3 days
* | * Whitespace.kargl2013-08-281-1/+1
| |
* | wordexp(): Avoid leaking the pipe file descriptors to a parallel fork/exec.jilles2013-08-271-4/+4
| | | | | | | | This uses the new pipe2() system call added on May 1 (r250159).
* | * s_erf.c:kargl2013-08-272-99/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | . Use integer literal constants instead of double literal constants. * s_erff.c: . Use integer literal constants instead of casting double literal constants to float. . Update the threshold values from those carried over from erf() to values appropriate for float. . New sets of polynomial coefficients for the rational approximations. These coefficients have little, but positive, effect on the maximum error in ULP in the four intervals, but do improve the overall speed of execution. . Remove redundant GET_FLOAT_WORD(ix,x) as hx already contained the contents that is packed into ix. . Update the mask that is used to zero-out lower-order bits in x in the intervals [1.25, 2.857143] and [2.857143, 12]. In tests on amd64, this change improves the maximum error in ULP from 6.27739 and 63.8095 to 3.16774 and 2.92095 on these intervals for erffc(). Reviewed by: bde
* | Make the PAM password strength checking module WARNS=2 safe.will2013-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lib/libpam/modules/pam_passwdqc/Makefile: Bump WARNS to 2. contrib/pam_modules/pam_passwdqc/pam_passwdqc.c: Bump _XOPEN_SOURCE and _XOPEN_VERSION from 500 to 600 so that vsnprint() is declared. Use the two new union types (pam_conv_item_t and pam_text_item_t) to resolve strict aliasing violations caused by casts to comply with the pam_get_item() API taking a "const void **" for all item types. Warnings are generated for casts that create "type puns" (pointers of conflicting sized types that are set to access the same memory location) since these pointers may be used in ways that violate C's strict aliasing rules. Casts to a new type must be performed through a union in order to be compliant, and access must be performed through only one of the union's data types during the lifetime of the union instance. Handle strict-aliasing warnings through pointer assignments, which drastically simplifies this change. Correct a CLANG "printf-like function with more arguments than format" error. Submitted by: gibbs Sponsored by: Spectra Logic
OpenPOWER on IntegriCloud