summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* MFH: r282315-r282534gjb2015-05-064-118/+97
|\ | | | | | | Sponsored by: The FreeBSD Foundation
| * Implement a mechanism for making changes in the kernel<->driver PPSian2015-05-041-4/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interface without breaking ABI or API compatibility with existing drivers. The existing data structures used to communicate between the kernel and driver portions of PPS processing contain no spare/padding fields and no flags field or other straightforward mechanism for communicating changes in the structures or behaviors of the code. This makes it difficult to MFC new features added to the PPS facility. ABI compatibility is important; out-of-tree drivers in module form are known to exist. (Note that the existing api_version field in the pps_params structure must contain the value mandated by RFC 2783 and any RFCs that come along after.) These changes introduce a pair of abi-version fields which are filled in by the driver and the kernel respectively to indicate the interface version. The driver sets its version field before calling the new pps_init_abi() function. That lets the kernel know how much of the pps_state structure is understood by the driver and it can avoid using newer fields at the end of the structure that it knows about if the driver is a lower version. The kernel fills in its version field during the init call, letting the driver know what features and data the kernel supports. To implement the new version information in a way that is backwards compatible with code from before these changes, the high bit of the lightly-used 'kcmode' field is repurposed as a flag bit that indicates the driver is aware of the abi versioning scheme. Basically if this bit is clear that indicates a "version 0" driver and if it is set the driver_abi field indicates the version. These changes also move the recently-added 'mtx' field of pps_state from the middle to the end of the structure, and make the kernel code that uses this field conditional on the driver being abi version 1 or higher. It changes the only driver currently supplying the mtx field, usb_serial, to use pps_init_abi(). Reviewed by: hselasky@
| * nv_malloc can fail in userland.oshogbo2015-05-021-0/+2
| | | | | | | | | | | | | | Add check to prevent a NULL pointer dereference. Pointed out by: mjg Approved by: pjd (mentor)
| * Remove duplicated code using macro template for the nvlist_add_.* functions.oshogbo2015-05-021-91/+27
| | | | | | | | Approved by: pjd (mentor)
| * Introduce the NV_FLAG_NO_UNIQUE flag. When set, it allows to storeoshogbo2015-05-022-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | multiple values using the same key in a nvlist. Approved by: pjd (mentor) Obtained from: WHEEL Systems (http://www.wheelsystems.com) Update man page. Reviewed by: AllanJude Approved by: pjd (mentor)
| * Approved, oprócz użycie RESTORE_ERRNO() do ustawiania errno.oshogbo2015-05-021-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the nvlist_recv() function to take additional argument that specifies flags expected on the received nvlist. Receiving a nvlist with different set of flags than the ones we expect might lead to undefined behaviour, which might be potentially dangerous. Update consumers of this and related functions and update the tests. Approved by: pjd (mentor) Update man page for nvlist_unpack, nvlist_recv, nvlist_xfer, cap_recv_nvlist and cap_xfer_nvlist. Reviewed by: AllanJude Approved by: pjd (mentor)
| * Fix an off-by-one bug in string/array handling which lead to memory overwritebz2015-05-021-1/+1
| | | | | | | | | | | | | | | | and follow-up assertion errors on at least ARM after r282257, with nvp_magic being 0x6e7600: Assertion failed: ((nvp)->nvp_magic == 0x6e7670), function nvpair_name, file .../subr_nvpair.c, line 713. Sponsored by: DARPA/AFRL
| * Remove a stale reference to the stop_scheduler_on_panic tunable, whichmarkj2015-05-021-4/+2
| | | | | | | | | | | | itself was removed in r243515. MFC after: 1 week
* | MFH: r281855-r282312gjb2015-05-0127-1423/+722
|\ \ | |/ | | | | Sponsored by: The FreeBSD Foundation
| * Add nvlist_flags() function, which returns nvlist's public flags.oshogbo2015-05-011-0/+11
| | | | | | | | Approved by: pjd (mentor)
| * Mark local function as static as a result of removing recursion.oshogbo2015-04-301-2/+2
| | | | | | | | Approved by: pjd (mentor)
| * Rename macros to use prefix ERRNO. Add macro ERRNO_SET. Nowoshogbo2015-04-302-86/+83
| | | | | | | | | | | | | | ERRNO_{RESTORE/SAVE} must by used together, additional variable is not needed. Always use ERRNO_{SAVE/RESTORE/SET} macros. Approved by: pjd (mentor)
| * Remove support for Xen PV domU kernels. Support for HVM domU kernelsjhb2015-04-304-27/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | remains. Xen is planning to phase out support for PV upstream since it is harder to maintain and has more overhead. Modern x86 CPUs include virtualization extensions that support HVM guests instead of PV guests. In addition, the PV code was i386 only and not as well maintained recently as the HVM code. - Remove the i386-only NATIVE option that was used to disable certain components for PV kernels. These components are now standard as they are on amd64. - Remove !XENHVM bits from PV drivers. - Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3, etc.) - Remove duplicate copy of <xen/features.h>. - Remove unused, i386-only xenstored.h. Differential Revision: https://reviews.freebsd.org/D2362 Reviewed by: royger Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0) Relnotes: yes
| * Save errno from close override.oshogbo2015-04-292-1/+7
| | | | | | | | Approved by: pjd (mentor)
| * Remove the nvlist_.*[fv] functions.oshogbo2015-04-293-1002/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those functions are problematic, because there is no way to report memory allocation problems without complicating the API, so we can either abort or potentially return invalid results. None of which is acceptable. In most cases the caller knows the size of the name, so he can allocate buffer on the stack and use snprintf(3) to prepare the name. After some discussion the conclusion is to removed those functions, which also simplifies the API. Discussed with: pjd, rstone Approved by: pjd (mentor)
| * Remove recursion from descriptor-related functions.oshogbo2015-04-291-38/+37
| | | | | | | | Approved by: pjd (mentor)
| * Always use the nv_malloc macro instead of malloc(3).oshogbo2015-04-291-2/+2
| | | | | | | | Approved by: pjd (mentor)
| * Style fixes.oshogbo2015-04-291-10/+20
| | | | | | | | Approved by: pjd (mentor)
| * Add kern.racct.enable tunable and RACCT_DISABLED config option.trasz2015-04-2912-85/+294
| | | | | | | | | | | | | | | | | | | | | | The point of this is to be able to add RACCT (with RACCT_DISABLED) to GENERIC, to avoid having to rebuild the kernel to use rctl(8). Differential Revision: https://reviews.freebsd.org/D2369 Reviewed by: kib@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation
| * Make setproctitle(3) work in Capsicum capability mode. This makestrasz2015-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | ctld(8) child processes to indicate initiator address and name in their titles, similar to what iscsid(8) child processes do. PR: 181352 Differential Revision: https://reviews.freebsd.org/D2363 Reviewed by: rwatson@, mjg@ MFC after: 1 month Sponsored by: The FreeBSD Foundation
| * Partially revert r255986: do not call VOP_FSYNC() when helpingkib2015-04-271-29/+64
| | | | | | | | | | | | | | | | | | | | bufdaemon in getnewbuf(), do use buf_flush(). The difference is that bufdaemon uses TRYLOCK to get buffer locks, which allows calls to getnewbuf() while another buffer is locked. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
| * Fix locking for oshmctl() and shmsys().kib2015-04-271-26/+24
| | | | | | | | | | | | Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
| * fd: plug an always overwritten initialization in fdallocmjg2015-04-261-1/+1
| |
| * Consistently use p instead of td->td_proc in create_threadmjg2015-04-261-5/+5
| | | | | | | | No functional changes.
| * MAXBSIZE defines both the largest UFS block size and thermacklem2015-04-251-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | largest size for a buffer in the buffer cache. This patch defines a new constant MAXBCACHEBUF, which is the largest size for a buffer in the buffer cache. Having a separate constant allows MAXBCACHEBUF to be set larger than MAXBSIZE on a per-architecture basis, so that NFS can do larger read/writes for these architectures. It modifies sys/param.h so that BKVASIZE can also be set on a per-architecture basis. A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE is fixed as well. Differential Revision: https://reviews.freebsd.org/D2330 Reviewed by: mav, kib MFC after: 2 weeks
| * Use correct length for sparse uiomove(). It must be the clipped tokib2015-04-241-1/+1
| | | | | | | | | | | | | | | | | | the page size, len is the total transfer length, which may be larger than zero_region. Reported and tested by: clusteradm (gjb) Sponsored by: The FreeBSD Foundation X-MFC-With: r281442
| * Make vpanic() externally visible so that it can be called as part of themarkj2015-04-241-2/+1
| | | | | | | | | | | | | | | | | | DTrace panic() action. Differential Revision: https://reviews.freebsd.org/D2349 Reviewed by: avg MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division
| * Handle incorrect ELF images specifying size for PT_GNU_STACK not beingkib2015-04-231-1/+1
| | | | | | | | | | | | | | multiple of page size. Sponsored by: The FreeBSD Foundation MFC after: 3 days
| * Make AIO to not allocate pbufs for unmapped I/O like r281825.mav2015-04-221-101/+105
| | | | | | | | | | | | | | | | | | | | | | While there, make few more performance optimizations. On 40-core system doing many 512-byte AIO reads from array of raw SSDs this change removes lock congestions inside pbuf allocator and devfs, and bottleneck on single AIO completion taskqueue thread. It improves peak AIO performance from ~600K to ~1.3M IOPS. MFC after: 2 weeks
| * Move zlib.c from net to libkern.rodrigc2015-04-223-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | It is not network-specific code and would be better as part of libkern instead. Move zlib.h and zutil.h from net/ to sys/ Update includes to use sys/zlib.h and sys/zutil.h instead of net/ Submitted by: Steve Kiernan stevek@juniper.net Obtained from: Juniper Networks, Inc. GitHub Pull Request: https://github.com/freebsd/freebsd/pull/28 Relnotes: yes
* | MFH: r280643-r281852gjb2015-04-2235-472/+511
|\ \ | |/ | | | | Sponsored by: The FreeBSD Foundation
| * Support file verification in MAC.rodrigc2015-04-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add VCREAT flag to indicate when a new file is being created * Add VVERIFY to indicate verification is required * Both VCREAT and VVERIFY are only passed on the MAC method vnode_check_open and are removed from the accmode after * Add O_VERIFY flag to rtld open of objects * Add 'v' flag to __sflags to set O_VERIFY flag. Submitted by: Steve Kiernan <stevek@juniper.net> Obtained from: Juniper Networks, Inc. GitHub Pull Request: https://github.com/freebsd/freebsd/pull/27 Relnotes: yes
| * Modify kern___getcwd() to take max pathlen limit as an additionaltrasz2015-04-211-4/+6
| | | | | | | | | | | | | | | | | | | | argument. This will be used for the Linux emulation layer - for Linux, PATH_MAX is 4096 and not 1024. Differential Revision: https://reviews.freebsd.org/D2335 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation
| * Rewrite physio() to not allocate pbufs for unmapped I/O.mav2015-04-211-61/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pbufs is a limited resource, and their allocator is not SMP-scalable. So instead of always allocating pbuf to immediately convert it to bio, allocate bio just here. If buffer needs kernel mapping, then pbuf is still allocated, but used only as a source of KVA and storage for a list of held pages. On 40-core system doing many 512-byte reads from user level to array of raw SSDs this change removes huge lock congestion inside pbuf allocator. It improves peak performance from ~300K to ~1.2M IOPS. On my previous 24-core system this problem also existed, but was less serious. Reviewed by: kib MFC after: 2 weeks
| * Always send log(9) messages to the message buffer.vangyzen2015-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is truer to the semantics of logging for messages to *always* go to the message buffer, where they can eventually be collected and, in fact, be put into a log file. This restores the behavior prior to r70239, which seems to have changed it inadvertently. Submitted by: Eric Badger <eric@badgerio.us> Reviewed by: jhb Approved by: kib (mentor) Obtained from: Dell Inc. MFC after: 1 week
| * Regen.kib2015-04-183-230/+20
| |
| * The lseek(2), mmap(2), truncate(2), ftruncate(2), pread(2), andkib2015-04-183-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x kernels which required padding before the off_t parameter. The fcntl(2) contains compatibility code to handle kernels before the struct flock was changed during the 8.x CURRENT development. The shims were reasonable to allow easier revert to the older kernel at that time. Now, two or three major releases later, shims do not serve any purpose. Such old kernels cannot handle current libc, so revert the compatibility code. Make padded syscalls support conditional under the COMPAT6 config option. For COMPAT32, the syscalls were under COMPAT6 already. Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to (partially) disable the removed shims. Reviewed by: jhb, imp (previous versions) Discussed with: peter Sponsored by: The FreeBSD Foundation MFC after: 1 week
| * Remove unimplemented sched provider probes.markj2015-04-181-11/+0
| | | | | | | | | | | | | | | | | | They were added for compatibility with the sched provider in Solaris and illumos, but our sched provider is already incompatible since it uses native types, so there isn't much point in keeping them around. Differential Revision: https://reviews.freebsd.org/D2167 Reviewed by: rpaulo
| * Initialize td_sel in the thread_init(). Struct thread is not zeroedkib2015-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | on the initial allocation, but seltdinit() assumes that td_sel is NULL or a valid pointer. Note that thread_fini()/seltdfini() also relies on this, but correctly resets td_sel to NULL. Submitted by: luke.tw@gmail.com PR: 199518 MFC after: 1 week
| * More accurately collect name-cache statistics in sysctl functionsmckusick2015-04-181-20/+16
| | | | | | | | | | | | | | | | | | | | sysctl_debug_hashstat_nchash() and sysctl_debug_hashstat_rawnchash(). These changes are in preparation for allowing changes in the size of the vnode hash tables driven by increases and decreases in the maximum number of vnodes in the system. Reviewed by: kib@ Phabric: D2265
| * Add "GELI Passphrase:" prompt to boot loader.dteske2015-04-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new loader.conf(5) option of geom_eli_passphrase_prompt="YES" will now allow you to enter your geli(8) root-mount credentials prior to invoking the kernel. See check-password.4th(8) for details. Differential Revision: https://reviews.freebsd.org/D2105 Reviewed by: imp, kmoore Discussed on: -current MFC after: 3 days X-MFC-to: stable/10 Relnotes: yes
| * File systems that do not use the buffer cache (such as ZFS) mustrmacklem2015-04-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache. Reviewed by: kib MFC after: 2 weeks
| * Fix handling of BUS_PROBE_NOWILDCARD in 'device_probe_child()'.neel2015-04-151-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Device probe value of BUS_PROBE_NOWILDCARD should be treated specially only if the device has a fixed devclass. Otherwise it should be interpreted just as if the driver doesn't want to claim the device. Prior to this change a device that was not claimed explicitly by its driver would remain "attached" to the driver that returned BUS_PROBE_NOWILDCARD. This would bump up the reference on 'driver->refs' and its 'dev->ops' would point to the 'driver->ops'. When the driver is subsequently unloaded the 'dev->ops->cls' is left pointing to freed memory. This fixes an easily reproducible #GP fault caused by loading and unloading vmm.ko multiple times. Differential Revision: https://reviews.freebsd.org/D2294 Reviewed by: imp, jhb Discussed with: rstone Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks
| * Rewrite linprocfs_domtab() as a wrapper around kern_getfsstat(). Thistrasz2015-04-151-7/+14
| | | | | | | | | | | | | | | | | | adds missing jail and MAC checks. Differential Revision: https://reviews.freebsd.org/D2193 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation
| * Implement support for binary to requesting specific stack size for thekib2015-04-153-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | initial thread. It is read by the ELF image activator as the virtual size of the PT_GNU_STACK program header entry, and can be specified by the linker option -z stack-size in newer binutils. The soft RLIMIT_STACK is auto-increased if possible, to satisfy the binary' request. Sponsored by: The FreeBSD Foundation MFC after: 1 week
| * When a kernel has DEVICE_POLLING turned on but no drivers havegnn2015-04-141-0/+6
| | | | | | | | | | | | | | | | | | | | the capability do not try to take the mutex at all. Replaces misbegotten attempt from reverted commit 281276 Pointed out by: glebius Sponsored by: Rubicon Communications (Netgate) Differential Revision: https://reviews.freebsd.org/D2262
| * Fix my stupid restoral of old code.. must be c_iflags now.rrs2015-04-141-1/+1
| | | | | | | | | | Thanks jhb for catching my stupidity... MFC after: 3 days
| * Restore the two lines accidentally deleted that allow CALLOUT_DIRECT to berrs2015-04-131-0/+2
| | | | | | | | | | | | | | | | specifed in the flags. Thanks Mark Johnston for noticing this ;-o MFC after: 3 days
| * uiomove_object_page(): Avoid instantiating pages in sparse regions on reads.will2015-04-111-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | Check whether the page being requested is either resident or on swap. If not, read from the zero_region instead of instantiating an unnecessary page. This avoids consuming memory for sparse files on tmpfs, when they are read by applications that do not use SEEK_HOLE/SEEK_DATA (which is most of them). Reviewed by: kib MFC after: 1 week Sponsored by: Spectra Logic
| * Replace struct filedesc argument in getsock_cap with struct threadmjg2015-04-111-27/+25
| | | | | | | | This is is a step towards removal of spurious arguments.
OpenPOWER on IntegriCloud