summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
...
* MFC r300489:hselasky2016-06-031-1/+1
| | | | | | | Use DELAY() instead of _sleep() when SCHEDULER_STOPPED() is set inside pause_sbt(). This allows pause() to continue working during a panic() which is not invoking KDB. This is useful when debugging graphics drivers using the LinuxKPI.
* MFC r300142:kib2016-06-011-0/+2
| | | | | Ensure that ftruncate(2) is performed synchronously when file is opened in O_SYNC mode, at least for UFS.
* Merge r301053:glebius2016-05-311-0/+1
| | | | | | | Fix kernel stack disclosures in the Linux and 4.3BSD compat layers. Security: SA-16:20 Security: SA-16:21
* MFC r299916: vfs_read_dirent: increment ncookies after adding a cookieavg2016-05-231-0/+1
|
* MFC r299412:kib2016-05-181-0/+30
| | | | | Add vfs_hash_ref(9) function, which finds a vnode by the hash value and returns it referenced.
* MFC r299408:kib2016-05-181-2/+4
| | | | Style: wrap long lines.
* Validate that user supplied control message length is not negative.glebius2016-05-171-0/+3
| | | | | | Submitted by: C Turt <cturt hardenedbsd.org> Security: SA-16:19 Security: CVE-2016-1887
* MFC r298982:kib2016-05-162-0/+43
| | | | | | | Add EVFILT_VNODE open, read and close notifications. MFC r298984: Correct wording.
* MFC r287831 (by cem):kib2016-05-162-3/+13
| | | | Note DOOMED vnodes with NOTE_REVOKE.
* MFC r298922:kib2016-05-161-0/+1
| | | | | | Issue NOTE_EXTEND when a directory entry is added to or removed from the monitored directory as the result of rename(2) operation. The renames staying in the directory are not reported.
* MFC r298921:kib2016-05-161-2/+18
| | | | | Fix reporting of NOTE_LINK when directory link count changes due to rename removing or adding subdirectory entry.
* MFC r298809, r298817pfg2016-05-132-13/+13
| | | | Minor spelling fixes.
* MFC r298677:ngie2016-05-131-4/+1
| | | | | | | | | | | | | r298677 (by cem): subr_mbpool: Don't free bogus pointer in error paths An mbpool is allocated with a contiguous array of mbpages. Freeing an individual mbpage has never been valid. Don't do it. This bug has been present since this code was introduced in r117624 (2003). CID: 1009687
* MFC r298678:ngie2016-05-131-3/+3
| | | | | | | | | | | | r298678 (by cem): posix4_mib: Don't overrun facility_initialized array The facility_initialized and facility arrays are the same size and were intended to be indexed the same. I believe this mismatch was just a typo/braino in r208731. CID: 1017430
* MFC r298584:jamie2016-04-303-115/+983
| | | | | | | | | | | | | | | | | | | | | | | | Note the existence of module-specific jail paramters, starting with the linux.* parameters when linux emulation is loaded. MFC r298585: Encapsulate SYSV IPC objects in jails. Define per-module parameters sysvmsg, sysvsem, and sysvshm, with the following bahavior: inherit: allow full access to the IPC primitives. This is the same as the current setup with allow.sysvipc is on. Jails and the base system can see (and moduly) each other's objects, which is generally considered a bad thing (though may be useful in some circumstances). disable: all no access, same as the current setup with allow.sysvipc off. new: A jail may see use the IPC objects that it has created. It also gets its own IPC key namespace, so different jails may have their own objects using the same key value. The parent jail (or base system) can see the jail's IPC objects, but not its keys. PR: 48471
* MFC r297367:jamie2016-04-301-111/+152
| | | | | | | | | Move the various per-type arrays of OSD data into a single structure array. MFC r297422: Add osd_reserve() and osd_set_reserved(), which allow M_WAITOK allocation of an OSD array.
* MFC r298565:jamie2016-04-301-72/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new jail OSD method, PR_METHOD_REMOVE. It's called when a jail is removed from the user perspective, i.e. when the last pr_uref goes away, even though the jail mail still exist in the dying state. It will also be called if either PR_METHOD_CREATE or PR_METHOD_SET fail. MFC r298683: Delay removing the last jail reference in prison_proc_free, and instead put it off into the pr_task. This is similar to prison_free, and in fact uses the same task even though they do something slightly different. MFC r298566: Pass the current/new jail to PR_METHOD_CHECK, which pushes the call until after the jail is found or created. This requires unlocking the jail for the call and re-locking it afterward, but that works because nothing in the jail has been changed yet, and other processes won't change the important fields as long as allprison_lock remains held. Keep better track of name vs namelc in kern_jail_set. Name should always be the hierarchical name (relative to the caller), and namelc the last component. MFC r298668: Use crcopysafe in jail_attach. PR: 48471
* MFC r298564:jamie2016-04-301-3/+1
| | | | | | | Remove the PR_REMOVE flag, which was meant as a temporary marker for a jail that might be seen mid-removal. It hasn't been doing the right thing since at least the ability to resurrect dying jails, and such resurrection also makes it unnecessary.
* MFC r298173:markj2016-04-251-6/+3
| | | | Use a loop instead of a goto in sysctl_kern_proc_kstack().
* MFC r295012vangyzen2016-04-141-10/+51
| | | | | | | | | | | | | kqueue EVFILT_PROC: avoid collision between NOTE_CHILD and NOTE_EXIT NOTE_CHILD and NOTE_EXIT return something in kevent.data: the parent pid (ppid) for NOTE_CHILD and the exit status for NOTE_EXIT. Do not let the two events be combined, since one would overwrite the other's data. PR: 180385 Submitted by: David A. Bright <david_a_bright@dell.com> Sponsored by: Dell Inc.
* MFC r281086: utimensat: Correct Capsicum required capability rights.jilles2016-04-091-2/+4
|
* MFC r295385: semget(): Check for [EEXIST] error first.jilles2016-04-091-5/+5
| | | | | | | | | | Although POSIX literally permits failing with [EINVAL] if IPC_CREAT and IPC_EXCL were both passed, the semaphore set already exists and has fewer semaphores than nsems, this does not allow an application to retry safely: if the [EINVAL] is actually because of the semmsl limit, an infinite loop would result. PR: 206927
* MFC r297488sbruno2016-04-051-8/+7
| | | | | | | | | Repair an overflow condition where a user could submit a string that was not getting a proper bounds check. PR: 206761 Submitted by: sson Reviewed by: cturt@hardenedbsd.org
* MFC r297298:np2016-04-012-2/+4
| | | | | | | | | | | | | | | | | | | | Plug leak in m_unshare. m_unshare passes on the source mbuf's flags as-is to m_getcl and this results in a leak if the flags include M_NOFREE. The fix is to clear the bits not listed in M_COPYALL before calling m_getcl. M_RDONLY should probably be filtered out too but that's outside the scope of this fix. Add assertions in the zone_mbuf and zone_pack ctors to catch similar bugs. Update netmap_get_mbuf to not pass M_NOFREE to m_getcl. It's not clear what the original code was trying to do but it's likely incorrect. Updated code is no different functionally but it avoids the newly added assertions. Sponsored by: Chelsio Communications
* MFC r297037:pfg2016-03-251-1/+2
| | | | | | | | | | | aio_qphysio(): Avoid uninitialized pointer read on error. For the !unmap case it may happen that pbuf gets called unreferenced when vm_fault_quick_hold_pages() fails. Initialize it so it doesn't cause trouble. CID: 1352776 Reviewed by: jhb
* MFC r296467:kib2016-03-211-41/+92
| | | | | Convert all panics from the link_elf_obj kernel linker for object files format into printfs and errors to caller.
* MFC r256613, r256862: MFprojects/camlock r254763:mav2016-03-201-4/+21
| | | | | | Move tq_enqueue() call out of the queue lock for known handlers (actually I have found no others in the base system). This reduces queue lock hold time and congestion spinning under active multithreaded enqueuing.
* MFC r256612: MFprojects/camlock r254685:mav2016-03-201-6/+1
| | | | Remove TQ_FLAGS_PENDING flag, softly duplicating queue emptiness status.
* MFC r277759 (by jhb@)np2016-03-201-0/+3
| | | | | | | | | Fix a couple of panics when detaching from a cxgbe/cxl interface that was never brought up: - Allow NULL to be passed to sglist_free(). - Don't try to stop an interface that was never fully initialized. PR: 208136
* MFC r296320:kib2016-03-152-6/+7
| | | | | | | Adjust _callout_stop_safe() return value for the subr_sleepqueue.c needs when migrating callout was blocked, but running one was not. PR: 200992
* MFC r295391:kib2016-03-121-12/+7
| | | | Remove the assert which outlived its usefulness.
* MFC r295489:kib2016-03-122-54/+27
| | | | | Remove useless checks for NULL before calling free(9), in the kernel elf linkers.
* MFC r295488:kib2016-03-121-6/+3
| | | | | Finish r173600. There is no need to test a condition if both cases result in the same value.
* MFC r296419 (by kib):dim2016-03-071-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | In the link_elf_obj.c, handle sections of type SHT_AMD64_UNWIND same as SHT_PROGBITS. This is needed after the clang 3.8 import, which generates that type for .eh_frame section, which had SHT_PROGBITS type before. Reported by: Nikolai Lifanov <lifanov@mail.lifanov.com> PR: 207729 Tested by: dim (previous version) Sponsored by: The FreeBSD Foundation MFC r296428: Since kernel modules can now contain sections of type SHT_AMD64_UNWIND, the boot loader should not skip over these anymore while loading images. Otherwise the kernel can still panic when it doesn't find the .eh_frame section belonging to the .rela.eh_frame section. Unfortunately this will require installing boot loaders from sys/boot before attempting to boot with a new kernel. Reviewed by: kib
* MFH: 285685araujo2016-02-241-0/+16
| | | | | | | | | | | Add support to the jail framework to be able to mount linsysfs(5) and linprocfs(5). PR: 207179 Requested by: thomas@gibfest.dk Reviewed by: jamie, bapt Approved by: re (gjb) Sponsored by: gandi.net Differential Revision: https://reviews.freebsd.org/D5390
* In preparation for 10.3-RELEASE, temporarily revert the MFC of r291244marius2016-02-231-242/+80
| | | | | | | | | | | done as part of r292895 on stable/10 as that change causes hangs with ZFS and the cause on at least amd64 so far not understood. Discussed with: kib For further information see: https://lists.freebsd.org/pipermail/freebsd-stable/2016-February/084045.html PR: 207281 Approved by: re (gjb)
* MFC: r264565marius2016-02-211-1/+76
| | | | | | | | | | | | | | | | | | | Do not set M_BESTFIT if a strategy has already been provided. This fixes problems when using M_FIRSTFIT. MFC: r280805 Add four new DDB commands to display vmem(9) statistics. In particular, such DDB commands were added: show vmem <addr> show all vmem show vmemdump <addr> show all vmemdump As possible usage, that allows to see KVA usage and fragmentation. Approved by: re (gjb)
* MFC 295418,295419:jhb2016-02-162-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix hangs or panics when misbehaved kernel threads return from their main function. 295418: Mark proc0 as a kernel process via the P_KTHREAD flag. All other kernel processes have this flag set and all threads in proc0 (including thread0) have the similar TDP_KTHREAD flag set. 295419: Call kthread_exit() rather than kproc_exit() for a premature kthread exit. Kernel threads (and processes) are supposed to call kthread_exit() (or kproc_exit()) to terminate. However, the kernel includes a fallback in fork_exit() to force a kthread exit if a kernel thread's "main" routine returns. This fallback was added back when the kernel only had processes and was not updated to call kthread_exit() instead of kproc_exit() when threads were added to the kernel. This mistake was particularly exciting when the errant thread belonged to proc0. Due to the missing P_KTHREAD flag the fallback did not kick in and instead tried to return to userland via whatever garbage was in the trapframe. With P_KTHREAD set it tried to terminate proc0 resulting in other amusements. PR: 204999 Approved by: re (glebius)
* MFC r294598:kib2016-02-141-5/+10
| | | | | | In tty_dealloc(), clear the queues. Approved by: re (marius)
* MFC r294596:kib2016-02-141-2/+3
| | | | | | | Limit the accesses to file' f_advice member to VREG vnodes only. Recheck that f_advice is not NULL after lock is taken. Approved by: re (marius)
* MFC 287442,287537,288944:jhb2016-02-104-21/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix corruption of coredumps due to procstat notes changing size during coredump generation. The changes in r287442 required some reworking since the 'fo_fill_kinfo' file op does not exist in stable/10. 287442: Detect badly behaved coredump note helpers Coredump notes depend on being able to invoke dump routines twice; once in a dry-run mode to get the size of the note, and another to actually emit the note to the corefile. When a note helper emits a different length section the second time around than the length it requested the first time, the kernel produces a corrupt coredump. NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to the length of filenames corresponding to vnodes in the process' fd table via vn_fullpath. As vnodes may move around during dump, this is racy. So: - Detect badly behaved notes in putnote() and pad underfilled notes. - Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to exercise the NT_PROCSTAT_FILES corruption. It simply picks random lengths to expand or truncate paths to in fo_fill_kinfo_vnode(). - Add a sysctl, kern.coredump_pack_fileinfo, to allow users to disable kinfo packing for PROCSTAT_FILES notes. This should avoid both FILES note corruption and truncation, even if filenames change, at the cost of about 1 kiB in padding bloat per open fd. Document the new sysctl in core.5. - Fix note_procstat_files to self-limit in the 2nd pass. Since sometimes this will result in a short write, pad up to our advertised size. This addresses note corruption, at the risk of sometimes truncating the last several fd info entries. - Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the zero padding. 287537: Follow-up to r287442: Move sysctl to compiled-once file Avoid duplicate sysctl nodes. 288944: Fix core corruption caused by race in note_procstat_vmmap This fix is spiritually similar to r287442 and was discovered thanks to the KASSERT added in that revision. NT_PROCSTAT_VMMAP output length, when packing kinfo structs, is tied to the length of filenames corresponding to vnodes in the process' vm map via vn_fullpath. As vnodes may move during coredump, this is racy. We do not remove the race, only prevent it from causing coredump corruption. - Add a sysctl, kern.coredump_pack_vmmapinfo, to allow users to disable kinfo packing for PROCSTAT_VMMAP notes. This avoids VMMAP corruption and truncation, even if names change, at the cost of up to PATH_MAX bytes per mapped object. The new sysctl is documented in core.5. - Fix note_procstat_vmmap to self-limit in the second pass. This addresses corruption, at the cost of sometimes producing a truncated result. - Fix PROCSTAT_VMMAP consumers libutil (and libprocstat, via copy-paste) to grok the new zero padding. Approved by: re (gjb)
* MFC r294732:kib2016-02-081-5/+7
| | | | | | Minor fixes for ddb tty-related commands. Approved by: re (gjb)
* MFC r294735:kib2016-02-081-3/+6
| | | | | | | Don't allow opening the callout device when the callin device is already open (in disguise as the console device). Approved by: re (gjb)
* MFC r295277:kib2016-02-071-1/+20
| | | | | | | When matching brand to the ELF binary by notes, try to find a brand with interpreter name exactly matching one wanted by the binary. Approved by: re (delphij)
* MFC 278320,278336,278830,285621:jhb2016-02-012-0/+258
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add devctl(8): a utility for manipulating new-bus devices. Note that this version does not include the 'suspend' and 'resume' commands present in HEAD as those depend on larger changes to the suspend and resume code in the kernel. 278320: Add a new device control utility for new-bus devices called devctl. This allows the user to request administrative changes to individual devices such as attach or detaching drivers or disabling and re-enabling devices. - Add a new /dev/devctl2 character device which uses ioctls for device requests. The ioctls use a common 'struct devreq' which is somewhat similar to 'struct ifreq'. - The ioctls identify the device to operate on via a string. This string can either by the device's name, or it can be a bus-specific address. (For unattached devices, a bus address is the only way to locate a device.) Bus drivers register an eventhandler to claim unrecognized device names that the driver recognizes as a valid address. Two buses currently support addresses: ACPI recognizes any device in the ACPI namespace via its full path starting with "\" and the PCI bus driver recognizes an address specification of 'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector strings supported by pciconf). - To make it easier to cut and paste, change the PnP location string in the PCI bus driver to output a full PCI selector string rather than 'slot=<slot> function=<func>'. - Add a devctl(3) interface in libdevctl which provides a wrapper around the ioctls and is the preferred interface for other userland code. - Add a devctl(8) program which is a simple wrapper around the requests supported by devctl(3). - Add a resource_unset_value() function that can be used to remove a hint from the kernel environment. This is used to clear a hint.<driver>.<unit>.disabled hint when re-enabling a boot-time disabled device. 278336: Unbreak the build (memchr is explicitly required by devctl(9) after r278320) 278830: install the man page... 285621: Fix formatting. Approved by: re (marius)
* MFC r293349:kib2016-01-281-52/+47
| | | | Convert tty common code to use make_dev_s().
* MFC r293346:kib2016-01-281-23/+75
| | | | Provide yet another KPI for cdev creation, make_dev_s(9).
* MFC: r294362, r294414, r294753marius2016-01-271-35/+67
| | | | | | | | | | | | - Fix tty_drain() and, thus, TIOCDRAIN of the current tty(4) incarnation to actually wait until the TX FIFOs of UARTs have be drained before returning. This is done by bringing the equivalent of the TS_BUSY flag found in the previous implementation back in an ABI-preserving way. Reported and tested by: Patrick Powell - Make the code consistent with itself style-wise and bring it closer to style(9). - Mark unused arguments as such. - Make the ttystates table const.
* MFC r293458:markj2016-01-261-7/+26
| | | | Prevent cv_waiters wraparound.
* MFC r293045, r293046:ian2016-01-241-3/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the 'env' directive described in config(5) work on all architectures, providing compiled-in static environment data that is used instead of any data passed in from a boot loader. Previously 'env' worked only on i386 and arm xscale systems, because it required the MD startup code to examine the global envmode variable and decide whether to use static_env or an environment obtained from the boot loader, and set the global kern_envp accordingly. Most startup code wasn't doing so. Making things even more complex, some mips startup code uses an alternate scheme that involves calling init_static_kenv() to pass an empty buffer and its size, then uses a series of kern_setenv() calls to populate that buffer. Now all MD startup code calls init_static_kenv(), and that routine provides a single point where envmode is checked and the decision is made whether to use the compiled-in static_kenv or the values provided by the MD code. The routine also continues to serve its original purpose for mips; if a non-zero buffer size is passed the routine installs the empty buffer ready to accept kern_setenv() values. Now if the size is zero, the provided buffer full of existing env data is installed. A NULL pointer can be passed if the boot loader provides no env data; this allows the static env to be installed if envmode is set to do so. Most of the work here is a near-mechanical change to call the init function instead of directly setting kern_envp. A notable exception is in xen/pv.c; that code was originally installing a buffer full of preformatted env data along with its non-zero size (like mips code does), which would have allowed kern_setenv() calls to wipe out the preformatted data. Now it passes a zero for the size so that the buffer of data it installs is treated as non-writeable. Also, revert accidental change that snuck into r293045.
OpenPOWER on IntegriCloud