summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* MFC r322405, r322406:markj2018-02-212-10/+10
| | | | | | | Modify vm_page_grab_pages() to handle VM_ALLOC_NOWAIT, use it in sendfile_swapin(). (cherry picked from commit 00ffd58e267b0466241a684db7dbfd7f2fecbf80)
* MFC r322296alc2018-02-211-9/+7
| | | | | | | | | | Introduce vm_page_grab_pages(), which is intended to replace loops calling vm_page_grab() on consecutive page indices. Besides simplifying the code in the caller, vm_page_grab_pages() allows for batching optimizations. For example, the current implementation replaces calls to vm_page_lookup() on consecutive page indices by cheaper calls to vm_page_next(). (cherry picked from commit 9d710dfe3f1905122f3d9e3c84da8e4dc03363ee)
* MFC r319873:kib2018-02-196-15/+15
| | | | | | | Move struct syscall_args syscall arguments parameters container into struct thread. (cherry picked from commit 985b26c6741218c134a15526fd32b736bd73fa8a)
* Merge remote-tracking branch 'origin/releng/11.1' into RELENG_2_4Renato Botelho2017-11-162-7/+9
|\
| * Properly bzero kldstat structure to prevent information leak. [SA-17:10]gordon2017-11-151-5/+7
| | | | | | | | | | | | Approved by: so Security: FreeBSD-SA-17:10.kldstat Security: CVE-2017-1088
| * Fix kernel data leak via ptrace(PT_LWPINFO). [SA-17:08]gordon2017-11-151-2/+2
| | | | | | | | | | | | Approved by: so Security: FreeBSD-SA-17:08.ptrace Security: CVE-2017-1086
* | Add config_intrhook_oneshot(): schedule an intrhook function and unregisterian2017-10-171-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it automatically after it runs. The config_intrhook mechanism allows a driver to stall the boot process until device(s) required for booting are available, by not allowing system inits to proceed until all intrhook functions have been unregistered. Virtually all existing code simply unregisters from within the hook function when it gets called. This new function makes that common usage more convenient. Instead of allocating and filling in a struct, passing it to a function that might (in theory) fail, and checking the return code, now a driver can simply call this cannot-fail routine, passing just the intrhook function and its arg. Differential Revision: https://reviews.freebsd.org/D11963 (cherry picked from commit 3dabf0d77785be405b3aa27de0590c5addd533dc)
* | Fix a panic during boot caused by inadequate locking of some vt(4) driverjtl2017-10-111-3/+5
|/ | | | | | | | | | | | | | | | | | | | | data structures. vt_change_font() calls vtbuf_grow() to change some vt driver data structures. It uses TF_MUTE to prevent the console from trying to use those data structures while it changes them. During the early stage of the boot process, the vt driver's tc_done routine uses those data structures; however, it is currently called outside the TF_MUTE check. Move the tc_done routine inside the locked TF_MUTE check. PR: 217282 Reviewed by: ed, ray Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D9709 (cherry picked from commit 0f9beefcb0a50b15c03a0c62dc0f82cbaa001850)
* Fix OpenSSH Denial of Service vulnerability. [SA-17:06]releng/11.1delphij2017-08-101-0/+2
| | | | | | | | Fix VNET kernel panic with asynchronous I/O. [EN-17:07] Fix pf(4) housekeeping thread causes kernel panic. [EN-17:08] Approved by: so
* MFC r320619 MFS r320863:kib2017-07-101-9/+8
| | | | | | Resolve confusion between different error code spaces. Approved by: re (delphij)
* MFC r320422:kib2017-07-041-1/+1
| | | | | | Do not ignore an error from vm_mmap_object(). Approved by: re (delphij)
* MFC r315518alc2017-06-281-4/+6
| | | | | | | | | | | | | | Avoid unnecessary calls to vm_map_protect() in elf_load_section(). Typically, when elf_load_section() unconditionally passed VM_PROT_ALL to elf_map_insert(), it was needlessly enabling execute access on the mapping, and it would later have to call vm_map_protect() to correct the mapping's access rights. Now, instead, elf_load_section() always passes its parameter "prot" to elf_map_insert(). So, elf_load_section() must only call vm_map_protect() if it needs to remove the write access that was temporarily granted to perform a copyout(). Approved by: re (kib)
* MFC r320108:kib2017-06-261-1/+3
| | | | | | | Allow negative aio_offset only for the read and write LIO ops on device nodes. Approved by: re (marius)
* MFC r320038:kib2017-06-231-7/+8
| | | | | | Style. Approved by: re (gjb)
* MFC r320124:markj2017-06-223-9/+13
| | | | | | Fix the !TD_IS_IDLETHREAD(curthread) locking assertions. Approved by: re (kib)
* MFC r319916:kib2017-06-201-1/+0
| | | | | | Remove stray return. Approved by: re (marius)
* MFC r319540alc2017-06-151-2/+2
| | | | | | | The data type returned by vmoff() is too narrow in its range. This could break the transmission of files longer than 4 GB on 32-bit architectures. Approved by: re (gjb)
* MFC r318995alc2017-06-151-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In r118390, the swap pager's approach to striping swap allocation over multiple devices was changed. However, swapoff_one() was not fully and correctly converted. In particular, with r118390's introduction of a per- device blist, the maximum swap block size, "dmmax", became irrelevant to swapoff_one()'s operation. Moreover, swapoff_one() was performing out-of- range operations on the per-device blist that were silently ignored by blist_fill(). This change corrects both of these problems with swapoff_one(), which will allow us to potentially increase MAX_PAGEOUT_CLUSTER. Previously, swapoff_one() would panic inside of blist_fill() if you increased MAX_PAGEOUT_CLUSTER. MFC r319001 After r118390, the variable "dmmax" was neither the correct strip size nor the correct maximum block size. Moreover, after r318995, it serves no purpose except to provide information to user space through a read- sysctl. This change eliminates the variable "dmmax" but retains the sysctl. It also corrects the value returned by the sysctl. MFC r319604 Halve the memory being internally allocated by the blist allocator. In short, half of the memory that is allocated to implement the radix tree is wasted because we did not change "u_daddr_t" to be a 64-bit unsigned int when we changed "daddr_t" to be a 64-bit (signed) int. (See r96849 and r96851.) MFC r319612 When the function blist_fill() was added to the kernel in r107913, the swap pager used a different scheme for striping the allocation of swap space across multiple devices. And, although blist_fill() was intended to support fill operations with large counts, the old striping scheme never performed a fill larger than the stripe size. Consequently, the misplacement of a sanity check in blst_meta_fill() went undetected. Now, moving forward in time to r118390, a new scheme for striping was introduced that maintained a blist allocator per device, but as noted in r318995, swapoff_one() was not fully and correctly converted to the new scheme. This change completes what was started in r318995 by fixing the underlying bug in blst_meta_fill() that stops swapoff_one() from simply performing a single blist_fill() operation. MFC r319627 Starting in r118390, swaponsomething() began to reserve the blocks at the beginning of a swap area for a disk label. However, neither r118390 nor r118544, which increased the reservation from one to two blocks, correctly accounted for these blocks when updating the variable "swap_pager_avail". This change corrects that error. MFC r319655 Originally, this file could be compiled as a user-space application for testing purposes. However, over the years, various changes to the kernel have broken this feature. This revision applies some fixes to get user- space compilation working again. There are no changes in this revision to code that is used by the kernel. Approved by: re (kib)
* MFC r318765:allanjude2017-06-113-6/+25
| | | | | | Allow cpuset_{get,set}affinity in capabilities mode Approved by: re (marius)
* MFC r319518:kib2017-06-101-0/+2
| | | | | | | | | | | Ensure that cached struct thread does not keep spurious td_su reference on an UFS mount point. MFC r319519: Clean possible td_su reference on the struct mount being unmounted as the last step of ffs_unmount(). Approved by: re (gjb)
* MFC r319167:mjg2017-06-011-4/+4
| | | | mtx: fix whitespace damage in _mtx_trylock_flags_
* MFC r318476, r318478:markj2017-05-251-1/+1
| | | | Fix up some kern_yield() usages.
* MFC r318191:markj2017-05-251-2/+2
| | | | Let ptracestop() suspend threads sleeping in an SBDRY section.
* move p_sigqueue to the end of struct procbadger2017-05-231-6/+6
| | | | | | | | | | | | | | In order to preserve KBI in stable branches, replace the existing p_sigqueue slot with padding and move the expanded (as of r315949) p_sigqueue to the end of the struct. This is a repeat of r317529 (which concerned td_sigqueue in struct thread) for p_sigqueue in struct proc. Virtualbox modules (and possibly others) are affected without this fix. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D10843
* MFC r308474, r308691, r309203, r309365, r309703, r309898, r310720,markj2017-05-232-7/+4
| | | | | r308489, r308706: Add PQ_LAUNDRY and remove PG_CACHED pages.
* MFC r318243:kib2017-05-191-1/+3
| | | | | | | Do not wake up sleeping thread in reschedule_signals() if the signal is blocked. The spurious wakeup might result in spurious EINTR. PR: 219228
* MFC r317348:trasz2017-05-181-0/+2
| | | | Make it possible to terminate "show lockedbufs" by pressing "q".
* MFC r316941:trasz2017-05-181-4/+14
| | | | | Don't try to write out bufs that have already failed with ENXIO. This fixes some panics after disconnecting mounted disks.
* MFC r317784:mjg2017-05-161-5/+2
| | | | | | | cache: stop holding the ncneg_hot lock across purging Only non-hot entries are purged so the lock is not needed in the first place. This saves one lock/unlock pair.
* MFC r315685: tighten buffer bounds in imgact_binmisc_populate_interpemaste2017-05-151-1/+1
| | | | | | | | | | | | We must ensure there's space for the terminating null in the temporary buffer in imgact_binmisc_populate_interp(). Note that there's no buffer overflow here because xbe->xbe_interpreter's length and null termination is checked in imgact_binmisc_add_entry() before imgact_binmisc_populate_interp() is called. However, the latter should correctly enforce its own bounds. Sponsored by: The FreeBSD Foundation
* MFC: r317982marius2017-05-141-10/+2
| | | | | | | | | | | | | | | | | | - Also outside of the KOBJOPLOOKUP macro - which in turn is used by the code auto-generated for *.m - kobj_lookup_method(9) is useful; for example in back-ends or base class device drivers in order to determine whether a default method has been overridden. Thus, allow for the kobj_method_t pointer argument - used by KOBJOPLOOKUP in order to update the cache entry - of kobj_lookup_method(9), to be NULL. Actually, that pointer is redundant as it's just set to the same kobj_method_t that the kobj_lookup_method(9) function returns in the first place, but probably it serves to reduce the number of instructions generated for KOBJOPLOOKUP. - For the same reason, move updating kobj_lookup_{hits,misses} (if KOBJ_STATS is defined) from kobj_lookup_method(9) to KOBJOPLOOKUP. As a side-effect, this gets rid of the convoluted approach of always incrementing kobj_lookup_hits in KOBJOPLOOKUP and then in case of a cache miss, decrementing it in kobj_lookup_method(9) again.
* MFC r316874: restore ability to shutdown(2) datagram sockets.sobomax2017-05-141-5/+20
|
* MFC r317845-r317846brooks2017-05-121-8/+18
| | | | | | | | | | | | | | | | | | | | | | r317845: Provide a freebsd32 implementation of sigqueue() The previous misuse of sys_sigqueue() was sending random register or stack garbage to 64-bit targets. The freebsd32 implementation preserves the sival_int member of value when signaling a 64-bit process. Document the mixed ABI implementation of union sigval and the incompability of sival_ptr with pointer integrity schemes. Reviewed by: kib, wblock Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D10605 r317846: Regen post r317845. MFC with: r317845 Sponsored by: DARPA, AFRL
* MFC 313407,313449: Copy ELF machine/flags from binaries to core dumps.jhb2017-05-112-6/+6
| | | | | | | | | | | | | | | | | 313407: Copy the e_machine and e_flags fields from the binary into an ELF core dump. In the kernel, cache the machine and flags fields from ELF header to use in the ELF header of a core dump. For gcore, the copy these fields over from the ELF header in the binary. This matters for platforms which encode ABI information in the flags field (such as o32 vs n32 on MIPS). 313449: Trim trailing whitespace (mostly introduced in r313407). Sponsored by: DARPA / AFRL
* MFC r317523:kib2017-05-111-0/+51
| | | | Add asserts to verify stability of struct proc and struct thread layouts.
* MFC 313999: Consolidate statements to initialize files.jhb2017-05-111-30/+26
| | | | | | | | | | | | | | Previously, the first lines of various generated files from system call tables were generated in two sections. Some of the initialization was done in BEGIN, and the rest was done when the first line was encountered. The main reason for this split before r313564 was that most of the initialization done in the second section depended on the $FreeBSD$ tag extracted from the system call table. Now that the $FreeBSD$ tag is no longer used, consolidate all of the file initialization in the BEGIN section. This change was tested by confirming that the content of generated files did not change.
* MFC 315323: Use UMA_ALIGN_PTR instead of sizeof(void *) for zone alignment.jhb2017-05-111-1/+1
| | | | | | | uma_zcreate()'s alignment argument is supposed to be sizeof(foo) - 1, and uma.h provides a set of helper macros for common types. Passing sizeof(void *) results in all of the members being misaligned triggering unaligned access faults on certain architectures (notably MIPS).
* MFC 313564:jhb2017-05-103-12/+6
| | | | | | | | | | | | | Drop the "created from" line from files generated by makesyscalls.sh. This information is less useful when the generated files are included in source control along with the source. If needed it can be reconstructed from the $FreeBSD$ tag in the generated file. Removing this information from the generated output permits committing the generated files along with the change to the system call master list without having inconsistent metadata in the generated files. Regenerate the affected files along with the MFC.
* MFC r315526vangyzen2017-05-015-33/+147
| | | | | | | | | | | | | | | | | | | | | | Add clock_nanosleep() Add a clock_nanosleep() syscall, as specified by POSIX. Make nanosleep() a wrapper around it. Attach the clock_nanosleep test from NetBSD. Adjust it for the FreeBSD behavior of updating rmtp only when interrupted by a signal. I believe this to be POSIX-compliant, since POSIX mentions the rmtp parameter only in the paragraph about EINTR. This is also what Linux does. (NetBSD updates rmtp unconditionally.) Copy the whole nanosleep.2 man page from NetBSD because it is complete and closely resembles the POSIX description. Edit, polish, and reword it a bit, being sure to keep any relevant text from the FreeBSD page. Regenerate syscall files. Relnotes: yes Sponsored by: Dell EMC
* MFC r316770:ae2017-04-211-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clear h/w csum flags on mbuf handled by UDP. When checksums of received IP and UDP header already checked, UDP uses sbappendaddr_locked() to pass received data to the socket. sbappendaddr_locked() uses given mbuf as is, and if NIC supports checksum offloading, mbuf contains csum_data and csum_flags that were calculated for already stripped headers. Some NICs support only limited checksums offloading and do not use CSUM_PSEUDO_HDR flag, and csum_data contains some value that UDP/TCP should use for pseudo header checksum calculation. When L2TP is used for tunneling with mpd5, ng_ksocket receives mbuf with filled csum_flags and csum_data, that were calculated for outer headers. When L2TP header is stripped, a packet that was tunneled goes to the IP layer and due to presence of csum_flags (without CSUM_PSEUDO_HDR) and csum_data, the UDP/TCP checksum check fails for this packet. Reported by: Irina Liakh <spell at itl ua> Tested by: Irina Liakh <spell at itl ua> MFC r316822,316823: Rework r316770 to make it protocol independent and general, like we do for streaming sockets. And do more cleanup in the sbappendaddr_locked_internal() to prevent leak information from existing mbuf to the one, that will be possible created later by netgraph. Suggested by: glebius Tested by: Irina Liakh <spell at itl ua>
* MFC r303863:smh2017-04-141-738/+8
| | | | | | Move IPv4 & IPv6 specific jail functions to netinet and netinet6 files. Sponsored by: Multiplay
* MFC r315960: dtrace sched:::preempt should fire only when there is preemptionavg2017-04-141-1/+5
|
* MFC r315851: move thread switch tracing from mi_switch to sched_switchavg2017-04-143-19/+28
|
* MFC r316528:kib2017-04-121-8/+10
| | | | Add V_VMIO flag for vinvalbuf(9).
* MFC r315289:markj2017-04-111-2/+6
| | | | When draining a callout, don't clear CALLOUT_ACTIVE while it is running.
* Improvements for the brand detection and prioritization.kib2017-04-061-7/+20
| | | | | | | | | | | | | | | | | | | MFC r315701 (by ed): Set the interpreter path to /nonexistent. MFC r315749: Adjust r314851 to not require every brand to specify interpreter path. MFC r315753: Add a flag BI_BRAND_ONLY_STATIC to specify that the brand only matches static binaries. MFC r315754: Update r315753 with the proper flag name. MFC r316211: A followup to r315749, two more places where brand->interp_path was accessed unconditionally.
* MFC r315860:ed2017-04-061-1/+2
| | | | | | | | | Don't require the presence of the compat_3_brand. The existing ELF image activator requires the brandinfo to provide such a string unconditionally, even if the executable format in question doesn't use this type of branding. Skip matching when it's a null pointer.
* MFC r316497:brooks2017-04-051-2/+1
| | | | | | | | | | | | Correct a kernel stack leak in 32-bit compat when vfc_name is short. Don't zero unused pointer members again. Per discussion with secteam we are not issuing an advisory for this issue as we have no current evidence it leaks exploitable information. Reviewed by: rwatson, glebius, delphij Sponsored by: DARPA, AFRL
* Merge r315910:glebius2017-04-031-4/+3
| | | | | | | | | | Make sendfile(2) more robust against file change. This fixes a possible crash when the file shrinks. This also fixes sendfile(2) not sending more data in a case when the file grows, and the request is open-ended or specifies a size that is greater than old file size. PR: 217789 Reviewed by: gallatin
* MFC r315699:ngie2017-03-291-1/+2
| | | | | | | | | Print out name of non-dynamic sysctl in sysctl_remove_oid_locked This will provide a slightly better smoking gun than just stating "can't remove non-dynamic nodes!" when calling sysctl_ctx_free(9) and sysctl_remove_{name,oid}(9) with a non-dynamic (likely static) sysctl.
OpenPOWER on IntegriCloud