summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* EN-16:17: virtual memory issues.glebius2016-10-254-50/+58
| | | | | | | | | | | | | | | | | | Due to increased parallelism and optimizations in several parts of the system, the previously latent bugs in VM become much easier to trigger, affecting a significant number of the FreeBSD users. The exact technical details of the issues are provided in the commit messages of the merged revisions, which are listed below with short summaries. r301184 prevent parallel object collapses, fixes object lifecycle r301436 do not leak the vm object lock, fixes overcommit disable r302243 avoid the active object marking for vm.vmtotal sysctl, fixes "vodead" hangs r302513 vm_fault() race with the vm_object_collapse(), fixes spurious SIGSEGV r303291 postpone BO_DEAD, fixes panic on fast vnode reclaim Approved by: so
* MFC 290728:jhb2016-01-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Export various helper variables describing the layout and size of certain kernel structures for use by debuggers. This mostly aids in examining cores from a kernel without debug symbols as a debugger can infer these values if debug symbols are available. One set of variables describes the layout of 'struct linker_file' to walk the list of loaded kernel modules. A second set of variables describes the layout of 'struct proc' and 'struct thread' to walk the list of processes in the kernel and the threads in each process. The 'pcb_size' variable is used to index into the stoppcbs[] array. The 'vm_maxuser_address' is used to distinguish kernel virtual addresses from user addresses. This doesn't have to be perfect, and 'vm_maxuser_address' is a cheap and simple way to differentiate kernel pointers from simple values like TIDs and PIDs. While here, annotate the fields in struct pcb used by kgdb on amd64 and i386 to note that their ABI should be preserved. Annotations for other platforms will be added in the future.
* MFC r291576:kib2015-12-151-97/+84
| | | | | Handle invalid pages found during the sleepable collapse scan, do not free the shadow page swap space. Combine the sleep code.
* MFC r291408:kib2015-12-111-5/+6
| | | | | In vm_pageout_grow_cache(), do not re-try the inactive queue when active queue scan initiated write, to avoid infinite loop.
* MFC r290920:kib2015-12-072-14/+30
| | | | | Raise OOM when pagedaemon is unable to produce a free page in several back-to-back passes.
* MFC r290917:kib2015-12-071-3/+63
| | | | | | Provide the OOM-specific vm_pageout_oom_pagecount() function which estimates the amount of reclamaible memory which could be stolen if the process is killed.
* MFC r290915:kib2015-12-071-1/+2
| | | | | Do not skip a process which has inhibited thread due to the swap-out, in the OOM selection loop.
* MFC r291446:kib2015-12-061-18/+10
| | | | | | | | | | Minor cleanup. Systematically use ANSI C functions definitions. Correct type of the flags argument to the dev_pager_putpages() function. vm_pager_free_nonreq() does not exist in stable/10, this part is not merged.
* MFC r289895:kib2015-11-291-86/+42
| | | | | | | | | | | Reduce the amount of calls to VOP_BMAP() made from the local vnode pager. MFC r291157, r291158: Include the pages before/after the requested page, that fit into the reqblock, into the calculation of the size of run of pages. Tested by: pho
* MFC r287235:markj2015-11-131-54/+25
| | | | Remove weighted page handling from vm_page_advise().
* MFC r289496:kib2015-11-021-5/+17
| | | | | Modify the 'unchanged' calculation bu dereferencing the marker tailq pointers, which is known to belong to the queue.
* MFC r288281alc2015-10-031-28/+13
| | | | | | | | | | | | | The conversion of kmem_alloc_attr() from operating on a vm map to a vmem arena in r254025 introduced a bug in the case when an allocation is only partially successful. Specifically, the vm object lock was not being acquired before freeing the allocated pages. To address this bug, replace the existing code by a call to kmem_unback(). Change the type of a variable in kmem_alloc_attr() so that an allocation of two or more gigabytes won't fail. Replace the error handling code in kmem_back() by a call to kmem_unback().
* MFC r283924vangyzen2015-10-022-0/+13
| | | | | | | | | | | | | | | Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431
* MFC 283624,283630:jhb2015-10-011-0/+137
| | | | | | Export a list of VM objects in the system via a sysctl. The list can be examined via 'vmstat -o'. It can be used to determine which files are using physical pages of memory and how much each is using.
* MFC r266588alc2015-09-271-1/+1
| | | | | | There is no reason to perform the pmap_remove() on the kernel pmap while the kmem object lock is held. Do the pmap_remove() before acquiring the kmem object lock.
* MFC r283795alc2015-09-271-0/+1
| | | | Document vm_page_alloc_contig()'s support for the VM_ALLOC_NODUMP option.
* MFC r288025alc2015-09-271-6/+11
| | | | | | | | | | | | | Correct a non-fatal error in vm_pageout_worker(). vm_pageout_worker() should not assume that vm_pages_needed will remain set while it sleeps. Other threads can clear vm_pages_needed by performing a sufficient number of vm_page_free() calls, e.g., process termination. The effect of this error was that vm_pageout_worker() would free and/or launder pages when, in fact, there was no shortage of free pages. Rewrite a nearby comment to describe all of the possible cases and not just the most common case. The problem being that the comment made the most common case seem like the only case.
* MFC r285282alc2015-09-272-15/+20
| | | | | | | | | | | | | | | The intention of r254304 was to scan the active queue continuously. However, I've observed the active queue scan stopping when there are frequent free page shortages and the inactive queue is steadily refilled by other mechanisms, such as the sequential access heuristic in vm_fault() or madvise(2). To remedy this problem, record the time of the last active queue scan, and always scan a number of pages proportional to the time since the last scan, regardless of whether that last scan was a timeout-triggered ("pass == 0") or free-page-shortage-triggered ("pass > 0") scan. Also, on a timeout-triggered scan, allow a full scan of the active queue when the system is short of inactive pages.
* MFC r287121alc2015-09-271-6/+10
| | | | Testing whether a page is dirty does not require the page lock.
* MFC r284654alc2015-09-271-3/+5
| | | | Avoid pmap_is_modified() on pages that can't be mapped.
* MFC r287944alc2015-09-271-1/+2
| | | | Eliminate (many) unnecessary calls to pmap_remove_all().
* MFC r280957rstone2015-09-173-14/+16
| | | | | | | | | | | Fix integer truncation bug in malloc(9) A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM.
* MFC r286970:rstone2015-09-171-3/+4
| | | | | | | | | | | | | | | | | | Prevent ticks rollover from preventing vm_lowmem event Currently vm_pageout_scan() uses a ticks-based scheme to rate-limit the number of times that the vm_lowmem event will happen. However if no events happen for long enough for ticks to roll over, this leaves us in a long window in which vm_lowmem events will not happen. Replace the use of ticks with time_t to prevent rollover from ever being an issue. Reviewed by: ian MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3439
* MFC 281310, 287567:imp2015-09-161-13/+42
| | | | | | | | | | r287567 | imp | 2015-09-08 11:47:56 -0600 (Tue, 08 Sep 2015) | 16 lines Mark the swap pager as direct dispatch compatible. r281310 | mav | 2015-04-09 07:09:05 -0600 (Thu, 09 Apr 2015) | 4 lines Remove sleeps from geom_up thread on device destruction.
* MFC r287591:kib2015-09-161-10/+4
| | | | | There is no reason in the current kernel to disallow write access to the COW wired entry if the entry permissions allow it. Remove the check.
* MFC r286086:kib2015-08-063-26/+21
| | | | Do not pretend that vm_fault(9) supports unwiring the address.
* MFC r285046:kib2015-08-051-1/+2
| | | | | Account for the main process stack being one page below the highest user address when ABI uses shared page.
* MFC r285878:kib2015-08-012-24/+5
| | | | | | Revert r173708's modifications to vm_object_page_remove(). This fixes inconsistencies encountered by vm_object_unwire() or by the buffer cache when the file is truncated.
* MFC r284207 (by alc):kib2015-07-161-2/+1
| | | | | | | Correct a type error in kmem_unback(). Requested by: alc Approved by: re (gjb)
* MFC r276439 (by alc):kib2015-07-162-47/+159
| | | | | | | | | | | Make the creation of the free lists dynamic, i.e., it is based on the available physical memory at boot time. For amd64 systems with 64 GB or more of physical memory, create free lists for managing pages with physical addresses below 4 GB. PR: 185727 Requested by: alc Approved by: re (gjb)
* MFC r282213:trasz2015-06-215-102/+137
| | | | | | | | | | | | | | | | | | Add kern.racct.enable tunable and RACCT_DISABLED config option. The point of this is to be able to add RACCT (with RACCT_DISABLED) to GENERIC, to avoid having to rebuild the kernel to use rctl(8). MFC r282901: Build GENERIC with RACCT/RCTL support by default. Note that it still needs to be enabled by adding "kern.racct.enable=1" to /boot/loader.conf. Note those two are MFC-ed together, because the latter one changes the name of RACCT_DISABLED option to RACCT_DEFAULT_TO_DISABLED. Should have committed the renaming separately... Relnotes: yes Sponsored by: The FreeBSD Foundation
* MFC 261811,282660,282706:jhb2015-06-068-27/+36
| | | | | | | | | | | | | | | | | | | | | Place VM objects on the object list when created and never remove them. 261811: Fix function name in KASSERT(). 282660: Place VM objects on the object list when created and never remove them. This is ok since objects come from a NOFREE zone and allows objects to be locked while traversing the object list without triggering a LOR. Ensure that objects on the list are marked DEAD while free or stillborn, and that they have a refcount of zero. This required updating most of the pagers to explicitly mark an object as dead when deallocating it. (Only the vnode pager did this previously.) 282706: Satisfy vm_object uma zone destructor requirements after r282660 when vnode object creation raced.
* MFC 281887:jhb2015-06-021-1/+1
| | | | | Reassign copyright statements on several files from Advanced Computing Technologies LLC to Hudson River Trading LLC.
* MFC r283162:kib2015-05-271-2/+3
| | | | | | | | | Set VPO_UNMANAGED on the freed page when insertion of the page into the object queue failed, to satisfy the assertion. MFC r283163: Do grammar fix in the comment to record the right commit message for r283162.
* MFC r282690:kib2015-05-233-5/+47
| | | | | Call uma_reclaim() from the additional pagedaemon thread to reclaim kmem arena address space.
* MFC r282128:kib2015-05-051-0/+4
| | | | | Do not sleep waiting for the MAP_ENTRY_IN_TRANSITION state ending with the vnode locked.
* Revert r281543. It causes a panic/hang early in boot for a number ofscottl2015-04-241-1/+1
| | | | | | | | users, myself included. The original code is likely papering over a larger bug that needs to be explored, but for now get things back to a working state. Obtained from: Netflix, Inc.
* MFC r279400alc2015-04-201-2/+0
| | | | | Eliminate a variable that became unused when VFS_LOCK_GIANT() was eliminated.
* MFC r281162, r281451:dchagin2015-04-151-1/+1
| | | | | | | Use flexible array for per cpu uma_cache to avoid allocating an extra struct uma_cache. PR: 199169
* MFC r281113:dchagin2015-04-121-2/+2
| | | | | | Fix wrong kassert msg in uma. PR: 199172
* MFC r280702: Make swapper release orphaned (lost) GEOM provider.mav2015-04-091-14/+50
| | | | | | Swap device is still reported as enabled, and system still may crash later if some swapped-out kernel pages were lost with the device, but at least GEOM and CAM can now release the lost disk, allowing it to be reconnected.
* MFC r279720alc2015-04-031-1/+1
| | | | | Correct a typo in vm_object_backing_scan() that originated in r254141. Specifically, change a lock acquire into a lock release.
* MFC r280238alc2015-04-021-0/+9
| | | | | | | Fix the root cause of the "vm_reserv_populate: reserv <address> is already promoted" panics. PR: 198163
* MFC r278888:ngie2015-03-221-4/+2
| | | | Some minor style(9) fixes (whitespace + comment)
* Merge r263233 from HEAD to stable/10:rwatson2015-03-191-1/+1
| | | | | | | | | Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. Sponsored by: Google, Inc.
* MFC r279764:kib2015-03-151-1/+1
| | | | Fix function name in the panic message.
* MFC r277649:rstone2015-03-011-0/+3
| | | | | | | | | vmspace_release() may sleep if the last reference is being released, so add a WITNESS_WARN() to catch cases where it is called with a non-sleepable lock held. MFC after: 1 month Sponsored by: Sandvine Inc.
* MFC r277828:kib2015-02-113-3/+11
| | | | | | | | | | | | | Update mtime for tmpfs files modified through memory mapping. MFC r277969: Update both ctime and mtime for writes to tmpfs files. MFC r277972: Remove single-use boolean. MFC r278151: Remove duplicated assignment.
* MFC r277646:kib2015-01-311-10/+16
| | | | Avoid calling vmspace_free() while owning the process lock.
* MFC r277055:kib2015-01-191-4/+0
| | | | Revert r263475: TDP_DEVMEMIO no longer needed.
OpenPOWER on IntegriCloud