summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_page.h
Commit message (Collapse)AuthorAgeFilesLines
* PG_SLAB no longer serves a useful purpose, since m->object is nokib2013-09-171-1/+0
| | | | | | | | longer abused to store pointer to slab. Remove it. Reviewed by: alc Sponsored by: The FreeBSD Foundation Approved by: re (hrs)
* Remove zero-copy sockets code. It only worked for anonymous memory,kib2013-09-161-11/+6
| | | | | | | | | | | | | and the equivalent functionality is now provided by sendfile(2) over posix shared memory filedescriptor. Remove the cow member of struct vm_page, and rearrange the remaining members. While there, make hold_count unsigned. Requested and reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (delphij)
* Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9).kib2013-08-221-1/+0
| | | | | | | | The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation
* Improve pageout flow control to wakeup more frequently and do less work whilejeff2013-08-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | maintaining better LRU of active pages. - Change v_free_target to include the quantity previously represented by v_cache_min so we don't need to add them together everywhere we use them. - Add a pageout_wakeup_thresh that sets the free page count trigger for waking the page daemon. Set this 10% above v_free_min so we wakeup before any phase transitions in vm users. - Adjust down v_free_target now that we're willing to accept more pagedaemon wakeups. This means we process fewer pages in one iteration as well, leading to shorter lock hold times and less overall disruption. - Eliminate vm_pageout_page_stats(). This was a minor variation on the PQ_ACTIVE segment of the normal pageout daemon. Instead we now process 1 / vm_pageout_update_period pages every second. This causes us to visit the whole active list every 60 seconds. Previously we would only maintain the active LRU when we were short on pages which would mean it could be woefully out of date. Reviewed by: alc (slight variant of this) Discussed with: alc, kib, jhb Sponsored by: EMC / Isilon Storage Division
* Different consumers of the struct vm_page abuse pageq member to keepkib2013-08-101-5/+15
| | | | | | | | | | | | | | | | | additional information, when the page is guaranteed to not belong to a paging queue. Usually, this results in a lot of type casts which make reasoning about the code correctness harder. Sometimes m->object is used instead of pageq, which could cause real and confusing bugs if non-NULL m->object is leaked. See r141955 and r253140 for examples. Change the pageq member into a union containing explicitly-typed members. Use them instead of type-punning or abusing m->object in x86 pmaps, uma and vm_page_alloc_contig(). Requested and reviewed by: alc Sponsored by: The FreeBSD Foundation
* Revert the addition of VPO_BUSY and instead update vm_page_replace() tojhb2013-08-091-1/+0
| | | | | | properly unbusy the page. Submitted by: alc
* Add missing 'VPO_BUSY' from r254141 to fix kernel build break.obrien2013-08-091-0/+1
|
* On all the architectures, avoid to preallocate the physical memoryattilio2013-08-091-2/+4
| | | | | | | | | | | | | | | | | | | | | for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
* The soft and hard busy mechanism rely on the vm object lock to work.attilio2013-08-091-34/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
* Split the pagequeues per NUMA domains, and split pageademon processkib2013-08-071-5/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation
* Revise the interface between vm_object_madvise() and vm_page_dontneed() soalc2013-06-101-1/+1
| | | | | | | that pointless calls to pmap_is_modified() can be easily avoided when performing madvise(..., MADV_FREE). Sponsored by: EMC / Isilon Storage Division
* Update a comment.alc2013-06-041-2/+2
|
* Require that the page lock is held, instead of the object lock, whenalc2013-06-031-7/+9
| | | | | | | | | | | | | | | | | | | clearing the page's PGA_REFERENCED flag. Since we are typically manipulating the page's act_count field when we are clearing its PGA_REFERENCED flag, the page lock is already held everywhere that we clear the PGA_REFERENCED flag. So, in fact, this revision only changes some comments and an assertion. Nonetheless, it will enable later changes to object locking in the pageout code. Introduce vm_page_assert_locked(), which completely hides the implementation details of the page lock from the caller, and use it in vm_page_aflag_clear(). (The existing vm_page_lock_assert() could not be used in vm_page_aflag_clear().) Over the coming weeks, I expect that we'll either eliminate or replace the various uses of vm_page_lock_assert() with vm_page_assert_locked(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division
* Simplify the definition of vm_page_lock_assert(). There is no compellingalc2013-05-311-7/+6
| | | | | | | | reason to inline the implementation of vm_page_lock_assert() in the !KLD_MODULES case. Use the same implementation for both KLD_MODULES and !KLD_MODULES. Reviewed by: kib
* The per-page act_count can be made very-easily protected by theattilio2013-04-081-1/+1
| | | | | | | | | per-page lock rather than vm_object lock, without any further overhead. Make the formal switch. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
* Now that vm_page_cache_free() and vm_page_cache_transfer() areattilio2013-02-061-1/+1
| | | | | reimplemented as ranged operations, sync vm_page_is_cached() semantic with HEAD.
* Reduce diffs against HEAD:attilio2013-02-061-1/+1
| | | | Reimplement vm_page_cache_free() as a range operation.
* Reduce diffs against HEAD:attilio2013-02-051-1/+1
| | | | | - Reimplement vm_page_cache_transfer() properly - Remove vm_page_cache_rename() as a subsequent change
* Merge from vmcontentionattilio2013-02-041-59/+154
|\
| * MFCattilio2012-12-111-50/+76
| |\
| | * Update a comment to reflect the elimination of the hold queue in r242300.alc2012-11-171-5/+1
| | |
| | * Move the declaration of vm_phys_paddr_to_vm_page() from vm/vm_page.hkib2012-11-161-2/+0
| | | | | | | | | | | | | | | | | | | | | to vm/vm_phys.h, where it belongs. Requested and reviewed by: alc MFC after: 2 weeks
| | * Explicitely state that M_USE_RESERVE requires M_NOWAIT, using assertion.kib2012-11-161-0/+3
| | | | | | | | | | | | | | | Reviewed by: alc MFC after: 2 weeks
| | * Flip the semantic of M_NOWAIT to only require the allocation to notkib2012-11-141-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions. Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class. Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags(). Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks
| | * Replace the single, global page queues lock with per-queue locks on thealc2012-11-131-16/+37
| | | | | | | | | | | | | | | | | | active and inactive paging queues. Reviewed by: kib
| | * Rework the known mutexes to benefit about staying on their ownattilio2012-10-311-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cache line in order to avoid manual frobbing but using struct mtx_padalign. The sole exception being nvme and sxfge drivers, where the author redefined CACHE_LINE_SIZE manually, so they need to be analyzed and dealt with separately. Reviwed by: jimharris, alc
| | * Replace the page hold queue, PQ_HOLD, by a new page flag, PG_UNHOLDFREE,alc2012-10-291-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | because the queue itself serves no purpose. When a held page is freed, inserting the page into the hold queue has the side effect of setting the page's "queue" field to PQ_HOLD. Later, when the page is unheld, it will be freed because the "queue" field is PQ_HOLD. In other words, PQ_HOLD is used as a flag, not a queue. So, this change replaces it with a flag. To accomodate the new page flag, make the page's "flags" field wider and "oflags" field narrower. Reviewed by: kib
| * | MFCattilio2012-10-221-1/+0
| |\ \ | | |/
| | * Move vm_page_requeue() to the only file that uses it.alc2012-10-131-1/+0
| | | | | | | | | | | | MFC after: 3 weeks
| * | MFCattilio2012-08-271-2/+1
| |\ \ | | |/
| | * Do not leave invalid pages in the object after the short read for akib2012-08-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week
| | * After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
| | * Reduce code duplication and exposure of direct access to structkib2012-08-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week
| * | MFCattilio2012-08-031-8/+79
| |\ \ | | |/
| | * Inline vm_page_aflags_clear() and vm_page_aflags_set().alc2012-08-031-7/+79
| | | | | | | | | | | | | | | Add comments stating that neither these functions nor the flags that they are used to manipulate are part of the KBI.
| | * Eliminate an unneeded declaration. (I should have removed this as partalc2012-07-301-1/+0
| | | | | | | | | | | | of r227568.)
* | | Merge from vmcontentionattilio2012-07-081-7/+31
|\ \ \ | |/ /
| * | MFCattilio2012-06-231-7/+31
| |\ \ | | |/
| | * Selectively inline vm_page_dirty().alc2012-06-201-1/+23
| | |
| | * The page flag PGA_WRITEABLE is set and cleared exclusively by the pmapalc2012-06-161-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks
* | | - Split the cached and resident pages tree into 2 distinct ones.attilio2012-07-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio
* | | Revert r231027 and fix the prototype for vm_radix_remove().attilio2012-06-081-1/+1
|/ / | | | | | | | | | | The target of this is getting at the point where the recovery path is completely removed as we could count on pre-allocation once the path compressed trie is implemented.
* | MFCattilio2012-06-011-14/+3
|\ \ | |/
| * Add a facility to register a range of physical addresses to be usedkib2012-05-121-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for allocation of fictitious pages, for which PHYS_TO_VM_PAGE() returns proper fictitious vm_page_t. The range should be de-registered after consumer stopped using it. De-inline the PHYS_TO_VM_PAGE() since it now carries code to iterate over registered ranges. A hash container might be developed instead of range registration interface, and fake pages could be put automatically into the hash, were PHYS_TO_VM_PAGE() could look them up later. This should be considered before the MFC of the commit is done. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
| * Split the code from vm_page_getfake() to initialize the fake page structkib2012-05-121-0/+1
| | | | | | | | | | | | | | | | | | vm_page into new interface vm_page_initfake(). Handle the case of fake page re-initialization with changed memattr. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
| * Commit the change forgotten in r235356.kib2012-05-121-1/+1
| | | | | | | | | | | | Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
| * Fix mincore(2) so that it reports PG_CACHED pages as resident.alc2012-04-081-0/+1
| | | | | | | | MFC after: 2 weeks
* | MFCattilio2012-04-081-1/+0
|\ \ | |/
| * Staticize vm_page_cache_remove().attilio2012-04-061-1/+0
| | | | | | | | Reviewed by: alc
| * Reduce the frequency that the PowerPC/AIM pmaps invalidate instructionnwhitehorn2012-04-061-0/+4
| | | | | | | | | | | | | | | | caches, by invalidating kernel icaches only when needed and not flushing user caches for shared pages. Suggested by: kib MFC after: 2 weeks
OpenPOWER on IntegriCloud