summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_page.h
Commit message (Collapse)AuthorAgeFilesLines
* Rename vm_pageq_requeue() to vm_page_requeue() on account of its recentalc2008-03-191-1/+1
| | | | migration to vm/vm_page.c.
* Almost seven years ago, vm/vm_page.c was split into three parts:alc2008-03-181-4/+1
| | | | | | | | | | | | vm/vm_contig.c, vm/vm_page.c, and vm/vm_pageq.c. Today, vm/vm_pageq.c has withered to the point that it contains only four short functions, two of which are only used by vm/vm_page.c. Since I can't foresee any reason for vm/vm_pageq.c to grow, it is time to fold the remaining contents of vm/vm_pageq.c back into vm/vm_page.c. Add some comments. Rename one of the functions, vm_pageq_enqueue(), that is now static within vm/vm_page.c to vm_page_enqueue(). Eliminate PQ_MAXCOUNT as it no longer serves any purpose.
* Correct an error of omission in the reimplementation of the pagealc2007-09-271-1/+1
| | | | | | | | | | | | | | cache: vm_object_page_remove() should convert any cached pages that fall with the specified range to free pages. Otherwise, there could be a problem if a file is first truncated and then regrown. Specifically, some old data from prior to the truncation might reappear. Generalize vm_page_cache_free() to support the conversion of either a subset or the entirety of an object's cached pages. Reported by: tegge Reviewed by: tegge Approved by: re (kensmith)
* Change the management of cached pages (PQ_CACHE) in two fundamentalalc2007-09-251-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)
* Update a comment describing the page queues.alc2007-07-131-6/+7
| | | | Approved by: re (hrs)
* Enable the new physical memory allocator.alc2007-06-161-45/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allocator uses a binary buddy system with a twist. First and foremost, this allocator is required to support the implementation of superpages. As a side effect, it enables a more robust implementation of contigmalloc(9). Moreover, this reimplementation of contigmalloc(9) eliminates the acquisition of Giant by contigmalloc(..., M_NOWAIT, ...). The twist is that this allocator tries to reduce the number of TLB misses incurred by accesses through a direct map to small, UMA-managed objects and page table pages. Roughly speaking, the physical pages that are allocated for such purposes are clustered together in the physical address space. The performance benefits vary. In the most extreme case, a uniprocessor kernel running on an Opteron, I measured an 18% reduction in system time during a buildworld. This allocator does not implement page coloring. The reason is that superpages have much the same effect. The contiguous physical memory allocation necessary for a superpage is inherently colored. Finally, the one caveat is that this allocator does not effectively support prezeroed pages. I hope this is temporary. On i386, this is a slight pessimization. However, on amd64, the beneficial effects of the direct-map optimization outweigh the ill effects. I speculate that this is true in general of machines with a direct map. Approved by: re
* Define every architecture as either VM_PHYSSEG_DENSE oralc2007-05-051-2/+20
| | | | | | | | | | | | | | | | | | | | VM_PHYSSEG_SPARSE depending on whether the physical address space is densely or sparsely populated with memory. The effect of this definition is to determine which of two implementations of vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy implementation is obtained by defining VM_PHYSSEG_DENSE, and a new implementation that trades off time for space is obtained by defining VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64 allows the entirety of my Itanium 2's memory to be used. Previously, only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on sparc64 allows USIIIi-based systems to boot without crashing. This change is a combination of Nathan Whitehorn's patch and my own work in perforce. Discussed with: kmacy, marius, Nathan Whitehorn PR: 112194
* Change the way that unmanaged pages are created. Specifically,alc2007-02-251-1/+0
| | | | | | | | | | | | | | immediately flag any page that is allocated to a OBJT_PHYS object as unmanaged in vm_page_alloc() rather than waiting for a later call to vm_page_unmanage(). This allows for the elimination of some uses of the page queues lock. Change the type of the kernel and kmem objects from OBJT_DEFAULT to OBJT_PHYS. This allows us to take advantage of the above change to simplify the allocation of unmanaged pages in kmem_alloc() and kmem_malloc(). Remove vm_page_unmanage(). It is no longer used.
* Change the page's CLEANCHK flag from being a page queue mutex synchronizedalc2007-02-221-1/+1
| | | | flag to a vm object mutex synchronized flag.
* Replace PG_BUSY with VPO_BUSY. In other words, changes to the page'salc2006-10-221-3/+3
| | | | | busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.
* Make vm_page_release_contig() static.alc2006-09-031-1/+0
|
* Refactor vm_page_sleep_if_busy() so that the test for a busy page isalc2006-08-271-1/+22
| | | | | | inlined and a procedure call is made in the rare case, i.e., when it is necessary to sleep. In this case, inlining the test actually makes the kernel smaller.
* The return value from vm_pageq_add_new_page() is not used. Eliminate it.alc2006-08-251-1/+1
|
* Reimplement the page's NOSYNC flag as an object-synchronized instead of aalc2006-08-131-1/+1
| | | | | | | | | | page queues-synchronized flag. Reduce the scope of the page queues lock in vm_fault() accordingly. Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the scope of the page queues lock. Reviewed by: tegge Additionally, eliminate an unnecessary dereference in computing the argument that is passed to vm_object_set_writeable_dirty().
* Introduce a field to struct vm_page for storing flags that arealc2006-08-091-2/+10
| | | | | | | | | | | | | | | | synchronized by the lock on the object containing the page. Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively. Eliminate the assertion that the page queues lock is held in vm_page_io_finish(). Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().
* With the recent changes to the implementation of page coloring, thealc2006-01-241-4/+0
| | | | | the option PQ_NOOPT is used exclusively by vm_pageq.c. Thus, the include of opt_vmpage.h can be removed from vm_page.h.
* MI changes:netchild2005-12-311-67/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)
* Don't perform a nested include of opt_vmpage.h if LIBMEMSTAT is defined,rwatson2005-08-041-1/+1
| | | | | | | as opt_vmpage.h will not be available to user space library builds. A similar existing check is present for KLD_MODULE for similar reasons. MFC after: 3 days
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Note that access to the page's busy count is synchronized by the containingalc2004-12-271-1/+1
| | | | object's lock.
* Introduce VM_ALLOC_NOBUSY, an option to vm_page_alloc() and vm_page_grab()alc2004-10-241-0/+1
| | | | | | | | that indicates that the caller does not want a page with its busy flag set. In many places, the global page queues lock is acquired and released just to clear the busy flag on a just allocated page. Both the allocation of the page and the clearing of the busy flag occur while the containing vm object is locked. So, the busy flag might as well never be set.
* Move the cow field between wire_count and hold_count. This is themarcel2004-08-221-1/+1
| | | | | | | | position that is 64-bit aligned and makes sure that the valid and dirty fields are also 64-bit aligned. This means that if PAGE_SIZE is 32K, the size of the vm_page structure is only increased by 8 bytes instead of 16 bytes. More importantly, the vm_page structure is either 120 or 128 bytes on ia64. These are "interesting" sizes.
* Reimplement contigmalloc(9) with an algorithm which stands a greatly-green2004-07-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | improved chance of working despite pressure from running programs. Instead of trying to throw a bunch of pages out to swap and hope for the best, only a range that can potentially fulfill contigmalloc(9)'s request will have its contents paged out (potentially, not forcibly) at a time. The new contigmalloc operation still operates in three passes, but it could potentially be tuned to more or less. The first pass only looks at pages in the cache and free pages, so they would be thrown out without having to block. If this is not enough, the subsequent passes page out any unwired memory. To combat memory pressure refragmenting the section of memory being laundered, each page is removed from the systems' free memory queue once it has been freed so that blocking later doesn't cause the memory laundered so far to get reallocated. The page-out operations are now blocking, as it would make little sense to try to push out a page, then get its status immediately afterward to remove it from the available free pages queue, if it's unlikely to have been freed. Another change is that if KVA allocation fails, the allocated memory segment will be freed and not leaked. There is a sysctl/tunable, defaulting to on, which causes the old contigmalloc() algorithm to be used. Nonetheless, I have been using vm.old_contigmalloc=0 for over a month. It is safe to switch at run-time to see the difference it makes. A new interface has been used which does not require mapping the allocated pages into KVA: vm_page.h functions vm_page_alloc_contig() and vm_page_release_contig(). These are what vm.old_contigmalloc=0 uses internally, so the sysctl/tunable does not affect their operation. When using the contigmalloc(9) and contigfree(9) interfaces, memory is now tracked with malloc(9) stats. Several functions have been exported from kern_malloc.c to allow other subsystems to use these statistics, as well. This invalidates the BUGS section of the contigmalloc(9) manpage.
* Update stale comments regarding page coloring.alc2004-06-051-10/+10
|
* Move the definitions of SWAPBLK_NONE and SWAPBLK_MASK from vm_page.h toalc2004-06-041-8/+0
| | | | | blist.h, enabling the removal of numerous #includes from subr_blist.c. (subr_blist.c and swap_pager.c are the only users of these definitions.)
* Remove a stale comment: PG_DIRTY and PG_FILLED were removed inalc2004-05-301-2/+0
| | | | revisions 1.17 and 1.12 respectively.
* Remove advertising clause from University of California Regent's license,imp2004-04-061-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Eliminate unused arguments from vm_page_startup().alc2004-04-041-1/+1
|
* Remove some long unused definitions.alc2004-03-041-2/+0
|
* - Align a comment within struct vm_page.alc2003-10-251-5/+5
| | | | | - Annotate the vm_page's valid field as synchronized by the containing vm object's lock.
* - Retire vm_pageout_page_free(). Instead, use vm_page_select_cache() fromalc2003-10-221-0/+1
| | | | | | vm_pageout_scan(). Rationale: I don't like leaving a busy page in the cache queue with neither the vm object nor the vm page queues lock held. - Assert that the page is active in vm_pageout_page_stats().
* - Remove some long unused code.alc2003-10-201-1/+0
|
* Retire vm_page_copy(). Its reason for being ended when peter@ modifiedalc2003-10-081-1/+0
| | | | | | pmap_copy_page() et al. to accept a vm_page_t rather than a physical address. Also, this change will facilitate locking access to the vm page's valid field.
* Assert that u_long is at least 64 bits if PAGE_SIZE is 32K.marcel2003-08-251-0/+7
| | | | Suggested by: phk
* Also define VM_PAGE_BITS_ALL for 16K and 32K pages. Make the constantmarcel2003-08-231-5/+7
| | | | unsigned for all page sizes and unsigned long for 32K pages.
* Add support for 16K and 32K page sizes. The valid and dirty mapsmarcel2003-08-231-0/+6
| | | | | | in struct vm_page are defined as u_int for 16K pages and u_long for 32K pages, with the implied assumption that long will at least be 64 bits wide on platforms where we support 32K pages.
* - Add vm_paddr_t, a physical address type. This is required for systemsjake2003-03-251-2/+2
| | | | | | | | | | | | | | | where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
* - Remove vm_page_sleep_busy(). The transition to vm_page_sleep_if_busy(),alc2002-12-191-1/+0
| | | | | | which incorporates page queue and field locking, is complete. - Assert that the page queue lock rather than Giant is held in vm_page_flag_set().
* Remove vm_page_protect(). Instead, use pmap_page_protect() directly.alc2002-11-181-1/+0
|
* Export the function vm_page_splay().alc2002-11-041-0/+1
|
* - Add a new flag to vm_page_alloc, VM_ALLOC_NOOBJ. This tellsjeff2002-11-011-3/+4
| | | | | | | | | | vm_page_alloc not to insert this page into an object. The pindex is still used for colorization. - Rework vm_page_select_* to accept a color instead of an object and pindex to work with VM_PAGE_NOOBJ. - Document other VM_ALLOC_ flags. Reviewed by: peter, jake
* o Reinline vm_page_undirty(), reducing the kernel size. (This revertsalc2002-10-201-1/+11
| | | | a part of vm_page.h revision 1.87 and vm_page.c revision 1.167.)
* Replace the vm_page hash table with a per-vmobject splay tree. There shoulddillon2002-10-181-1/+2
| | | | | | | | | | | | | | | | be no major change in performance from this change at this time but this will allow other work to progress: Giant lock removal around VM system in favor of per-object mutexes, ranged fsyncs, more optimal COMMIT rpc's for NFS, partial filesystem syncs by the syncer, more optimal object flushing, etc. Note that the buffer cache is already using a similar splay tree mechanism. Note that a good chunk of the old hash table code is still in the tree. Alan or I will remove it prior to the release if the new code does not introduce unsolvable bugs, else we can revert more easily. Submitted by: alc (this is Alan's code) Approved by: re
* - Split UMA_ZFLAG_OFFPAGE into UMA_ZFLAG_OFFPAGE and UMA_ZFLAG_HASH.jeff2002-09-181-0/+1
| | | | | | | - Remove all instances of the mallochash. - Stash the slab pointer in the vm page's object pointer when allocating from the kmem_obj. - Use the overloaded object pointer to find slabs for malloced memory.
* o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever sincealc2002-08-251-2/+0
| | | | | | pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
* o Remove the setting and clearing of the PG_MAPPED flag from the alpha andalc2002-08-101-1/+0
| | | | | ia64 pmap. o Remove the PG_MAPPED flag's declaration.
* o Introduce vm_page_sleep_if_busy() as an eventual replacement foralc2002-07-291-0/+1
| | | | | vm_page_sleep_busy(). vm_page_sleep_if_busy() uses the page queues lock.
* o Modify vm_page_grab() to accept VM_ALLOC_WIRED.alc2002-07-281-1/+1
|
* o Remove dead and/or unused code.alc2002-07-201-2/+0
|
* o Introduce an argument, VM_ALLOC_WIRED, that requests vm_page_alloc()alc2002-07-181-1/+5
| | | | | | | | | | to return a wired page. o Use VM_ALLOC_WIRED within Alpha's pmap_growkernel(). Also, because Alpha's pmap_growkernel() calls vm_page_alloc() from within a critical section, specify VM_ALLOC_INTERRUPT instead of VM_ALLOC_SYSTEM. (Only VM_ALLOC_INTERRUPT is implemented entirely with a spin mutex.) o Assert that the page queues mutex is held in vm_page_wire() on Alpha, just like the other platforms.
OpenPOWER on IntegriCloud