summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
...
* Add support to the virtual memory system for configuring machine-alc2009-07-1210-43/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
* When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounterskib2009-07-121-1/+1
| | | | | | | | | | | | non-readable and non-executable map entry, the entry is skipped from wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not set for the map entry, its wired_count is later erronously decremented. vm_map_delete(9) for such map entry stuck in "vmmaps". Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop. Reported by: John Marshall <john.marshall riverwillow com au> Approved by: re (kensmith)
* When forking a vm space that has wired map entries, do not forget tokib2009-07-033-12/+16
| | | | | | | | | charge the objects created by vm_fault_copy_entry. The object charge was set, but reserve not incremented. Reported by: Greg Rivers <gcr+freebsd-current tharned org> Reviewed by: alc (previous version) Approved by: re (kensmith)
* Eliminiate code duplication by calling vm_object_destroy()kib2009-06-281-18/+4
| | | | | | | from vm_object_collapse(). Requested and reviewed by: alc Approved by: re (kensmith)
* This change is the next step in implementing the cache control functionalityalc2009-06-265-6/+16
| | | | | | | | | | | required by video card drivers. Specifically, this change introduces vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all architectures. In addition, this changes adds a vm_cache_mode_t parameter to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the interfaces for allocating mapped kernel memory and physical memory, respectively, with non-default cache modes. In collaboration with: jhb
* Change the type of uio_resid member of struct uio from int to ssize_t.kib2009-06-251-1/+1
| | | | | | | | Note that this does not actually enable full-range i/o requests for 64 architectures, and is done now to update KBI only. Tested by: pho Reviewed by: jhb, bde (as part of the review of the bigger patch)
* Initialize the uip to silence gcc warning that seems to sneak in in somekib2009-06-241-0/+1
| | | | | | build environments. Reported by: alc, bf1783 at googlemail com
* The bits set in a page's dirty mask are a subset of the bits set in itsalc2009-06-242-10/+8
| | | | | | | | valid mask. Consequently, there is no need to perform a bit-wise and of the page's dirty and valid masks in order to determine which parts of a page are dirty and valid. Eliminate an unnecessary #include.
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-2316-61/+575
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Validate the page in one place, dev_pager_getpages(), rather than doing italc2009-06-221-7/+6
| | | | | | in two places, dev_pager_getfake() and dev_pager_updatefake(). Compare a pointer to "NULL" rather than "0".
* Implement a mechanism within vm_phys_alloc_contig() to defer all necessaryalc2009-06-211-9/+20
| | | | | | calls to vdrop() until after the free page queues lock is released. This eliminates repeatedly releasing and reacquiring the free page queues lock each time the last cached page is reclaimed from a vnode-backed object.
* Strive for greater consistency among the places that implement real,alc2009-06-213-13/+18
| | | | | fictious, and contiguous page allocation. Eliminate unnecessary reinitialization of a page's fields.
* Track the kernel mapping of a physical page by a new entry in vm_pagethompsa2009-06-181-2/+1
| | | | | | | | | | | structure. When the page is shared, the kernel mapping becomes a special type of managed page to force the cache off the page mappings. This is needed to avoid stale entries on all ARM VIVT caches, and VIPT caches with cache color issue. Submitted by: Mark Tinguely Reviewed by: alc Tested by: Grzegorz Bernacki, thompsa
* Add support for UMA_SLAB_KERNEL to page_free(). (While I'm here remove analc2009-06-181-2/+4
| | | | unnecessary newline character from the end of two panic messages.)
* Eliminate unnecessary forward declarations.alc2009-06-171-3/+0
|
* Refactor contigmalloc() into two functions: a simple front-end that dealsalc2009-06-172-8/+22
| | | | | | | | | | | | | with the malloc tag and calls a new back-end, kmem_alloc_contig(), that allocates the pages and maps them. The motivations for this change are two-fold: (1) A cache mode parameter will be added to kmem_alloc_contig(). In other words, kmem_alloc_contig() will be extended to support the allocation of memory with caller-specified caching. (2) The UMA allocation function that is used by the two jumbo frames zones can use kmem_alloc_contig() in place of contigmalloc() and thereby avoid having free jumbo frames held by the zone counted as live malloc()ed memory.
* Pass the size of the mapping to contigmapping() as a "vm_size_t" ratheralc2009-06-171-16/+13
| | | | than a "vm_pindex_t". A "vm_size_t" is more convenient for it to use.
* Make the maintenance of a page's valid bits by contigmalloc() more likealc2009-06-172-3/+6
| | | | | | kmem_alloc() and kmem_malloc(). Specifically, defer the setting of the page's valid bits until contigmapping() when the mapping is known to be successful.
* Long, long ago in r27464 special case code for mapping device-backedalc2009-06-142-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory with 4MB pages was added to pmap_object_init_pt(). This code assumes that the pages of a OBJT_DEVICE object are always physically contiguous. Unfortunately, this is not always the case. For example, jhb@ informs me that the recently introduced /dev/ksyms driver creates a OBJT_DEVICE object that violates this assumption. Thus, this revision modifies pmap_object_init_pt() to abort the mapping if the OBJT_DEVICE object's pages are not physically contiguous. This revision also changes some inconsistent if not buggy behavior. For example, the i386 version aborts if the first 4MB virtual page that would be mapped is already valid. However, it incorrectly replaces any subsequent 4MB virtual page mappings that it encounters, potentially leaking a page table page. The amd64 version has a bug of my own creation. It potentially busies the wrong page and always an insufficent number of pages if it blocks allocating a page table page. To my knowledge, there have been no reports of these bugs, hence, their persistance. I suspect that the existing restrictions that pmap_object_init_pt() placed on the OBJT_DEVICE objects that it would choose to map, for example, that the first page must be aligned on a 2 or 4MB physical boundary and that the size of the mapping must be a multiple of the large page size, were enough to avoid triggering the bug for drivers like ksyms. However, one side effect of testing the OBJT_DEVICE object's pages for physical contiguity is that a dubious difference between pmap_object_init_pt() and the standard path for mapping devices pages, i.e., vm_fault(), has been eliminated. Previously, pmap_object_init_pt() would only instantiate the first PG_FICTITOUS page being mapped because it never examined the rest. Now, however, pmap_object_init_pt() uses the new function vm_object_populate() to instantiate them all (in order to support testing their physical contiguity). These pages need to be instantiated for the mechanism that I have prototyped for automatically maintaining the consistency of the PAT settings across multiple mappings, particularly, amd64's direct mapping, to work. (Translation: This change is also being made to support jhb@'s work on the Nvidia feature requests.) Discussed with: jhb@
* Eliminate an unnecessary clearing of a page's dirty bits inalc2009-06-131-1/+2
| | | | phys_pager_getpages().
* Eliminate an unnecessary restriction on the vm object type fromalc2009-06-091-4/+2
| | | | | | vm_map_pmap_enter(). The immediate effect of this change is that automatic prefaulting by mmap() for small mappings is performed on POSIX shared memory objects just the same as it is on ordinary files.
* Eliminate unnecessary obfuscation when testing a page's valid bits.alc2009-06-073-6/+5
|
* Eliminate an unneeded forward declaration. (This should have been removedalc2009-06-061-2/+0
| | | | in revision 1.42.)
* If vm_pager_get_pages() returns VM_PAGER_OK, then there is no need to checkalc2009-06-061-1/+1
| | | | | the page's valid bits. The page is guaranteed to be fully valid. (For the record, this is documented in vm/vm_pager.h's comments.)
* vm_thread_swapin() needn't validate any pages. The pages are alreadyalc2009-06-051-1/+0
| | | | validated by vm_pager_get_pages().
* Simplify contigfree().alc2009-06-051-3/+1
|
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-052-2/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Correct a boundary case error in the management of a page's dirty bits byalc2009-06-021-10/+16
| | | | | | | | shm_dotruncate() and vnode_pager_setsize(). Specifically, if the length of a shared memory object or a file is truncated such that the length modulo the page size is between 1 and 511, then all of the page's dirty bits were cleared. Now, a dirty bit is cleared only if the corresponding block is truncated in its entirety.
* Add an extension to the character device interface that allows characterjhb2009-06-011-58/+47
| | | | | | | | | | | | | | | | | | | | device drivers to use arbitrary VM objects to satisfy individual mmap() requests. - A new d_mmap_single(cdev, &foff, objsize, &object, prot) callback is added to cdevsw. This function is called for each mmap() request. If it returns ENODEV, then the mmap() request will fall back to using the device's device pager object and d_mmap(). Otherwise, the method can return a VM object to satisfy this entire mmap() request via *object. It can also modify the starting offset into this object via *foff. This allows device drivers to use the file offset as a cookie to identify specific VM objects. - vm_mmap_vnode() has been changed to call vm_mmap_cdev() directly when mapping V_CHR vnodes. This avoids duplicating all the cdev mmap handling code and simplifies some of vm_mmap_vnode(). - D_VERSION has been bumped to D_VERSION_02. Older device drivers using D_VERSION_01 are still supported. MFC after: 1 month
* Eliminate a stale comment and the two remaining uses of the "register"alc2009-05-301-6/+2
| | | | keyword in this file.
* Add assertions in two places where a page's valid or dirty bits are changed.alc2009-05-301-0/+10
|
* Change vm_object_page_remove() such that it clears the page's dirty bitsalc2009-05-281-1/+3
| | | | | | when it invalidates the page. Suggested by: tegge
* Revise vm_pageout_scan()'s handling of partially dirty pages. Specifically,alc2009-05-281-8/+9
| | | | | | | | | | | rather than unconditionally making partially dirty pages fully dirty, only make partially dirty pages fully dirty if the pmap says that the page has been modified. (This change is also a small optimization. It eliminate an unnecessary call to pmap_is_modified() on pages that are mapped read only.) Suggested by: tegge
* - back out direct map hackkmacy2009-05-191-6/+1
| | | | - it is no longer needed
* Eliminate a pointless call to pmap_clear_reference() from vm_pageout_scan().alc2009-05-171-1/+2
| | | | | If the page belongs to an object with a reference count of zero, then it can't have any managed mappings on which to clear a reference bit.
* apply band-aid to x86_64 systems with more physical memory than kmem by ↵kmacy2009-05-161-1/+6
| | | | allocating from the direct map
* Eliminate unnecessary clearing of the page's dirty mask from variousalc2009-05-151-5/+6
| | | | | | getpages functions. Eliminate a stale comment.
* Eliminate page queues locking from bufdone_finish() through thealc2009-05-132-0/+46
| | | | | | | | | | | | | | | | | | | | following changes: Rename vfs_page_set_valid() to vfs_page_set_validclean() to reflect what this function actually does. Suggested by: tegge Introduce a new version of vfs_page_set_valid() that does no more than what the function's name implies. Specifically, it does not update the page's dirty mask, and thus it does not require the page queues lock to be held. Update two of the three callers to the old vfs_page_set_valid() to call vfs_page_set_validclean() instead because they actually require the page's dirty mask to be cleared. Introduce vm_page_set_valid(). Reviewed by: tegge
* Eliminate gratuitous clearing of the page's dirty mask.alc2009-05-121-1/+2
|
* Fix a race involving vnode_pager_input_smlfs(). Specifically, in the casealc2009-05-091-23/+10
| | | | | | | | | | | | | | | | | | | that vnode_pager_input_smlfs() zeroes the page, it should not mark the page as valid until after the page is zeroed. Otherwise, the page could be mapped for read access (e.g., by vm_map_pmap_enter()) before the page is zeroed. Reviewed by: tegge Eliminate gratuitous clearing of the page's dirty mask by vnode_pager_input_smlfs(). Instead, assert that the page is clean. Reviewed by: tegge Eliminate some blank lines. Eliminate pointless calls to pmap_clear_modify() and vm_page_undirty() from vnode_pager_input_old(). The page is not mapped. Therefore, it cannot have any page table entries that are modified. Eliminate an incorrect comment from vnode_pager_generic_getpages().
* Eliminate an incorrect comment.alc2009-05-071-2/+0
|
* Eliminate vnode_pager_input_smlfs()'s pointless call to pmap_clear_modify().alc2009-05-041-3/+0
| | | | | The page can't possibly have any modified page table entries because it isn't even mapped.
* Use the acquired reference to the vmspace instead of direct dereferencingkib2009-04-281-1/+1
| | | | | | of p->p_vmspace in a place where it was missed in r191277. Noted by: pluknet gmail com
* Fix typo.kib2009-04-281-2/+2
|
* Eliminate an errant comment.alc2009-04-261-2/+1
| | | | Discussed with: tegge
* Eliminate an archaic band-aid. The immediately preceding comment alreadyalc2009-04-261-5/+3
| | | | | | explains why the band-aid is unnecessary. Suggested by: tegge
* Eliminate unnecessary calls to pmap_clear_modify(). Specifically, callingalc2009-04-252-10/+14
| | | | | | | | | pmap_clear_modify() on a page is pointless if that page is not mapped or it is only mapped for read access. Instead, assert that the page is not mapped or not mapped for write access as appropriate. Eliminate unnecessary clearing of a page's dirty mask. Instead, assert that the page's dirty mask is clear.
* Do not call vm_page_lookup() from the ddb routine, namely from "showkib2009-04-231-19/+13
| | | | | | | | | | | | | | | | | vmopag" implementation. The vm_page_lookup() code modifies splay tree of the object pages, and asserts that object lock is taken. First issue could cause kernel data corruption, and second one instantly panics the INVARIANTS-enabled kernel. Take the advantage of the fact that object->memq is ordered by page index, and iterate over memq to calculate the runs. While there, make the code slightly more style-compliant by moving variables declarations to the right place. Discussed with: jhb, alc Reviewed by: alc MFC after: 2 weeks
* In both pageout oom handler and vm_daemon, acquire the reference tokib2009-04-191-8/+21
| | | | | | | | | | the vmspace of the examined process instead of directly accessing its vmspace, that may change. Also, as an optimization, check for P_INEXEC flag before examining the process. Reported and tested by: pho (previous version) Reviewed by: alc MFC after: 3 week
* Calling pmap_clear_modify() after calling pmap_remove_write() is pointless.alc2009-04-191-1/+0
| | | | | The latter function already clears the modified status from each of the page's mappings.
OpenPOWER on IntegriCloud