summaryrefslogtreecommitdiffstats
path: root/sys/vm/device_pager.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r291446:kib2015-12-061-18/+10
| | | | | | | | | | Minor cleanup. Systematically use ANSI C functions definitions. Correct type of the flags argument to the dev_pager_putpages() function. vm_pager_free_nonreq() does not exist in stable/10, this part is not merged.
* MFC 261811,282660,282706:jhb2015-06-061-0/+2
| | | | | | | | | | | | | | | | | | | | | Place VM objects on the object list when created and never remove them. 261811: Fix function name in KASSERT(). 282660: Place VM objects on the object list when created and never remove them. This is ok since objects come from a NOFREE zone and allows objects to be locked while traversing the object list without triggering a LOR. Ensure that objects on the list are marked DEAD while free or stillborn, and that they have a refcount of zero. This required updating most of the pagers to explicitly mark an object as dead when deallocating it. (Only the vnode pager did this previously.) 282706: Satisfy vm_object uma zone destructor requirements after r282660 when vnode object creation raced.
* MFC r263095:kib2014-03-191-0/+1
| | | | Initialize paddr to handle the case of zero size.
* Different consumers of the struct vm_page abuse pageq member to keepkib2013-08-101-2/+2
| | | | | | | | | | | | | | | | | additional information, when the page is guaranteed to not belong to a paging queue. Usually, this results in a lot of type casts which make reasoning about the code correctness harder. Sometimes m->object is used instead of pageq, which could cause real and confusing bugs if non-NULL m->object is leaked. See r141955 and r253140 for examples. Change the pageq member into a union containing explicitly-typed members. Use them instead of type-punning or abusing m->object in x86 pmaps, uma and vm_page_alloc_contig(). Requested and reviewed by: alc Sponsored by: The FreeBSD Foundation
* On all the architectures, avoid to preallocate the physical memoryattilio2013-08-091-1/+2
| | | | | | | | | | | | | | | | | | | | | for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
* Hide the details for the assertion for VM_OBJECT_LOCK operations.attilio2013-02-211-4/+4
| | | | | | | | Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc
* Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-7/+7
| | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* Switch vm_object lock to be a rwlock.attilio2013-02-201-4/+5
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Fix a bug in the device pager code that can trigger an assertionken2013-01-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in devfs if a particular race condition is hit in the device pager code. This was a side effect of change 227530 which changed the device pager interface to call a new destructor routine for the cdev. That destructor routine, old_dev_pager_dtor(), takes a VM object handle. The object handle is cast to a struct cdev *, and passed into dev_rel(). That works in most cases, except the case in cdev_pager_allocate() where there is a race condition between two threads allocating an object backed by the same device. The loser of the race deallocates its object at the end of the function. The problem is that before inserting the object into the dev_pager_object_list, the object's handle is changed from the struct cdev pointer to the object's own address. This is to avoid conflicts with the winner of the race, which already inserted an object in the list with a handle that is a pointer to the same cdev structure. The object is then passed to vm_object_deallocate(), and eventually makes its way down to old_dev_pager_dtor(). That function passes the handle pointer (which is actually a VM object, not a struct cdev as usual) into dev_rel(). dev_rel() decrements the reference count in the assumed struct cdev (which happens to be 0), and that triggers the assertion in dev_rel() that the reference count is greater than or equal to 0. The fix is to add a cdev pointer to the VM object, and use that pointer when calling the cdev_pg_dtor() routine. vm_object.h: Add a struct cdev pointer to the VM object structure. device_pager.c: In cdev_pager_allocate(), populate the new cdev pointer. In dev_pager_dealloc(), use the new cdev pointer when calling the object's cdev_pg_dtor() routine. Reviewed by: kib Sponsored by: Spectra Logic Corporation MFC after: 1 week
* Move the declaration of vm_phys_paddr_to_vm_page() from vm/vm_page.hkib2012-11-161-0/+1
| | | | | | | to vm/vm_phys.h, where it belongs. Requested and reviewed by: alc MFC after: 2 weeks
* After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-051-0/+1
| | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
* Do not double-reference the found vm object in cdev_pager_lookup().kib2012-05-181-1/+0
| | | | | | | | | | | vm_pager_object_lookup() already referenced the object. Note that there is no in-tree consumers of cdev_pager_lookup(). The only known user of the function is i915 gem driver, which is not yet imported. This should make the KPI change minor. Submitted by: avg MFC after: 1 week
* Add new pager type, OBJT_MGTDEVICE. It provides the device pagerkib2012-05-121-8/+46
| | | | | | | | | | | | | | | | | which carries fictitous managed pages. In particular, the consumers of the new object type can remove all mappings of the device page with pmap_remove_all(). The range of physical addresses used for fake page allocation shall be registered with vm_phys_fictitious_reg_range() interface to allow the PHYS_TO_VM_PAGE() to work in pmap. Most likely, only i386 and amd64 pmaps can handle fictitious managed pages right now. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
* Update the device pager interface, while keeping the compatibilitykib2011-11-151-75/+160
| | | | | | | | | | | | | | | | | | | | | | | | layer for old KPI and KBI. New interface should be used together with d_mmap_single cdevsw method. Device pager can be allocated with the cdev_pager_allocate(9) function, which takes struct cdev_pager_ops, containing constructor/destructor and page fault handler methods supplied by driver. Constructor and destructor, called at the pager allocation and deallocation time, allow the driver to handle per-object private data. The pager handler is called to handle page fault on the vm map entry backed by the driver pager. Driver shall return either the vm_page_t which should be mapped, or error code (which does not cause kernel panic anymore). The page handler interface has a placeholder to specify the access mode causing the fault, but currently PROT_READ is always passed there. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
* Fix a race in the device pager allocation. If another thread won andkib2011-07-301-2/+9
| | | | | | | | | | | | allocated the device pager for the given handle, then the object fictitious pages list and the object membership in the global object list still need to be initialized. Otherwise, dev_pager_dealloc() will traverse uninitialized pointers. Reported and tested by: pho Reviewed by: jhb Approved by: re (kensmith) MFC after: 1 week
* Handle a race between device_pager and devsw in a more graceful manner:attilio2011-07-061-2/+4
| | | | | | | | | return an error code rather than panic the kernel. Sponsored by: Sandvine Incorporated Reviewed by: kib Tested by: pho MFC after: 2 weeks
* Eliminate duplication of the fake page code and zone by the device and sgalc2011-03-111-61/+3
| | | | | | pagers. Reviewed by: jhb
* Explicitly initialize the page's queue field to PQ_NONE instead of relyingalc2011-01-171-0/+1
| | | | | | | | | on PQ_NONE being zero. Redefine PQ_NONE and PQ_COUNT so that a page queue isn't allocated for PQ_NONE. Reviewed by: kib@
* Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that createdkib2010-08-061-6/+7
| | | | | | | | | cdev will never be destroyed. Propagate the flag to devfs vnodes as VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a thread reference on such nodes. In collaboration with: pho MFC after: 1 month
* Eliminate page queues locking around most calls to vm_page_free().alc2010-05-061-4/+0
|
* On Alan's advice, rather than do a wholesale conversion on a singlekmacy2010-04-301-6/+13
| | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
* Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.rnoland2009-12-291-14/+3
| | | | | | | | | | | | | This replaces d_mmap() with the d_mmap2() implementation and also changes the type of offset to vm_ooffset_t. Purge d_mmap2(). All driver modules will need to be rebuilt since D_VERSION is also bumped. Reviewed by: jhb@ MFC after: Not in this lifetime...
* Extend the device pager to support different memory attributes on differentjhb2009-08-281-5/+15
| | | | | | | | | | | | | | | pages in an object. - Add a new variant of d_mmap() currently called d_mmap2() which accepts an additional in/out parameter that is the memory attribute to use for the requested page. - A driver either uses d_mmap() or d_mmap2() for all requests but not both. The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate that the driver provides a d_mmap2() handler instead of d_mmap(). This is done to make the change ABI compatible with existing drivers and MFC'able to 7 and 8. Submitted by: alc MFC after: 1 month
* Change the handling of fictitious pages by pmap_page_set_memattr() onalc2009-07-191-12/+4
| | | | | | | | | | | | | | | | | amd64 and i386. Essentially, fictitious pages provide a mechanism for creating aliases for either normal or device-backed pages. Therefore, pmap_page_set_memattr() on a fictitious page needn't update the direct map or flush the cache. Such actions are the responsibility of the "primary" instance of the page or the device driver that "owns" the physical address. For example, these actions are already performed by pmap_mapdev(). The device pager needn't restore the memory attributes on a fictitious page before releasing it. It's now pointless. Add pmap_page_set_memattr() to the Xen pmap. Approved by: re (kib)
* Add support to the virtual memory system for configuring machine-alc2009-07-121-26/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Validate the page in one place, dev_pager_getpages(), rather than doing italc2009-06-221-7/+6
| | | | | | in two places, dev_pager_getfake() and dev_pager_updatefake(). Compare a pointer to "NULL" rather than "0".
* Strive for greater consistency among the places that implement real,alc2009-06-211-0/+1
| | | | | fictious, and contiguous page allocation. Eliminate unnecessary reinitialization of a page's fields.
* Save previous content of the td_fpop before storing the currentkib2008-09-261-0/+6
| | | | | | | | | | | | | | | filedescriptor into it. Make sure that td_fpop is NULL when calling d_mmap from dev_pager_getpages(). Change guards against td_fpop field being non-NULL with private state for another device, and against sudden clearing the td_fpop. This could occur when either a driver method calls another driver through the filedescriptor operation, or a page fault happen while driver is writing to a memory backed by another driver. Noted by: rwatson Tested by: rnoland MFC after: 3 days
* Preset a device object's alignment ("pg_color") based upon thealc2008-05-171-1/+5
| | | | | | physical address of the device's memory. This enables pmap_align_superpage() to propose a virtual address for mapping the device memory that permits the use of superpage mappings.
* Remove comment that is no longer quite true.kib2007-08-181-3/+0
| | | | | Noted by: alc Approved by: re (kensmith)
* Protect the creation of the device pager with the dev_pager_mtx. Lookupkib2007-08-071-12/+24
| | | | | | | | | | | of device pager in the pagers list by handle is now synchronized with its removal from the list, and dev_pager_mtx is put before vm object lock in lock order. Dispose the dev_pager_sx lock, since dev_pager_mtx now covers the same block. Noted by: kensmith Reviewed by: alc Approved by: re (kensmith)
* Consider a scenario in which one processor, call it Pt, is performingalc2007-08-051-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm_object_terminate() on a device-backed object at the same time that another processor, call it Pa, is performing dev_pager_alloc() on the same device. The problem is that vm_pager_object_lookup() should not be allowed to return a doomed object, i.e., an object with OBJ_DEAD set, but it does. In detail, the unfortunate sequence of events is: Pt in vm_object_terminate() holds the doomed object's lock and sets OBJ_DEAD on the object. Pa in dev_pager_alloc() holds dev_pager_sx and calls vm_pager_object_lookup(), which returns the doomed object. Next, Pa calls vm_object_reference(), which requires the doomed object's lock, so Pa waits for Pt to release the doomed object's lock. Pt proceeds to the point in vm_object_terminate() where it releases the doomed object's lock. Pa is now able to complete vm_object_reference() because it can now complete the acquisition of the doomed object's lock. So, now the doomed object has a reference count of one! Pa releases dev_pager_sx and returns the doomed object from dev_pager_alloc(). Pt now acquires dev_pager_mtx, removes the doomed object from dev_pager_object_list, releases dev_pager_mtx, and finally calls uma_zfree with the doomed object. However, the doomed object is still in use by Pa. Repeating my key point, vm_pager_object_lookup() must not return a doomed object. Moreover, the test for the object's state, i.e., doomed or not, and the increment of the object's reference count should be carried out atomically. Reviewed by: kib Approved by: re (kensmith) MFC after: 3 weeks
* Do not acquire Giant unconditionally around the calls to the cdevswkib2007-08-051-5/+0
| | | | | | | | | d_mmap methods. prep_cdevsw() already installs the shims that acquire/drop Giant for the methods of a driver that specified the D_NEEDGIANT flag. Reviewed by: alc Approved by: re (kensmith)
* Replace PG_BUSY with VPO_BUSY. In other words, changes to the page'salc2006-10-221-2/+2
| | | | | busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.
* Ensure that the page's new field for object-synchronized flags is alwaysalc2006-08-111-0/+1
| | | | | | | initialized to zero. Call vm_page_sleep_if_busy() instead of duplicating its implementation in vm_page_grab().
* Add a comment to the effect that fictitious pages do not require thealc2005-06-101-0/+4
| | | | initialization of their machine-dependent fields.
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Use dev_re[fl]thread() to maintain a ref on the device driver whilephk2004-09-241-14/+13
| | | | we call the ->d_mmap function.
* In dev_pager_updatefake, m->valid is typically 0 on entry. Itdfr2004-08-041-1/+2
| | | | | | | | should be set to VM_PAGE_BITS_ALL before returning, to ensure that neither vm_pager_get_pages nor vm_fault calls vm_page_zero_invalid after dev_pager_getpages has returned. Submitted by: tegge
* Fix a memory leak in the device pager which is exposed by the NVIDIAdfr2004-07-301-13/+41
| | | | | | OpenGL driver. Submitted by: nvidia (possibly also tegge)
* Do the dreaded s/dev_t/struct cdev */phk2004-06-161-2/+2
| | | | Bump __FreeBSD_version accordingly.
* Push down Giant into vm_pager_get_pages(). The only get pages methods thatalc2004-04-231-0/+3
| | | | require Giant are in the device and vnode pagers.
* Remove advertising clause from University of California Regent's license,imp2004-04-061-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Simplify the various pager allocation routines by computing the desiredalc2004-01-041-4/+5
| | | | object size once and assigning that value to a local variable.
* The addition of a locking assertion to vm_page_zero_invalid() has revealedalc2003-10-051-0/+1
| | | | | | | | | | a long-time bug: vm_pager_get_pages() assumes that m[reqpage] contains a valid page upon return from pgo_getpages(). In the case of the device pager this page has been freed and replaced by a fake page. The fake page is properly inserted into the vm object but m[reqpage] is left pointing to a freed page. For now, update m[reqpage] to point to the fake page. Submitted by: tegge
* - Use the UMA_ZONE_VM flag on the fakepg and object zones to preventjeff2003-10-041-1/+2
| | | | | vm recursion and LORs. This may be necessary for other zones created in the vm but this needs to be verified.
* Use sparse struct initializations for struct pagerops.phk2003-08-051-7/+6
| | | | This makes grepping for which pagers implement which methods easier.
* Assert that the vm object is locked on entry to dev_pager_getpages().alc2003-06-241-0/+1
|
* Add vm object locking to various pagers' "get pages" methods, i386 stackalc2003-06-131-1/+2
| | | | management functions, and a u area management function.
OpenPOWER on IntegriCloud