summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_object.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r279720alc2015-04-031-1/+1
| | | | | Correct a typo in vm_object_backing_scan() that originated in r254141. Specifically, change a lock acquire into a lock release.
* MFC r277828:kib2015-02-111-1/+6
| | | | | | | | | | | | | Update mtime for tmpfs files modified through memory mapping. MFC r277969: Update both ctime and mtime for writes to tmpfs files. MFC r277972: Remove single-use boolean. MFC r278151: Remove duplicated assignment.
* MFC r275513:kib2014-12-121-1/+6
| | | | | When the last reference on the vnode' vm object is dropped, read the vp->v_vflag without taking vnode lock and without bypass.
* Fix a leak of the wired pages when unwiring of the PROT_NONE-mappedkib2014-09-011-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wired region. Rework the handling of unwire to do the it in batch, both at pmap and object level. All commits below are by alc. MFC r268327: Introduce pmap_unwire(). MFC r268591: Implement pmap_unwire() for powerpc. MFC r268776: Implement pmap_unwire() for arm. MFC r268806: pmap_unwire(9) man page. MFC r269134: When unwiring a region of an address space, do not assume that the underlying physical pages are mapped by the pmap. This fixes a leak of the wired pages on the unwiring of the region mapped with no access allowed. MFC r269339: In the implementation of the new function pmap_unwire(), the call to MOEA64_PVO_TO_PTE() must be performed before any changes are made to the PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic. MFC r269365: Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed by the combination of r268591 and r269134: When we attempt to add the wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing. (They only set the wired attribute on newly created mappings.) MFC r269433: Handle wiring failures in vm_map_wire() with the new functions pmap_unwire() and vm_object_unwire(). Retire vm_fault_{un,}wire(), since they are no longer used. MFC r269438: Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable "rv" is uninitialized. MFC r269485: Retire pmap_change_wiring(). Reviewed by: alc
* MFC r268615:kib2014-07-281-4/+6
| | | | | | | | | | | Add OBJ_TMPFS_NODE flag. MFC r268616: Set the OBJ_TMPFS_NODE flag for vm_object of VREG tmpfs node. MFC r269053: Correct assertion. tmpfs vm object is always at the bottom of the shadow chain.
* MFC r263092:kib2014-03-191-1/+2
| | | | | Do not vdrop() the tmpfs vnode until it is unlocked. The hold reference might be the last, and then vdrop() would free the vnode.
* MFC r261867:attilio2014-02-211-2/+4
| | | | Use the right index to free swapspace after vm_page_rename().
* MFC r257680:kib2013-11-121-2/+3
| | | | | | Do not coalesce if the swap object belongs to tmpfs vnode. Approved by: re (glebius)
* Drain for the xbusy state for two places which potentially dokib2013-09-081-0/+6
| | | | | | | | | | | | pmap_remove_all(). Not doing the drain allows the pmap_enter() to proceed in parallel, making the pmap_remove_all() effects void. The race results in an invalidated page mapped wired by usermode. Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation Approved by: re (glebius)
* Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9).kib2013-08-221-2/+1
| | | | | | | | The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation
* On all the architectures, avoid to preallocate the physical memoryattilio2013-08-091-27/+49
| | | | | | | | | | | | | | | | | | | | | for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
* The soft and hard busy mechanism rely on the vm object lock to work.attilio2013-08-091-23/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
* Replace kernel virtual address space allocation with vmem. This providesjeff2013-08-071-1/+1
| | | | | | | | | | | | | transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
* Never remove user-wired pages from an object when doingkib2013-07-111-9/+10
| | | | | | | | | | | | | | | msync(MS_INVALIDATE). The vm_fault_copy_entry() requires that object range which corresponds to the user-wired vm_map_entry, is always fully populated. Add OBJPR_NOTWIRED flag for vm_object_page_remove() to request the preserving behaviour, use it when calling vm_object_page_remove() from vm_object_sync(). Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* - Add a general purpose resource allocator, vmem, from NetBSD. It wasjeff2013-06-281-6/+0
| | | | | | | | | | | | | | originally inspired by the Solaris vmem detailed in the proceedings of usenix 2001. The NetBSD version was heavily refactored for bugs and simplicity. - Use this resource allocator to allocate the buffer and transient maps. Buffer cache defrags are reduced by 25% when used by filesystems with mixed block sizes. Ultimately this may permit dynamic buffer cache sizing on low KVA machines. Discussed with: alc, kib, attilio Tested by: pho Sponsored by: EMC / Isilon Storage Division
* Revise the interface between vm_object_madvise() and vm_page_dontneed() soalc2013-06-101-22/+2
| | | | | | | that pointless calls to pmap_is_modified() can be easily avoided when performing madvise(..., MADV_FREE). Sponsored by: EMC / Isilon Storage Division
* In vm_object_split(), busy and consequently unbusy the pages only whenattilio2013-06-041-3/+4
| | | | | | | | swap_pager_copy() is invoked, otherwise there is no reason to do so. This will eliminate the necessity to busy pages most of the times. Sponsored by: EMC / Isilon storage division Reviewed by: alc
* After the object lock was dropped, the object' reference count couldkib2013-05-301-5/+5
| | | | | | | | | | | change. Retest the ref_count and return from the function to not execute the further code which assumes that ref_count == 1 if it is not. Also, do not leak vnode lock if other thread cleared OBJ_TMPFS flag meantime. Reported by: bdrewery Tested by: bdrewery, pho Sponsored by: The FreeBSD Foundation
* Remove the capitalization in the assertion message. Print the addresskib2013-05-301-1/+1
| | | | of the object to get useful information from optimizated kernels dump.
* Rework the handling of the tmpfs node backing swap object and tmpfskib2013-04-281-1/+23
| | | | | | | | | | | | | | | | | | vnode v_object to avoid double-buffering. Use the same object both as the backing store for tmpfs node and as the v_object. Besides reducing memory use up to 2x times for situation of mapping files from tmpfs, it also makes tmpfs read and write operations copy twice bytes less. VM subsystem was already slightly adapted to tolerate OBJT_SWAP object as v_object. Now the vm_object_deallocate() is modified to not reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle VV_TEXT flag on the last dereference of the tmpfs backing object. Reviewed by: alc Tested by: pho, bf MFC after: 1 month
* Make vm_object_page_clean() and vm_mmap_vnode() tolerate the vnode'kib2013-04-281-1/+6
| | | | | | | | | | | | | | | | | v_object of non OBJT_VNODE type. For vm_object_page_clean(), simply do not assert that object type must be OBJT_VNODE, and add a comment explaining how the check for OBJ_MIGHTBEDIRTY prevents the rest of function from operating on such objects. For vm_mmap_vnode(), if the object type is not OBJT_VNODE, require it to be for swap pager (or default), handle the bypass filesystems, and correctly acquire the object reference in this case. Reviewed by: alc Tested by: pho, bf MFC after: 1 week
* Introduce vm_radix_is_empty(), and use it in place ofalc2013-03-101-1/+1
| | | | | | | vm_object_cache_is_empty() where the caller is aware of the page cache's implementation as a radix trie. Sponsored by: EMC / Isilon Storage Division
* Merge from vmcontention.attilio2013-03-091-103/+104
|\
| * MFCattilio2013-03-091-102/+105
| |\
| | * Switch the vm_object mutex to be a rwlock. This will enable in theattilio2013-03-091-103/+104
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
| | | * MFCattilio2013-03-081-5/+5
| | | |\
| | | * | Fix compiling.attilio2013-02-261-2/+2
| | | | |
| | | * | MFCattilio2013-02-261-4/+4
| | | | |
| | | * | MFCattilio2013-02-261-2/+2
| | | | |
| | | * | MFCattilio2013-02-261-1/+1
| | | | |
| | | * | As VM_OBJECT_SLEEP() is a vm_object_t specific function, makeattilio2013-02-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the passed object as the first argument of the function for consistency. Sponsored by: EMC / Isilon storage revision
| | | * | Hide the details for the assertion for VM_OBJECT_LOCK operations.attilio2013-02-211-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc
| | | * | Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-78/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
| | | * | Switch vm_object lock to be a rwlock.attilio2013-02-201-33/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
| | * | | Merge from vmc-playground:attilio2013-03-091-5/+6
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new KPI that verifies if the page cache is empty for a specified vm_object. This KPI does not make assumptions about the locking in order to be used also for building assertions at init and destroy time. It is mostly used to hide implementation details of the page cache. Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: alc (vm_radix based version) Tested by: flo, pho, jhb, davide
| * | | MFCattilio2013-03-041-3/+5
| |\ \ \ | | |/ /
| | * | Merge from vmcontention:attilio2013-03-041-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As vm objects are type-stable there is no need to initialize the resident splay tree pointer and the cache splay tree pointer in _vm_object_allocate() but this could be done in the init UMA zone handler. The destructor UMA zone handler, will further check if the condition is retained at every destruction and catch for bugs. Sponsored by: EMC / Isilon storage division Submitted by: alc
* | | | Evaluations on the likelyhood of empty object cache cannot be made inattilio2013-03-041-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | general way but must be evaluated case by case. Embedd the decision in the caller themselves rather than in a general purpose KPI. Sponsored by: EMC / Isilon storage division Reported by: alc Reviewed by: alc
* | | | Remove the boot-time cache support and rely on UMA boot-time slab cacheattilio2013-03-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for allocating the nodes before to have the possibility to carve directly from the UMA subsystem. Sponsored by: EMC / Isilon storage division Reviewed by: alc
* | | | We don't need to reinitialize the root of the page cache trie on everyalc2013-03-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm object allocation. We can, instead, rely on the type stability of the vm object zone. (Note that we already assert that the page cache trie is empty in the vm object zone destructor.) Sponsored by: EMC / Isilon Storage Division
* | | | Merge from vmcontentionattilio2013-03-031-1/+0
|\ \ \ \ | |/ / /
| * | | MFCattilio2013-03-031-1/+0
| |\ \ \ | | |/ /
| | * | The value held by the vm object's field pg_color is only consideredalc2013-03-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | valid if the flag OBJ_COLORED is set. Since _vm_object_allocate() doesn't set this flag, it needn't initialize pg_color. Sponsored by: EMC / Isilon Storage Division
| | * | Merge from vmc-playground branch:attilio2013-02-261-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination. The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init(). Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide
| | * | Remove white spaces.attilio2013-02-261-2/+2
| | | | | | | | | | | | | | | | Sponsored by: EMC / Isilon storage division
| * | | MFCattilio2013-02-261-4/+4
| | | |
| * | | MFCattilio2013-02-261-1/+1
| | | |
* | | | Revert white space change in the previous commit.alc2013-03-021-1/+0
| | | | | | | | | | | | | | | | Requested by: attilio
* | | | Assert that the trie is empty when a vm object is destroyed.alc2013-03-021-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since vm objects are allocated from type-stable memory, we don't need to initialize the trie's root in _vm_object_allocate() on every vm object allocation. We can instead do it once in vm_object_zinit(). We don't need to call vm_radix_reclaim_allnodes() in vm_object_terminate() unless the resident page count is non-zero. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division
* | | | Merge from vmcontentionattilio2013-02-261-1/+1
| | | |
OpenPOWER on IntegriCloud