summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_object.h
Commit message (Collapse)AuthorAgeFilesLines
* On all the architectures, avoid to preallocate the physical memoryattilio2013-08-091-1/+1
| | | | | | | | | | | | | | | | | | | | | for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
* Never remove user-wired pages from an object when doingkib2013-07-111-0/+1
| | | | | | | | | | | | | | | msync(MS_INVALIDATE). The vm_fault_copy_entry() requires that object range which corresponds to the user-wired vm_map_entry, is always fully populated. Add OBJPR_NOTWIRED flag for vm_object_page_remove() to request the preserving behaviour, use it when calling vm_object_page_remove() from vm_object_sync(). Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* o Relax locking assertions for vm_page_find_least()attilio2013-05-211-0/+2
| | | | | | | | | | | | o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc
* Rework the handling of the tmpfs node backing swap object and tmpfskib2013-04-281-0/+11
| | | | | | | | | | | | | | | | | | vnode v_object to avoid double-buffering. Use the same object both as the backing store for tmpfs node and as the v_object. Besides reducing memory use up to 2x times for situation of mapping files from tmpfs, it also makes tmpfs read and write operations copy twice bytes less. VM subsystem was already slightly adapted to tolerate OBJT_SWAP object as v_object. Now the vm_object_deallocate() is modified to not reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle VV_TEXT flag on the last dereference of the tmpfs backing object. Reviewed by: alc Tested by: pho, bf MFC after: 1 month
* Introduce vm_radix_is_empty(), and use it in place ofalc2013-03-101-1/+1
| | | | | | | vm_object_cache_is_empty() where the caller is aware of the page cache's implementation as a radix trie. Sponsored by: EMC / Isilon Storage Division
* Merge from vmcontention.attilio2013-03-091-12/+25
|\
| * MFCattilio2013-03-091-12/+32
| |\
| | * Switch the vm_object mutex to be a rwlock. This will enable in theattilio2013-03-091-12/+25
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
| | | * MFCattilio2013-02-261-3/+0
| | | |
| | | * As VM_OBJECT_SLEEP() is a vm_object_t specific function, makeattilio2013-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | the passed object as the first argument of the function for consistency. Sponsored by: EMC / Isilon storage revision
| | | * Complete the asserts by definining also assertions forattilio2013-02-211-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | RA_RLOCKED and RA_LOCKED cases. Sponsored by: EMC / Isilon storage division Requested by: alc
| | | * Hide the details for the assertion for VM_OBJECT_LOCK operations.attilio2013-02-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc
| | | * Add read mode operations to VM_OBJECT_LOCK* class of functions.attilio2013-02-201-0/+6
| | | | | | | | | | | | | | | | Sponsored by: EMC / Isilon storage division
| | | * Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
| | | * There is no need to use VM_OBJECT_LOCKED() as the assertion won'tattilio2013-02-201-2/+0
| | | | | | | | | | | | | | | | | | | | make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().
| | | * Remove unused VM_OBJECT_LOCKPTR().attilio2013-02-201-2/+0
| | | | | | | | | | | | | | | | Sponsored by: EMC / Isilon storage division
| | | * Switch vm_object lock to be a rwlock.attilio2013-02-201-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
| | * | Merge from vmc-playground:attilio2013-03-091-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new KPI that verifies if the page cache is empty for a specified vm_object. This KPI does not make assumptions about the locking in order to be used also for building assertions at init and destroy time. It is mostly used to hide implementation details of the page cache. Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: alc (vm_radix based version) Tested by: flo, pho, jhb, davide
| | * | Merge from vmobj-rwlock:attilio2013-02-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
| | * | Merge from vmc-playground branch:attilio2013-02-261-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination. The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init(). Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide
| * | | MFCattilio2013-02-271-1/+0
| | | |
| * | | MFCattilio2013-02-261-4/+0
| | | |
* | | | Evaluations on the likelyhood of empty object cache cannot be made inattilio2013-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | general way but must be evaluated case by case. Embedd the decision in the caller themselves rather than in a general purpose KPI. Sponsored by: EMC / Isilon storage division Reported by: alc Reviewed by: alc
* | | | Merge from vmcontentionattilio2013-02-271-1/+0
| | | |
* | | | Merge from vmcontentionattilio2013-02-261-1/+3
|\ \ \ \ | |/ / /
| * | | MFCattilio2013-02-261-1/+3
| |\ \ \ | | |/ /
| | * | Wrap the sleeps synchronized by the vm_object lock into the specificattilio2013-02-261-1/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
* | | Retire the old UMA primitive uma_zone_set_obj() and replace it with theattilio2013-02-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | more modern uma_zone_reserve_kva(). The difference is that it doesn't rely anymore on an obj to allocate pages and the slab allocator doesn't use any more any specific locking but atomic operations to complete the operation. Where possible, the uma_small_alloc() is instead used and the uk_kva member becomes unused. The subsequent cleanups also brings along the removal of VM_OBJECT_LOCK_INIT() macro which is not used anymore as the code can be easilly cleaned up to perform a single mtx_init(), private to vm_object.c. For the same reason, _vm_object_allocate() becomes private as well. Sponsored by: EMC / Isilon storage division Reviewed by: alc
* | | Reduce differences with HEAD.attilio2013-02-071-1/+1
| | |
* | | Reformat comments to follow original version and re-add correctattilio2013-02-061-2/+2
| | | | | | | | | | | | locking flags.
* | | Do not assume the lock to be held so that this can be used also inattilio2013-02-061-2/+0
| | | | | | | | | | | | safe cases as a short-cut.
* | | Tweak comment to remove splay tree references.attilio2013-02-061-2/+2
| | |
* | | Make vm_object_cache_is_empty() inline.attilio2013-02-061-1/+9
| | |
* | | Avoid a namespace pollution in vm_object.h by defining separately theattilio2013-02-061-1/+1
| | | | | | | | | | | | structure for vm_radix implementation.
* | | - Move the vm_object_cache_is_empty() prototype to be sortedattilio2013-02-061-1/+1
| | | | | | | | | | | | | | | | | | alphabetically. - Change the return type to be boolean_t in order to match what vm_page_is_cached() does.
* | | Merge from vmcontentionattilio2013-02-041-0/+3
|\ \ \ | |/ /
| * | MFCattilio2013-02-031-0/+1
| |\ \ | | |/
| | * Fix a bug in the device pager code that can trigger an assertionken2013-01-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in devfs if a particular race condition is hit in the device pager code. This was a side effect of change 227530 which changed the device pager interface to call a new destructor routine for the cdev. That destructor routine, old_dev_pager_dtor(), takes a VM object handle. The object handle is cast to a struct cdev *, and passed into dev_rel(). That works in most cases, except the case in cdev_pager_allocate() where there is a race condition between two threads allocating an object backed by the same device. The loser of the race deallocates its object at the end of the function. The problem is that before inserting the object into the dev_pager_object_list, the object's handle is changed from the struct cdev pointer to the object's own address. This is to avoid conflicts with the winner of the race, which already inserted an object in the list with a handle that is a pointer to the same cdev structure. The object is then passed to vm_object_deallocate(), and eventually makes its way down to old_dev_pager_dtor(). That function passes the handle pointer (which is actually a VM object, not a struct cdev as usual) into dev_rel(). dev_rel() decrements the reference count in the assumed struct cdev (which happens to be 0), and that triggers the assertion in dev_rel() that the reference count is greater than or equal to 0. The fix is to add a cdev pointer to the VM object, and use that pointer when calling the cdev_pg_dtor() routine. vm_object.h: Add a struct cdev pointer to the VM object structure. device_pager.c: In cdev_pager_allocate(), populate the new cdev pointer. In dev_pager_dealloc(), use the new cdev pointer when calling the object's cdev_pg_dtor() routine. Reviewed by: kib Sponsored by: Spectra Logic Corporation MFC after: 1 week
| * | MFCattilio2012-12-111-0/+2
| |\ \ | | |/
| | * In the past four years, we've added two new vm object types. Each time,alc2012-12-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages. To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged. Reviewed and tested by: kib
* | | Merge from vmcontentionattilio2012-07-081-0/+13
|\ \ \ | |/ /
| * | MFCattilio2012-06-231-0/+13
| |\ \ | | |/
| | * - Add a comment explaining the locking of the cached pages pool heldattilio2012-06-221-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by vm_objects. - Add flags for the per-object lock and free pages queue mutex lock. Use the newly added flags to mark the cache root within the vm_object structure. Please note that other vm_object members should be marked with correct locking but they are left for other commits. In collabouration with: alc MFC after: 3 days3 days3 days
* | | - Split the cached and resident pages tree into 2 distinct ones.attilio2012-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio
* | | Introduce a new tree for dealing with cached pages separately andattilio2012-06-091-0/+1
|/ / | | | | | | | | | | | | | | remove the RED/BLACK concept. This is based on the assumption that path-compressed tries will be small and fast enough that a separate trie for cached pages will make sense and will leave the trie code simple enough (along with removing a lot of differences in the userend code).
* | MFCattilio2012-03-191-3/+3
|\ \ | |/
| * Fix madvise(MADV_WILLNEED) to properly handle individual mappings largerjhb2012-03-191-1/+1
| | | | | | | | | | | | | | | | than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int' count of pages overflowed on 64-bit platforms. While here, change vm_object_madvise() to accept two vm_pindex_t parameters (start and end) rather than a (start, count) tuple to match other VM APIs as suggested by alc@.
| * In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flagkib2012-03-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if the filesystem performed short write and we are skipping the page due to this. Propogate write error from the pager back to the callers of vm_pageout_flush(). Report the failure to write a page from the requested range as the FALSE return value from vm_object_page_clean(), and propagate it back to msync(2) to return EIO to usermode. While there, convert the clearobjflags variable in the vm_object_page_clean() and arguments of the helper functions to boolean. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
* | MFCattilio2012-02-251-0/+1
|\ \ | |/
| * Account the writeable shared mappings backed by file in the vnodekib2012-02-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v_writecount. Keep the amount of the virtual address space used by the mappings in the new vm_object un_pager.vnp.writemappings counter. The vnode v_writecount is incremented when writemappings gets non-zero value, and decremented when writemappings is returned to zero. Writeable shared vnode-backed mappings are accounted for in vm_mmap(), and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on the created map entry. During deferred map entry deallocation, vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and decrements writemappings for the vm object. Now, the writeable mount cannot be demoted to read-only while writeable shared mappings of the vnodes from the mount point exist. Also, execve(2) fails for such files with ETXTBUSY, as it should be. Noted by: tegge Reviewed by: tegge (long time ago, early version), alc Tested by: pho MFC after: 3 weeks
OpenPOWER on IntegriCloud