summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_object.h
Commit message (Collapse)AuthorAgeFilesLines
* VI_OBJDIRTY vnode flag mirrors the state of OBJ_MIGHTBEDIRTY vm objectkib2009-12-211-1/+1
| | | | | | | | | | | | | flag. Besides providing the redundand information, need to update both vnode and object flags causes more acquisition of vnode interlock. OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects. Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for vnode-backed vm objects. Suggested and reviewed by: alc Tested by: pho MFC after: 3 weeks
* Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar tojhb2009-07-241-0/+9
| | | | | | | | | | | a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
* Add support to the virtual memory system for configuring machine-alc2009-07-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Long, long ago in r27464 special case code for mapping device-backedalc2009-06-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory with 4MB pages was added to pmap_object_init_pt(). This code assumes that the pages of a OBJT_DEVICE object are always physically contiguous. Unfortunately, this is not always the case. For example, jhb@ informs me that the recently introduced /dev/ksyms driver creates a OBJT_DEVICE object that violates this assumption. Thus, this revision modifies pmap_object_init_pt() to abort the mapping if the OBJT_DEVICE object's pages are not physically contiguous. This revision also changes some inconsistent if not buggy behavior. For example, the i386 version aborts if the first 4MB virtual page that would be mapped is already valid. However, it incorrectly replaces any subsequent 4MB virtual page mappings that it encounters, potentially leaking a page table page. The amd64 version has a bug of my own creation. It potentially busies the wrong page and always an insufficent number of pages if it blocks allocating a page table page. To my knowledge, there have been no reports of these bugs, hence, their persistance. I suspect that the existing restrictions that pmap_object_init_pt() placed on the OBJT_DEVICE objects that it would choose to map, for example, that the first page must be aligned on a 2 or 4MB physical boundary and that the size of the mapping must be a multiple of the large page size, were enough to avoid triggering the bug for drivers like ksyms. However, one side effect of testing the OBJT_DEVICE object's pages for physical contiguity is that a dubious difference between pmap_object_init_pt() and the standard path for mapping devices pages, i.e., vm_fault(), has been eliminated. Previously, pmap_object_init_pt() would only instantiate the first PG_FICTITOUS page being mapped because it never examined the rest. Now, however, pmap_object_init_pt() uses the new function vm_object_populate() to instantiate them all (in order to support testing their physical contiguity). These pages need to be instantiated for the mechanism that I have prototyped for automatically maintaining the consistency of the PAT settings across multiple mappings, particularly, amd64's direct mapping, to work. (Translation: This change is also being made to support jhb@'s work on the Nvidia feature requests.) Discussed with: jhb@
* Eliminate OBJ_NEEDGIANT. After r188331, OBJ_NEEDGIANT's only use is by aalc2009-02-081-1/+0
| | | | | | redundant assertion in vm_fault(). Reviewed by: kib
* Allow VM object creation in ufs_lookup. (If vfs.vmiodirenable is set)ups2008-05-201-0/+1
| | | | | | | | | | | | Directory IO without a VM object will store data in 'malloced' buffers severely limiting caching of the data. Without this change VM objects for directories are only created on an open() of the directory. TODO: Inline test if VM object already exists to avoid locking/function call overhead. Tested by: kris@ Reviewed by: jeff@ Reported by: David Filo
* Add a list of reservations to the vm object structure.alc2007-12-271-0/+2
| | | | | | | | Recycle the vm object's "pg_color" field to represent the color of the first virtual page address at which the object is mapped instead of the color of the object's first physical page. Since an object may not be mapped, introduce a flag "OBJ_COLORED" that indicates whether "pg_color" is valid.
* Change the management of cached pages (PQ_CACHE) in two fundamentalalc2007-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)
* Eliminate OBJ_WRITEABLE. It hasn't been used in a long time.alc2006-07-211-1/+0
|
* Make vm_object_vndeallocate() static. The external calls to it werealc2006-01-221-1/+0
| | | | eliminated in ufs/ffs/ffs_vnops.c's revision 1.125.
* - Add a new object flag "OBJ_NEEDSGIANT". We set this flag if thejeff2005-05-031-0/+1
| | | | | | | underlying vnode requires Giant. - In vm_fault only acquire Giant if the underlying object has NEEDSGIANT set. - In vm_object_shadow inherit the NEEDSGIANT flag from the backing object.
* - Change the vm_mmap() function to accept an objtype_t parameter specifyingjhb2005-04-011-4/+0
| | | | | | | | | | | | the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* With the removal of kern/uipc_jumbo.c and sys/jumbo.h,alc2004-12-081-1/+0
| | | | vm_object_allocate_wait() is not used. Remove it.
* Move a call to wakeup() from vm_object_terminate() to vnode_pager_dealloc()alc2004-11-061-0/+1
| | | | | | | | | because this call is only needed to wake threads that slept when they discovered a dead object connected to a vnode. To eliminate unnecessary calls to wakeup() by vnode_pager_dealloc(), introduce a new flag, OBJ_DISCONNECTWNT. Reviewed by: tegge@
* Make the code and comments for vm_object_coalesce() consistent.alc2004-07-251-1/+1
|
* - Change uma_zone_set_obj() to call kmem_alloc_nofault() instead ofalc2004-07-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | kmem_alloc_pageable(). The difference between these is that an errant memory access to the zone will be detected sooner with kmem_alloc_nofault(). The following changes serve to eliminate the following lock-order reversal reported by witness: 1st 0xc1a3c084 vm object (vm object) @ vm/swap_pager.c:1311 2nd 0xc07acb00 swap_pager swhash (swap_pager swhash) @ vm/swap_pager.c:1797 3rd 0xc1804bdc vm object (vm object) @ vm/uma_core.c:931 There is no potential deadlock in this case. However, witness is unable to recognize this because vm objects used by UMA have the same type as ordinary vm objects. To remedy this, we make the following changes: - Add a mutex type argument to VM_OBJECT_LOCK_INIT(). - Use the mutex type argument to assign distinct types to special vm objects such as the kernel object, kmem object, and UMA objects. - Define a static swap zone object for use by UMA. (Only static objects are assigned a special mutex type.)
* Remove advertising clause from University of California Regent's license,imp2004-04-061-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* - Rename vm_map_clean() to vm_map_sync(). This better reflects the factalc2003-11-091-0/+2
| | | | | | | | | | that msync(2) is its only caller. - Migrate the parts of the old vm_map_clean() that examined the internals of a vm object to a new function vm_object_sync() that is implemented in vm_object.c. At the same, introduce the necessary vm object locking so that vm_map_sync() and vm_object_sync() can be called without Giant. Reviewed by: tegge
* - Introduce and use vm_object_reference_locked(). Unlikealc2003-11-021-0/+1
| | | | | | | vm_object_reference(), this function must not be used to reanimate dead vm objects. This restriction simplifies locking. Reviewed by: tegge
* - Revert a part of revision 1.73: Make vm_object_set_flag() an inlinealc2003-10-311-1/+10
| | | | | function. This function is so trivial that inlining reduces the size of the kernel.
* Reduce the size of the vm object on 64-bit architectures by movingalc2003-08-121-1/+1
| | | | a field within the structure.
* - Add VM_OBJECT_TRYLOCK().alc2003-06-041-0/+1
|
* - Add vm object locking to vm_object_deallocate(). (Still morealc2003-06-041-5/+0
| | | | | | changes are required.) - Remove special-case macros for kmem object locking. They are no longer used.
* Change kernel_object and kmem_object to (&kernel_object_store) andalc2003-06-011-2/+5
| | | | | (&kmem_object_store), respectively. This allows the address of these objects to be resolved at link-time rather than run-time.
* Reduce the size of a vm object by converting its shadow list from a TAILQalc2003-05-181-2/+2
| | | | | | to a LIST. Approved by: re (rwatson)
* - Define VM_OBJECT_LOCK_INIT().alc2003-04-281-0/+2
| | | | | - Avoid repeatedly mtx_init()ing and mtx_destroy()ing the vm_object's lock using UMA's uminit callback, in this case, vm_object_zinit().
* - Convert vm_object_pip_wait() from using tsleep() to msleep().alc2003-04-261-1/+1
| | | | | - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().
* Add VM_OBJECT_LOCKED().alc2003-04-221-0/+1
|
* - Lock the vm_object when performing vm_object_pip_wakeupn().alc2003-04-191-0/+2
| | | | | - Assert that the vm_object lock is held in vm_object_pip_wakeupn(). - Add a new macro VM_OBJECT_LOCK_ASSERT().
* Add new macros for locking and unlocking a vm object.alc2003-04-131-0/+3
|
* Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.alc2003-03-061-4/+0
| | | | Discussed on: arch@
* Fuse two #ifdefs with identical conditions.alc2003-02-251-3/+0
|
* - Remove vm_object_init2(). It is unused.alc2002-12-291-1/+0
| | | | | | - Add a mtx_destroy() to vm_object_collapse(). (This allows a bzero() to migrate from _vm_object_allocate() to vm_object_zinit(), where it will be performed less often.)
* Add a mutex to struct vm_object. Initialize and destroy that mutexalc2002-12-201-2/+7
| | | | | at appropriate times. For the moment, the mutex is only used on the kmem_object.
* Remove the hash_rand field from struct vm_object. As of revision 1.215 ofalc2002-12-191-1/+0
| | | | vm/vm_page.c, it is unused.
* Remove dead code that hasn't been needed since the demise of share mapsalc2002-11-131-2/+0
| | | | in various revisions of vm/vm_map.c between 1.148 and 1.153.
* Replace the vm_page hash table with a per-vmobject splay tree. There shoulddillon2002-10-181-0/+1
| | | | | | | | | | | | | | | | be no major change in performance from this change at this time but this will allow other work to progress: Giant lock removal around VM system in favor of per-object mutexes, ranged fsyncs, more optimal COMMIT rpc's for NFS, partial filesystem syncs by the syncer, more optimal object flushing, etc. Note that the buffer cache is already using a similar splay tree mechanism. Note that a good chunk of the old hash table code is still in the tree. Alan or I will remove it prior to the release if the new code does not introduce unsolvable bugs, else we can revert more easily. Submitted by: alc (this is Alan's code) Approved by: re
* Reduce namespace pollution.alc2002-09-211-3/+0
| | | | Submitted by: bde
* o Resurrect vm_object_lock() and vm_object_unlock() from revision 1.19.alc2002-08-241-0/+6
| | | | (For now, they simply acquire and release Giant.)
* At long last, commit the zero copy sockets code.ken2002-06-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.
* Complete the initial set of VM changes required to support fulliedowse2002-06-251-3/+3
| | | | | | | | | | | 64-bit file sizes. This step simply addresses the remaining overflows, and does attempt to optimise performance. The details are: o Use a 64-bit type for the vm_object `size' and the size argument to vm_object_allocate(). o Use the correct type for index variables in dev_pager_getpages(), vm_object_page_clean() and vm_object_page_remove(). o Avoid an overflow in the i386 pmap_object_init_pt().
* o Migrate vm_map_split() from vm_map.c to vm_object.c, renaming italc2002-06-021-0/+1
| | | | | to vm_object_split(). Its interface should still be changed to resemble vm_object_shadow().
* o Move vm_freeze_copyopts() from vm_map.{c.h} to vm_object.{c,h}. It's plainlyalc2002-05-061-0/+1
| | | | an operation on a vm_object and belongs in the latter place.
* o Make _vm_object_allocate() and vm_object_allocate() callablealc2002-05-041-2/+5
| | | | | | without holding Giant. o Begin documenting the trivial cases of the locking protocol on vm_object.
* Reintroduce locking on accesses to vm_object_list.alc2002-04-201-0/+1
|
* Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has adillon2001-10-261-0/+1
| | | | | | | | | | | | | | | | real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week
* Oops. Last commit to vm_object.c should have got these files too.jake2001-07-311-1/+0
| | | | | | | Remove the use of atomic ops to manipulate vm_object and vm_page flags. Giant is required here, so they are superfluous. Discussed with: dillon
* Change inlines back into mainline code in preparation for mutexing. Also,dillon2001-07-041-95/+26
| | | | | | | | most of these inlines had been bloated in -current far beyond their original intent. Normalize prototypes and function declarations to be ANSI only (half already were). And do some general cleanup. (kernel size also reduced by 50-100K, but that isn't the prime intent)
OpenPOWER on IntegriCloud