summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm.h
Commit message (Collapse)AuthorAgeFilesLines
* - Fix locked memory accounting for maps with MAP_WIREFUTURE flag.zont2012-12-181-0/+2
| | | | | | | | - Add sysctl vm.old_mlock which may turn such accounting off. Reviewed by: avg, trasz Approved by: kib (mentor) MFC after: 1 week
* Add new pager type, OBJT_MGTDEVICE. It provides the device pagerkib2012-05-121-1/+1
| | | | | | | | | | | | | | | | | which carries fictitous managed pages. In particular, the consumers of the new object type can remove all mappings of the device page with pmap_remove_all(). The range of physical addresses used for fake page allocation shall be registered with vm_phys_fictitious_reg_range() interface to allow the PHYS_TO_VM_PAGE() to work in pmap. Most likely, only i386 and amd64 pmaps can handle fictitious managed pages right now. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month
* Replace pointer to "struct uidinfo" with pointer to "struct ucred"trasz2010-12-021-3/+3
| | | | | | | | | in "struct vm_object". This is required to make it possible to account for per-jail swap usage. Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
* Replace VM_PROT_OVERRIDE_WRITE by VM_PROT_COPY. VM_PROT_OVERRIDE_WRITE hasalc2009-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | represented a write access that is allowed to override write protection. Until now, VM_PROT_OVERRIDE_WRITE has been used to write breakpoints into text pages. Text pages are not just write protected but they are also copy-on-write. VM_PROT_OVERRIDE_WRITE overrides the write protection on the text page and triggers the replication of the page so that the breakpoint will be written to a private copy. However, here is where things become confused. It is the debugger, not the process being debugged that requires write access to the copied page. Nonetheless, the copied page is being mapped into the process with write access enabled. In other words, once the debugger sets a breakpoint within a text page, the program can write to its private copy of that text page. Whereas prior to setting the breakpoint, a SIGSEGV would have occurred upon a write access. VM_PROT_COPY addresses this problem. The combination of VM_PROT_READ and VM_PROT_COPY forces the replication of a copy-on-write page even though the access is only for read. Moreover, the replicated page is only mapped into the process with read access, and not write access. Reviewed by: kib MFC after: 4 weeks
* Extend the device pager to support different memory attributes on differentjhb2009-08-281-6/+6
| | | | | | | | | | | | | | | pages in an object. - Add a new variant of d_mmap() currently called d_mmap2() which accepts an additional in/out parameter that is the memory attribute to use for the requested page. - A driver either uses d_mmap() or d_mmap2() for all requests but not both. The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate that the driver provides a d_mmap2() handler instead of d_mmap(). This is done to make the change ABI compatible with existing drivers and MFC'able to 7 and 8. Submitted by: alc MFC after: 1 month
* Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar tojhb2009-07-241-1/+1
| | | | | | | | | | | a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
* Add support to the virtual memory system for configuring machine-alc2009-07-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
* This change is the next step in implementing the cache control functionalityalc2009-06-261-0/+8
| | | | | | | | | | | required by video card drivers. Specifically, this change introduces vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all architectures. In addition, this changes adds a vm_cache_mode_t parameter to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the interfaces for allocating mapped kernel memory and physical memory, respectively, with non-default cache modes. In collaboration with: jhb
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Add the superpage reservation type.alc2007-12-271-0/+3
|
* Retire debug.mpsafevm. None of the architectures supported in CVS requirealc2006-07-211-13/+0
| | | | it any longer.
* - Change the vm_mmap() function to accept an objtype_t parameter specifyingjhb2005-04-011-0/+4
| | | | | | | | | | | | the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* - Introduce and use a new tunable "debug.mpsafevm". At present, settingalc2004-08-161-0/+13
| | | | | | | | | | | | | | "debug.mpsafevm" results in (almost) Giant-free execution of zero-fill page faults. (Giant is held only briefly, just long enough to determine if there is a vnode backing the faulting address.) Also, condition the acquisition and release of Giant around calls to pmap_remove() on "debug.mpsafevm". The effect on performance is significant. On my dual Opteron, I see a 3.6% reduction in "buildworld" time. - Use atomic operations to update several counters in vm_fault().
* Remove advertising clause from University of California Regent's license,imp2004-04-061-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Change the way ELF coredumps are handled. Instead of unconditionallydillon2002-12-161-0/+1
| | | | | | | | | | | | | | | | | | | skipping read-only pages, which can result in valuable non-text-related data not getting dumped, the ELF loader and the dynamic loader now mark read-only text pages NOCORE and the coredump code only checks (primarily) for complete inaccessibility of the page or NOCORE being set. Certain applications which map large amounts of read-only data will produce much larger cores. A new sysctl has been added, debug.elf_legacy_coredump, which will revert to the old behavior. This commit represents collaborative work by all parties involved. The PR contains a program demonstrating the problem. PR: kern/45994 Submitted by: "Peter Edwards" <pmedwards@eircom.net>, Archie Cobbs <archie@dellroad.org> Reviewed by: jdp, dillon MFC after: 7 days
* - Remove a number of extra newlines that do not belong here according toeivind2002-03-101-1/+1
| | | | | | | | | style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
* Remove a parameter name from a prototype.dwmalone2002-01-251-1/+1
|
* Move most of the kernel submap initialization code, including thedillon2001-08-221-0/+17
| | | | | | | | timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal. Reviewed by: freebsd-smp, jake
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-4/+0
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+4
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"peter1999-12-291-1/+1
| | | | | | is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
* useracc() the prequel:phk1999-10-291-1/+44
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* Removed vestiges of Mach lock types.bde1997-02-181-11/+0
| | | | | | vm_map.h: Removed #include of <sys/proc.h>. curproc is only used in some macros and users of the macros already include <sys/proc.h>.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-0/+11
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Moved the declaration of boolean_t from <vm/vm_param.h> tobde1995-12-101-1/+10
| | | | | | | | | | | | | | | <sys/types.h> (if KERNEL is defined). This allows removing bogus dependencies on vm stuff in several places (e.g., ddb) and stops <vm_param.h> from depending on <vm_param.h> Added declaration of boolean_t to <vm/vm.h> (if KERNEL is not defined). It never belonged in <vm/vm_param.h>. Unfortunately, it is required for some vm headers that are included by applications. Deleted declarations of TRUE and FALSE from <vm/vm_param.h>. They are defined in <sys/param.h> if KERNEL is defined and we'll soon find out if any applications depend on them being defined in a vm header.
* Untangled the vm.h include file spaghetti.dg1995-12-071-35/+2
|
* Moved the declaration of vm_object_t from <vm/vm.h> to <sys/types.h>bde1995-12-051-1/+7
| | | | | | (if KERNEL is defined). This allows removing the #includes of vm stuff in vnode_if.h, which will speed up the compilation of LINT by about 5%.
* Fix pollution of application namespace by declarations of kernelbde1995-10-051-2/+1
| | | | | | | | | | | | | | functions. The application header <sys/user.h> includes <vm/vm.h> which includes <vm/lock.h>... vm.h: Don't include <machine/cpufunc.h>. It is already included by <sys/systm.h> in the kernel and isn't designed to be included by applications (the 2.1 version causes a syntax error in C++ and the current version has initializers that are invalid in strict C++). lock.h: Only declare kernel functions if KERNEL is defined.
* NOTE: libkvm, w, ps, 'top', and any other utility which depends on structdg1995-07-131-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
* These changes embody the support of the fully coherent merged VM buffer cache,dg1995-01-091-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman
* Added $Id$dg1994-08-021-0/+1
|
* The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.rgrimes1994-05-251-1/+3
| | | | | Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
* BSD 4.4 Lite Kernel Sourcesrgrimes1994-05-241-0/+91
OpenPOWER on IntegriCloud