summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_fault.c
Commit message (Collapse)AuthorAgeFilesLines
* o Remove GIANT_REQUIRED from vm_fault_user_wire().alc2002-06-161-5/+1
| | | | | | o Move pmap_pageable() outside of Giant in vm_fault_unwire(). (pmap_pageable() is a no-op on all supported architectures.) o Remove the acquisition and release of Giant from mlock().
* o Acquire and release Giant around pmap operations in vm_fault_unwire()alc2002-05-261-1/+2
| | | | | | | and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete() only if operating on the kernel_object or the kmem_object. o Remove GIANT_REQUIRED from vm_map_remove(). o Remove the acquisition and release of Giant from munmap().
* o Condition the compilation and use of vm_freeze_copyopts()alc2002-05-061-1/+2
| | | | on ENABLE_VFS_IOOPT.
* o Revert vm_fault1() to its original name vm_fault(), eliminating the wrapperalc2002-04-301-16/+11
| | | | that took its place for the purposes of acquiring and releasing Giant.
* Document three synchronization issues in vm_fault().alc2002-04-291-0/+8
|
* o Introduce and use vm_map_trylock() to replace several direct usesalc2002-04-281-3/+1
| | | | | | of lockmgr(). o Add missing synchronization to vmspace_swap_count(): Obtain a read lock on the vm_map before traversing it.
* o Move the acquisition of Giant from vm_fault() to the pointalc2002-04-191-12/+8
| | | | | after initialization in vm_fault1(). o Fix some style problems in vm_fault1().
* Add a comment documenting a race condition in vm_fault(): Specifically, aalc2002-04-181-0/+3
| | | | modification is made to the vm_map while only a read lock is held.
* o Call vm_map_growstack() from vm_fault() if vm_map_lookup() has failedalc2002-04-181-1/+10
| | | | | | | | | | | due to conditions that suggest the possible need for stack growth. This has two beneficial effects: (1) we can now remove calls to vm_map_growstack() from the MD trap handlers and (2) simple page faults are faster because we no longer unnecessarily perform vm_map_growstack() on every page fault. o Remove vm_map_growstack() from the i386's trap_pfault(). o Remove the acquisition and release of Giant from i386's trap_pfault(). (vm_fault() still acquires it.)
* Remove an unused option, VM_FAULT_HOLD, to vm_fault().alc2002-04-171-2/+0
|
* Remove __P.alfred2002-03-191-3/+2
|
* Back out the modification of vm_map locks from lockmgr to sx locks. Thegreen2002-03-181-16/+10
| | | | | | | | | | best path forward now is likely to change the lockmgr locks to simple sleep mutexes, then see if any extra contention it generates is greater than removed overhead of managing local locking state information, cost of extra calls into lockmgr, etc. Additionally, making the vm_map lock a mutex and respecting it properly will put us much closer to not needing Giant magic in vm.
* Document faultstate.lookup_still_valid more than none.green2002-03-141-10/+14
| | | | Requested by: alfred
* Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate.green2002-03-131-5/+7
| | | | | | | | | | | | | | | | While doing this, move it earlier in the sysinit boot process so that the VM system can use it. After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand. This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)
* - Remove a number of extra newlines that do not belong here according toeivind2002-03-101-44/+9
| | | | | | | | | style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
* Changes to make the OOM killer much more effective:silby2002-02-191-2/+2
| | | | | | | | | | | | - Allow the OOM killer to target processes currently locked in memory. These very often are the ones doing the memory hogging. - Drop the wakeup priority of processes currently sleeping while waiting for their page fault to complete. In order for the OOM killer to work well, the killed process and other system processes waiting on memory must be allowed to wakeup first. Reviewed by: dillon MFC after: 1 week
* Fix deadlock introduced in 1.73 (Jan 1998). The paging-in-progress countdillon2001-11-091-1/+5
| | | | | | | | | on a vnode-backed object must be incremented *after* obtaining the vnode lock. If it is bumped before obtaining the vnode lock we can deadlock against vtruncbuf(). Submitted by: peter, ps MFC after: 3 days
* Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has adillon2001-10-261-2/+1
| | | | | | | | | | | | | | | | real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week
* KSE Milestone 2julian2001-09-121-1/+1
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).dillon2001-07-041-1/+1
| | | | | | | | | | | Also removed some spl's and added some VM mutexes, but they are not actually used yet, so this commit does not really make any operational changes to the system. vm_page.c relates to vm_page_t manipulation, including high level deactivation, activation, etc... vm_pageq.c relates to finding free pages and aquiring exclusive access to a page queue (exclusivity part not yet implemented). And the world still builds... :-)
* whitespace / register cleanupdillon2001-07-041-6/+6
|
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-38/+11
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Take a more conservative approach and still lock Giant around VM faultsjhb2001-05-231-8/+6
| | | | for now.
* Sort includes.jhb2001-05-221-3/+3
|
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-4/+61
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-1/+2
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Fix a lock reversal problem in the VM subsystem related to threadeddillon2001-03-141-0/+6
| | | | | | | | | | | | | programs. There is a case during a fork() which can cause a deadlock. From Tor - The workaround that consists of setting a flag in the vm map that indicates that a fork is in progress and using that mark in the page fault handling to force a revalidation failure. That change will only affect (pessimize) page fault handling during fork for threaded (linuxthreads style) applications and applications using aio_*(). Submited by: tegge
* If we intend to make the page writable without requiring another fault,dillon2001-02-281-6/+6
| | | | | | make sure that PG_NOSYNC is properly set. Previously we only set it for a write-fault, but this can occur on a read-fault too. (will be MFCd prior to 4.3 freeze)
* Change and clean the mutex lock interface.bmilekic2001-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* - Catch up to proc flag changes.jhb2001-01-241-1/+3
|
* Add the splvm()'s suggested in PR 20609 to protect vm_pager_page_unswapped().dillon2000-11-181-0/+3
| | | | | | The remainder of the PR is still open. PR: kern/20609 (partial fix)
* This is a cleanup patch to Peter's new OBJT_PHYS VM object typedillon2000-05-291-1/+1
| | | | | | | | | | | | | | | | | and sysv shared memory support for it. It implements a new PG_UNMANAGED flag that has slightly different characteristics from PG_FICTICIOUS. A new sysctl, kern.ipc.shm_use_phys has been added to enable the use of physically-backed sysv shared memory rather then swap-backed. Physically backed shm segments are not tracked with PV entries, allowing programs which use a large shm segment as a rendezvous point to operate without eating an insane amount of KVM in the PV entry management. Read: Oracle. Peter's OBJT_PHYS object will also allow us to eventually implement page-table sharing and/or 4MB physical page support for such segments. We're half way there.
* Implement an optimization of the VM<->pmap API. Pass vm_page_t's directlypeter2000-05-211-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon
* Revert spelling mistake I made in the previous commitcharnier2000-03-271-2/+2
| | | | Requested by: Alan and Bruce
* Spellingcharnier2000-03-261-4/+4
|
* Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC todillon1999-12-121-2/+16
| | | | | | | | | | | | | | | | | madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Final commit to remove vnode->v_lastr. vm_fault now handles readdillon1999-09-211-36/+43
| | | | | | | | | | | | | | | clustering issues (replacing code that used to be in ufs/ufs/ufs_readwrite.c). vm_fault also now uses the new VM page counter inlines. This completes the changeover from vnode->v_lastr to vm_entry_t->v_lastr for VM, and fp->f_nextread and fp->f_seqcount (which have been in the tree for a while). Determination of the I/O strategy (sequential, random, and so forth) is now handled on a descriptor-by-descriptor basis for base I/O calls, and on a memory-region-by-memory-region and process-by-process basis for VM faults. Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Move the memory access behavior information provided by madvisealc1999-08-011-3/+4
| | | | | | from the vm_object to the vm_map. Submitted by: dillon
* Convert a "page not busy" warning to an assertion.alc1999-07-201-7/+3
| | | | Submitted by: dillon@backplane.com
* The VFS/BIO subsystem contained a number of hacks in order to optimizealc1999-05-021-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
* Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>alc1999-02-251-2/+3
| | | | | | Corrected the computation of cnt.v_ozfod in vm_fault: vm_fault was counting the number of unoptimized rather than optimized zero-fill faults.
* Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>dillon1999-02-171-1/+12
| | | | | | | Unlock vnode before messing with map to avoid deadlock between map and vnode ( e.g. with exec_map and underlying program binary vnode ). Solves a deadlock that most often occurs during a large -j# buildworld reported by three people.
* Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used todillon1999-02-071-4/+3
| | | | | | attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
* Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALLdillon1999-01-241-2/+2
| | | | | | to use the vm_page_dirty() inline. The inline can thus do sanity checks ( or not ) over all cases.
* Get rid of unused old_m in vm_fault. Add INVARIANTS to test whetherdillon1999-01-241-4/+13
| | | | | page is still busy after all the hell vm_fault goes through.. it is supposed to be, and printf() if it isn't. don't panic, though.
* Reenable John Dyson's low-memory VM_WAIT code for page reactivations outdillon1999-01-231-7/+13
| | | | | of PQ_CACHE. Add comments explaining what it accomplishes and its limitations.
* Mainly cleanup. Removed some inappropriate low-memory handling codedillon1999-01-211-1/+1
| | | | | and added lots of comments. Add tie-in to vm_pager ( and thus the new swapper ) to deallocate backing swap for dirtied pages on the fly.
* This is a rather large commit that encompasses the new swapper,dillon1999-01-211-37/+84
| | | | | | | | | | changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
OpenPOWER on IntegriCloud