summaryrefslogtreecommitdiffstats
path: root/sys/vm/vnode_pager.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix a few bugs with NFS and mmap caused by NFS' use of b_validoffdfr1997-05-191-1/+6
| | | | | | | | and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully can be tidied up later by a slight redesign. PR: kern/2573, kern/2754, kern/3046 (possibly) Reviewed by: dyson
* When removing IN_RECURSE support during the Lite/2 merge, read/writedyson1997-03-081-2/+2
| | | | | to/from mmaped regions was broken. This commit fixes the breakage, and uses the new Lite/2 locking mechanisms.
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-7/+11
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Added a check/panic for v_usecount being 0 (no vnode reference) indg1997-01-241-0/+2
| | | | vnode_pager_alloc().
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Clean up the rundown of the object backing a vnode. This should fixdyson1996-10-171-1/+7
| | | | NFS problems associated with forcible dismounts.
* The whole issue of not support VOP_LOCK for VBLK devices should bedyson1996-09-101-3/+10
| | | | | | rethought. This fixes YET another problem with unmounting filesystems. The root cause is not fixed here, but at least the problem has gone away.
* Even though this looks like it, this is not a complex code change.dyson1996-08-211-2/+2
| | | | | | | | | | | | | | | | | The interface into the "VMIO" system has changed to be more consistant and robust. Essentially, it is now no longer necessary to call vn_open to get merged VM/Buffer cache operation, and exceptional conditions such as merged operation of VBLK devices is simpler and more correct. This code corrects a potentially large set of problems including the problems with ktrace output and loaded systems, file create/deletes, etc. Most of the changes to NFS are cosmetic and name changes, eliminating a layer of subroutine calls. The direct calls to vput/vrele have been re-instituted for better cross platform compatibility. Reviewed by: davidg
* Backed out the recent changes/enhancements to the VM code. Thedyson1996-07-301-4/+4
| | | | | | | problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
* This commit is meant to solve a couple of VM system problems ordyson1996-07-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
* Another sweep over the pmap/vm macros, this time with more focus onphk1996-05-031-7/+7
| | | | | the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
* Fix the problem that unmounting filesystems that are backed by a VMIOdyson1996-03-191-2/+5
| | | | | | | device have reference count problems. We mark the underlying object ono-persistent, and account for the reference count that the VM system maintainsfor the special device close. This should fix the removable device problem.
* Eliminated many redundant vm_map_lookup operations for vm_mmap.dyson1996-01-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
* Fix paging from ext2fs (and other fs w/block size < PAGE_SIZE). Thisdyson1995-12-171-15/+31
| | | | should fix kern/900.
* Another mega commit to staticize things.phk1995-12-141-14/+18
|
* Changes to support 1Tb filesizes. Pages are now named by andyson1995-12-111-44/+57
| | | | (object,index) pair instead of (object,offset) pair.
* Untangled the vm.h include file spaghetti.dg1995-12-071-1/+6
|
* Remove unused vars & funcs, make things static, protoize a little bit.phk1995-11-201-5/+7
|
* Don't pass an extra trailing arg to some functions.bde1995-10-301-5/+14
| | | | Added the prototypes that found this bug.
* Finalize GETPAGES layering scheme. Move the device GETPAGESdyson1995-10-231-6/+18
| | | | | interface into specfs code. No need at this point to modify the PUTPAGES stuff except in the layered-type (NULL/UNION) filesystems.
* Fix initialization of "bsize" in vnode_pager_haspage(). It must happendg1995-10-191-5/+3
| | | | | after the check for the mount point still existing or else the system will panic if someone forcibly unmounted the filesystem.
* Fix really bogus casting of a block number to a long. Also change thedyson1995-09-121-2/+2
| | | | comparison from a "< 0" to "== -1" like it should be.
* Fix an error that can cause attempted reading beyond the end of file.dyson1995-09-111-3/+11
|
* Minor performance improvements, additional prototype for additionaldyson1995-09-061-2/+4
| | | | exported symbol.
* Allow the fault code to use additional clustering info from bothdyson1995-09-041-41/+60
| | | | bmap and the swap pager. Improved fault clustering performance.
* Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block countdyson1995-09-041-6/+6
| | | | for VOP_BMAP. Updated affected filesystems...
* NOTE: libkvm, w, ps, 'top', and any other utility which depends on structdg1995-07-131-306/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
* Moved call to VOP_GETATTR() out of vnode_pager_alloc() and into the placesdg1995-07-091-25/+7
| | | | | | that call vnode_pager_alloc() so that a failure return can be dealt with. This fixes a panic seen on NFS clients when a file being opened is deleted on the server before the open completes.
* Fixed an object allocation race condition that was causing a "objectdg1995-07-061-13/+34
| | | | | | deallocated too many times" panic when using NFS. Reviewed by: John Dyson
* 1) Converted v_vmdata to v_object.dg1995-06-281-11/+11
| | | | | | | 2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs after vnode_pager_alloc() calls - the object is already guaranteed to be persistent. 3) Removed some gratuitous casts.
* Remove trailing whitespace.rgrimes1995-05-301-3/+3
|
* Accessing pages beyond the end of a mapped file results in internaldg1995-05-181-15/+17
| | | | | | | | | | | | | inconsistencies in the VM system that eventually lead to a panic. These changes fix the behavior to conform to the behavior in SunOS, which is to deny faults to pages beyond the EOF (returning SIGBUS). Internally, this is implemented by requiring faults to be within the object size boundaries. These changes exposed another bug, namely that passing in an offset to mmap when trying to map an unnamed anonymous region also results in internal inconsistencies. In this case, the offset is forced to zero. Reviewed by: John Dyson and others
* Changed "handle" from type caddr_t to void *; "handle" is several differentdg1995-05-101-2/+2
| | | | types of pointers, and "char *" is a bad choice for the type.
* Changes from John Dyson and myself:dg1995-04-091-350/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed remaining known bugs in the buffer IO and VM system. vfs_bio.c: Fixed some race conditions and locking bugs. Improved performance by removing some (now) unnecessary code and fixing some broken logic. Fixed process accounting of # of FS outputs. Properly handle NFS interrupts (B_EINTR). (various) Replaced calls to clrbuf() with calls to an optimized routine called vfs_bio_clrbuf(). (various FS sync) Sync out modified vnode_pager backed pages. ffs_vnops.c: Do two passes: Sync out file data first, then indirect blocks. vm_fault.c: Fixed deadly embrace caused by acquiring locks in the wrong order. vnode_pager.c: Changed to use buffer I/O system for writing out modified pages. This should fix the problem with the modification date previous not getting updated. Also dramatically simplifies the code. Note that this is going to change in the future and be implemented via VOP_PUTPAGES(). vm_object.c: Fixed a pile of bugs related to cleaning (vnode) objects. The performance of vm_object_page_clean() is terrible when dealing with huge objects, but this will change when we implement a binary tree to keep the object pages sorted. vm_pageout.c: Fixed broken clustering of pageouts. Fixed race conditions and other lockup style bugs in the scanning of pages. Improved performance.
* Removed unused variable declaration missed in previous commit.dg1995-03-211-2/+1
|
* Removed do-nothing VOP_UPDATE() call.dg1995-03-211-3/+1
|
* Added a new boolean argument to vm_object_page_clean that causes it todg1995-03-211-2/+2
| | | | only toss out clean pages if TRUE.
* Don't gain/lose an object reference in vnode_pager_setsize(). It willdg1995-03-201-13/+1
| | | | | cause vnode locking problems in vm_object_terminate(). Implement proper vnode locking in vm_object_terminate().
* Do proper vnode locking when doing paging I/O. Removed the asynchronousdg1995-03-191-47/+26
| | | | | | | paging capability to facilitate this (we saw little or no measureable improvement with this anyway). Submitted by: John Dyson
* Incorporated 4.4-lite vnode_pager_uncache() and vnode_pager_umount()dg1995-03-191-20/+22
| | | | | | | routines (and merged local changes). The changed vnode_pager_uncache gets rids of the bogosity that you can call the routine without having the vnode locked. The changed vnode_pager_umount properly locks the vnode before calling vnode_pager_uncache.
* Add and move declarations to fix all of the warnings from `gcc -Wimplicit'bde1995-03-161-2/+1
| | | | | | (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
* Explicitly set object->flags = OBJ_CANPERSIST.dg1995-03-121-3/+2
|
* Set VAGE flag when pager is destroyed. This usually happens when andg1995-03-071-1/+2
| | | | | | object has fallen off the end of the cached list - this is likely the last reference to the vnode and it should be reused before non file vnodes that are already on the free list (VDIR mostly).
* Various changes from John and myself that do the following:dg1995-03-011-7/+2
| | | | | | | | | | New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
* Removed redundant HOLDRELE()'s.dg1995-02-231-5/+1
|
* Changed return value from vnode_pager_addr to be in DEV_BSIZE units sodg1995-02-221-7/+7
| | | | | | | that 9 bits aren't lost in the conversion. Changed all callers to expect this. This allows paging on large (>2GB) filesystems. Submitted by: John Dyson
* Only do object paging_in_progress wakeups if someone is waiting on thisdg1995-02-221-2/+6
| | | | | | condition. Submitted by: John Dyson
* Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_dg1995-02-211-2/+3
| | | | | | | | pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.
* Fixed bmap run-length brokeness.dg1995-02-031-62/+53
| | | | | | Use bmap run-length extension when doing clustered paging. Submitted by: John Dyson
OpenPOWER on IntegriCloud