summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Eradicate the variable "time" from the kernel, using various measures.phk1998-03-302-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde
* Moved some #includes from <sys/param.h> nearer to where they are actuallybde1998-03-281-1/+2
| | | | used.
* Some VM improvements, including elimination of alot of Sig-11dyson1998-03-165-33/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!! pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.) pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances. vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.) vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust. vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true. vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.) vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities. vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.) vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.) 10) Modify FFS to use vtruncbuf. vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.) vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly. vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.) vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.
* Fix for mmap of char devices bug as described in OpenBSD advisory ofguido1998-03-121-7/+36
| | | | | | 1998/02/20 Reviewed by: John Dyson Submitted by: "Cy Schubert" <cschuber@uumail.gov.bc.ca>
* Complement diagnostic messages about missing per-FS VOP page operations,msmith1998-03-091-5/+14
| | | | | but don't make their absence fatal. Submitted by: terry
* Quell unneeded pageout daemon activity.dyson1998-03-081-2/+2
|
* Remove a very ill advised vm_page_protect. This was being calleddyson1998-03-081-2/+1
| | | | for a non-managed page. That is a big no-no.
* Some cruft left over from my megacommit. A page rotation optimizationdyson1998-03-081-7/+2
| | | | | was a good idea, but can cause instability. That optimization is now removed.
* Several minor fixes:dyson1998-03-081-4/+11
| | | | | | | | | | | | 1) When freeing pages, it is a good idea to protect them off. (This is probably gratuitious, but good form.) 2) Allow collapsing pages in the backing object that are PQ_CACHE. This will improve memory utilization. 3) Correct the collapse code so that pages that were on the cache queue are moved to the inactive queue. This is done when pages are marked dirty (so that those pages will be properly paged out instead of freed), so that cached pages will not be paradoxically marked dirty.
* This mega-commit is meant to fix numerous interrelated problems. Theredyson1998-03-0711-110/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
* Make vm_fault much cleaner by removing the evil macro inlines, anddyson1998-03-071-220/+208
| | | | | | | | | | put alot of it's context into a data structure. This allows significant shortening of its codepath, and will significantly decrease it's cache footprint. Also, add some stats to vmmeter. Note that you'll have to rebuild/recompile vmstat, systat, etc... Otherwise, you'll get "very interesting" paging stats.
* Reviewed by: msmith, bde long agodufault1998-03-041-2/+2
| | | | | POSIX.4 headers and sysctl variables. Nothing should change unless POSIX4 is defined or _POSIX_VERSION is set to 199309.
* 1) Use a more consistent page wait methodology.dyson1998-03-018-120/+105
| | | | | | | | | | | | | 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."
* In the author's words:msmith1998-02-262-28/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | These diffs implement the first stage of a VOP_{GET|PUT}PAGES pushdown for local media FS's. See ffs_putpages in /sys/ufs/ufs/ufs_readwrite.c for implementation details for generic *_{get|put}pages for local media FS's. Support is trivial to add for any FS that formerly relied on the default behaviour of the vnode_pager in in EOPNOTSUPP cases (just copy the ffs_getpages() code for the FS in question's *_{get|put}pages). Obviously, it would be better if each local media FS implemented a more optimal method, instead of calling an exported interface from the /sys/vm/vnode_pager.c, but this is a necessary first step in getting the FS's to a point where they can be supplied with better implementations on a case-by-case basis. Obviously, the cd9660_putpages() can be rather trivial (since it is a read-only FS type 8-)). A slight (temporary) modification is made to print a diagnostic message in the case where the underlying filesystem attempts to engage in the previous behaviour. Failure is likely to be ungraceful. Submitted by: terry@freebsd.org (Terry Lambert)
* Fix page prezeroing for SMP, and fix some potential paging-in-progressdyson1998-02-256-41/+48
| | | | | | hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.
* Correct some severe VM tuning problems for small systems (<=16MB), anddyson1998-02-241-7/+10
| | | | | | | improve tuning on larger systems. (A couple of the VM tuning params for small systems were so badly chosen that the system could hang under load.) The broken tuning was originaly my fault.
* Significantly improve the efficiency of the swap pager, which appears todyson1998-02-238-314/+306
| | | | | | | | have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)
* Try to dynamically size the VM_KMEM_SIZE (but is still able to be overriddendyson1998-02-231-2/+14
| | | | | | | | | | | | in a way identically as before.) I had problems with the system properly handling the number of vnodes when there is alot of system memory, and the default VM_KMEM_SIZE. Two new options "VM_KMEM_SIZE_SCALE" and "VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems with greater than 128MB. Add some accouting for vm_zone memory allocations, and provide properly for vm_zone allocations out of the kmem_map. Also move the vm_zone allocation stats to the VM OID tree from the KERN OID tree.
* Removed unused #includes.bde1998-02-201-2/+1
|
* Move the 'sw' device off block major #1, which is now occupied by 'wfd'.msmith1998-02-191-2/+2
|
* Staticize.eivind1998-02-098-34/+35
|
* Fix an argument to vn_lock. It appears that alot of the vn_lock usagedyson1998-02-081-2/+2
| | | | is a bit undisciplined, and should be checked carefully.
* Back out DIAGNOSTIC changes.eivind1998-02-0615-43/+15
|
* 1) Start using a cleaner and more consistant page allocator insteaddyson1998-02-059-122/+155
| | | | | | | | | | | | | | | | | | | | | | | | of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-0415-15/+43
|
* Added #include of <sys/queue.h> so that this file is more "self"-sufficent.bde1998-02-031-1/+3
|
* This fix should help the panic problems in -current. Theredyson1998-02-031-36/+65
| | | | | | were some errors in "interval" management. Due to the clustering mechanism, the code is necessarily complex and error prone.
* Forward declare more structs that are used in prototypes here - don'tbde1998-02-011-1/+4
| | | | depend on <sys/types.h> forward declaring common ones.
* Fix a performance problem caused by an earlier commit.dyson1998-02-011-2/+3
|
* contigalloc doesn't place the allocated page(s) into an object, anddyson1998-01-311-3/+5
| | | | | | now this breaks vm_page_wire (due to wired page accounting per object.) This should fix a problem as described by Donald Maddox.
* Change the busy page mgmt, so that when pages are freed, theydyson1998-01-3110-103/+164
| | | | | | | | | | | | | | MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
* Turn NSWAPDEV into a new-style option.eivind1998-01-251-1/+2
|
* Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style.eivind1998-01-241-1/+3
| | | | | | | | This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.
* Add better support for larger I/O clusters, including larger physicaldyson1998-01-241-4/+2
| | | | | I/O. The support is not mature yet, and some of the underlying implementation needs help. However, support does exist for IDE devices now.
* VM level code cleanups.dyson1998-01-2214-410/+284
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
* Allow gdb to work again.dyson1998-01-211-4/+6
|
* Tie up some loose ends in vnode/object management. Remove an unneededdyson1998-01-179-239/+277
| | | | | | | config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.
* Fix some vnode management problems, and better mgmt of vnode free list.dyson1998-01-126-55/+170
| | | | | | | | | | | | | | | Fix the UIO optimization code. Fix an assumption in vm_map_insert regarding allocation of swap pagers. Fix an spl problem in the collapse handling in vm_object_deallocate. When pages are freed from vnode objects, and the criteria for putting the associated vnode onto the free list is reached, either put the vnode onto the list, or put it onto an interrupt safe version of the list, for further transfer onto the actual free list. Some minor syntax changes changing pre-decs, pre-incs to post versions. Remove a bogus timeout (that I added for debugging) from vn_lock. PHK will likely still have problems with the vnode list management, and so do I, but it is better than it was.
* Turn off the VTEXT flag when an object is no longer referenced, sodyson1998-01-071-2/+6
| | | | | that an executable that is no longer running can be written to. Also, clear the OBJ_OPT flag more often, when appropriate.
* Make our v_usecount vnode reference count work identically to thedyson1998-01-067-190/+245
| | | | | | | | | | | | | | | | | | | | | | | | | | | original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
* caddr_t --> void *alex1997-12-312-16/+16
|
* Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problemdyson1997-12-291-2/+1
| | | | with the object cache removal.
* Lots of improvements, including restructring the caching and managementdyson1997-12-297-221/+79
| | | | | | | | | | | | | | of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.
* The ioopt code is still buggy, but wasn't fully disabled.dyson1997-12-251-2/+3
|
* Support running with inadequate swap space. Additionally, the codedyson1997-12-242-8/+16
| | | | will complain with a suggestion of increasing it.
* Improve my copyright.dyson1997-12-221-9/+2
|
* Change bogus usage of btoc to atop. The incorrect usage of btoc wasdyson1997-12-191-13/+12
| | | | pointed out by bde.
* Some performance improvements, and code cleanups (including changing ourdyson1997-12-196-24/+359
| | | | expensive OFF_TO_IDX to btoc whenever possible.)
* Make COMPAT_43 and COMPAT_SUNOS new-style options.eivind1997-12-161-1/+2
|
* Fix a recursive kernel_map lock problem in vm_zone allocator.dyson1997-12-151-19/+23
| | | | PR: 5298
OpenPOWER on IntegriCloud