summaryrefslogtreecommitdiffstats
path: root/sys/vm/vnode_pager.c
Commit message (Collapse)AuthorAgeFilesLines
* Be consistent about "static" functions: if the function is markedphk2002-09-281-1/+1
| | | | | | static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
* - Add a ASSERT_VOP_LOCKED in vnode_pager_alloc.jeff2002-09-251-2/+7
| | | | - Lock access to v_iflags.
* o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever sincealc2002-08-251-1/+1
| | | | | | pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-9/+17
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* o Lock page queue accesses by vm_page_free().alc2002-07-281-14/+19
| | | | o Apply some style fixes.
* o Lock page queue accesses by vm_page_activate() and vm_page_deactivate().alc2002-07-271-0/+2
|
* - Use (OFF_TO_IDX(off) - pi) instead of (OFF_TO_IDX(off - IDX_TO_OFF(pi))).robert2002-07-011-5/+7
| | | | - Reformat a comment.
* o Replace GIANT_REQUIRED in vnode_pager_alloc() by the acquisition andalc2002-06-221-6/+5
| | | | | | | release of Giant. (Annotate as MPSAFE.) o Also, in vnode_pager_alloc(), remove an unnecessary re-initialization of struct vm_object::flags and move a statement that is duplicated in both branches of an if-else.
* More s/file system/filesystem/gtrhodes2002-05-161-1/+1
|
* Make daddr_t and u_daddr_t 64bits wide.phk2002-05-141-2/+2
| | | | | | Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.
* o Condition the compilation and use of vm_freeze_copyopts()alc2002-05-061-0/+2
| | | | on ENABLE_VFS_IOOPT.
* We do not necessarily need to map/unmap pages to zero parts of them.peter2002-04-281-4/+1
| | | | | On systems where physical memory is also direct mapped (alpha, sparc, ia64 etc) this is slightly harmful.
* Remove __P.alfred2002-03-191-10/+10
|
* Introduce the new 64-bit size disk block, daddr64_t. Changemckusick2002-03-151-2/+2
| | | | | | | | | | | | the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
* - Remove a number of extra newlines that do not belong here according toeivind2002-03-101-6/+4
| | | | | | | | | style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-6/+6
| | | | reference.
* This fixes a large number of bugs in our NFS client side code. A recentdillon2001-12-141-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit by Kirk also fixed a softupdates bug that could easily be triggered by server side NFS. * An edge case with shared R+W mmap()'s and truncate whereby the system would inappropriately clear the dirty bits on still-dirty data. (applicable to all filesystems) THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING. see vm/vm_page.c line 1641 * The straddle case for VM pages and buffer cache buffers when truncating. (applicable to NFS client side) * Possible SMP database corruption due to vm_pager_unmap_page() not clearing the TLB for the other cpu's. (applicable to NFS client side but could effect all filesystems). Note: not considered serious since the corruption occurs beyond the file EOF. * When flusing a dirty buffer due to B_CACHE getting cleared, we were accidently setting B_CACHE again (that is, bwrite() sets B_CACHE), when we really want it to stay clear after the write is complete. This resulted in a corrupt buffer. (applicable to all filesystems but probably only triggered by NFS) * We have to call vtruncbuf() when ftruncate()ing to remove any buffer cache buffers. This is still tentitive, I may be able to remove it due to the second bug fix. (applicable to NFS client side) * vnode_pager_setsize() race against nfs_vinvalbuf()... we have to set n_size before calling nfs_vinvalbuf or the NFS code may recursively vnode_pager_setsize() to the original value before the truncate. This is what was causing the user mmap bus faults in the nfs tester program. (applicable to NFS client side) * Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made by Kirk). Testing program written by: Avadis Tevanian, Jr. Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step') MFC after: 1 week
* Adjust vnode_pager_input_smlfs() to not attempt to BMAP blocks beyond thedillon2001-11-051-2/+7
| | | | | | | | | | | file EOF. This works around a bug in the ISOFS (CDRom) BMAP code which returns bogus values for requests beyond the file EOF rather then returning an error, resulting in either corrupt data being mmap()'d beyond the file EOF or resulting in a seg-fault on the last page of a mmap()'d file (mmap()s of CDRom files). Reported by: peter / Yahoo MFC after: 3 days
* Finally fix the VM bug where a file whos EOF occurs in the middle of a pagedillon2001-10-121-3/+21
| | | | | | | | | | | | | | | | | would sometimes prevent a dirty page from being cleaned, even when synced, resulting in the dirty page being re-flushed to disk every 30-60 seconds or so, forever. The problem is that when the filesystem flushes a page to its backing file it typically does not clear dirty bits representing areas of the page that are beyond the file EOF. If the file is also mmap()'d and a fault is taken, vm_fault (properly, is required to) set the vm_page_t->dirty bits to VM_PAGE_BITS_ALL. This combination could leave us with an uncleanable, unfreeable page. The solution is to have the vnode_pager detect the edge case and manually clear the dirty bits representing areas beyond the file EOF. The filesystem does the rest and the page comes up clean after the write completes. MFC after: 3 days
* Change the kernel's ucred API as follows:jhb2001-10-111-10/+8
| | | | | | | | - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
* KSE Milestone 2julian2001-09-121-8/+8
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Whitespace fixes.jhb2001-08-041-1/+1
|
* whitespace / register cleanupdillon2001-07-041-2/+2
|
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-50/+17
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Fix a XXX comment by moving the initialization of the number of pbuf'sjhb2001-07-031-9/+9
| | | | | for the vnode pager to a new vnode pager init method instead of making it a hack in getpages().
* Don't hold the VM lock across VOP's and other things that can sleep.jhb2001-05-291-2/+19
|
* - Assert Giant is held in the vnode pager methods.jhb2001-05-231-14/+18
| | | | | - Lock the VM while walking down a vm_object's backing_object list in vnode_pager_lock().
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-15/+38
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* vnode_pager_freepage() is really vm_page_free() in disguise,alfred2001-04-191-14/+7
| | | | nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()
* This implements a better launder limiting solution. There was a solutiondillon2000-12-261-3/+36
| | | | | | | | | | | | | | | | | | | in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit. Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things. NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-0/+5
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Implement an optimization of the VM<->pmap API. Pass vm_page_t's directlypeter2000-05-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Move B_ERROR flag to b_ioflags and call it BIO_ERROR.phk2000-04-021-2/+2
| | | | | | | | | | | | | (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
* Revert spelling mistake I made in the previous commitcharnier2000-03-271-1/+1
| | | | Requested by: Alan and Bruce
* Spellingcharnier2000-03-261-2/+2
|
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-201-2/+2
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-201-2/+2
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* The vnode pager (used when you do file-backed mmaps) must use thedillon1999-09-171-3/+11
| | | | | | | | | | | | underlying physical sector size when aligning I/O transfer sizes. It cannot assume 512 bytes. We assume the underlying sector size is a power of 2. If it isn't, mmap() will break badly anyway (in the same way mmap broke with NFS when NFS tried to cache piecemeal write ranges in buffers, before we enforced read-buffer-before-write-piecemeal for NFS). Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Add the (inline) function vm_page_undirty for clearing the dirty bitmaskalc1999-08-171-3/+3
| | | | | | | | of a vm_page. Use it. Submitted by: dillon
* Fix some int/long printf problems for the Alphapeter1999-07-011-3/+3
|
* Convert buffer locking from using the B_BUSY and B_WANTED flags to usingmckusick1999-06-261-3/+3
| | | | | | | lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
* Fix confusion of size of transfer with size of the pager.dt1999-05-151-3/+4
| | | | | PR: 11658 Broken in: 1.89 (1998/03/07)
* remove b_proc from struct buf, it's (now) unused.phk1999-05-061-5/+3
| | | | Reviewed by: dillon, bde
* The VFS/BIO subsystem contained a number of hacks in order to optimizealc1999-05-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
* Convert usage of vm_page_bits() to the new convention ("Inputs are requireddt1999-04-101-2/+2
| | | | to range within a page").
OpenPOWER on IntegriCloud