summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_bio.c
Commit message (Collapse)AuthorAgeFilesLines
* Finish the vm_object locking for this file, including holding the vm_objectalc2003-04-281-3/+8
| | | | lock when accessing the vm_object's flags or calling vm_page_lookup().
* - Lock the vm_object when performing vm_page_alloc() in allocbuf().alc2003-04-261-0/+4
|
* Lock the vm_object in vfs_busy_pages().alc2003-04-201-0/+4
|
* - Lock the vm_object when performing vm_object_pip_subtract().alc2003-04-191-1/+2
| | | | - Assert that the vm_object lock is held in vm_object_pip_subtract().
* - Lock the vm_object when performing vm_object_pip_wakeupn().alc2003-04-191-1/+6
| | | | | - Assert that the vm_object lock is held in vm_object_pip_wakeupn(). - Add a new macro VM_OBJECT_LOCK_ASSERT().
* Update locking on the kernel_object to use the new macros.alc2003-04-141-4/+4
|
* Remove an unnecessary trunc_page() from vmapbuf().alc2003-04-061-1/+1
| | | | Reviewed by: tegge
* o Check the b_bufsize passed to vmapbuf() returning an erroralc2003-04-041-2/+2
| | | | | | | if it is invalid. o Remove a debugging printf() from vmapbuf(). Suggested by: tegge
* Preparation commit before I start on the bioqueue lockdown:phk2003-03-301-25/+0
| | | | | Collect all the bits of bioqueue handing in subr_disk.c, vfs_bio.c is big enough as it is and disksort already lives in subr_disk.c.
* Add support for reading directly from file to userland buffer when thetegge2003-03-261-0/+12
| | | | | O_DIRECT descriptor status flag is set and both offset and length is a multiple of the physical media sector size.
* - Add vm_paddr_t, a physical address type. This is required for systemsjake2003-03-251-1/+1
| | | | | | | | | | | | | | | where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* - Add a lock for protecting against msleep(bp, ...) wakeup(bp) races.jeff2003-03-131-10/+37
| | | | | | | | | | - Create a new function bdone() which sets B_DONE and calls wakup(bp). This is suitable for use as b_iodone for buf consumers who are not going through the buf cache. - Create a new function bwait() which waits for the buf to be done at a set priority and with a specific wmesg. - Replace several cases where the above functionality was implemented without locking with the new functions.
* - Remove a race between fsync like functions and flushbufqueues() byjeff2003-03-131-44/+32
| | | | | | | | | | | requiring locked bufs in vfs_bio_awrite(). Previously the buf could have been written out by fsync before we acquired the buf lock if it weren't for giant. The cluster_wbuild() handles this race properly but the single write at the end of vfs_bio_awrite() would not. - Modify flushbufqueues() so there is only one copy of the loop. Pass a parameter in that says whether or not we should sync bufs with deps. - Call flushbufqueues() a second time and then break if we couldn't find any bufs without deps.
* - Add a new 'flags' parameter to getblk().jeff2003-03-041-5/+10
| | | | | | | | | | - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick
* - Hold the vnode interlock across calls to bgetvp instead of acquiring itjeff2003-03-021-1/+4
| | | | | internally. This is required to stop multiple bufs from being associated with a single lblkno.
* - gc USE_BUFHASH. The smp locking of the buf cache renders this useless.jeff2003-03-011-104/+0
|
* When doing cleanup of excessive buffers in bdwrite (see kern/vfs_bio.cmckusick2003-02-251-2/+8
| | | | | | | | delta 1.371) we must ensure that we do not get ourselves into a recursive trap endlessly trying to clean up after ourselves. Reported by: Attila Nagy <bra@fsn.hu> Sponsored by: DARPA & NAI Labs.
* - Add the missing NULL interlock argument to a recently added BUF_LOCK.jeff2003-02-251-1/+1
|
* Prevent large files from monopolizing the system buffers. Keepmckusick2003-02-251-3/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | track of the number of dirty buffers held by a vnode. When a bdwrite is done on a buffer, check the existing number of dirty buffers associated with its vnode. If the number rises above vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one of the other (hopefully older) dirty buffers associated with the vnode is written (using bawrite). In the event that this approach fails to curb the growth in it the vnode's number of dirty buffers (due to soft updates rollback dependencies), the more drastic approach of doing a VOP_FSYNC on the vnode is used. This code primarily affects very large and actively written files such as snapshots. This change should eliminate hanging when taking snapshots or doing background fsck on very large filesystems. Hopefully, one day it will be possible to cache filesystem metadata in the VM cache as is done with file data. As it stands, only the buffer cache can be used which limits total metadata storage to about 20Mb no matter how much memory is available on the system. This rather small memory gets badly thrashed causing a lot of extra I/O. For example, taking a snapshot of a 1Tb filesystem minimally requires about 35,000 write operations, but because of the cache thrashing (we only have about 350 buffers at our disposal) ends up doing about 237,540 I/O's thus taking twenty-five minutes instead of four if it could run entirely in the cache. Reported by: Attila Nagy <bra@fsn.hu> Sponsored by: DARPA & NAI Labs.
* - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.jeff2003-02-251-16/+20
| | | | | | | | | | - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick
* Back out M_* changes, per decision of the TRB.imp2003-02-191-1/+1
| | | | Approved by: trb
* - Introduce a new function bremfreel() that does a bremfree with the bufjeff2003-02-161-8/+14
| | | | | | | | | queue lock already held. - In getblk() and flushbufqueues() use bremfreel() while we still have the buf queue lock held to keep the lists consistent. - Add LK_NOWAIT to two cases where we're essentially asserting that the bufs are not locked while acquiring the locks. This will make sure that we get the appropriate panic() and not another one for sleeping with a lock held.
* - Add a comment about a race that will happen without Giant.jeff2003-02-101-0/+1
|
* - Unlock the nblock after the loop in bwillwrite().jeff2003-02-101-1/+1
|
* - In getnewbuf() unlock the bq lock prior to sleeping when we're out ofjeff2003-02-101-0/+1
| | | | | | buffers. Submitted by: tegge
* - Correct another atomic op.jeff2003-02-091-1/+2
| | | | Spotted by: alc
* - Move some code out from #ifdef INVARIANTS.jeff2003-02-091-2/+0
|
* - Cleanup unlocked accesses to buf flags by introducing a new b_vflag memberjeff2003-02-091-2/+4
| | | | | | | | | | that is protected by the vnode lock. - Move B_SCANNED into b_vflags and call it BV_SCANNED. - Create a vop_stdfsync() modeled after spec's sync. - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some fs specific processing. This gives all of these filesystems proper behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag. - Annotate the locking in buf.h
* - spell add 'add' and not 'subtract' in an atomic op.jeff2003-02-091-1/+1
| | | | | Spotted by: alc Pointy hat to: jeff
* - Lock down the buffer cache's infrastructure code. This includes locks onjeff2003-02-091-61/+154
| | | | | | | buf lists, synchronization variables, and atomic ops for the counters. This change does not remove giant from any code although some pushdown may be possible. - In vfs_bio_awrite() don't access buf fields without the buf lock.
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-1/+1
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Close the remaining user address mapping races for physicaldillon2003-01-201-6/+26
| | | | | | | I/O, CAM, and AIO. Still TODO: streamline useracc() checks. Reviewed by: alc, tegge MFC after: 7 days
* - Hold the page queues lock around vm_page_hold().alc2003-01-201-0/+2
| | | | | - Assert that the page queues lock rather than Giant is held in vm_page_hold().
* Fix two long-standing, but likely harmless, errors in the use ofalc2003-01-161-2/+2
| | | | | | | | | | | | | vm_pageout_deficit: 1. Update vm_pageout_deficit before VM_WAIT. There is no sense in delaying the update; the sooner the pageout daemon receives this information the better. Reviewed by: tegge 2. Update vm_pageout_deficit according to the number of pages still needed to complete the allocation, not the original size of the allocation. Submitted by: tegge (These errors have existed since the introduction of vm_pageout_deficit in revision 1.144.)
* Merge all the various copies of vmapbuf() and vunmapbuf() into a singledillon2003-01-151-0/+76
| | | | | | | | portable copy. Note that pmap_extract() must be used instead of pmap_kextract(). This is precursor work to a reorganization of vmapbuf() to close remaining user/kernel races (which can lead to a panic).
* - Update vm_pageout_deficit using atomic operations. It's a simplealc2003-01-141-2/+4
| | | | | counter outside the scope of existing locks. - Eliminate a redundant clearing of vm_pageout_deficit.
* vm_hold_load_pages() needn't clear PG_ZERO because it didn't passalc2003-01-121-1/+0
| | | | VM_ALLOC_ZERO to vm_page_alloc(). (PG_ZERO is clear by default.)
* Make bogus_offset local to bufinit().alc2003-01-071-6/+1
|
* Fix cut&paste bug which would result in a panic because buffer wasphk2003-01-051-2/+2
| | | | being biodone'ed multiple times.
* Allocate bogus_page with VM_ALLOC_WIRED. (Previously, bogus_page'salc2003-01-051-2/+1
| | | | | | allocation incremented the global count of wired pages, but not the page's own wire count. This inconsistency was introduced in revision 1.230.)
* Temporarily introduce a new VOP_SPECSTRATEGY operation while I tryphk2003-01-041-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | to sort out disk-io from file-io in the vm/buffer/filesystem space. The intent is to sort VOP_STRATEGY calls into those which operate on "real" vnodes and those which operate on VCHR vnodes. For the latter kind, the call will be changed to VOP_SPECSTRATEGY, possibly conditionally for those places where dual-use happens. Add a default VOP_SPECSTRATEGY method which will call the normal VOP_STRATEGY. First time it is called it will print debugging information. This will only happen if a normal vnode is passed to VOP_SPECSTRATEGY by mistake. Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY does on a VCHR vnode today. Add a new VOP_STRATEGY method in specfs to catch instances where the conversion to VOP_SPECSTRATEGY has not yet happened. Handle the request just like we always did, but first time called print debugging information. Apart up to two instances of console messages per boot, this amounts to a glorified no-op commit. If you get any of the messages on your console I would very much like a copy of them mailed to phk@freebsd.org
* Don't call VOP_BMAP on VCHR vnodes when the logical and physical blockphk2003-01-041-1/+1
| | | | numbers are identical: it cannot even hope to accomplish anything.
* Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op sincephk2003-01-031-1/+1
| | | | all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
* Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,schweikh2003-01-011-1/+1
| | | | especially in troff files.
* Hold the page queues lock when calling vm_page_flag_clear().alc2002-12-271-0/+6
|
* - Hold the kernel_object's lock around vm_page_alloc(kernel_object,...).alc2002-12-231-0/+6
| | | | - Hold the page queues lock around vm_page_wakeup().
* The buffer daemon cannot skip over buffers owned by locked inodes asmckusick2002-12-141-46/+64
| | | | | | | | | | | | | | | they may be the only viable ones to flush. Thus it will now wait for an inode lock if the other alternatives will result in rollbacks (and immediate redirtying of the buffer). If only buffers with rollbacks are available, one will be flushed, but then the buffer daemon will wait briefly before proceeding. Failing to wait briefly effectively deadlocks a uniprocessor since every other process writing to that filesystem will wait for the buffer daemon to clean up which takes close enough to forever to feel like a deadlock. Reported by: Archie Cobbs <archie@dellroad.org> Sponsored by: DARPA & NAI Labs. Approved by: re
* Hold the page queues/flags lock when calling vm_page_set_validclean().alc2002-11-231-1/+5
| | | | Approved by: re
* Now that pmap_remove_all() is exported by our pmap implementationsalc2002-11-161-2/+2
| | | | use it directly.
OpenPOWER on IntegriCloud