summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cluster.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove a stale comment. The very same revision (r85511) that introducedalc2009-06-301-3/+0
| | | | | | this comment also implemented the proposed change to the code. Approved by: re (kib)
* Correct a long-standing performance bug in cluster_rbuild(). Specifically,alc2009-06-271-4/+15
| | | | | | | | | | in the case of a file system with a block size that is less than the page size, cluster_rbuild() looks at too many of the page's valid bits. Consequently, it may terminate prematurely, resulting in poor performance. Reported by: bde Reviewed by: tegge Approved by: re (kib)
* Eliminate unnecessary obfuscation when testing a page's valid bits.alc2009-06-071-4/+2
|
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-221-13/+18
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* - Move rusage from being per-process in struct pstats to per-thread injeff2007-06-011-2/+2
| | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
* Change these descriptions of memory types used in malloc(9), as theirwkoszek2007-03-051-1/+1
| | | | | | current, rather long strings make output from vmstat -m look unpleasant. Approved by: cognet (mentor)
* Replace PG_BUSY with VPO_BUSY. In other words, changes to the page'salc2006-10-221-1/+1
| | | | | busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.
* Add mnt_noasync counter to better handle interleaved calls to nmount(),tegge2006-09-261-1/+1
| | | | | | sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
* Remove unused leaked debug function prototype.tegge2006-03-211-1/+0
|
* Let snapshots make a copy of old contents for all buffers taking part in ategge2006-03-191-5/+1
| | | | | | | | cluster instead of just the first buffer. Delay buf_start() calls until snapshots have a copy of old content. PR: kern/93942
* Changes imported from XFS for FreeBSD project:rodrigc2005-12-071-0/+15
| | | | | | | | | | | | | | - add fields to struct buf (needed by XFS) - 3 private fields: b_fsprivate1, b_fsprivate2, b_fsprivate3 - b_pin_count, count of pinned buffer - add new B_MANAGED flag - add breada() function to initiate asynchronous I/O on read-ahead blocks. - add bufdone_finish(), bpin(), bunpin_wait() functions Patches provided by: kan Reviewed by: phk Silence on: arch@
* Normalize a significant number of kernel malloc type names:rwatson2005-10-311-1/+1
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* Only set B_RAM (Read ahead mark) on an incore buffers if we can lock it.ups2005-10-241-3/+8
| | | | | | | This fixes a race condition caused by the unlocked write access to the b_flags field. MFC after: 3 days
* Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable.kan2005-08-131-6/+0
| | | | | | | | | | | vm_pager_init() is run before required nswbuf variable has been set to correct value. This caused system to run with single pbuf available for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt variable in the same way. Reported by: ade Obtained from: alc MFC after: 2 days
* Revert revision 1.164: pmap_qremove() does not require protection byalc2005-05-141-2/+0
| | | | | | VM_LOCK_GIANT. Discussed with: jeff
* - Remove spls and comments relating to them.jeff2005-05-011-26/+2
|
* - Call VM_LOCK_GIANT in cluster_callback() to protect some pmap calls. VFSjeff2005-04-301-0/+2
| | | | | | will not be acquiring Giant before calling this function anymore. Sponsored by: Isilon Systems, Inc.
* make cluster_callback() staticphk2005-02-101-1/+2
|
* - Remove GIANT_REQUIRED where giant is no longer required.jeff2005-01-241-6/+0
| | | | Sponsored By: Isilon Systems, Inc.
* Eliminate (now) unnecessary acquisition and release of the global pagealc2004-12-291-4/+0
| | | | queues lock.
* Don't manually set b_bufobj, pbgetvp() does this for us.phk2004-11-151-1/+0
|
* Explicitly call pbrelvp()phk2004-11-151-0/+1
|
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* Lock bp->b_bufobj->b_object instead of bp->b_objectphk2004-10-281-8/+8
|
* Avoid using bp->b_vp when we already have the vnode by other means.phk2004-10-271-6/+5
|
* Synchronize access to the vm page's PG_BUSY flag using the containing vmalc2004-10-271-4/+4
| | | | | object's lock. In the same place, eliminate unnecessary checks for a NULL vm object pointer.
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-241-5/+2
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-4/+5
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-3/+1
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Give cluster_write() an explicit vnode argument.phk2004-09-271-6/+1
| | | | In the future a struct buf will not automatically point out a vnode for us.
* Eliminate unused second argument to reassignbuf() and simplify itphk2004-07-251-1/+1
| | | | accordingly.
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Update the statfs structure with 64-bit fields to allowmckusick2003-11-121-2/+2
| | | | | | | | | | | | | | | | | accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.
* Initialize the buf's b_object in pbgetvp(). Clear it in pbrelvp(). (Thisalc2003-10-201-1/+0
| | | | | | | facilitates synchronization of the vm page's valid field using the vm object's lock.) Suggested by: tegge
* - Synchronize access to a vm page's valid field using the containingalc2003-10-201-4/+10
| | | | vm object's lock.
* DuH!phk2003-10-181-2/+2
| | | | | bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
* Initialize bp->b_offset before calling VOP_STRATEGY()phk2003-10-181-0/+2
|
* - Move BX_BKGRDWAIT and BX_BKGRDINPROG to BV_ and the b_vflags field.jeff2003-08-281-7/+11
| | | | | | | | | | | | | | | | | | - Surround all accesses of the BKGRD{WAIT,INPROG} flags with the vnode interlock. - Don't use the B_LOCKED flag and QUEUE_LOCKED for background write buffers. Check for the BKGRDINPROG flag before recycling or throwing away a buffer. We do this instead because it is not safe for us to move the original buffer to a new queue from the callback on the background write buffer. - Remove the B_LOCKED flag and the locked buffer queue. They are no longer used. - The vnode interlock is used around checks for BKGRDINPROG where it may not be strictly necessary. If we hold the buf lock the a back-ground write will not be started without our knowledge, one may only be completed while we're not looking. Rather than remove the code, Document two of the places where this extra locking is done. A pass should be done to verify and minimize the locking later.
* Revert stuff which accidentally ended up in the previous commit.phk2003-07-221-2/+2
|
* Don't attempt to inline large functions mb_alloc() and mb_free(),phk2003-07-221-2/+2
| | | | | | it more than doubles the text size of this file. GCC has wisely ignored us on this previously
* Use __FBSDID().obrien2003-06-111-1/+3
|
* The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to preventphk2003-05-311-1/+1
| | | | | | deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
* In cluster_wbuild(), initialise b_iocmd to BIO_WRITE before callingiedowse2003-05-281-1/+3
| | | | | | | | | | | | | | | buf_start() to avoid triggering a panic in softdep_disk_io_initiation() if b_iocmd happened to be BIO_READ. The later initialisation of b_iocmd in cluster_wbuild() could probably be moved to before the buf_start() call, but this patch keeps the change as simple as possible. This is reported to fix occasional "softdep_disk_io_initiation: read" panics, especially on NFS servers. Reported by: Nick Hilliard <nick@netability.ie> Tested by: Nick Hilliard <nick@netability.ie> Approved by: re (rwatson)
* - Lock the vm_object when performing vm_object_pip_add().alc2003-04-201-0/+8
|
* - We are not guaranteed that read ahead blocks are not in memory already.jeff2003-03-301-1/+9
| | | | | Check for B_DELWRI as well as B_CACHED before issuing io on a buffer. This is especially important since we are changing the b_iocmd.
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* - Unlock the target bp and not the pager buf bp in a failure case injeff2003-03-171-1/+1
| | | | | | | cluster_wbuild(). This was causing strange panics that were widely reported on current@. Big Pointy Hat to: jeff
* - Tune down read_max. For single disks we get no gain out of reading morejeff2003-03-131-1/+1
| | | | | | than a MAXPHYS size block ahead. Having this set too high just leaves other processes starved for IO and screws up interactive response. Let the users with RAID set it higher when they need it.
* - Regularize variable usage in cluster_read().jeff2003-03-111-92/+62
| | | | | | | | | - Issue the io that we will later block on prior to doing cluster read ahead so that it is more likely to be ready when we block. - Loop issuing clustered reads until we've exhausted the seq count supplied by the file system. - Use a sysctl tunable "vfs.read_max" to determine the maximum number of blocks that we'll read ahead.
* - Hold the buf lock while manipulating and inspecting its fields.jeff2003-03-041-56/+70
| | | | | | | | | | | | | | | | - Use gbincore() and not incore() so that we can drop the vnode interlock as we acquire the buflock. - Use GB_LOCK_NOWAIT when getting bufs for read ahead clusters so that we don't block on locked bufs. - Convert a while loop to a howmany() that will most likely be faster on modern processors. There is another while loop divide that was left near by because it is operating on a 64bit int and is most likely faster. - Cleanup the cluster_read() code a little to get rid of a goto and make the logic clearer. Tested on: x86, alpha Tested by: Steve Kargl <sgk@troutmask.apl.washington.edu> Reviewd by: arch
OpenPOWER on IntegriCloud