summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cluster.c
Commit message (Collapse)AuthorAgeFilesLines
* - Remove GIANT_REQUIRED where giant is no longer required.jeff2005-01-241-6/+0
| | | | Sponsored By: Isilon Systems, Inc.
* Eliminate (now) unnecessary acquisition and release of the global pagealc2004-12-291-4/+0
| | | | queues lock.
* Don't manually set b_bufobj, pbgetvp() does this for us.phk2004-11-151-1/+0
|
* Explicitly call pbrelvp()phk2004-11-151-0/+1
|
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* Lock bp->b_bufobj->b_object instead of bp->b_objectphk2004-10-281-8/+8
|
* Avoid using bp->b_vp when we already have the vnode by other means.phk2004-10-271-6/+5
|
* Synchronize access to the vm page's PG_BUSY flag using the containing vmalc2004-10-271-4/+4
| | | | | object's lock. In the same place, eliminate unnecessary checks for a NULL vm object pointer.
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-241-5/+2
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-4/+5
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-3/+1
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Give cluster_write() an explicit vnode argument.phk2004-09-271-6/+1
| | | | In the future a struct buf will not automatically point out a vnode for us.
* Eliminate unused second argument to reassignbuf() and simplify itphk2004-07-251-1/+1
| | | | accordingly.
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Update the statfs structure with 64-bit fields to allowmckusick2003-11-121-2/+2
| | | | | | | | | | | | | | | | | accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.
* Initialize the buf's b_object in pbgetvp(). Clear it in pbrelvp(). (Thisalc2003-10-201-1/+0
| | | | | | | facilitates synchronization of the vm page's valid field using the vm object's lock.) Suggested by: tegge
* - Synchronize access to a vm page's valid field using the containingalc2003-10-201-4/+10
| | | | vm object's lock.
* DuH!phk2003-10-181-2/+2
| | | | | bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
* Initialize bp->b_offset before calling VOP_STRATEGY()phk2003-10-181-0/+2
|
* - Move BX_BKGRDWAIT and BX_BKGRDINPROG to BV_ and the b_vflags field.jeff2003-08-281-7/+11
| | | | | | | | | | | | | | | | | | - Surround all accesses of the BKGRD{WAIT,INPROG} flags with the vnode interlock. - Don't use the B_LOCKED flag and QUEUE_LOCKED for background write buffers. Check for the BKGRDINPROG flag before recycling or throwing away a buffer. We do this instead because it is not safe for us to move the original buffer to a new queue from the callback on the background write buffer. - Remove the B_LOCKED flag and the locked buffer queue. They are no longer used. - The vnode interlock is used around checks for BKGRDINPROG where it may not be strictly necessary. If we hold the buf lock the a back-ground write will not be started without our knowledge, one may only be completed while we're not looking. Rather than remove the code, Document two of the places where this extra locking is done. A pass should be done to verify and minimize the locking later.
* Revert stuff which accidentally ended up in the previous commit.phk2003-07-221-2/+2
|
* Don't attempt to inline large functions mb_alloc() and mb_free(),phk2003-07-221-2/+2
| | | | | | it more than doubles the text size of this file. GCC has wisely ignored us on this previously
* Use __FBSDID().obrien2003-06-111-1/+3
|
* The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to preventphk2003-05-311-1/+1
| | | | | | deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
* In cluster_wbuild(), initialise b_iocmd to BIO_WRITE before callingiedowse2003-05-281-1/+3
| | | | | | | | | | | | | | | buf_start() to avoid triggering a panic in softdep_disk_io_initiation() if b_iocmd happened to be BIO_READ. The later initialisation of b_iocmd in cluster_wbuild() could probably be moved to before the buf_start() call, but this patch keeps the change as simple as possible. This is reported to fix occasional "softdep_disk_io_initiation: read" panics, especially on NFS servers. Reported by: Nick Hilliard <nick@netability.ie> Tested by: Nick Hilliard <nick@netability.ie> Approved by: re (rwatson)
* - Lock the vm_object when performing vm_object_pip_add().alc2003-04-201-0/+8
|
* - We are not guaranteed that read ahead blocks are not in memory already.jeff2003-03-301-1/+9
| | | | | Check for B_DELWRI as well as B_CACHED before issuing io on a buffer. This is especially important since we are changing the b_iocmd.
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* - Unlock the target bp and not the pager buf bp in a failure case injeff2003-03-171-1/+1
| | | | | | | cluster_wbuild(). This was causing strange panics that were widely reported on current@. Big Pointy Hat to: jeff
* - Tune down read_max. For single disks we get no gain out of reading morejeff2003-03-131-1/+1
| | | | | | than a MAXPHYS size block ahead. Having this set too high just leaves other processes starved for IO and screws up interactive response. Let the users with RAID set it higher when they need it.
* - Regularize variable usage in cluster_read().jeff2003-03-111-92/+62
| | | | | | | | | - Issue the io that we will later block on prior to doing cluster read ahead so that it is more likely to be ready when we block. - Loop issuing clustered reads until we've exhausted the seq count supplied by the file system. - Use a sysctl tunable "vfs.read_max" to determine the maximum number of blocks that we'll read ahead.
* - Hold the buf lock while manipulating and inspecting its fields.jeff2003-03-041-56/+70
| | | | | | | | | | | | | | | | - Use gbincore() and not incore() so that we can drop the vnode interlock as we acquire the buflock. - Use GB_LOCK_NOWAIT when getting bufs for read ahead clusters so that we don't block on locked bufs. - Convert a while loop to a howmany() that will most likely be faster on modern processors. There is another while loop divide that was left near by because it is operating on a 64bit int and is most likely faster. - Cleanup the cluster_read() code a little to get rid of a goto and make the logic clearer. Tested on: x86, alpha Tested by: Steve Kargl <sgk@troutmask.apl.washington.edu> Reviewd by: arch
* - Add a new 'flags' parameter to getblk().jeff2003-03-041-4/+4
| | | | | | | | | | - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick
* - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.jeff2003-02-251-3/+5
| | | | | | | | | | - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick
* Remove duplicate includes.cognet2003-02-201-1/+0
| | | | Submitted by: Cyril Nguyen-Huu <cyril@ci0.org>
* Back out M_* changes, per decision of the TRB.imp2003-02-191-1/+1
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-1/+1
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* - Use %j to print intmax_t values.jhb2002-11-071-3/+4
| | | | - Cast more daddr_t values to intmax_t when printing to quiet warnings.
* - Use incore() where no other interlock locking is necessary.jeff2002-09-251-2/+6
| | | | - Lock access to numoutput.
* Replace various spelling with FALLTHROUGH which is lint()ablecharnier2002-08-251-3/+3
|
* o Lock page accesses by vm_page_io_start() with the page queues lock.alc2002-07-311-1/+4
| | | | o Assert that the page queues lock is held in vm_page_io_start().
* Replace the global buffer hash table with per-vnode splay trees using adillon2002-07-101-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | methodology similar to the vm_map_entry splay and the VM splay that Alan Cox is working on. Extensive testing has appeared to have shown no increase in overhead. Disadvantages Dirties more cache lines during lookups. Not as fast as a hash table lookup (but still N log N and optimal when there is locality of reference). Advantages vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem syncer operate more efficiently. I get to rip out all the old hacks (some of which were mine) that tried to keep the v_dirtyblkhd tailq sorted. The per-vnode splay tree should be easier to lock / SMPng pushdown on vnodes will be easier. This commit along with another that Alan is working on for the VM page global hash table will allow me to implement ranged fsync(), optimize server-side nfs commit rpcs, and implement partial syncs by the filesystem syncer (aka filesystem syncer would detect that someone is trying to get the vnode lock, remembers its place, and skip to the next vnode). Note that the buffer cache splay is somewhat more complex then other splays due to special handling of background bitmap writes (multiple buffers with the same lblkno in the same vnode), and B_INVAL discontinuities between the old hash table and the existence of the buffer on the v_cleanblkhd list. Suggested by: alc
* This commit adds basic support for the UFS2 filesystem. The UFS2mckusick2002-06-211-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | filesystem expands the inode to 256 bytes to make space for 64-bit block pointers. It also adds a file-creation time field, an ability to use jumbo blocks per inode to allow extent like pointer density, and space for extended attributes (up to twice the filesystem block size worth of attributes, e.g., on a 16K filesystem, there is space for 32K of attributes). UFS2 fully supports and runs existing UFS1 filesystems. New filesystems built using newfs can be built in either UFS1 or UFS2 format using the -O option. In this commit UFS1 is the default format, so if you want to build UFS2 format filesystems, you must specify -O 2. This default will be changed to UFS2 when UFS2 proves itself to be stable. In this commit the boot code for reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c) as there is insufficient space in the boot block. Once the size of the boot block is increased, this code can be defined. Things to note: the definition of SBSIZE has changed to SBLOCKSIZE. The header file <ufs/ufs/dinode.h> must be included before <ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and ufs_lbn_t. Still TODO: Verify that the first level bootstraps work for all the architectures. Convert the utility ffsinfo to understand UFS2 and test growfs. Add support for the extended attribute storage. Update soft updates to ensure integrity of extended attribute storage. Switch the current extended attribute interfaces to use the extended attribute storage. Add the extent like functionality (framework is there, but is currently never used). Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@freebsd.org>
* Make daddr_t and u_daddr_t 64bits wide.phk2002-05-141-3/+3
| | | | | | Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.
* Remove __P.alfred2002-03-191-3/+3
|
* Introduce the new 64-bit size disk block, daddr64_t. Changemckusick2002-03-151-3/+3
| | | | | | | | | | | | the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
* Document all functions, global and static variables, and sysctls.eivind2002-03-051-3/+11
| | | | | | | | Includes some minor whitespace changes, and re-ordering to be able to document properly (e.g, grouping of variables and the SYSCTL macro calls for them, where the documentation has been added.) Reviewed by: phk (but all errors are mine)
* Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blockingdillon2001-11-051-1/+1
| | | | | | | | | | | | | in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week
* In cluster_rbuild(), 'size' had better match buf->b_bcount and buf->b_bufsizedillon2001-10-251-2/+14
| | | | | | | or the cluster will not be properly merged. Dup the code from cluster_wbuild() and add some printf()s to see if bad cases are present. MFC after: 2 weeks
* Syntax cleanup and documentation, no operational changes.dillon2001-10-211-10/+55
| | | | MFC after: 1 day
OpenPOWER on IntegriCloud