summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cluster.c
Commit message (Collapse)AuthorAgeFilesLines
* The hardware has caught up; improvements are now observed even at 128,ivoras2011-03-161-1/+1
| | | | | but stay conservative and bump read_max to "only" 64 (it will probably be a good idea to increase this to 128 after the next major release).
* Bumping the read-ahead count once more, to value equivalent to 512 KiB onivoras2010-08-091-1/+1
| | | | | | | | | | | | | | | | most system, based on benchmark results on a low-end fibre channel SAN under VMWare: vfs.read_max read performance 8 (historical default) 83 MB/s 16 (recent bump) 131 MB/s 32 (this version) 152 MB/s 64 157 MB/s (results are +/- 3 MB/s) As read-ahead is heuristic, based on past IO requests, it shouldn't be problematic. The new default is still smaller then in other OSes.
* To help with sequential read UFS performance on modern systems, increaseivoras2010-08-071-1/+1
| | | | | | | | | | | | | | the vfs.read_max default. For most systems this means going from 128 KiB to 256 KiB, which is still very conservative and lower than what most other operating systems use, but as a sane default should not interfere much with existing systems. For systems with RAID volumes and/or virtualization envirnments, where read performance is very important, increasing this sysctl tunable to 32 or even more will demonstratively yield additional performance benefits. If MAXPHYS ever gets bumped up, it will probably be a good idea to slave read_max to it.
* Remove a stale comment. The very same revision (r85511) that introducedalc2009-06-301-3/+0
| | | | | | this comment also implemented the proposed change to the code. Approved by: re (kib)
* Correct a long-standing performance bug in cluster_rbuild(). Specifically,alc2009-06-271-4/+15
| | | | | | | | | | in the case of a file system with a block size that is less than the page size, cluster_rbuild() looks at too many of the page's valid bits. Consequently, it may terminate prematurely, resulting in poor performance. Reported by: bde Reviewed by: tegge Approved by: re (kib)
* Eliminate unnecessary obfuscation when testing a page's valid bits.alc2009-06-071-4/+2
|
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-221-13/+18
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* - Move rusage from being per-process in struct pstats to per-thread injeff2007-06-011-2/+2
| | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
* Change these descriptions of memory types used in malloc(9), as theirwkoszek2007-03-051-1/+1
| | | | | | current, rather long strings make output from vmstat -m look unpleasant. Approved by: cognet (mentor)
* Replace PG_BUSY with VPO_BUSY. In other words, changes to the page'salc2006-10-221-1/+1
| | | | | busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.
* Add mnt_noasync counter to better handle interleaved calls to nmount(),tegge2006-09-261-1/+1
| | | | | | sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
* Remove unused leaked debug function prototype.tegge2006-03-211-1/+0
|
* Let snapshots make a copy of old contents for all buffers taking part in ategge2006-03-191-5/+1
| | | | | | | | cluster instead of just the first buffer. Delay buf_start() calls until snapshots have a copy of old content. PR: kern/93942
* Changes imported from XFS for FreeBSD project:rodrigc2005-12-071-0/+15
| | | | | | | | | | | | | | - add fields to struct buf (needed by XFS) - 3 private fields: b_fsprivate1, b_fsprivate2, b_fsprivate3 - b_pin_count, count of pinned buffer - add new B_MANAGED flag - add breada() function to initiate asynchronous I/O on read-ahead blocks. - add bufdone_finish(), bpin(), bunpin_wait() functions Patches provided by: kan Reviewed by: phk Silence on: arch@
* Normalize a significant number of kernel malloc type names:rwatson2005-10-311-1/+1
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* Only set B_RAM (Read ahead mark) on an incore buffers if we can lock it.ups2005-10-241-3/+8
| | | | | | | This fixes a race condition caused by the unlocked write access to the b_flags field. MFC after: 3 days
* Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable.kan2005-08-131-6/+0
| | | | | | | | | | | vm_pager_init() is run before required nswbuf variable has been set to correct value. This caused system to run with single pbuf available for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt variable in the same way. Reported by: ade Obtained from: alc MFC after: 2 days
* Revert revision 1.164: pmap_qremove() does not require protection byalc2005-05-141-2/+0
| | | | | | VM_LOCK_GIANT. Discussed with: jeff
* - Remove spls and comments relating to them.jeff2005-05-011-26/+2
|
* - Call VM_LOCK_GIANT in cluster_callback() to protect some pmap calls. VFSjeff2005-04-301-0/+2
| | | | | | will not be acquiring Giant before calling this function anymore. Sponsored by: Isilon Systems, Inc.
* make cluster_callback() staticphk2005-02-101-1/+2
|
* - Remove GIANT_REQUIRED where giant is no longer required.jeff2005-01-241-6/+0
| | | | Sponsored By: Isilon Systems, Inc.
* Eliminate (now) unnecessary acquisition and release of the global pagealc2004-12-291-4/+0
| | | | queues lock.
* Don't manually set b_bufobj, pbgetvp() does this for us.phk2004-11-151-1/+0
|
* Explicitly call pbrelvp()phk2004-11-151-0/+1
|
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* Lock bp->b_bufobj->b_object instead of bp->b_objectphk2004-10-281-8/+8
|
* Avoid using bp->b_vp when we already have the vnode by other means.phk2004-10-271-6/+5
|
* Synchronize access to the vm page's PG_BUSY flag using the containing vmalc2004-10-271-4/+4
| | | | | object's lock. In the same place, eliminate unnecessary checks for a NULL vm object pointer.
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-241-5/+2
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-4/+5
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-3/+1
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Give cluster_write() an explicit vnode argument.phk2004-09-271-6/+1
| | | | In the future a struct buf will not automatically point out a vnode for us.
* Eliminate unused second argument to reassignbuf() and simplify itphk2004-07-251-1/+1
| | | | accordingly.
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Update the statfs structure with 64-bit fields to allowmckusick2003-11-121-2/+2
| | | | | | | | | | | | | | | | | accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.
* Initialize the buf's b_object in pbgetvp(). Clear it in pbrelvp(). (Thisalc2003-10-201-1/+0
| | | | | | | facilitates synchronization of the vm page's valid field using the vm object's lock.) Suggested by: tegge
* - Synchronize access to a vm page's valid field using the containingalc2003-10-201-4/+10
| | | | vm object's lock.
* DuH!phk2003-10-181-2/+2
| | | | | bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
* Initialize bp->b_offset before calling VOP_STRATEGY()phk2003-10-181-0/+2
|
* - Move BX_BKGRDWAIT and BX_BKGRDINPROG to BV_ and the b_vflags field.jeff2003-08-281-7/+11
| | | | | | | | | | | | | | | | | | - Surround all accesses of the BKGRD{WAIT,INPROG} flags with the vnode interlock. - Don't use the B_LOCKED flag and QUEUE_LOCKED for background write buffers. Check for the BKGRDINPROG flag before recycling or throwing away a buffer. We do this instead because it is not safe for us to move the original buffer to a new queue from the callback on the background write buffer. - Remove the B_LOCKED flag and the locked buffer queue. They are no longer used. - The vnode interlock is used around checks for BKGRDINPROG where it may not be strictly necessary. If we hold the buf lock the a back-ground write will not be started without our knowledge, one may only be completed while we're not looking. Rather than remove the code, Document two of the places where this extra locking is done. A pass should be done to verify and minimize the locking later.
* Revert stuff which accidentally ended up in the previous commit.phk2003-07-221-2/+2
|
* Don't attempt to inline large functions mb_alloc() and mb_free(),phk2003-07-221-2/+2
| | | | | | it more than doubles the text size of this file. GCC has wisely ignored us on this previously
* Use __FBSDID().obrien2003-06-111-1/+3
|
* The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to preventphk2003-05-311-1/+1
| | | | | | deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
* In cluster_wbuild(), initialise b_iocmd to BIO_WRITE before callingiedowse2003-05-281-1/+3
| | | | | | | | | | | | | | | buf_start() to avoid triggering a panic in softdep_disk_io_initiation() if b_iocmd happened to be BIO_READ. The later initialisation of b_iocmd in cluster_wbuild() could probably be moved to before the buf_start() call, but this patch keeps the change as simple as possible. This is reported to fix occasional "softdep_disk_io_initiation: read" panics, especially on NFS servers. Reported by: Nick Hilliard <nick@netability.ie> Tested by: Nick Hilliard <nick@netability.ie> Approved by: re (rwatson)
* - Lock the vm_object when performing vm_object_pip_add().alc2003-04-201-0/+8
|
* - We are not guaranteed that read ahead blocks are not in memory already.jeff2003-03-301-1/+9
| | | | | Check for B_DELWRI as well as B_CACHED before issuing io on a buffer. This is especially important since we are changing the b_iocmd.
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* - Unlock the target bp and not the pager buf bp in a failure case injeff2003-03-171-1/+1
| | | | | | | cluster_wbuild(). This was causing strange panics that were widely reported on current@. Big Pointy Hat to: jeff
OpenPOWER on IntegriCloud