summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient/nfs_bio.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't brelse(bp) if bp is null. Also, eliminate some redundancydas2005-03-181-18/+5
| | | | | | and dead code. Found by: Coverity Prevent analysis tool
* - The VI_DOOMED flag now signals the end of a vnode's relationship withjeff2005-03-131-1/+1
| | | | | | the filesystem. Check that rather than VI_XLOCK. Sponsored by: Isilon Systems, Inc.
* Remove unused cred arg from nfs_vinvalbuf() and many bogus argumentsphk2005-01-241-6/+5
| | | | passed for it.
* Eliminate unused and unnecessary "cred" argument from vinvalbuf()phk2005-01-141-2/+2
|
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* First cut of NFS direct IO support.ps2004-12-151-0/+174
| | | | | | | | | | | | | | - NFS direct IO completely bypasses the buffer and page caches. If a file is open for direct IO all caching is disabled. - Direct IO for Directories will be addressed later. - 2 new NFS directio related sysctls are added. One is a knob to disable NFS direct IO completely (direct IO is enabled by default). The other is to disallow mmaped IO on a file that has at least one O_DIRECT open (see the comment in nfs_vnops.c for more details). The default is to allow mmaps on a file that has O_DIRECT opens. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!
* Store a hint in the nfsnode to detect sequential access of the file.ps2004-12-101-1/+4
| | | | | | | | Kick off a readahead only when sequential access is detected. This eliminates wasteful readaheads in random file access. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!
* Rewrite of the NFS client's reply handling. We now have NFS socketps2004-12-061-2/+8
| | | | | | | | upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* 2 fixes that improve on the consistency of the NFS client cache.ps2004-12-061-4/+4
| | | | | | | | | | | - Change the cached mtime to a 'struct timespec' from a time_t. Improving the precision of the cached mtime tightens up NFS' "close-to-open" consistency considerably. - Always force an over-the-wire consistency check from nfs_open() (unless the file is marked modified). This further improves NFS' "close-to-open" consistency. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* Serialize NFS vinvalbuf operations by acquiring/upgrading to theps2004-12-061-26/+23
| | | | | | | | | | | | | vnode EXCLUSIVE lock. This prevents threads from adding pages to the vnode while an invalidation is in progress, closing potential races. In the bioread() path, callers acquire the SHARED vnode lock - so while an invalidate was in progress, it was possible to fault in new pages onto the vnode causing the invalidation to take a while or fail. We saw these races at Yahoo! with very large files+heavy concurrent access. Forcing an upgrade to EXCLUSIVE lock before doing the invalidation closes all these races. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* - Eliminate the acquisition and release of the bqlock in bremfree() byjeff2004-11-181-0/+2
| | | | | | | | | | | | | | | | | | | | setting the B_REMFREE flag in the buf. This is done to prevent lock order reversals with code that must call bremfree() with a local lock held. This also reduces overhead by removing two lock operations per buf for fsync() and similar. - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock has been acquired so that we may remove ourself from the free-list. - Provide a bremfreef() function to immediately remove a buf from a free-list for use only by NFS. This is done because the nfsclient code overloads the b_freelist queue for its own async. io queue. - Simplify the numfreebuffers accounting by removing a switch statement that executed the same code in every possible case. - getnewbuf() can encounter locked bufs on free-lists once Giant is removed. Remove a panic associated with this condition and delay asserts that inspect the buf until after it is locked. Reviewed by: phk Sponsored by: Isilon Systems, Inc.
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-241-35/+2
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-2/+2
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* nfsclient/nfs_bio.c has a PHOLD() without a PRELE(). Neither shoulddas2004-10-011-3/+1
| | | | be necessary here. Also, use killproc() instead of psignal().
* Remove unused B_WRITEINPROG flagphk2004-09-151-5/+0
|
* Explicitly pass vnode to nfs_doio() and mountpoint to nfs_asyncio().phk2004-09-071-14/+9
|
* NFS mobility PHASE I, II & III (phase VI, and V pending):alfred2004-07-061-19/+37
| | | | | | | | | | | | | | | Rebind the client socket when we experience a timeout. This fixes the case where our IP changes for some reason. Signal a VFS event when NFS transitions from up to down and vice versa. Add a placeholder vfs_sysctl where we will put status reporting shortly. Also: Make down NFS mounts return EIO instead of EINTR when there is a soft timeout or force unmount in progress.
* Remove bad cookie vp kernel printf; while it does notify about anrwatson2004-06-171-1/+0
| | | | | interesting event, there's little or nothing the user can do about it.
* Make vm_page's PG_ZERO flag immutable between the time of the page'salc2004-05-061-2/+0
| | | | | | | | | | allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page. Reviewed by: tegge@
* Let the NFS client notice a file's size changing as a modification.peadar2004-04-141-1/+3
| | | | | | | | | This avoids presenting invalid data to the client's applications when the file is modified, and then extended within the window of the resolution of the modifcation timestamp. Reviewed By: iedowse PR: kern/64091
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* Properly vector all bwrite() and BUF_WRITE() calls through the same pathphk2004-03-111-2/+2
| | | | and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
* Locking for the per-process resource limits structure.jhb2004-02-041-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
* Use function pointers to remove the depenancy cross dependancy on nfs4alfred2003-11-221-19/+10
| | | | | | | | and the nfs3 client. Also fix some bugs that happen to be causing crashes in both v3 and v4 introduced by the v4 import. Submitted by: Jim Rees <rees@umich.edu> Approved by: re
* University of Michigan's Citi NFSv4 kernel client code.alfred2003-11-141-10/+43
| | | | Submitted by: Jim Rees <rees@umich.edu>
* Initialize bp->b_offset before calling VOP_STRATEGY().phk2003-10-181-5/+0
| | | | Remove KASSERTS and panics with B_PHYS checks which no longer apply.
* We do not get B_PHYS buffers here anymore. /dev/drum is long gone.phk2003-10-181-24/+2
|
* - Remove the backtrace() call from the *_vinvalbuf() functions. Thanks to ajeff2003-10-041-5/+6
| | | | | | | | stack trace supplied by phk, I now understand what's going on here. The check for VI_XLOCK stops us from calling vinvalbuf once the vnode has been partially torn down in vclean(). It is not clear that this would cause a problem. Document this in nfs_bio.c, which is where the other two filesystems copied this code from.
* - Remove interlock protection around VI_XLOCK. The interlock is notjeff2003-09-191-4/+3
| | | | | | | | | | | | | sufficient to guarantee that this race is not hit. The XLOCK will likely have to be redesigned due to the way reference counting and mutexes work in FreeBSD. We currently can not be guaranteed that xlock was not set and cleared while we were blocked on the interlock while waiting to check for XLOCK. This would lead us to reference a vnode which was not the vnode we requested. - Add a backtrace() call inside of INVARIANTS in the hopes of finding out if this condition is ever hit. It should not, since we should be retaining a reference to the vnode in these cases. The reference would be sufficient to block recycling.
* Lock the vm object when freeing a page.alc2003-06-171-1/+9
|
* The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to preventphk2003-05-311-6/+0
| | | | | | deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
* This change grabs the vnode lock for NFS client vnodes when callingrwatson2003-05-151-1/+3
| | | | | | | | | | VOP_SETATTR() or VOP_GETATTR(); without these locks (a) VFS_DEBUG_LOCKS will panic, and (b) it may be possible to corrupt entries in the cached vnode attributes in the nfsnode, since nfsnode attribute cache data is also protected by the vnode lock. Approved by: re (jhb) Pointed out by: VFS_DEBUG_LOCKS
* - Add a new 'flags' parameter to getblk().jeff2003-03-041-3/+3
| | | | | | | | | | - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick
* More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9).des2003-03-021-4/+4
|
* Abstract-out the constants for the sequential heuristic.dillon2002-12-281-1/+1
| | | | | | No operational changes. MFC after: 1 day
* - Lock access to the buf lists.jeff2002-09-251-1/+1
| | | | | - Use vrefcnt() where appropriate. - Add some locking asserts.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-2/+7
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* o Lock page queue accesses in nfs_getpages().alc2002-07-211-1/+7
|
* Fix a bug nfs_write() related to ^C'ing during a file write on andillon2002-07-161-2/+4
| | | | | | | | interruptable mount. We were returning from inside the loop without releasing the rslock. Submitted by: Mike Junk <junk@isilon.com> MFC after: 3 days
* Convert old style (type foo *)0 casts to NULLsdillon2002-07-111-6/+6
| | | | | PR: kern/40360 Requested by: Hiten PAndya via direct email
* Replace the global buffer hash table with per-vnode splay trees using adillon2002-07-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | methodology similar to the vm_map_entry splay and the VM splay that Alan Cox is working on. Extensive testing has appeared to have shown no increase in overhead. Disadvantages Dirties more cache lines during lookups. Not as fast as a hash table lookup (but still N log N and optimal when there is locality of reference). Advantages vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem syncer operate more efficiently. I get to rip out all the old hacks (some of which were mine) that tried to keep the v_dirtyblkhd tailq sorted. The per-vnode splay tree should be easier to lock / SMPng pushdown on vnodes will be easier. This commit along with another that Alan is working on for the VM page global hash table will allow me to implement ranged fsync(), optimize server-side nfs commit rpcs, and implement partial syncs by the filesystem syncer (aka filesystem syncer would detect that someone is trying to get the vnode lock, remembers its place, and skip to the next vnode). Note that the buffer cache splay is somewhat more complex then other splays due to special handling of background bitmap writes (multiple buffers with the same lblkno in the same vnode), and B_INVAL discontinuities between the old hash table and the existence of the buffer on the v_cleanblkhd list. Suggested by: alc
* In namei(), we use a NULL thread for uio_td when doing a VOP_READLINK().jhb2002-06-281-4/+4
| | | | | | | | | | | | nfs_readlink() calls nfs_bioread() which passes in uio_td as the thread argument to nfs_getcacheblk(). In nfs_getcacheblk() we dereference the thread pointer to get a process pointer to pass to nfs_sigintr(). This obviously results in a panic. :) Rather than change nfs_getcacheblk() to check if the thread pointer is NULL when calling nfs_sigintr() like other callers do, change nfs_sigintr() to take a thread as the last argument instead of a process so none of the callers have to care if the thread is NULL or not.
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-2/+2
| | | | reference.
* Revise the nfsiod auto tuning code. Now both the upper and lower limitspeter2002-01-151-12/+14
| | | | | | are specifyable by sysctl and are respected. Submitted by: Maxime Henrion <mux@sneakerz.org>
* Implement vfs.nfs.iodmin (minimum number of nfsiod's) andpeter2002-01-141-23/+25
| | | | | | | | | vfs.nfs.iodmaxidle (idle time before nfsiod's exit). Make it adaptive so that we create nfsiod's on demand and they go away after not being used for a while. The upper limit is NFS_MAXASYNCDAEMON (currently 20). More will be done here, but this is a useful checkpoint. Submitted by: Maxime Henrion <mux@qualys.com>
* This fixes a large number of bugs in our NFS client side code. A recentdillon2001-12-141-4/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit by Kirk also fixed a softupdates bug that could easily be triggered by server side NFS. * An edge case with shared R+W mmap()'s and truncate whereby the system would inappropriately clear the dirty bits on still-dirty data. (applicable to all filesystems) THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING. see vm/vm_page.c line 1641 * The straddle case for VM pages and buffer cache buffers when truncating. (applicable to NFS client side) * Possible SMP database corruption due to vm_pager_unmap_page() not clearing the TLB for the other cpu's. (applicable to NFS client side but could effect all filesystems). Note: not considered serious since the corruption occurs beyond the file EOF. * When flusing a dirty buffer due to B_CACHE getting cleared, we were accidently setting B_CACHE again (that is, bwrite() sets B_CACHE), when we really want it to stay clear after the write is complete. This resulted in a corrupt buffer. (applicable to all filesystems but probably only triggered by NFS) * We have to call vtruncbuf() when ftruncate()ing to remove any buffer cache buffers. This is still tentitive, I may be able to remove it due to the second bug fix. (applicable to NFS client side) * vnode_pager_setsize() race against nfs_vinvalbuf()... we have to set n_size before calling nfs_vinvalbuf or the NFS code may recursively vnode_pager_setsize() to the original value before the truncate. This is what was causing the user mmap bus faults in the nfs tester program. (applicable to NFS client side) * Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made by Kirk). Testing program written by: Avadis Tevanian, Jr. Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step') MFC after: 1 week
* Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blockingdillon2001-11-051-0/+6
| | | | | | | | | | | | | in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week
* Change the kernel's ucred API as follows:jhb2001-10-111-12/+6
| | | | | | | | - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
* Sigh, Last minute pre-merge typo. (missing quotes)peter2001-09-181-1/+1
|
OpenPOWER on IntegriCloud