summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient/nfs_node.c
Commit message (Collapse)AuthorAgeFilesLines
* Serialize NFS vinvalbuf operations by acquiring/upgrading to theps2004-12-061-1/+1
| | | | | | | | | | | | | vnode EXCLUSIVE lock. This prevents threads from adding pages to the vnode while an invalidation is in progress, closing potential races. In the bioread() path, callers acquire the SHARED vnode lock - so while an invalidate was in progress, it was possible to fault in new pages onto the vnode causing the invalidation to take a while or fail. We saw these races at Yahoo! with very large files+heavy concurrent access. Forcing an upgrade to EXCLUSIVE lock before doing the invalidation closes all these races. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* Back when VOP_* was introduced, we did not have new-style structphk2004-12-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-241-0/+4
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Clean up properly when unloading NFS client module.peadar2004-04-111-0/+10
| | | | | | | | | This includes a modified form of some code from Thomas Moestl (tmm@) to properly clean up the UMA zone and the "nfsnodehashtbl" hash table. Reviewed By: iedowse PR: 16299
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* University of Michigan's Citi NFSv4 kernel client code.alfred2003-11-141-3/+8
| | | | Submitted by: Jim Rees <rees@umich.edu>
* Since the addition of the VI_DOINGINACT flag some time ago,iedowse2003-10-051-15/+1
| | | | | | | VOP_INACTIVE routines need not worry about their vnode getting recycled if they block. Remove the code from nfs_inactive() that used vget() to get an extra vnode reference that was held during the nfs_vinvalbuf() call.
* - We don't need to cache_purge() in nfs_reclaim(), vclean() does it for us.jeff2003-10-051-2/+0
|
* Name the vnode method vectors consistently with the rest of the filesystems.phk2003-09-121-1/+1
| | | | This improves the output of src/tools/tools/vop_table
* Back out M_* changes, per decision of the TRB.imp2003-02-191-2/+2
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Remove extern declarations of stuff which is static in nfs_node.cphk2002-10-201-0/+3
| | | | | | Move related macro to nfs_node.c Spotted by: FlexeLint
* Regularize the vop_stdlock'ing protocol across all the filesystemsmckusick2002-10-141-1/+0
| | | | | | | | | | | | | | | | | | | | that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.
* - Lock access to the buf lists.jeff2002-09-251-3/+3
| | | | | - Use vrefcnt() where appropriate. - Add some locking asserts.
* Remove all use of vnode->v_tag, replacing with appropriate substitutes.njl2002-09-141-1/+1
| | | | | | | | | | | | v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)
* Convert old style (type foo *)0 casts to NULLsdillon2002-07-111-3/+3
| | | | | PR: kern/40360 Requested by: Hiten PAndya via direct email
* Remove the nfs_{lock,unlock,islocked} functions and the associatediedowse2002-04-271-86/+0
| | | | | definitions; they have been unused and #if 0'd out since the Lite/2 merge and we are unlikely to want them in the future.
* Remove references to vm_zone.h and switch over to the new uma API.jeff2002-03-201-7/+8
|
* nfs_nget() does no locking whatsoever when looking up a vnode. If thedillon2001-12-271-1/+4
| | | | | | | vget() sleeps we have to retry the operation to avoid racing against a deletion. MFC maybe: submitted to re's
* Cleanup and split of nfs client and server code.peter2001-09-181-60/+33
| | | | This builds on the top of several repo-copies.
* KSE Milestone 2julian2001-09-121-10/+10
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-3/+6
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode topeter2001-04-111-0/+79
| | | | | | | | | | | enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.
* Use the same API as the example code.peter2001-03-201-1/+1
| | | | | | | | | | Allow the initial hash value to be passed in, as the examples do. Incrementally hash in the dvp->v_id (using the official api) rather than add it. This seems to help power-of-two predictable filename trees where the filenames repeat on a power-of-two cycle and the directory trees have power-of-two components in it. The simple add then mask was causing things like 12000+ entry collision chains while most other entries have between 0 and 3 entries each. This way seems to improve things.
* Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash).peter2001-03-171-28/+2
| | | | | | | | | | | | | | | | | Make the name cache hash as well as the nfsnode hash use it. As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32). The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha. Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.
* Dramatically improve the **lame** nfs_hash(). This is based on thepeter2001-03-171-8/+16
| | | | | | | | | | | | | Fowler / Noll / Vo Hash (http://www.isthe.com/chongo/tech/comp/fnv/). This improves hash coverage a *massive* amount. We were seeing one set of machines that were using 0.84% of their 131072 entry nfsnode hash buckets with maximum chain lengths of up to ~500 entries. The machine was spending nearly 100% of its time in 'system'. A test with this has pushed the coverage from a few perCent up to 91% utilization with a max chain length of 11. Submitted by: David Filo
* In preparation for deprecating CIRCLEQ macros in favor of TAILQmckusick2000-11-141-2/+2
| | | | | macros which provide the same functionality and are a bit more efficient, convert use of CIRCLEQ's in NFS to TAILQ's.
* Remove unneeded #include <sys/proc.h> lines.phk2000-10-291-1/+0
|
* Add missed vop_stdunlock() for fifo's vnops (this affects only v2 mounts).bp2000-10-151-0/+1
| | | | Give nfs's node lock its own name.
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-041-0/+2
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Back out the previous change to the queue(3) interface.jake2000-05-261-1/+1
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-1/+1
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Enhance reassignbuf(). When a buffer cannot be time-optimally inserteddillon2000-01-051-1/+13
| | | | | | | | | | | | | | | | | | | into vnode dirtyblkhd we append it to the list instead of prepend it to the list in order to maintain a 'forward' locality of reference, which is arguably better then 'reverse'. The original algorithm did things this way to but at a huge time cost. Enhance the append interlock for NFS writes to handle intr/soft mounts better. Fix the hysteresis for NFS async daemon I/O requests to reduce the number of unnecessary context switches. Modify handling of NFS mount options. Any given user option that is too high now defaults to the kernel maximum for that option rather then the kernel default for that option. Reviewed by: Alfred Perlstein <bright@wintelcom.net>
* Introduce NDFREE (and remove VOP_ABORTOP)eivind1999-12-151-17/+0
|
* Fix two problems: First, fix the append seek position race that candillon1999-12-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | occur due to np->n_size potentially changing if nfs_getcacheblk() blocks in nfs_write(). Second, under -current we must supply the proper bufsize when obtaining buffers that straddle the EOF, but due to the fact that np->n_size can change out from under us it is possible that we may specify the wrong buffer size and wind up truncating dirty data written by another process. Both problems are solved by implementing nfs_rslock(), which allows us to lock around sensitive buffer cache operations such as those that occur when appending to a file. It is believed that this race is responsible for causing dirtyoff/dirtyend and (in stable) validoff/validend to exceed the buffer size. Therefore we have now added a warning printf for the dirtyoff/end case in current. However, we have introduced a new problem which we need to fix at some point, and that is that soft or intr NFS mounts may become uninterruptable from the point of view of process A which is stuck waiting on rslock while process B is stuck doing the rpc. To unstick process A, process B would have to be interrupted first. Reviewed by: Alfred Perlstein <bright@wintelcom.net>
* Lock reporting and assertion changes.eivind1999-12-111-0/+1
| | | | | | | | | | | | | | | * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Thanks to Bruce for noticing this.... compare against the *new* nfsnode'smjacob1999-06-191-2/+2
| | | | | | | | mount point for seeing whether or not the new nfsnode is already in the hash queue. We're pretty much guaranteed that the old nfsnode is already in the hash queue. Wank! Infinite Loop! Looks like just a minor typo.... (ah the influence of fortran ... np && np2... why not nfsnode_the_first && nfsnode_the_second???)...
* If we retry this operation from the top of this routine, we need tomjacob1999-06-151-1/+5
| | | | | | make sure we've freed any allocated resources (to avoid a memory leak) and and do the right thing with respect to the nfs node hash lock we'd acquired.
* Fix a malloc racepeter1999-06-051-3/+12
| | | | Obtained from: OpenBSD (csapuntz)
* Do not need (or want) to take a reference on an NFS file thatmckusick1998-09-291-6/+12
| | | | | | | | | is being deleted due to an forcible unmount. The problem is that vgone calls vclean() which then calls calls nfs_inactive() with VXLOCK set on the vnode. Nfs_inactive() was calling vget() to get a reference on the vnode, which in turn hung on VXLOCK. Nfs_inactive() now checks v_usecount to make sure that the vnode is not coming from vclean() before it does a vget().
* Convert a couple of large allocations to use zones rather than mallocpeter1998-05-241-14/+9
| | | | | | for better packing. This means that we can choose better values for the various hash entries without having to try and get it all to fit within an artificial power of two limit for malloc's sake.
* Add missing arg to vget().. Serves me right for committing a 2.2 patch topeter1998-05-131-2/+2
| | | | | | -current without testing it there.. :-( Submitted by: Michael Hancock <michaelh@cet.co.jp>
* Hold a reference to the vnode during the sillyrename cleanup. If we blockpeter1998-05-131-1/+9
| | | | | | | | | | | | | | | in nfs_vinvalbuf() or the nfs_removeit(), we can have the nfsnode reallocated from underneath us (eg: replaced by a ufs 'struct inode') which can cause disk corruption ('freeing free block' when di_db[5] gets trashed). This is not a cheap fix, but it'll do until the nfsnodes get reference counting and/or locking. Apparently NetBSD have a similar fix (apparently from BSDI). I wish all PR's had this much useful detail. :-) PR: 6611 Submitted by: Stephen Clawson <sclawson@marker.cs.utah.edu>
* Staticize.eivind1998-02-091-4/+4
|
* Unspammed nested include of <vm/vm_zone.h>.bde1997-12-271-1/+3
|
* Don't #include <nfs/nfs.h> in <nfs/nfs_node.h> if KERNEL is defined.bde1997-10-281-1/+2
| | | | Fixed everything that depended on the nested include.
* Last major round (Unless Bruce thinks of somthing :-) of malloc changes.phk1997-10-121-2/+2
| | | | | | | | Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
OpenPOWER on IntegriCloud