summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient/nfs_subs.c
Commit message (Collapse)AuthorAgeFilesLines
* Changes to make the NFS client MP safe.mohans2006-05-191-24/+108
| | | | Thanks to Kris Kennaway for testing and sending lots of bugs my way.
* fix a problem with XID re-use when a server returns NFSERR_JUKEBOX.rees2005-11-211-3/+4
| | | | | | | Submitted by: cel@citi.umich.edu Fixed by: rick@snowhite.cis.uoguelph.ca Approved by: alfred MFC after: 3 weeks
* Make nfs_timer() MPSAFE. With this change, the bottom half of the NFSps2005-07-191-1/+1
| | | | | | | client (the interface with the protocol stack and callouts) is Giant-free. Submitted by: Mohan Srinivasan.
* - The VI_DOOMED flag now signals the end of a vnode's relationship withjeff2005-03-131-1/+1
| | | | | | the filesystem. Check that rather than VI_XLOCK. Sponsored by: Isilon Systems, Inc.
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Rewrite of the NFS client's reply handling. We now have NFS socketps2004-12-061-0/+4
| | | | | | | | upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* 2 fixes that improve on the consistency of the NFS client cache.ps2004-12-061-4/+4
| | | | | | | | | | | - Change the cached mtime to a 'struct timespec' from a time_t. Improving the precision of the cached mtime tightens up NFS' "close-to-open" consistency considerably. - Always force an over-the-wire consistency check from nfs_open() (unless the file is marked modified). This further improves NFS' "close-to-open" consistency. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* Add non-blocking versions of nfsm_dissect() and friends, for use fromps2004-12-061-1/+1
| | | | | | | | | socket callbacks or similar callers, from both the NFS client and the server. Instituted nfsm_dissect_nonblock(), nfsm_dissect_xx_nonblock(). And nfsm_disct() now takes an extra M_TRYWAIT/M_DONTWAIT argument. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* For reasons unknown, the nfs locking code used a fifo to send requests tophk2004-12-061-10/+0
| | | | | | | | | | | | | userland and a dedicated system call to get replies. The vnode-bypass of fifos broke this into a panic. Ditch all the magic and create a device /dev/nfslock instead, and use that for both directions apart from the shorter path, this is also faster because the device driver runs Giant free using the vnode bypass. Noticed by: marcel
* Back when VOP_* was introduced, we did not have new-style structphk2004-12-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-2/+1
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-1/+1
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Remove support for using NFS device nodes.phk2004-09-281-7/+0
|
* When we traverse the vnodes on a mountpoint we need to look out forphk2004-07-041-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | our cached 'next vnode' being removed from this mountpoint. If we find that it was recycled, we restart our traversal from the start of the list. Code to do that is in all local disk filesystems (and a few other places) and looks roughly like this: MNT_ILOCK(mp); loop: for (vp = TAILQ_FIRST(&mp...); (vp = nvp) != NULL; nvp = TAILQ_NEXT(vp,...)) { if (vp->v_mount != mp) goto loop; MNT_IUNLOCK(mp); ... MNT_ILOCK(mp); } MNT_IUNLOCK(mp); The code which takes vnodes off a mountpoint looks like this: MNT_ILOCK(vp->v_mount); ... TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes); ... MNT_IUNLOCK(vp->v_mount); ... vp->v_mount = something; (Take a moment and try to spot the locking error before you read on.) On a SMP system, one CPU could have removed nvp from our mountlist but not yet gotten to assign a new value to vp->v_mount while another CPU simultaneously get to the top of the traversal loop where it finds that (vp->v_mount != mp) is not true despite the fact that the vnode has indeed been removed from our mountpoint. Fix: Introduce the macro MNT_VNODE_FOREACH() to traverse the list of vnodes on a mountpoint while taking into account that vnodes may be removed from the list as we go. This saves approx 65 lines of duplicated code. Split the insmntque() which potentially moves a vnode from one mount point to another into delmntque() and insmntque() which does just what the names say. Fix delmntque() to set vp->v_mount to NULL while holding the mountpoint lock.
* Second half of the dev_t cleanup.phk2004-06-171-1/+1
| | | | | | | | | | | The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev() Various minor adjustments including handling of userland access to kernel space struct cdev etc.
* Let the NFS client notice a file's size changing as a modification.peadar2004-04-141-2/+9
| | | | | | | | | This avoids presenting invalid data to the client's applications when the file is modified, and then extended within the window of the resolution of the modifcation timestamp. Reviewed By: iedowse PR: kern/64091
* Unbreak build: s/TAILQ_ISEMPTY/TAILQ_EMPTY/gmarcel2004-04-111-1/+1
|
* Clean up properly when unloading NFS client module.peadar2004-04-111-0/+19
| | | | | | | | | This includes a modified form of some code from Thomas Moestl (tmm@) to properly clean up the UMA zone and the "nfsnodehashtbl" hash table. Reviewed By: iedowse PR: 16299
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* only do nfs rpc callouts if there is work to do.rees2004-03-251-3/+2
| | | | | Submitted by: kan Approved by: alfred
* Remove unused second arg to vfinddev().phk2004-03-111-3/+4
| | | | Don't call addaliasu() on VBLK nodes.
* Use function pointers to remove the depenancy cross dependancy on nfs4alfred2003-11-221-4/+0
| | | | | | | | and the nfs3 client. Also fix some bugs that happen to be causing crashes in both v3 and v4 introduced by the v4 import. Submitted by: Jim Rees <rees@umich.edu> Approved by: re
* University of Michigan's Citi NFSv4 kernel client code.alfred2003-11-141-0/+6
| | | | Submitted by: Jim Rees <rees@umich.edu>
* Remove mntvnode_mtx and replace it with per-mountpoint mutex.kan2003-11-051-4/+4
| | | | | | | | | | Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to operate on this mutex transparently. Eventually new mutex will be protecting more fields in struct mount, not only vnode list. Discussed with: jeff
* - Check the XLOCK before we inspect the vnode.jeff2003-10-051-0/+4
|
* Name the vnode method vectors consistently with the rest of the filesystems.phk2003-09-121-2/+2
| | | | This improves the output of src/tools/tools/vop_table
* Back out M_* changes, per decision of the TRB.imp2003-02-191-11/+11
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-11/+11
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Change iov_base's type from `char *' to the standard `void *'. Allmike2002-10-111-1/+2
| | | | | uses of iov_base which assume its type is `char *' (in order to do pointer arithmetic) have been updated to cast iov_base to `char *'.
* - Lock access to the buf lists.jeff2002-09-251-0/+4
| | | | | - Use vrefcnt() where appropriate. - Add some locking asserts.
* Convert old style (type foo *)0 casts to NULLsdillon2002-07-111-9/+9
| | | | | PR: kern/40360 Requested by: Hiten PAndya via direct email
* Remove references to vm_zone.h and switch over to the new uma API.jeff2002-03-201-2/+3
|
* Avoid passing the variable `tl' to functions that just use it foriedowse2001-12-181-79/+82
| | | | | | | | | | | | | temporary storage. In the old NFS code it wasn't at all clear if the value of `tl' was used across or after macro calls, but I'm fairly confident that the convention was to keep its use local. Each ex-macro function now uses a local version of this variable, so all of the double-indirection goes away. The only exception to the `local use' rule for `tl' is nfsm_clget(), which is left unchanged by this commit. Reviewed by: peter
* Change the vnode list under the mount point from a LIST to a TAILQdillon2001-10-231-2/+2
| | | | | | in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days
* Make nfsm_dissect() have an obvious return value.peter2001-09-271-26/+28
|
* Tidy up nfsm_build usage. This is only partially finished.peter2001-09-271-20/+20
|
* Add a missing dereference level. This caused nfsm_postop_attr_xx()iedowse2001-09-251-1/+1
| | | | | | | to try and extract node attributes from an RPC reply even if none were present. Reviewed by: peter
* Oops. Fix a missing indirection level. gcc didn't complain about it onpeter2001-09-201-1/+1
| | | | | x86, but did complain about it on alpha (since int and pointer are different sizes)
* Cleanup and split of nfs client and server code.peter2001-09-181-1499/+310
| | | | This builds on the top of several repo-copies.
* KSE Milestone 2julian2001-09-121-17/+17
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-2/+2
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* - Protect the mnt_vnode list with the mntvnode lock.jhb2001-06-281-2/+4
| | | | - Use queue(9) macros.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+2
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* * Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.bmilekic2000-12-211-14/+14
| | | | | | | | | | | | | | | | | | This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.
* In preparation for deprecating CIRCLEQ macros in favor of TAILQmckusick2000-11-141-1/+1
| | | | | macros which provide the same functionality and are a bit more efficient, convert use of CIRCLEQ's in NFS to TAILQ's.
* Problem to avoid processes getting stuck in "vmopar". From Ian'sdwmalone2000-10-241-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | mail: The problem seems to originate with NFS's postop_attr information that is returned with a read or write RPC. Within a vm_fault context, the code cannot deal with vnode_pager_setsize() shrinking a vnode. The workaround in the patch below stops the nfsm_postop_attr() macro from ever shrinking a vnode. If the new size in the postop_attr information is smaller, then it just sets the nfsnode n_attrstamp to 0 to stop the wrong size getting used in the future. This change only affects postop_attr attributes; the nfsm_loadattr() macro works as normal. The change is implemented by adding a new argument to nfs_loadattrcache() called 'dontshrink'. When this is non-zero, nfs_loadattrcache() will never reduce the vnode/nfsnode size; instead it zeros n_attrstamp. There remain other was processes can get stuck in vmopar. Submitted by: Ian Dowse <iedowse@maths.tcd.ie> Reviewed by: dillon Tested by: Vadim Belman <voland@lflat.org>
* This patch corrects the first round of panics and hangs reportedmckusick2000-07-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
* Back out the previous change to the queue(3) interface.jake2000-05-261-1/+1
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
OpenPOWER on IntegriCloud