summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient
Commit message (Collapse)AuthorAgeFilesLines
* Fix a timeout deadlock that can occur when the process holding thedillon1999-12-131-4/+24
| | | | | | | receive lock hasn't yet managed to send its own request. PR: kern/15055 Submitted by: Ian Dowse iedowse@maths.tcd.ie
* Fix a number of server-side issues related to aborting badly formeddillon1999-12-121-0/+3
| | | | | | | | NFS packets, mainly initializing structure pointers to NULL which are conditionally freed prior to return. PR: kern/15249 Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
* Synopsis of problem being fixed: Dan Nelson originally reported thatdillon1999-12-123-42/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blocks of zeros could wind up in a file written to over NFS by a client. The problem only occurs a few times per several gigabytes of data. This problem turned out to be bug #3 below. bug #1: B_CLUSTEROK must be cleared when an NFS buffer is reverted from stage 2 (ready for commit rpc) to stage 1 (ready for write). Reversions can occur when a dirty NFS buffer is redirtied with new data. Otherwise the VFS/BIO system may end up thinking that a stage 1 NFS buffer is clusterable. Stage 1 NFS buffers are not clusterable. bug #2: B_CLUSTEROK was inappropriately set for a 'short' NFS buffer (short buffers only occur near the EOF of the file). Change to only set when the buffer is a full biosize (usually 8K). This bug has no effect but should be fixed in -current anyway. It need not be backported. bug #3: B_NEEDCOMMIT was inappropriately set in nfs_flush() (which is typically only called by the update daemon). nfs_flush() does a multi-pass loop but due to the lack of vnode locking it is possible for new buffers to be added to the dirtyblkhd list while a flush operation is going on. This may result in nfs_flush() setting B_NEEDCOMMIT on a buffer which has *NOT* yet gone through its stage 1 write, causing only the commit rpc to be made and thus causing the contents of the buffer to be thrown away (never sent to the server). The patch also contains some cleanup, which only applies to the commit into -current. Reviewed by: dg, julian Originally Reported by: Dan Nelson <dnelson@emsphone.com>
* Lock reporting and assertion changes.eivind1999-12-112-1/+2
| | | | | | | | | | | | | | | * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
* The symlink implementation could improperly return a NULL vp along withdillon1999-11-301-3/+33
| | | | | | | | | | | | | | | | | | | | a 0 error code. The problem occured with NFSv2 mounts and also with any NFSv3 mount returning an EEXIST error (which is translated to 0 prior to return). The reply to the rpc only contains the file handle for the no-error case under NFSv3. The error case under NFSv3 and all cases under NFSv2 do *not* return the file handle. The fix is to do a secondary lookup to obtain the file handle and thus be able to generate a return vnode for the situations where the rpc reply does not contain the required information. The bug was originally introduced when VOP_SYMLINK semantics were changed for -CURRENT. The NFS symlink implementation was not properly modified to go along with the change despite the fact that three people reviewed the code. It took four attempts to get the current fix correct with five people. Is NFS obfuscated? Ha! Reviewed by: Alfred Perlstein <bright@wintelcom.net> Testing and Discussion: "Viren R.Shah" <viren@rstcorp.com>, Eivind Eklund <eivind@FreeBSD.ORG>, Ian Dowse <iedowse@maths.tcd.ie>
* Remap the error EEXISTS => 0 *before* using error to determine if we shouldeivind1999-11-271-7/+9
| | | | return a vp.
* nm_srtt and nm_sdrtt are arrays[4]. Remove explicit initializationdillon1999-11-221-3/+3
| | | | | | | | | of element [4] in both, which goes beyond the end of the array, leaving [0], [1], [2], and [3]. This bug did not cause any problems since the overrun fields are initialized after the bogus array init but needs to be fixed anyway. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
* Fix VOP_MKNOD for loss of WILLRELE. I don't know how I could have missedeivind1999-11-201-7/+1
| | | | | | this in the first place :-( Noticed by: bde
* Remove WILLRELE from VOP_SYMLINKeivind1999-11-131-1/+3
| | | | | | Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.
* Remove special case socket sharing code in order to allow nfsd todillon1999-11-111-12/+12
| | | | | | | bind IP addresses to udp/cltp sockets separately. PR: kern/13049 Reviewed by: David Malone <dwmalone@maths.tcd.ie>, freebsd-current
* Fix nfssvc_addsock() to not attempt to free a NULL socket structuredillon1999-11-081-3/+6
| | | | | | | | | when returning an error. Bug fix was extracted from the PR. The PR is not yet entirely resolved by this commit. PR: kern/13049 Reviewed by: Matt Dillon <dillon@freebsd.org> Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
* Call bootpc_init before we try to mount an NFS root, if we're configuredmsmith1999-11-011-0/+6
| | | | | | to use BOOTP for NFS root discovery. The entire interface setup inside nfs_mountroot is evil, and should die.
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Move NFS access cache hits/misses into nfsstats structure sodillon1999-10-254-6/+13
| | | | /usr/bin/nfsstat can get to it easily.
* Before we start to mess with the VFS name-cache clean things up a little bit:phk1999-10-032-6/+0
| | | | Isolate the namecache in its own file, and give it a dedicated malloc type.
* Careless use of struct proc *p caused major problems. 'p' is allowed tomarcel1999-09-291-4/+8
| | | | | | | be NULL in this function (nfs_sigintr). Reorder the statements and guard them all with a single if (p != NULL). reported, reviewed and tested by: jdp
* sigset_t change (part 2 of 5)marcel1999-09-294-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.
* Add comment to clarify a commit rpc optimization already being performed.dillon1999-09-201-0/+8
|
* Asynchronized client-side nfs_commit. NFS commit operations weredillon1999-09-175-14/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | previously issued synchronously even if async daemons (nfsiod's) were available. The commit has been moved from the strategy code to the doio code in order to asynchronize it. Removed use of lastr in preparation for removal of vnode->v_lastr. It has been replaced with seqcount, which is already supported by the system and, in fact, gives us a better heuristic for sequential detection then lastr ever did. Made major performance improvements to the server side commit. The server previously fsync'd the entire file for each commit rpc. The server now bawrite()s only those buffers related to the offset/size specified in the commit rpc. Note that we do not commit the meta-data yet. This works still needs to be done. Note that a further optimization can be done (and has not yet been done) on the client: we can merge multiple potential commit rpc's into a single rpc with a greater file offset/size range and greatly reduce rpc traffic. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
* Seperate the export check in VFS_FHTOVP, exports are now checked viaalfred1999-09-113-42/+5
| | | | | | | | | VFS_CHECKEXP. Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD
* All unimplemented VFS ops now have entries in kern/vfs_default.c that returnalfred1999-09-071-89/+5
| | | | | | | | | | | | | reasonable defaults. This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed. This should make *_vfsops.c more readable and reduce bloat. Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
* remove unused variables.phk1999-08-281-1/+0
|
* $Id$ -> $FreeBSD$peter1999-08-2817-17/+17
|
* Simplify the handling of VCHR and VBLK vnodes using the new dev_t:phk1999-08-261-21/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;
* Convert all the nfs macros to do { blah } while (0) to ensure itpeter1999-08-191-111/+168
| | | | | | works correctly in if/else etc. egcs had probably picked up most of the problems here before with "ambiguous braces" etc, but this should increase the robustness a bit. Based on an idea from Eivind Eklund.
* Add the (inline) function vm_page_undirty for clearing the dirty bitmaskalc1999-08-171-3/+3
| | | | | | | | of a vm_page. Use it. Submitted by: dillon
* nfs_getcacheblk() can return 0 if the mount is interruptible. It need to bedt1999-08-121-1/+5
| | | | | | checked by the caller. Broken in: rev. 1.70 (1999/05/02)
* Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,phk1999-08-082-5/+4
| | | | | | a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.
* Don't over-allocate and over-copy shorter NFSv2 filehandles and thenpeter1999-08-041-10/+11
| | | | | | | | | | correct the pointers afterwards. It's kinda bogus that we generate a 24 (?) byte filehandle (2 x int32 fsid and 16 byte VFS fhandle) and pad it out to 64 bytes for NFSv3 with garbage. The whole point of NFSv3's variable filehandle length was to allow for shorter handles, both in memory and over the wire. I plan on taking a shot at fixing this shortly.
* As described by the submitter:msmith1999-07-311-34/+55
| | | | | | | | | | | | | | | | | | | | I did some tcpdumping the other day and noticed that GETATTR calls were frequently followed by an ACCESS call to the same file. The attached patch changes nfs_getattr to fill the access cache as a side effect. This is accomplished by calling ACCESS rather than GETATTR. This implies a modest overhead of 4 bytes in the request and 8 bytes in the response compared to doing a vanilla GETATTR. ... [The patch comprises two parts] The first is the "real" patch, the second counts misses and hits rather than fills and hits. The difference is subtle but important because both nfs_getattr and nfs_access now fill the cache. It also changes the default value of nfsaccess_cache_timeout to better match the attribute cache. IMHO, file timestamps change much more frequently than protection bits. Submitted by: Bjoern Groenvall <bg@sics.se> Reviewed by: dillon (partially)
* Close PR #12651: the hash calculation routine has changed in otherwpaul1999-07-301-2/+2
| | | | parts of the kernel but was not updated in nfs_readdirplusrpc().
* Fix two bugs in nfs_readdirplus(). The first is that in some cases,wpaul1999-07-301-3/+6
| | | | | | | | | | | | | | | vnodes are locked and never unlocked, which leads to processes starting to wedge up after doing a mount -o nfsv3,tcp,rdirplus foo:/fs /fs; ls /fs. The second is that sometimes cnp is accessed without having been properly initialized: cnp->cn_nameptr points to an earlier name while "len" contains the length of a current name of different size. This leads to an attempt to dereference *(cn->cn_nameptr + len) which will sometimes cause a page fault and a panic. With these two fixes, client side readdirplus works correctly with FreeBSD, IRIX 6.5.4 and Solaris 2.5.1 and 2.6 servers. Submitted by: Matthew Dillon <dillon@backplane.com>
* I have not one single time remembered the name of this function correctlyphk1999-07-171-2/+2
| | | | so obviously I gave it the wrong name. s/umakedev/makeudev/g
* Fix warning. va_fsid is udev_t, which is int32_t. No need to use %lx.peter1999-07-011-2/+2
|
* Submitted by: Conrad Minshall <conrad@apple.com>julian1999-06-301-1/+6
| | | | | | | | | | Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> The following ugly hack to the exit path of nfs_readlinkrpc() circumvents an Auspex bug: for symlinks longer than 112 (0x70) they return a 1024 byte xdr string - the correct data with many nulls appended. Without this fix namei returns ENAMETOOLONG, at least it does on our source base and on FreeBSD 3.0. Note we do not (and should not) rely upon their null padding.
* Fix a KASSERT() that was negated and lead to:peter1999-06-281-2/+2
| | | | | nfs_strategy: buffer 0xxxxx not locked when you attempted to write and had INVARIANTS turned on.
* Minor tweaks to make sure (new) prerequisites for <sys/buf.h> (mostlypeter1999-06-271-3/+3
| | | | splbio()/splx()) are #included in time.
* Convert buffer locking from using the B_BUSY and B_WANTED flags to usingmckusick1999-06-263-25/+33
| | | | | | | lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
* Matt's NFS fixes.julian1999-06-233-42/+96
| | | | | | Submitted by: Matt Dillon Reviewed by: David Cross, Julian Elischer, Mike Smith, Drew Gallatin 3.2 version to follow when tested
* Thanks to Bruce for noticing this.... compare against the *new* nfsnode'smjacob1999-06-191-2/+2
| | | | | | | | mount point for seeing whether or not the new nfsnode is already in the hash queue. We're pretty much guaranteed that the old nfsnode is already in the hash queue. Wank! Infinite Loop! Looks like just a minor typo.... (ah the influence of fortran ... np && np2... why not nfsnode_the_first && nfsnode_the_second???)...
* Add a vnode argument to VOP_BWRITE to get rid of the last vnodemckusick1999-06-162-5/+5
| | | | | operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.
* If we retry this operation from the top of this routine, we need tomjacob1999-06-151-1/+5
| | | | | | make sure we've freed any allocated resources (to avoid a memory leak) and and do the right thing with respect to the nfs node hash lock we'd acquired.
* Various changes lifted from the OpenBSD cvs tree:peter1999-06-055-34/+54
| | | | | | | | | | | | | | | txdr_hyper and fxdr_hyper tweaks to avoid excessive CPU order knowledge. nfs_serv.c: don't call nfsm_adj() with negative values, windows clients could crash servers when doing a readdir of a large directory. nfs_socket.c: Use IP_PORTRANGE to get a priviliged port without a spin loop trying to bind(). Don't clobber a mbuf pointer or we get panics on a NFS3ERR_JUKEBOX error from a server when reusing a freed mbuf. nfs_subs.c: Don't loose st_blocks on NFSv2 mounts when > 2GB. Obtained from: OpenBSD
* Fix a malloc racepeter1999-06-051-3/+12
| | | | Obtained from: OpenBSD (csapuntz)
* Don't mistake a non-async block that needs to be committed for anpeter1999-06-051-2/+2
| | | | | | interrupted write. Obtained from: fvdl@NetBSD.org via OpenBSD.
* Divorce "dev_t" from the "major|minor" bitmap, which is now calledphk1999-05-112-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
* remove b_proc from struct buf, it's (now) unused.phk1999-05-065-23/+22
| | | | Reviewed by: dillon, bde
* Add sufficient braces to keep egcs happy about potentially ambiguouspeter1999-05-061-2/+3
| | | | if/else nesting.
* All directory accesses must be made with NFS_DIRBLKSIZE chunks to avoidalc1999-05-031-3/+3
| | | | | | | | | | confusing the directory read cookie cache. The nfs_access implementation for v2 mounts attempts to read from the directory if root is the user so that root can't access cached files when the server remaps root to some other user. Submitted by: Doug Rabson <dfr@nlsystems.com> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
* The VFS/BIO subsystem contained a number of hacks in order to optimizealc1999-05-026-282/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
OpenPOWER on IntegriCloud