summaryrefslogtreecommitdiffstats
path: root/sys/nfsserver/nfs_serv.c
Commit message (Collapse)AuthorAgeFilesLines
* When grabbing vnodes to service NFS requests, make sure to callphk2003-10-241-81/+11
| | | | | | vn_start_write() early to avoid snapshot deadlocks. By: mckusick
* Fix a bug in nfsrv_read() that caused the replies to certain NFSv3iedowse2003-06-241-1/+1
| | | | | | | | | | | | short read operations at the end of a file to not have the "eof" flag set as they should. The problem is that the requested read count was compared against the rounded-up reply data length instead of the actual reply data length. This bug appears to have been introduced in revision 1.78 (June 1999). It causes first-time reads of certain file sizes (e.g 4094 bytes) to fail with EIO on a RedHat 9.0 NFSv3 client. MFC after: 1 week
* Increase the size of the NFS server hash table to improve performancemckusick2003-06-211-4/+4
| | | | | | | | | | when serving up more than about 32 active files. For details see section 6.3 (pg 111) of Daniel Ellard and Margo Seltzer, ``NFS Tricks and Benchmarking Traps'' in the Proceedings of the Usenix 2003 Freenix Track, June 9-14, 2003 pg 101-114. Obtained from: Daniel Ellard <ellard@eecs.harvard.edu> Sponsored by: DARPA & NAI Labs.
* Beat vnode locking in the NFS server code into submission. This changetruckman2003-05-251-112/+186
| | | | | | | | | | is not pretty, but it fixes the code so that it no longer violates the vnode locking rules in the VFS API and doesn't trip any of the locking assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option. There is one report that this patch fixed a "locking against myself" panic on an NFS server that was tripped by a diskless client. Approved by: re (scottl)
* - Acquire the vm_object's lock when performing vm_object_page_clean().alc2003-04-241-0/+4
| | | | | | - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)
* - Lock bufs before inspecting their flags.jeff2003-03-131-6/+9
|
* - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.jeff2003-02-251-3/+7
| | | | | | | | | | - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick
* Back out M_* changes, per decision of the TRB.imp2003-02-191-10/+10
| | | | Approved by: trb
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-10/+10
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,schweikh2003-01-011-2/+2
| | | | especially in troff files.
* Abstract-out the constants for the sequential heuristic.dillon2002-12-281-3/+3
| | | | | | No operational changes. MFC after: 1 day
* In the NFSv3 `fsinfo' procedure reply, don't claim that we supportiedowse2002-12-051-2/+2
| | | | | | | | | | | | | 32k read and write operations on datagram sockets when in fact we reject requests larger than 16k. It must be the case that virtually all clients use data sizes of 16k or less for UDP transport (FreeBSD's client defaults to 8k and never exceeds 16k), as this bug has been present ever since NFSv3 support was added. Reported by: Senthil <lihtnes78@netscape.net> Reviewed by: dillon Approved by: re MFC-after: 1 week
* - Introduce a new macro, since that's what nfs loves, calledjeff2002-10-311-2/+2
| | | | | | | | | | | | | | | | nfsm_srvpathsiz. This macro plucks a length out of an rpc request and verifies that its size does not exceed NFS_MAXPATHLEN. If it does it generates an ENAMETOOLONG response. - Use this macro, and the existing nfsm_srvnamsiz macro in two places where we deal with paths passed in by the client. This fixes a linux interoperability bug. Linux was sending oversized path components which would cause us to ignore the request all together. This causes linux to hang indefinitly while it waits for a response. This could still happen in other cases where we error out with EBADRPC. Sponsored by: Isilon Systems, Inc. Reviewed by: alfred, fabbri@isilon.com, neal@isilon.com
* Correct a problem wherein NFS servers running NFSv2 would not returnrwatson2002-10-031-3/+2
| | | | | | | certain classes of failure responses to the client during a failed remove operation. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
* - Use incore() instead of gbincore() so we don't have to acquire thejeff2002-09-251-1/+1
| | | | vnode interlock.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-3/+3
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* Convert old style (type foo *)0 casts to NULLsdillon2002-07-111-19/+19
| | | | | PR: kern/40360 Requested by: Hiten PAndya via direct email
* Replace the global buffer hash table with per-vnode splay trees using adillon2002-07-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | methodology similar to the vm_map_entry splay and the VM splay that Alan Cox is working on. Extensive testing has appeared to have shown no increase in overhead. Disadvantages Dirties more cache lines during lookups. Not as fast as a hash table lookup (but still N log N and optimal when there is locality of reference). Advantages vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem syncer operate more efficiently. I get to rip out all the old hacks (some of which were mine) that tried to keep the v_dirtyblkhd tailq sorted. The per-vnode splay tree should be easier to lock / SMPng pushdown on vnodes will be easier. This commit along with another that Alan is working on for the VM page global hash table will allow me to implement ranged fsync(), optimize server-side nfs commit rpcs, and implement partial syncs by the filesystem syncer (aka filesystem syncer would detect that someone is trying to get the vnode lock, remembers its place, and skip to the next vnode). Note that the buffer cache splay is somewhat more complex then other splays due to special handling of background bitmap writes (multiple buffers with the same lblkno in the same vnode), and B_INVAL discontinuities between the old hash table and the existence of the buffer on the v_cleanblkhd list. Suggested by: alc
* More s/file system/filesystem/gtrhodes2002-05-161-4/+4
|
* Limit to the maximum allowed reply size the amount of data thatiedowse2002-04-211-0/+4
| | | | | | | | | | | nfsrv_readdir and nfsrv_readdirplus can return. A client request containing an over-large `count' field could trigger the "Bad nfs svc reply" panic in nfs_syscalls.c. Spotted while trying to reproduce kern/37304, which turned out to be fixed in FreeBSD a long time ago. MFC after: 1 week
* Change the suser() API to take advantage of td_ucred as well as do ajhb2002-04-011-2/+2
| | | | | | | | | | | | general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@
* Add a flags parameter to VFS_VGET to pass through the desiredmckusick2002-03-171-2/+4
| | | | | | | | | | | | locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-1/+1
| | | | reference.
* The vnode was not being vput()'d in the EEXIST mknod case on the nfsdillon2002-01-141-0/+2
| | | | | | | | | server side. This can lead to a system deadlock. Reviewed by: iedowse Tested by: Alexey G Misurenko <mag@caravan.ru>, iedowse Bug found with help by: Alexey G Misurenko <mag@caravan.ru> MFC at: earliest convenience
* It is required by VOP_CREATE, VOP_MKNOD, VOP_SYMLINK and VOP_MKDIRiedowse2002-01-131-3/+9
| | | | | | | | | | | | | | | | that va_mode of the supplied attributes is filled in with a valid file mode (i.e not VNOVAL, and only ALLPERM bits set). However, some NFS server op functions didn't guarantee this for all possible request messages: If a V3 client chose not include to a mode specification, we could end up creating an ffs inode with mode 0177777, requiring a manual fsck on the next reboot. Fix this by setting va_mode to 0 before calling the VOP if a mode hasn't been supplied by the client. In nfsrv_symlink(), S_IFMT bits supplied by a V2 client could end up in the va_mode passed to VOP_SYMLINK with similar effects. We now use the macro nfstov_mode() to correctly mask the bits.
* Fix a few NFSv2 issues that slipped in during the big cleanup. Theiedowse2002-01-121-33/+28
| | | | | | | | | | | | | | semantics of the nfsm_reply() macro were changed so that the caller has to explicitly handle the V2 error case, whereas before, nfsm_reply() did a `goto nfsmout' then. A few server ops (setattr, readlink, create, mkdir) weren't updated to match, so errors in the V2 case could cause protocol hangs and leaked mbufs. Correct some comments that describe the old nfsm_reply behaviour. [older, harmless nit] Remove the unnecessary `nfsmreply0' label in nfsrv_create(), since for its users, the main `ereply' label does the same thing.
* Rename some variables that end up shadowing their namesakes in the NFS clientmsmith2002-01-081-24/+24
| | | | | | code. Reviewed by: peter
* Avoid passing the variable `tl' to functions that just use it foriedowse2001-12-181-10/+0
| | | | | | | | | | | | | temporary storage. In the old NFS code it wasn't at all clear if the value of `tl' was used across or after macro calls, but I'm fairly confident that the convention was to keep its use local. Each ex-macro function now uses a local version of this variable, so all of the double-indirection goes away. The only exception to the `local use' rule for `tl' is nfsm_clget(), which is left unchanged by this commit. Reviewed by: peter
* When VOP_SYMLINK fails, the value of *vpp is junk, so we must NULLiedowse2001-12-041-3/+2
| | | | | | | | | | | out nd.ni_vp to prevent the resource cleanup code at the end of nfsrv_symlink from trying to vrele it. This fixes a "vrele: negative ref cnt" panic that can occur when a symlink is attempted on an NFS filesystem with no free space. Found locally, but the symptoms correspond to those in the PR referenced below. PR: kern/26878 MFC after: 3 days
* Now that nfsm_reply() does not usually set 'error' to 0, we neediedowse2001-10-251-0/+1
| | | | | | | | | | | | to do it explicitly in nfsrv_noop so that the reply gets sent back to the client. This fixes the generation of a selection of RPC error replies (RPC_PROGMISMATCH, RPC_PROGUNAVAIL, RPC_PROCUNAVAIL etc.) that are used by some clients to detect support for optional protocols and features. Reviewed by: peter Reported by: Thomas Quinot <quinot@inf.enst.fr> PR: kern/31479
* Unwind some more macros. NFSMADV() was kinda silly since it was rightpeter2001-09-281-2/+2
| | | | | | | | | | next to equivalent m_len adjustments. Move the nfsm_subs.h macros into groups depending on which phase they are used in, since that affects the error recovery requirements. Collect some of the common error checking into a single macro as preparation for unwinding some more. Have nfs_rephead return a value instead of secretly modifying args. Remove some unused function arguments that were being passed around. Clarify nfsm_reply()'s error handling (I hope).
* Make nfsm_dissect() have an obvious return value.peter2001-09-271-21/+21
|
* Tidy up nfsm_build usage. This is only partially finished.peter2001-09-271-23/+25
|
* Cleanup and split of nfs client and server code.peter2001-09-181-513/+340
| | | | This builds on the top of several repo-copies.
* KSE Milestone 2julian2001-09-121-146/+146
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* Preceed/preceeding are not english words. Use precede and preceding.asmodai2001-02-181-2/+2
|
* Fix some problems that were introduced in revision 1.97. Insteadiedowse2001-02-091-46/+101
| | | | | | | | | | | | of returning an error code to the caller, NFS server op routines must themselves build an error reply and return 0 to the caller. This is achieved by replacing the erroneous return statements with code that jumps forward to the op function's reply code. We need to be careful to ensure that the 'struct mount' pointer is NULL though, so that the final vn_finished_write() call becomes a no-op. Reviewed by: mckusick, dillon
* * Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.bmilekic2000-12-211-4/+4
| | | | | | | | | | | | | | | | | | This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-1/+113
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Remove unneeded #include <vm/vm_zone.h>phk2000-04-301-1/+0
| | | | Generated by: src/tools/tools/kerninclude
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-201-1/+1
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* Fix compilation warning on alpha when converting pointer to integerdillon1999-12-181-1/+1
| | | | | | to generate hash index. Reviewed by: Andrew Gallatin <gallatin@cs.duke.edu>
* Have NFS use a snapshot of boottime instead of boottime itself todillon1999-12-161-6/+14
| | | | | | | | | | | generate the NFSv3 Version id. boottime itself may change, sometimes once every tick if you are running xntpd, which really throws off clients. Clients will tend to throw away what they believe to be stale data too often, and can get into long loops rewriting the same data over and over again because they believe the server has rebooted over and over again due to the changing version id. Approved by: jkh
* Introduce NDFREE (and remove VOP_ABORTOP)eivind1999-12-151-121/+25
|
* Add a readahead heuristic to the NFS server side code. While the serverdillon1999-12-131-1/+77
| | | | | | | | | | | cannot unilaterally pass data to a client it can reduce the physical disk transaction overhead by reading larger blocks. This results in better pipelining of requests/responses over the network and an almost 100% increase in cpu efficiency on the server. On a 100BaseTX network NFS read performance increases from 8.5 MBytes/sec to 10 MB/sec (maxed out), and cpu efficiency increases from 72% idle to 80% idle on the server. Reviewed by: Alfred Perlstein <bright@wintelcom.net>
* Fix a number of server-side issues related to aborting badly formeddillon1999-12-121-4/+4
| | | | | | | | NFS packets, mainly initializing structure pointers to NULL which are conditionally freed prior to return. PR: kern/15249 Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
* Remove WILLRELE from VOP_SYMLINKeivind1999-11-131-15/+6
| | | | | | Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.
OpenPOWER on IntegriCloud