summaryrefslogtreecommitdiffstats
path: root/sys/nfsserver/nfs_serv.c
Commit message (Collapse)AuthorAgeFilesLines
* Honor NFSv3 commit call (RFC 1813, Section 3.3.21) where when count is 0,delphij2011-12-151-1/+6
| | | | | | | | | | | | the full length from offset is being flushed. Note that for now VOP_FSYNC does not support offset and length parameters so we still do the same full VOP_FSYNC. This issue was reported at FreeNAS support site as FreeNAS ticket #1096. Submitted by: "ceckerle" <ce.freenas eckerle net> Prodded by: gcooper Reviewed by: rmacklem MFC after: 2 weeks
* Enhance the sequential access heuristic used to perform readahead in thejhb2011-12-011-60/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | NFS server and reuse it for writes as well to allow writes to the backing store to be clustered. - Use a prime number for the size of the heuristic table (1017 is not prime). - Move the logic to locate a heuristic entry from the table and compute the sequential count out of VOP_READ() and into a separate routine. - Use the logic from sequential_heuristic() in vfs_vnops.c to update the seqcount when a sequential access is performed rather than just increasing seqcount by 1. This lets the clustering count ramp up faster. - Allow for some reordering of RPCs and if it is detected leave the current seqcount as-is rather than dropping back to a seqcount of 1. Also, when out of order access is encountered, cut seqcount in half rather than dropping it all the way back to 1 to further aid with reordering. - Fix the new NFS server to properly update the next offset after a successful VOP_READ() so that the readahead actually works. Some of these changes came from an earlier patch by Bjorn Gronwall that was forwarded to me by bde@. Discussed with: bde, rmacklem, fs@ Submitted by: Bjorn Gronwall (1, 4) MFC after: 2 weeks
* Fix the NFS servers so that they can do a Lookup of "..",rmacklem2011-09-031-0/+1
| | | | | | | which requires that ni_strictrelative be set to 0, post-r224810. Tested by: swills (earlier version), geo dot liaskos at gmail.com Approved by: re (kib)
* Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/netchild2011-02-251-0/+2
| | | | | | | | | | | | | PMC/SYSV/...). No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project
* Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() arealc2011-02-051-1/+2
| | | | | | | | | | | incorrectly calling vm_object_page_clean(). They are passing the length of the range rather than the ending offset of the range. Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the callers. Reviewed by: kib MFC after: 3 weeks
* ZFS might not return monotonically increasing directory offset cookies,pjd2010-12-281-3/+10
| | | | | | | | | | | | | so turn off UFS-specific hack that assumes so in ZFS case. Before the change we can miss returning some directory entries to a NFS client. I believe that the hack should be moved to ufs_readdir(), but until we find somebody who will do it, turn it off for ZFS in NFS server code. Submitted by: rmacklem Discussed with: rmacklem, mckusick MFC after: 3 days
* Use newly added NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() to keep mount pointpjd2010-12-211-10/+15
| | | | | | | | | busy. This fixes a race where we can pass invalid mount point to VFS_VGET() via vp->v_mount when exported file system was forcibly unmounted between nfsrv_fhtovp() and VFS_VGET(). Reviewed by: kib MFC after: 5 days
* - Move pubflag and lockflag handling from nfsrv_fhtovp() to nfs_namei() -pjd2010-12-211-25/+15
| | | | | | | | | | | | this is the only place that is different from all the other nfsrv_fhtovp() consumers. This simplifies nfsrv_fhtovp() a bit and also eliminates one vn_lock/VOP_UNLOCK() cycle in case of NFSv3. - Implement NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() that tells it to leave mount point busy. Reviewed by: kib MFC after: 5 days
* Reduce lock scope a little.pjd2010-12-191-2/+2
|
* VOP_ISLOCKED() should not be used to determine if the vnode is locked.kib2010-12-151-2/+11
| | | | | | | | Explicitely track the locked status of the vnode. Reviewed by: pjd Tested by: avg MFC after: 1 week
* Fix a bug in r214049. The nvp == vp case shall be handled speciallykib2010-11-051-1/+1
| | | | | | | | | only for !usevget case. If VFS_VGET is working, the vnode shared lock is obtained recursively and vput() shall be done, not vunref(). Submitted by: rmacklem Tested by: Josh Carroll <josh.carroll gmail com> MFC after: 3 days
* When readdirplus() is handled on the exported filesystem that doeskib2010-10-191-12/+13
| | | | | | | | | not support VFS_VGET, like msdosfs, do not call VOP_LOOKUP() for dotdot on the root directory. Our filesystems expect that VFS handles dotdot lookups on root on its own. Reported and tested by: kevlo MFC after: 2 weeks
* - When VFS_VGET() is not supported, switch to VOP_LOOKUP().pjd2010-08-261-40/+48
| | | | | | | | | - We are fine by only share-locking the vnode. - Remove assertion that doesn't hold for ZFS where we cross mount points boundaries by going into .zfs/snapshot/<name>/. Reviewed by: rmacklem MFC after: 1 month
* Properly return an error reply if an NFS remove or link operation fails.jhb2009-12-031-4/+3
| | | | | | | | | | | | Previously the failing operation would allocate an mbuf and construct an error reply, but because the function did not return 0, the NFS server assumed it had failed to generate a reply and would leak the reply mbuf as well as not sending the reply to the NFS client. PR: kern/140853 Submitted by: Ted Faber faber at isi edu (remove) Reviewed by: rmacklem (remove) MFC after: 1 week
* Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object.pjd2009-09-261-1/+1
| | | | | | | | | | This completes the fix from r185586. PR: kern/139059 Reported by: Daniel Braniss <danny@cs.huji.ac.il> Submitted by: Jaakko Heinonen <jh@saunalahti.fi> Tested by: Daniel Braniss <danny@cs.huji.ac.il> MFC after: 3 days
* Correct typo after manual patching.pjd2009-09-091-1/+1
| | | | Noticed by: b. f.
* Fix usecount leak in mknod(2) on file system exported over NFS.pjd2009-09-091-2/+2
| | | | | | | While I'm here, correct typo in comment. Reviewed by: kan, kib MFC after: 3 days
* Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.dfr2009-06-301-424/+0
| | | | Approved by: re
* Remove the thread argument from the FSD (File-System Dependent) parts ofattilio2009-05-111-2/+2
| | | | | | | | | | | | | | | | | the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
* Use shared vnode locks when invoking VOP_READDIR().jhb2009-02-131-2/+2
| | | | MFC after: 1 month
* Handle VFS_VGET() failing with an error other than EOPNOTSUPP in additionkensmith2008-12-161-3/+6
| | | | | | | | | to failing with that error. PR: 125149 Submitted by: Jaakko Heinonen (jh <at> saunalahti <dot> fi) Reviewed by: mohans, kan MFC after: 3 days
* Change nfsserver slightly so that it does not trip over the timestampkan2008-12-031-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | validation code on ZFS. Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8 bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely identify this create request. Server then creates a new file with access mode 0, copies received 8 bytes into va_atime member of struct vattr and attempt to set the atime on file using VOP_SETATTR. If that succeeds, it fetches file attributes with VOP_GETATTR and verifies that atime timestamps match. If timestamps do not match, NFS server concludes it has probbaly lost the race to another process creating the file with the same name and bails with EEXIST. This scheme works OK when exported FS is FFS, but if underlying filesystem is ZFS _and_ server is running 64bit kernel, it breaks down due to sanity checking in zfs_setattr function, which refuses to accept any timestamps which have tv_sec that cannot be represented as 32bit int. Since struct timespec fields are 64 bit integers on 64bit platforms and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight bytes supplied by client end up in va_atime.tv_sec, forcing it out of valid 32bit range. The solution this change implements is simple: it treats NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing that tv_sec remains in 32 bit range and ZFS remains happy. Reviewed by: kib
* Implement support for RPCSEC_GSS authentication to both the NFS clientdfr2008-11-031-28/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
* Document a few sysctls in the NFS client and server code.trhodes2008-11-021-2/+4
| | | | | | Minor style(9) where applicable. Approved by: alfred (slightly older version)
* Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessarytrasz2008-10-281-5/+6
| | | | | | | to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-19/+19
| | | | MFC after: 3 months
* Turn XXX's for unlocked writes of NFS server statistics to simple notes,rwatson2008-10-121-2/+2
| | | | | | as we consider it a feature to exchange performance for consistency. MFC after: 3 days
* Remove the suser(9) interface from the kernel. It has been replaced fromattilio2008-09-171-2/+4
| | | | | | | | | | | | | | | | | years by the priv_check(9) interface and just very few places are left. Note that compatibility stub with older FreeBSD version (all above the 8 limit though) are left in order to reduce diffs against old versions. It is responsibility of the maintainers for any module, if they think it is the case, to axe out such cases. This patch breaks KPI so __FreeBSD_version will be bumped into a later commit. This patch needs to be credited 50-50 with rwatson@ as he found time to explain me how the priv_check() works in detail and to review patches. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reviewed by: rwatson
* Decontext-alize the nfsserver module.attilio2008-09-161-58/+55
| | | | | | | Now, only some few places still require thread passing (mostly the ones which access to VOP_* functions) and will be fixed once the primitive also will be. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed threadattilio2008-08-281-49/+47
| | | | | | was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Change the fix in the rev. 1.179 to use nfsrv_lockedpair_nd().kib2008-05-281-6/+2
| | | | | Tested by: pho MFC after: 3 days
* Initialize vfslocked prior to calling nfsm_srvmtofh where it was forgotten.kib2008-05-281-0/+3
| | | | | | Reported by: Andrew Edwards <aedwards sandvine com> Tested by: pho MFC after: 3 days
* Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT.ru2008-03-251-4/+4
| | | | | | | | | | Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-221-8/+7
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* Fix the Giant leak in the nfsrv_remove().kib2008-03-041-2/+6
| | | | | Reported by: pluknet <pluknet gmail com> MFC after: 1 week
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-16/+16
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-13/+13
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Add a -z flag to nfsstat which zeros the NFS statistics after displayingjhb2007-10-181-1/+1
| | | | | | | | them. MFC after: 1 week Requested by: ps Submitted by: ps (6 years ago)
* Include priv.h to pick up suser(9) definitions, missed in an earlierrwatson2007-06-131-0/+1
| | | | | | commit. Warnings spotted by: kris
* Init timespec to zero fo quiesce warnings.mjacob2007-06-101-1/+1
|
* Initialize vfslocked to 0 before nfsm_srvmtofh() so that the variable isjhb2007-03-261-0/+1
| | | | | | | | not used uninitialized in 'nfsmout' if nfsm_srvmtofh() gets an internal error. CID: 1766 Found by: Coverity Prevent (tm)
* - Turn all explicit giant acquires into conditional VFS_LOCK_GIANTs.jeff2007-03-171-554/+187
| | | | | | | | | | | | | | | Only ops which used namei still remained. - Implement a scheme for reducing the overhead of tracking which vops require giant by constantly reducing the number of recursive giant acquires to one, leaving us with only one vfslocked variable. - Remove all NFSD lock acquisition and release from the individual nfs ops. Careful examination has shown that they are not required. This greatly simplifies the code. Sponsored by: Isilon Systems, Inc. Discussed with: rwatson Tested by: kkenn Approved by: re
* Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.pjd2007-02-151-6/+6
| | | | | | | | | | | | | | | | This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
* Get the vfs giant lock before calling nfs_access.mpp2007-02-131-3/+9
| | | | Reviewed by: mohan
* Push Giant a bit further off the NFS server in a number of straightrwatson2006-11-241-213/+263
| | | | | | | | | | | | | | | | | | | | | | | | forward cases by converting from unconditional acquisition of Giant around vnode operations to conditional acquisition: - Remove nfsrv_access_withgiant(), and cause nfsrv_access() to now assert that Giant will be held if it is required for the vnode. - Add nfsrv_fhtovp_locked(), which will drop the NFS server lock if required, and modify nfsrv_fhtovp() to conditionally acquire Giant if required. - In the VOP's not dealing with more than one vnode at a time (i.e., not involving a lookup), conditionally acquire Giant. This removes Giant use for MPSAFE file systems for a number of quite important RPCs, including getattr, read, write. It leaves unconditional Giant acquisitions in vnode operations that interact with the name space or more than one vnode at a time as these require further work. Tested by: kris Reviewed by: kib
* Protect nfsm_srvpathsiz() call with the nfsd_mtx lock.pjd2006-11-201-5/+6
| | | | Reviewed by: mohans
* Fix leak in NAMEI zone caused by nfs server when VOP_RENAME fails.kib2006-10-261-2/+2
| | | | | | | Submitted by: Padma Bhooma <pbhooma at panasas com> Reviewed by: bde Approved by: pjd (mentor) MFC after: 1 week
* Temporary workaround to prevent leak of Giant from nfsd when callingkib2006-06-051-0/+16
| | | | | | | | | lookup(). Reviewed by: tegge Tested by: "Arno J. Klaassen" <arno at heho snv jussieu fr>, "Rong-en Fan" <grafan at gmail com>, Dmitriy Kirhlarov <dimma at higis ru>, Dmitry Pryanishnikov <dmitry at atlantis dp ua> MFC after: 1 week Approved by: kan, pjd (mentors)
* - Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().jeff2006-03-311-0/+11
| | | | | | Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.
* - Reorder vrele calls after vput calls to prevent lock order reversalsjeff2006-03-121-26/+17
| | | | | | | between leaf and directory locks. Found by: kris Sponsored by: Isilon Systems, Inc.
OpenPOWER on IntegriCloud