summaryrefslogtreecommitdiffstats
path: root/sys/nfsserver
Commit message (Collapse)AuthorAgeFilesLines
...
* Rework the credential code to support larger values of NGROUPS andbrooks2009-06-192-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867
* Since svc_[dg|vc|tli|tp]_create() did not hold a reference count on thermacklem2009-06-171-0/+1
| | | | | | | | | | | | | SVCXPTR structure returned by them, it was possible for the structure to be free'd before svc_reg() had been completed using the structure. This patch acquires a reference count on the newly created structure that is returned by svc_[dg|vc|tli|tp]_create(). It also adds the appropriate SVC_RELEASE() calls to the callers, except the experimental nfs subsystem. The latter will be committed separately. Submitted by: dfr Tested by: pho Approved by: kib (mentor)
* Add a #include <sys/jail.h> so that it builds whenrmacklem2009-06-121-0/+1
| | | | | | options KGSSAPI is specified in the kernel configuration. Approved by: kib (mentor)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-052-2/+2
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Rework socket upcalls to close some races with setup/teardown of upcalls.jhb2009-06-013-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem
* Place hostnames and similar information fully under the prison system.jamie2009-05-291-2/+3
| | | | | | | | | | | | | | | | | The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
* Fix build of KGSSAPI bits post-vimage.dfr2009-05-241-1/+2
|
* Remove the thread argument from the FSD (File-System Dependent) parts ofattilio2009-05-111-2/+2
| | | | | | | | | | | | | | | | | the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
* Do not embed struct ucred into larger netcred parent structures.kan2009-05-091-0/+3
| | | | | | | | | | | | | Credential might need to hang around longer than its parent and be used outside of mnt_explock scope controlling netcred lifetime. Use separate reference-counted ucred allocated separately instead. While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up some unused declarations in new NFS code. Reported by: John Hickey PR: kern/133439 Reviewed by: dfr, kib
* Change nfsserver so that it uses the nfssvc() system call providedrmacklem2009-04-124-49/+22
| | | | | | | | | | | in sys/nfs/nfs_nfssvc.c by registering with it using the nfsd_call_nfsserver function pointer. Also, add the build glue for nfs_nfssvc.c optionally based on "nfsserver" and also as a loadable module. Submitted by: rmacklem Reviewed by: kib Approved by: kib (mentor)
* Fix an mbuf leak in the error path.dfr2009-03-191-0/+2
| | | | Submitted by: Rick Macklem <rick at snowhite dot cis dot uoguelph dot ca>
* Include audit.h so that the system call path protected by NFS_LEGACYRPCrwatson2009-02-231-0/+2
| | | | | | | | can audit its arguments. Submitted by: Jaakko Heinonen <jh at saunalahti.fi> MFC after: 1 week X-MFC-note: MFC with r188311
* Use shared vnode locks when invoking VOP_READDIR().jhb2009-02-131-2/+2
| | | | MFC after: 1 month
* Audit the flag argument to the nfssvc(2) system call.rwatson2009-02-081-0/+2
| | | | | Obtained from: TrustedBSD Project Sponsored by: Apple, Inc.
* Last step of splitting up minor and unit numbers: remove minor().ed2009-01-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.
* Handle VFS_VGET() failing with an error other than EOPNOTSUPP in additionkensmith2008-12-161-3/+6
| | | | | | | | | to failing with that error. PR: 125149 Submitted by: Jaakko Heinonen (jh <at> saunalahti <dot> fi) Reviewed by: mohans, kan MFC after: 3 days
* We need to pass a structure with enough space for an NFSv2 filehandle todfr2008-12-101-3/+3
| | | | | nfs_srvmtofh_xx otherwise bad things happen when an NFSv2 client tries to make a request.
* Change nfsserver slightly so that it does not trip over the timestampkan2008-12-031-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | validation code on ZFS. Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8 bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely identify this create request. Server then creates a new file with access mode 0, copies received 8 bytes into va_atime member of struct vattr and attempt to set the atime on file using VOP_SETATTR. If that succeeds, it fetches file attributes with VOP_GETATTR and verifies that atime timestamps match. If timestamps do not match, NFS server concludes it has probbaly lost the race to another process creating the file with the same name and bails with EEXIST. This scheme works OK when exported FS is FFS, but if underlying filesystem is ZFS _and_ server is running 64bit kernel, it breaks down due to sanity checking in zfs_setattr function, which refuses to accept any timestamps which have tv_sec that cannot be represented as 32bit int. Since struct timespec fields are 64 bit integers on 64bit platforms and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight bytes supplied by client end up in va_atime.tv_sec, forcing it out of valid 32bit range. The solution this change implements is simple: it treats NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing that tv_sec remains in 32 bit range and ZFS remains happy. Reviewed by: kib
* In the nfsrv_fhtovp(), after the vfs_getvfs() function found the pointerkib2008-11-291-3/+6
| | | | | | | | | | | | | | | | | | | to the fs, but before a vnode on the fs is locked, unmount may free fs structures, causing access to destroyed data and freed memory. Introduce a vfs_busymp() function that looks up and busies found fs while mountlist_mtx is held. Use it in nfsrv_fhtovp() and in the implementation of the handle syscalls. Two other uses of the vfs_getvfs() in the vfs_subr.c, namely in sysctl_vfs_ctl and vfs_getnewfsid seems to be ok. In particular, sysctl_vfs_ctl is protected by Giant by being a non-sleeping sysctl handler, that prevents Giant-locked unmount code to interfere with it. Noted by: tegge Reviewed by: dfr Tested by: pho MFC after: 1 month
* Switch the default rpc implementation for NFS back to the new code. I believedfr2008-11-141-4/+0
| | | | | | I have fixed the reported problems - if you still have trouble with it, please contact me with as much detail as possible so that I can track down any other issues as quickly as possible.
* Use the remote address for access control, not the local address. This fixesdfr2008-11-131-2/+44
| | | | | | | | the nfsd problems that some people have with the new code. Add support for the vfs.nfsrv.nfs_privport sysctl which denies access unless the client is using a port number less than 1024. Not really sure if this is particularly useful since it doesn't add any real security.
* Temporarily switch NFS back to the old RPC code while I try to diagnose anddfr2008-11-131-0/+4
| | | | | | fix the problems a few people have noticed with the new code. People who want to continue testing the new code or who need RPCSEC_GSS support should use the new option NFS_NEWRPC to select it.
* Turn (NFSERR_AUTHERR|code) status values into svcerr_auth(rqst, code) repliesdfr2008-11-121-2/+7
| | | | instead of returning a success with a bogus NFS error code.
* Allow v3 GETATTR requests even when weakly authenticated. Change the errordfr2008-11-121-2/+3
| | | | | return for for weakly authenticated requests from REJECTEDCRED to WEAKAUTH for consistency with Solaris.
* Range-check NFSv2 procedure numbers before converting to NFSv3.dfr2008-11-071-2/+7
| | | | Submitted by: csjp
* Don't depend on krpc.ko in the NFS_LEGACYRPC case.dfr2008-11-061-0/+2
|
* Unbreak NFS.des2008-11-061-0/+1
| | | | Pointy hat to: dfr
* If mountd doesn't specify a secflavor list for the mount, assume that -sec=sysdfr2008-11-051-0/+10
| | | | is what was wanted.
* Include <sys/eventhandler.h>.dfr2008-11-041-0/+1
|
* Implement support for RPCSEC_GSS authentication to both the NFS clientdfr2008-11-0311-65/+1373
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
* Document a few sysctls in the NFS client and server code.trhodes2008-11-022-5/+10
| | | | | | Minor style(9) where applicable. Approved by: alfred (slightly older version)
* Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessarytrasz2008-10-281-5/+6
| | | | | | | to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
* Rename three MAC entry points from _proc_ to _cred_ to reflect the factrwatson2008-10-281-1/+1
| | | | | | | that they operate directly on credentials: mac_proc_create_swapper(), mac_proc_create_init(), and mac_proc_associate_nfsd(). Update policies. Obtained from: TrustedBSD Project
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-234-31/+31
| | | | MFC after: 3 months
* Turn XXX's for unlocked writes of NFS server statistics to simple notes,rwatson2008-10-121-2/+2
| | | | | | as we consider it a feature to exchange performance for consistency. MFC after: 3 days
* Remove the suser(9) interface from the kernel. It has been replaced fromattilio2008-09-171-2/+4
| | | | | | | | | | | | | | | | | years by the priv_check(9) interface and just very few places are left. Note that compatibility stub with older FreeBSD version (all above the 8 limit though) are left in order to reduce diffs against old versions. It is responsibility of the maintainers for any module, if they think it is the case, to axe out such cases. This patch breaks KPI so __FreeBSD_version will be bumped into a later commit. This patch needs to be credited 50-50 with rwatson@ as he found time to explain me how the priv_check() works in detail and to review patches. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reviewed by: rwatson
* Decontext-alize the nfsserver module.attilio2008-09-165-101/+91
| | | | | | | Now, only some few places still require thread passing (mostly the ones which access to VOP_* functions) and will be fixed once the primitive also will be. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed threadattilio2008-08-282-50/+48
| | | | | | was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Remove spls from NFS server setup call; expand receive socket bufferrwatson2008-06-301-8/+3
| | | | | | | locking to cover full setup of socket upcalls; remove XXX about locking. MFC after: 3 weeks
* Change the fix in the rev. 1.179 to use nfsrv_lockedpair_nd().kib2008-05-281-6/+2
| | | | | Tested by: pho MFC after: 3 days
* Initialize vfslocked prior to calling nfsm_srvmtofh where it was forgotten.kib2008-05-281-0/+3
| | | | | | Reported by: Andrew Edwards <aedwards sandvine com> Tested by: pho MFC after: 3 days
* Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT.ru2008-03-255-15/+15
| | | | | | | | | | Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-221-8/+7
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* Fix a regression from the last revision - don't edit the ns_rec list whiledfr2008-03-191-1/+3
| | | | not holding the lock.
* Don't call nfs_realign while holding locks.dfr2008-03-181-4/+5
| | | | Reviewed by: kib
* Fix the Giant leak in the nfsrv_remove().kib2008-03-041-2/+6
| | | | | Reported by: pluknet <pluknet gmail com> MFC after: 1 week
* Use nfsrv_destroycache() only once, else it crashes the server.remko2008-01-181-1/+0
| | | | | | | PR: kern/118152 Submitted by: Bjoern Groenvall <bg at sics dot se> Approved by: imp (mentor, a while ago already), jhb MFC After: 3 days
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-132-21/+20
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-102-14/+14
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
OpenPOWER on IntegriCloud