summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient
Commit message (Collapse)AuthorAgeFilesLines
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-013-3/+0
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Patch the regular nfs client in a manner analagous tormacklem2009-07-171-1/+2
| | | | | | | | | r195704 for the experimental client. The patch avoids calling vn_lock() for the case where nfs_nget() has acquired the same vnode as dvp, since nfs_nget() has already locked the vnode. Reviewed by: kib, jhb Approved by: re (kensmith), kib (mentor)
* Use PBDRY flag for msleep(9) in NFS and NLM when sleeping thread ownskib2009-07-143-7/+9
| | | | | | | | | | kernel resources that block other threads, like vnode locks. The SIGSTOP sent to such thread (process, rather) shall not stop it until thread releases the resources. Tested by: pho Reviewed by: jhb Approved by: re (kensmith)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-143-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* In vn_vget_ino() and their inline equivalents, mnt_ref() the mount pointkib2009-07-021-0/+2
| | | | | | | | | | | around the sequence that drop vnode lock and then busies the mount point. Not having vlocked node or direct reference to the mp allows for the forced unmount to proceed, making mp unmounted or reused. Tested by: pho Reviewed by: jeff Approved by: re (kensmith) MFC after: 2 weeks
* Adjust the internal NFS KPI to avoid the last traces of NFS_LEGACYRPC.dfr2009-06-305-26/+21
| | | | Approved by: re
* Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.dfr2009-06-3015-2224/+16
| | | | Approved by: re
* Fix build with NFS_LEGACYRPC enabled after the socket upcall lockingjhb2009-06-301-3/+3
| | | | | | changes. Approved by: re (kensmith)
* Add a new global rwlock, in_ifaddr_lock, which will synchronize use of therwatson2009-06-251-0/+4
| | | | | | | | | | | | | | | | | | | in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz
* After cleaning up rt_tables from vnet.h and cleaning up opt_route.hbz2009-06-231-1/+0
| | | | | a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
* Fix some of the style errors in *getpages().alc2009-06-181-18/+13
|
* For dotdot lookup in nfs_lookup, inline the vn_vget_ino() to preventkib2009-06-171-10/+39
| | | | | | | | | | | operating on the unmounted mount point and freed mount data in case of forced unmount performed while dvp is unlocked to nget the target vnode. Add missed calls to m_freem(mrep) there on error exits [1]. Submitted by: rmacklem [1] Tested by: pho MFC after: 2 weeks
* Rename the host-related prison fields to be the same as the host.*jamie2009-06-132-2/+3
| | | | | | | parameters they represent, and the variables they replaced, instead of abbreviated versions of them. Approved by: bz (mentor)
* Add a test for VI_DOOMED just after nfs_upgrade_vnlock() inrmacklem2009-06-101-8/+15
| | | | | | | | | | | | | nfs_bioread_check_cons(). This is required since it is possible for the vnode to be vgonel()'d while in nfs_upgrade_vnlock() when a forced dismount is in progress. Also, move the check for VI_DOOMED in nfs_vinvalbuf() down to after nfs_upgrade_vnlock() and replace the out of date comment for it. Submitted by: jhb Tested by: pho Approved by: kib (mentor) MFC after: 1 month
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-082-2/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* Rework socket upcalls to close some races with setup/teardown of upcalls.jhb2009-06-011-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem
* Convert the two dimensional array to be malloced and introducebz2009-06-011-3/+7
| | | | | | | | | | | | | | | | an accessor function to get the correct rnh pointer back. Update netstat to get the correct pointer using kvm_read() as well. This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now. Reviewed by: julian, rwatson, zec X-MFC: not possible
* nfs_write() can use the recently introduced vfs_bio_set_valid() instead ofalc2009-05-311-1/+1
| | | | | | vfs_bio_set_validclean(), thereby avoiding the page queues lock. Garbage collect vfs_bio_set_validclean(). Nothing uses it any longer.
* Place hostnames and similar information fully under the prison system.jamie2009-05-292-14/+10
| | | | | | | | | | | | | | | | | The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)
* Make *getpages()s' assertion on the state of each page's dirty bitsalc2009-05-281-1/+1
| | | | stricter.
* Make sure we feed 32bit align memory to nfsm_dissect otherwise we will faultdfr2009-05-241-1/+73
| | | | | | | | | | on platforms with strict alignment requirements. In particular, this fixes the problems with the new RPC transport on the arm platform. Note: this adds yet another copy of nfs_realign(). I will attempt to refactor after NFS_LEGACYRPC is removed. Submitted by: sam
* While r192615 fixed the former problems, make this file VIMAGEbz2009-05-231-0/+3
| | | | compliant now as well initializing local context variables.
* It seems this file was ignored by MRT, rnh locking changes and new-arpv2.bz2009-05-231-17/+3
| | | | | | | | So let the V_irtualization people finally make the disabled debugging code compile again. MFC after: 2 weeks X-MFC: MRT and adapt rnh locking
* Remove the unmaintained University of Michigan NFSv4 client from 8.xrwatson2009-05-2216-84/+10
| | | | | | | prior to 8.0-RELEASE. Rick Macklem's new and more feature-rich NFSv234 client and server are replacing it. Discussed with: rmacklem
* Eliminate unnecessary clearing of the page's dirty mask from variousalc2009-05-151-1/+3
| | | | | | getpages functions. Eliminate a stale comment.
* Eliminate gratuitous clearing of the page's dirty mask.alc2009-05-121-1/+2
|
* Remove the thread argument from the FSD (File-System Dependent) parts ofattilio2009-05-112-11/+18
| | | | | | | | | | | | | | | | | the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
* Eliminate stale comments.alc2009-05-101-6/+1
| | | | Eliminate a case of unnecessary page queues locking.
* Change the curvnet variable from a global const struct vnet *,zec2009-05-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)
* Remove redundant NFSMNT_NFSV3 check in DTrace hooks for NFS RPC.rwatson2009-05-041-2/+1
| | | | MFC after: 1 month
* Fix typo in comment.rwatson2009-05-041-1/+1
| | | | MFC after: 1 month
* Remove trailing spaceskib2009-04-131-2/+2
|
* Remove VOP_LEASE and supporting functions. This hasn't been used sincerwatson2009-04-101-1/+0
| | | | | | | | | | | | | | the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon
* Cache_lookup() for DOTDOT drops dvp vnode lock, allowing dvp to be reclaimed.kib2009-04-101-0/+2
| | | | | | | | | Check the condition and return ENOENT then. In nfs_lookup(), respect ENOENT return from cache_lookup() when it is caused by dvp reclaim. Reported and tested by: pho
* When a stale file handle is encountered, purge all cached information aboutjhb2009-04-064-2/+26
| | | | | | | | | | | an NFS node including the access and attribute caches. Previously the NFS client only purged any name cache entries associated with the file. PR: kern/123755 Submitted by: Jaakko Heinonen jh of saunalahti fi Reported by: Timo Sirainen tss of iki fi Reviewed by: rwatson, rmacklem MFC after: 1 month
* Change the default timeout for caching attributes of a directory in the NFSjhb2009-04-061-1/+1
| | | | | | | | | | | client from 30 seconds to 3 seconds. After the recent changes to add caching of negative name cache lookups, a negative name cache hit will persist until the client notices the parent directory has changed. The higher the attribute cache timeout on directories, the longer that can take, so lower the default timeout for directories to match that of regular files. Suggested by: bde, mohans MFC after: 1 month
* Move dtnfsclient.c in the cddl tree to nfs_kdtrace.c in the nfsclientrwatson2009-03-251-0/+545
| | | | | | | | directory, since it's under a BSD license, and this keeps NFS internals- aware tracing parts close to NFS. MFC after: 1 month Suggested by: jhb
* Fix two bugs in DTrace tracing of accesscache and attrcache load events:rwatson2009-03-243-7/+15
| | | | | | | | | | | - Trace non-error loads into the access cache once, not zero times or twice. - Sometimes attr cache loads fail due to a race, in which case they are aborted leading to an invalidation; in this case, trace only the flush, not a load. MFC after: 1 month Sponsored by: Google, Inc.
* Add DTrace probes to the NFS access and attribute caches. Access cacherwatson2009-03-244-16/+235
| | | | | | | | | | | | | | | | | | | | | | | | | events are: nfsclient:accesscache:flush:done nfsclient:accesscache:get:hit nfsclient:accesscache:get:miss nfsclient:accesscache:load:done They pass the vnode, uid, and requested or loaded access mode (if any); the load event may also report a load error if the RPC fails. The attribute cache events are: nfsclient:attrcache:flush:done nfsclient:attrcache:get:hit nfsclient:attrcache:get:miss nfsclient:attrcache:load:done They pass the vnode, optionally the vattr if one is present (hit or load), and in the case of a load event, also a possible RPC error. MFC after: 1 month Sponsored by: Google, Inc.
* Add dtnfsclient, a first cut at an NFSv2/v3 client reuest DTracerwatson2009-03-221-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | provider. The NFS client exposes 'start' and 'done' probes for NFSv2 and NFSv3 RPCs when using the new RPC implementation, passing in the vnode, mbuf chain, credential, and NFSv2 or NFSv3 procedure number. For 'done' probes, the error number is also available. Probes are named in the following way: ... nfsclient:nfs2:write:start nfsclient:nfs2:write:done ... nfsclient:nfs3:access:start nfsclient:nfs3:access:done ... Access to the unmarshalled arguments is not easily available at this point in the stack, but the passed probe arguments are sufficient to to a lot of interesting things in practice. Technically, these probes may cover multiple RPC retransmits, and even transactions if the transaction ID change as a result of authentication failure or a jukebox error from the server, but usefully capture the intent of a single NFS request, such as access, getattr, write, etc. Typical use might involve profiling RPC latency by system call, number of RPCs, how often a getattr leads to a call to access, when failed access control checks occur, etc. More detailed RPC information might best be provided by adding a krpc provider. It would also be useful to add NFS client probes for events such as the access cache or attribute cache satisfying requests without an RPC. Sponsored by: Google, Inc. MFC after: 1 month
* In nfs_request(), always exit using the nfsmout label once we'rerwatson2009-03-211-8/+3
| | | | | | | | definitely doing an NFSv2 or NFSv3 RPC, rather than sometimes doing so and sometimes not. This makes it easier to add a DTrace return probe at a single point in the function. MFC after: 1 week
* Expand the per-node access cache to cache permissions for multiple users.jhb2009-03-203-24/+55
| | | | | | | | | | | | The number of entries in the cache defaults to 8 but is easily changed in nfsclient/nfs.h. When the cache is filled, the oldest cache entry is evicted when space is needed. I mirrored the changes to the NFSv[23] client in the NFSv4 client to fix compile breakage. However, the NFSv4 client doesn't actually use the access cache currently. Submitted by: rmacklem
* - Remove code to set SAVENAME for CREATE or RENAME requests that get a -vejhb2009-03-101-4/+4
| | | | | | | | | | | | hit in the name cache. cache_lookup() doesn't actually return ENOENT for such requests to force the filesystem to do an explicit lookup, so this was effectively dead code. - Grab the nfsnode mutex while writing to n_dmtime. We don't grab the lock when comparing the time against the cached directory mod time (just as we don't when comparing ctime's for +ve name cache hits) since the attribute caching is already racy for NFS clients as it is. Discussed with: bde
* For all files including net/vnet.h directly include opt_route.h andbz2009-02-272-0/+3
| | | | | | | | | | | | | | net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.
* Bring back the code to prime the ACCESS cache when fetching attributes forjhb2009-02-241-0/+11
| | | | | | | | an NFS file. Now the priming is conditional on a new vfs.nfs.prime_access_cache sysctl. For now I've left the default setting to disabling the priming. Requested by: scottl
* Enable caching of negative pathname lookups in the NFS client. To avoidjhb2009-02-192-8/+54
| | | | | | | | | | | | | | stale entries, we save a copy of the directory's modification time when the first negative cache entry was added in the directory's NFS node. When a negative cache entry is hit during a pathname lookup, the parent directory's modification time is checked. If it has changed, all of the negative cache entries for that parent are purged and the lookup falls back to using the RPC. This required adding a new cache_purge_negative() method to the name cache to purge only negative cache entries for a given directory. Submitted by: mohans, Rick Macklem, Ricardo Labiaga @ NetApp Reviewed by: mohans
* When fetching attributes for a file for NFSv3 mounts, do not perform anjhb2009-02-191-6/+0
| | | | | | | | | opportunistic ACCESS RPC to populate both the access and attribute caches of the file and instead always use a GETATTR RPC. On many modern NFS servers, an ACCESS RPC is much more expensive to service than a GETATTR RPC. Submitted by: mohans
* Don't clear the attribute cache of a file when it is closed. A subsequentjhb2009-02-191-7/+0
| | | | | | | | | | | open() of the same file will load fresh attributes, so they do not need to be explicitly flushed in close() to guarantee close to open consistency. However, other file desciptors may still reference this file and clearing the attributes in close() forces those other file descriptors to fetch fresh attributes the next time they need them. Reviewed by: mohans MFC after: 1 week
* Reindent a small bit of code that was not 8-space indented like the restjhb2009-02-181-6/+6
| | | | of the nfs_lookup() function.
* Last step of splitting up minor and unit numbers: remove minor().ed2009-01-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.
OpenPOWER on IntegriCloud