FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client	kib	2012-03-17	2	-3/+10
\| \| \| \| \| \| \| \| \|	behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
*	Post r230394, the Lookup RPC counts for both NFS clients increased	rmacklem	2012-03-03	2	-22/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks
*	Fix the NFS clients so that they use copyin() instead of bcopy(),	rmacklem	2012-03-01	1	-1/+16
\| \| \| \| \| \| \| \|	when doing direct I/O. This direct I/O code is not enabled by default. Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week
*	Fix found places where uio_resid is truncated to int.	kib	2012-02-21	2	-9/+10
\| \| \| \| \| \| \| \| \|	Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
*	Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:	bz	2012-02-17	2	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity. This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat. Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days
*	r228827 fixed a problem where copying of NFSv4 open credentials into	rmacklem	2012-02-07	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time. MFC after: 5 days
*	Rename cache_lookup_times() to cache_lookup() and retire the old API and	jhb	2012-02-06	1	-1/+1
\| \| \| \|	ABI stub for cache_lookup().
*	Current implementations of sync(2) and syncer vnode fsync() VOP uses	kib	2012-02-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
*	When a "mount -u" switches an NFS mount point from TCP to UDP,	rmacklem	2012-01-31	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick. Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks
*	A problem with respect to data read through the buffer cache for both	rmacklem	2012-01-27	3	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes. Reported by: pjd Reviewed by: kib MFC after: 2 weeks
*	Revert r230516, since it doesn't really fix the problem.	rmacklem	2012-01-26	1	-17/+0
\|
*	Fix remaining calls to cache_enter() in both NFS clients to provide	kib	2012-01-25	1	-17/+18
\| \| \| \| \| \| \| \|	appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks
*	Add a timeout on positive name cache entries in the NFS client. That is,	jhb	2012-01-25	3	-11/+29
\| \| \| \| \| \| \| \| \| \| \|	we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week
*	If a mount -u is done to either NFS client that switches it	rmacklem	2012-01-25	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from TCP to UDP and the rsize/wsize/readdirsize is greater than NFS_MAXDGRAMDATA, it is possible for a thread doing an I/O RPC to get stuck repeatedly doing retries. This happens because the RPC will use a resize/wsize/readdirsize that won't work for UDP and, as such, it will keep failing indefinitely. This patch returns an error for this case, to avoid the problem. A discussion on freebsd-fs@ seemed to indicate that returning an error was preferable to silently ignoring the "udp"/"mntudp" option. This problem was discovered while investigating a problem reported by pjd@ via email. MFC after: 2 weeks
*	Close a race in NFS lookup processing that could result in stale name cache	jhb	2012-01-20	3	-54/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks
*	Make sure all intermediate variables holding mount flags (mnt_flag)	mckusick	2012-01-17	1	-1/+1
\| \| \| \| \| \| \|	and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks
*	opt_inet6.h was missing from some files in the new NFS subsystem.	rmacklem	2012-01-08	3	-1/+3
\| \| \| \| \| \| \| \| \| \|	The effect of this was, for clients mounted via inet6 addresses, that the DRC cache would never have a hit in the server. It also broke NFSv4 callbacks when an inet6 address was the only one available in the client. This patch fixes the above, plus deletes opt_inet6.h from a couple of files it is not needed for. MFC after: 2 weeks
*	During investigation of an NFSv4 client crash reported by glebius@,	rmacklem	2011-12-23	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	jhb@ spotted that nfscl_getstateid() might modify credentials when called from nfsrpc_read() for the case where p != NULL, whereas nfsrpc_read() only did a crdup() to get new credentials for p == NULL. This bug was introduced by r195510, since pre-r195510 nfscl_getstateid() only modified credentials for the p == NULL case. This patch modifies nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case. It is conceivable that this bug caused the crash reported by glebius@, but that will not be determined for some time, since the crash occurred after about 1month of operation. Tested by: glebius Reviewed by: jhb MFC after: 2 weeks
*	Post r223774, the NFSv4 client no longer has multiple instances	rmacklem	2011-12-03	1	-52/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of the same lock_owner4 string. As such, the handling of cleanup of lock_owners could be simplified. This simplification permitted the client to do a ReleaseLockOwner operation when the process that the lock_owner4 string represents, has exited. This permits the server to release any storage related to the lock_owner4 string before the associated open is closed. Without this change, it is possible to exhaust a server's storage when a long running process opens a file and then many child processes do locking on the file, because the open doesn't get closed. A similar patch was applied to the Linux NFSv4 client recently so that it wouldn't exhaust a server's storage. Reviewed by: zack MFC after: 2 weeks
*	Rename vm_page_set_valid() to vm_page_set_valid_range().	kib	2011-11-30	1	-1/+1
\| \| \| \| \| \| \|	The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc
*	Clean up some cruft in the NFSv4 client left over from the	rmacklem	2011-11-21	1	-13/+6
\| \| \| \| \| \| \|	OpenBSD port, so that it is more readable. No logic change is made by this commit. MFC after: 2 weeks
*	Add two arguments to the nfsrpc_rellockown() function in the NFSv4	rmacklem	2011-11-20	1	-6/+5
\| \| \| \| \| \| \| \|	client. This does not change the client's behaviour, but prepares the code so that nfsrpc_rellockown() can be called elsewhere in a future commit. MFC after: 2 weeks
*	Since the nfscl_cleanup() function isn't used by the FreeBSD NFSv4 client,	rmacklem	2011-11-20	1	-33/+7
\| \| \| \| \| \| \|	delete the code and fix up the related comments. This should not have any functional effect on the client. MFC after: 2 weeks
*	Post r223774 the NFSv4 client never uses the linked list with the	rmacklem	2011-11-20	1	-58/+0
\| \| \| \| \| \| \| \| \|	head nfsc_defunctlockowner. This patch simply removes the code that loops through this always empty list, since the code no longer does anything useful. It should not have any effect on the client's behaviour. MFC after: 2 weeks
*	Modify the new NFS client so that nfs_fsync() only calls ncl_flush()	rmacklem	2011-11-15	1	-0/+10
\| \| \| \| \| \| \| \| \|	for regular files. Since other file types don't write into the buffer cache, calling ncl_flush() is almost a no-op. However, it does clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for directories. MFC after: 2 weeks
*	Move the setting of the default value for nm_wcommitsize to	rmacklem	2011-11-15	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	before the nfs_decode_args() call in the new NFS client, so that a specfied command line value won't be overwritten. Also, modify the calculation for small values of desiredvnodes to avoid an unusually large value or a divide by zero crash. It seems that the default value for nm_wcommitsize is very conservative and may need to change at some time. PR: kern/159351 Submitted by: onwahe at gmail.com (earlier version) Reviewed by: jhb MFC after: 2 weeks
*	Finish making 'wcommitsize' an NFS client mount option.	jhb	2011-11-14	1	-1/+10
\| \| \| \| \|	Reviewed by: rmacklem MFC after: 1 week
*	Sync with the old NFS client: Remove an obsolete comment.	jhb	2011-11-14	1	-2/+0
\|
*	Since NFSv4 byte range locking only works for regular files,	rmacklem	2011-11-14	1	-0/+2
\| \| \| \| \| \|	add a sanity check for the vnode type to the NFSv4 client. MFC after: 2 weeks
*	Move the assignment of default values for some mount options	rmacklem	2011-11-13	1	-3/+9
\| \| \| \| \| \| \|	to before the nfs_decode_args() call in the new NFS client, so they don't overwrite the value specified on the command line. MFC after: 2 weeks
*	Second-to-last commit implementing Capsicum capabilities in the FreeBSD	rwatson	2011-08-11	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
*	Fix a LOR in the NFS client which could cause a deadlock.	rmacklem	2011-08-02	2	-2/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was reported to the mailing list freebsd-net@freebsd.org on July 21, 2011 under the subject "LOR with nfsclient sillyrename". The LOR occurred when nfs_inactive() called vrele(sp->s_dvp) while holding the vnode lock on the file in s_dvp. This patch modifies the client so that it performs the vrele(sp->s_dvp) as a separate task to avoid the LOR. This fix was discussed with jhb@ and kib@, who both proposed variations of it. Tested by: pho, jlott at averesystems.com Submitted by: jhb (earlier version) Reviewed by: kib Approved by: re (kib) MFC after: 2 weeks
*	The new NFS client failed to vput() the new vnode if a setattr	rmacklem	2011-07-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	failed after the file was created in nfs_create(). This would probably only happen during a forced dismount. The old NFS client does have a vput() for this case. Detected by pho during recent testing, where an open syscall returned with a vnode still locked. Tested by: pho Approved by: re (kib) MFC after: 2 weeks
*	Simple find/replace of VOP_ISLOCKED -> NFSVOPISLOCKED. This is done so that ↵	zack	2011-07-16	3	-3/+3
\| \| \| \| \| \| \| \|	NFSVOPISLOCKED can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
*	Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that ↵	zack	2011-07-16	3	-17/+17
\| \| \| \| \| \| \| \|	NFSVOPUNLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
*	Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that ↵	zack	2011-07-16	3	-10/+10
\| \| \| \| \| \| \| \|	NFSVOPLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
*	r222389 introduced a case where the NFSv4 client could	rmacklem	2011-07-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	loop in nfscl_getcl() when a forced dismount is in progress, because nfsv4_lock() will return 0 without sleeping when MNTK_UNMOUNTF is set. This patch fixes it so it won't loop calling nfsv4_lock() for this case. MFC after: 2 weeks
*	The algorithm used by nfscl_getopen() could have resulted in	rmacklem	2011-07-04	1	-81/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple instances of the same lock_owner when a process both inherited an open file descriptor plus opened the same file itself. Since some NFSv4 servers cannot handle multiple instances of the same lock_owner string, this patch changes the algorithm used by nfscl_getopen() in the new NFSv4 client to keep that from happening. The new algorithm is simpler, since there is no longer any need to ascend the process's parentage tree because all NFSv4 Closes for a file are done at VOP_INACTIVE()/VOP_RECLAIM(), making the Opens indistinct w.r.t. use with Lock Ops. This problem was discovered at the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks
*	Modify the new NFSv4 client so that it appends a file handle	rmacklem	2011-07-03	2	-21/+28
\| \| \| \| \| \| \| \| \| \| \| \| \|	to the lock_owner4 string that goes on the wire. Also, add code to do a ReleaseLockOwner Op on the lock_owner4 string before a Close. Apparently not all NFSv4 servers handle multiple instances of the same lock_owner4 string, at least not in a compatible way. This patch avoids having multiple instances, except for one unusual case, which will be fixed by a future commit. Found at the recent NFSv4 interoperability Bakeathon. Tested by: tdh at excfb.com MFC after: 2 weeks
*	Fix the new NFSv4 client so that it doesn't fill the cached	rmacklem	2011-06-28	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	mode attribute in as 0 when doing writes. The change adds the Mode attribute plus the others except Owner and Owner_group to the list requested by the NFSv4 Write Operation. This fixed a problem where an executable file built by "cc" would get mode 0111 instead of 0755 for some NFSv4 servers. Found at the recent NFSv4 interoperability Bakeathon. Tested by: tdh at excfb.com MFC after: 2 weeks
*	Fix the kgssapi so that it can be loaded as a module. Currently	rmacklem	2011-06-19	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	the NFS subsystems use five of the rpcsec_gss/kgssapi entry points, but since it was not obvious which others might be useful, all nineteen were included. Basically the nineteen entry points are set in a structure called rpc_gss_entries and inline functions defined in sys/rpc/rpcsec_gss.h check for the entry points being non-NULL and then call them. A default value is returned otherwise. Requested by rwatson. Reviewed by: jhb MFC after: 2 weeks
*	Add DTrace support to the new NFS client. This is essentially	rmacklem	2011-06-18	7	-5/+827
\| \| \| \| \| \| \|	cloned from the old NFS client, plus additions for NFSv4. A review of this code is in progress, however it was felt by the reviewer that it could go in now, before code slush. Any changes required by the review can be committed as bug fixes later.
*	Add support for flock(2) locks to the new NFSv4 client. I think this	rmacklem	2011-06-05	2	-3/+9
\| \| \| \| \| \| \| \| \| \|	should be ok, since the client now delays NFSv4 Close operations until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no risk that the NFSv4 Open is closed while an associated byte range lock still exists. Tested by: avg MFC after: 2 weeks
*	The new NFSv4 client was erroneously using "p" instead of	rmacklem	2011-06-05	4	-58/+56
\| \| \| \| \| \| \| \| \| \|	"p_leader" for the "id" for POSIX byte range locking. I think this would only have affected processes created by rfork(2) with the RFTHREAD flag specified. This patch fixes that by passing the "id" down through the various functions from nfs_advlock(). MFC after: 2 weeks
*	Fix the new NFSv4 client so that it doesn't crash when	rmacklem	2011-06-05	1	-0/+4
\| \| \| \| \| \| \| \|	a mount is done for a VIMAGE kernel. Tested by: glz at hidden-powers dot com Reviewed by: bz MFC after: 2 weeks
*	In the VOP_PUTPAGES() implementations, change the default error from	kib	2011-06-01	1	-11/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean(). VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost. The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes. Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week
*	Fix the new NFS client so that it doesn't do an NFSv3	rmacklem	2011-05-31	1	-2/+11
\| \| \| \| \| \| \| \| \| \|	Pathconf RPC for cases where the reply doesn't include the answer. This fixes a problem reported by avg@ where the NFSv3 Pathconf RPC would fail when "ls -l" did an lpathconf(2) for _PC_ACL_NFS4. Tested by: avg MFC after: 2 weeks
*	Fix the new NFS client so that it handles NFSv4 state	rmacklem	2011-05-27	2	-17/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	correctly during a forced dismount. This required that the exclusive and shared (refcnt) sleep lock functions check for MNTK_UMOUNTF before sleeping, so that they won't block while nfscl_umount() is getting rid of the state. As such, a "struct mount *" argument was added to the locking functions. I believe the only remaining case where a forced dismount can get hung in the kernel is when a thread is already attempting to do a TCP connect to a dead server when the krpc client structure called nr_client is NULL. This will only happen just after a "mount -u" with options that force a new TCP connection is done, so it shouldn't be a problem in practice. MFC after: 2 weeks
*	Add a check for MNTK_UNMOUNTF at the beginning of nfs_sync()	rmacklem	2011-05-26	1	-1/+11
\| \| \| \| \| \| \| \| \|	in the new NFS client so that a forced dismount doesn't get stuck in the VFS_SYNC() call that happens before VFS_UNMOUNT() in dounmount(). Additional changes are needed before forced dismounts will work. MFC after: 2 weeks
*	Add some missing mutex locking to the new NFS client.	rmacklem	2011-05-25	1	-0/+2
\| \| \| \|	MFC after: 2 weeks