summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient
Commit message (Collapse)AuthorAgeFilesLines
* Make EWOULDBLOCK a recoverable error so that the request is retransmitted.mohans2006-10-311-2/+2
| | | | | | | This bug results in data corruption with NFS/TCP. Writes are silently dropped on EWOULDBLOCK (because socket send buffer is full and sockbuf timer fires). Reviewed by: ups@
* Fixed some style bugs (especially ones involving long lines and usebde2006-10-171-17/+19
| | | | of __P(())). There are many more.
* Don't do null Setattr RPCs for VA_MARK_ATIME. When we added thebde2006-10-141-2/+2
| | | | | | | | | | | | | | | | | | VA_MARK_ATIME feature to fix POSIX conformance fore execve() and mmap(), we thought that it was optimized well enough for the one file system that supports it (ffs) and harmless for other file systems (except layered ones which already get the layering for VOP_SETATTR() wrong). However, nfs_setattr() doesn't do much parameter checking, so when it gets a combination of parameters that it doesn't understand, it always does a Setattr RPC. This RPC can't do anything good, and for VA_MARK_ATIME it is null except for wasting a lot of time. This is the smallest and easiest to fix of several bugs that have increased the number of RPCs for kernel builds on nfs by more than 100% since 2004-11-05. The real-time increase depends on network latency and parallelization and can also be very large (approaching the same percentage for unparallelized operations like "make depend" on systems with fast CPUs and high-latency networks).
* First part of a little cleanup in the calendar/timezone/RTC handling.phk2006-10-021-0/+1
| | | | | | Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.
* Add mnt_noasync counter to better handle interleaved calls to nmount(),tegge2006-09-261-1/+1
| | | | | | sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
* Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.tegge2006-09-261-3/+13
| | | | | This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
* Fixes up the handling of shared vnode lock lookups in the NFS client,mohans2006-09-135-14/+14
| | | | | | | | | | | | | | | | | | | | adds a FS type specific flag indicating that the FS supports shared vnode lock lookups, adds some logic in vfs_lookup.c to test this flag and set lock flags appropriately. - amd on 6.x is a non-starter (without this change). Using amd under heavy load results in a deadlock (with cascading vnode locks all the way to the root) very quickly. - This change should also fix the more general problem of cascading vnode deadlocks when an NFS server goes down. Ideally, we wouldn't need these changes, as enabling shared vnode lock lookups globally would work. Unfortunately, UFS, for example isn't ready for shared vnode lock lookups, crashing pretty quickly. This change is the result of discussions with Stephan Uphoff (ups@). Reviewed by: ups@
* Fix for a deadlock triggered by a 'umount -f' causing a NFS request to nevermohans2006-08-291-2/+14
| | | | | | retransmit (or return). Thanks to John Baldwin for helping nail this one. Found by : Kris Kennaway
* Fix typos in comment.thomas2006-08-161-1/+1
|
* Introduce a field to struct vm_page for storing flags that arealc2006-08-091-1/+1
| | | | | | | | | | | | | | | | synchronized by the lock on the object containing the page. Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively. Eliminate the assertion that the page queues lock is held in vm_page_io_finish(). Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().
* Add a new kernel environment variable "boot.netif.mtu" which is used tobrooks2006-08-091-0/+10
| | | | | | | | set the MTU prior to mounting root via NFS. This is required if the server supports a higher than default MTU because the client will not see the responses otherwise. MFC after: 3 weeks
* soreceive_generic(), and sopoll_generic(). Add new functions sosend(),rwatson2006-07-241-11/+6
| | | | | | | | | | | | | | | | soreceive(), and sopoll(), which are wrappers for pru_sosend, pru_soreceive, and pru_sopoll, and are now used univerally by socket consumers rather than either directly invoking the old so*() functions or directly invoking the protocol switch method (about an even split prior to this commit). This completes an architectural change that was begun in 1996 to permit protocols to provide substitute implementations, as now used by UDP. Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to perform these operations on sockets -- in particular, distributed file systems and socket system calls. Architectural head nod: sam, gnn, wollman
* Signals may be delivered to process as well as to the thread. Check thekib2006-07-081-1/+3
| | | | | | | | thread-delivered signals in addition to the process one. Reviewed by: mohan MFC after: 1 month Approved by: kan (mentor)
* Always supply curthread as argument to nfs_asyncio and nfs_doiokib2006-07-081-8/+2
| | | | | | | | | in nfs_strategy. Otherwise, for some buffers, signals would be ignored at the intr mounts. Reviewed by: mohan MFC after: 1 month Approved by: kan (mentor)
* There is a consensus that ifaddr.ifa_addr should never be NULL,yar2006-06-292-6/+7
| | | | | | | | | | except in places dealing with ifaddr creation or destruction; and in such special places incomplete ifaddrs should never be linked to system-wide data structures. Therefore we can eliminate all the superfluous checks for "ifa->ifa_addr != NULL" and get ready to the system crashing honestly instead of masking possible bugs. Suggested by: glebius, jhb, ru
* Use the elegant TAILQ_FOREACH() in place of a hand-rolled for() loop.yar2006-06-291-3/+1
|
* Kris Kennaway found that for '/' NFS mounts, the MPSAFE mount flag wasmohans2006-05-301-1/+2
| | | | not being set, which means Giant would be acquired for these mounts.
* Fix for a potential attempt to sleep while holding nm_mtx. Caught and reportedmohans2006-05-261-1/+1
| | | | | | by Witness (which forces the mbuf allocation flag to M_NOWAIT). Reported by: "sekes".
* Call vm_object_page_clean() with the object lock held.ups2006-05-251-0/+2
| | | | | | Submitted by: kensmith@ Reviewed by: mohans@ MFC after: 6 days
* Do not set B_NOCACHE on buffers when releasing them in flushbuflist().ups2006-05-251-0/+11
| | | | | | | | | | | | | | | If B_NOCACHE is set the pages of vm backed buffers will be invalidated. However clean buffers can be backed by dirty VM pages so invalidating them can lead to data loss. Add support for flush dirty page in the data invalidation function of some network file systems. This fixes data losses during vnode recycling (and other code paths using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file. Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@ Reviewed by: tegge@ MFC after: 7 days
* Since NFSv4 is not SMP safe, nfsiod needs to acquire Giant for NFSv4 mountsmohans2006-05-242-0/+9
| | | | | | before doing the read/write. Reported by: Chuck Lever.
* Adjust minimum iod threads from 4 to 0 -- since we compile the NFSrwatson2006-05-241-1/+1
| | | | | | | | | | | | | | | | | client into the kernel by default, and many users won't use NFS, don't start an extra 4 kernel threads that are unused. Once NFS becomes active, it will start nfsiod's as it needs them. We might consider mandating a minimum iod's equal to the number of active NFS mounts (truncated to some value), which would force some to remain available without having to create a new one if the file system is mostly inactive. PR: 70880 MFC after: 2 weeks Prodded by: cel Head nod: peter Pointed out by: Joe <fbsd_user at a1poweruser dot com>
* NFS over TCP retransmit behavior should default to a 60 second time out,cel2006-05-232-3/+9
| | | | | | | | | | | | mimicing the NFS reference implementation. NFS over TCP does not need fast retransmit timeouts, since network loss and congestion are managed by the transport (TCP), unlike with NFS over UDP. A long timeout prevents the unnecessary retransmission of non- idempotent NFS requests. Reviewed by: mohans, silby, rees? Sponsored by: Network Appliance, Incorporated
* Refactor the NFS over UDP retransmit timeout estimation logic to allowcel2006-05-233-62/+158
| | | | | | | | | | | | | the estimator to be more easily tuned and maintained. There should be no functional change except there is now a lower limit on the retransmit timeout to prevent the client from retransmitting faster than the server's disks can fill requests, and an upper limit to prevent the estimator from taking to long to retransmit during a server outage. Reviewed by: mohan, kris, silby Sponsored by: Network Appliance, Incorporated
* Vnode locks are recursive and the NFS client support shared vnode locks.mohans2006-05-231-0/+5
| | | | Found by: Kris Kennaway.
* Changes to make the NFS client MP safe.mohans2006-05-1910-450/+919
| | | | Thanks to Kris Kennaway for testing and sending lots of bugs my way.
* Fix a snafu caused while patching the previous fix from another branch.mohans2006-05-051-1/+0
|
* Fix for a NFS/TCP client bug which would cause the NFS/TCP stream to getmohans2006-05-051-0/+31
| | | | | out of sync under heavy loads, forcing frequent reconnets, causing EBADRPC errors etc.
* Keep track of the number of in-progress async direct IO writes in the nfsnode.mohans2006-04-063-5/+36
| | | | | Make fsync/close wait until all of these drain. Add a check to nfs_getpage() and nfs_putpage().
* - Busy the filesystem in nfs_statfs to prevent us from creating a newjeff2006-04-011-1/+7
| | | | | | | | | vnode after vflush() has succeeded. This would cause a dangling vnode panic at unmount time otherwise. Other filesystems may have this problem via their VFS_VGET() routines. Found by: kris Sponsored by: Isilon Systems, Inc.
* Fix a bug in the NFS/TCP retransmission path.kris2006-03-231-0/+1
| | | | | | | | | | | | | | | | | The bug was that earlier, if a request was retransmitted, we would do subsequent retransmits every 10 msecs. This can cause data corruption under moderate loads by reordering operations as seen by the client NFS attribute cache, and on the server side when the retransmission occurs after the original request has left the duplicate cache, since the operation will be committed for a second time. Further work on retransmission handling is needed (e.g. they are still being done sent too often since they are scaled by HZ, and the size of the dup cache is too small and easily overwhelmed on busy servers). Submitted by: mohans
* Actually I wanted 'nolockd' here instead of 'lockd'.pjd2006-03-191-1/+1
| | | | MFC after: 2 days
* If an NFS server returns more than a few EJUKEBOX errors for a given RPCcel2006-03-171-8/+4
| | | | | | | | | | | | | | | | request, the FreeBSD NFS client will quickly back off to a excessively long wait (days, then weeks) before retrying the request. Change the behavior of the FreeBSD NFS client to match the behavior of the reference NFS client implementation (Solaris). This provides a fixed delay of 10 seconds between each retry by default. A sysctl, called nfs3_jukebox_delay, is now available to tune the delay. Unlike Solaris, the sysctl value on FreeBSD is in seconds, rather than in HZ. Sponsored by: Network Appliance, Incorporated Reviewed by: rick Approved by: silby MFC after: 3 days
* Fix a bug in NFSv3 READDIRPLUS reply processingcel2006-03-081-1/+5
| | | | | | | | | | | | | | | The client's READDIRPLUS logic skips the attributes and filehandle of the ".." entry. If the server doesn't send attributes but does send a filehandle for "..", the client's logic doesn't account for the extra "value follows" field that indicates whether the filehandle is present, causing the remaining entries in the reply to be ignored. Sponsored by: Network Appliance, Inc. Reviewed by: rick, mohans Approved by: silby MFC after: 2 weeks
* Don't log an error on tcp connection reset, even if we don't get ECONNRESET.rees2006-01-201-2/+2
| | | | Submitted by: cel@citi.umich.edu
* I ran into an nfs client panic a couple of times in a row over thealfred2006-01-171-1/+5
| | | | | | | | | | | | | | | | | | | | last few days. I tracked it down to the fact that nfs_reclaim() is setting vp->v_data to NULL _before_ calling vnode_destroy_object(). After silence from the mailing list I checked further and discovered that ufs_reclaim() is unique among FreeBSD filesystems for calling vnode_destroy_object() early, long before tossing v_data or much of anything else, for that matter. The rest, including NFS, appear to be identical, as if they were just clones of one original routine. The enclosed patch fixes all file systems in essentially the same way, by moving the call to vnode_destroy_object() to early in the routine (before the call to vfs_hash_remove(), if any). I have only tested NFS, but I've now run for over eighteen hours with the patch where I wouldn't get past four or five without it. Submitted by: Frank Mayhar Requested by: Mohan Srinivasan MFC After: 1 week
* In nfs_dolock(), GC now under-used ioflg, rendered obsolete when we movedrwatson2006-01-131-4/+1
| | | | | | | from using a fifo to talk to rpc.lockd to using a special device node. Noticed by: Coverity Prevent analysis tool MFC after: 3 days
* Add marker vnodes to ensure that all vnodes associated with the mount point aretegge2006-01-091-2/+3
| | | | | | iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman
* Correct a typodelphij2005-12-281-1/+1
|
* Improve upon rev 1.133 where NFS/TCP would not reconnect.ps2005-12-121-13/+2
| | | | Submitted by: Mohan Srinivasan
* Unexpand LLADDR().ru2005-11-291-2/+2
|
* Fix for a bug where NFS/TCP would not reconnect (in the case whereps2005-11-211-1/+12
| | | | | | | the server FIN'ed). Seen with Solaris NFS servers. Reported by: TOMITA Yoshinori <yoshint@flab.fujitsu.co.jp> Submitted by: Mohan Strinivasan
* - Always return success from NFS strategy. nfs_doio(), in theps2005-11-212-5/+4
| | | | | | | | | | | | event of an error, does the right thing, in terms of setting the error flags in the buf header. That fixes a crash from bstrategy(). - Treat ETIMEDOUT as a "recoverable" error, causing the buffer to be re-dirtied. ETIMEDOUT can occur on soft mounts, when the number of retries are exceeded, and we don't want data loss in that case. Submitted by: Mohan Srinivasan
* fix a problem with XID re-use when a server returns NFSERR_JUKEBOX.rees2005-11-213-7/+13
| | | | | | | Submitted by: cel@citi.umich.edu Fixed by: rick@snowhite.cis.uoguelph.ca Approved by: alfred MFC after: 3 weeks
* fix a crash when an nfsv2 mount failsjon2005-11-101-2/+4
| | | | MFC after: 1 week
* Fix for a crash (from nfs_lookup() in an error case).ps2005-11-031-1/+1
| | | | Submitted by: Mohan Srinivasan
* In nfs_flush(), clear the NMODIFIED bit only if there are no dirtyps2005-11-031-1/+2
| | | | | | | | | | | buffers *and* there are no buffers queued up for writing. The bug was that NMODIFIED was being cleared even while there were buffers scheduled to be written out, which leads to all sorts of interesting bugs - one where the file could shrink (because of a post-op getattr load, say) causing data in buffer(s) queued for write to be tossed, resulting in data corruption. Submitted by: Mohan Srinivasan
* Fix for a race between the thread transmitting the request and theps2005-11-031-1/+5
| | | | | | thread processing the reply. Submitted by: Mohan Srinivasan
* Normalize a significant number of kernel malloc type names:rwatson2005-10-313-8/+8
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* - Fix leak of struct nlminfo on process exit.glebius2005-10-262-3/+15
| | | | | | | - Fix malloc type collision, that made the above problem difficult to understand. Reported by: Vladimir Sharun <sharun ukr.net>
OpenPOWER on IntegriCloud