summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient/nfs_socket.c
Commit message (Collapse)AuthorAgeFilesLines
* Move the NFS/RPC code away from lbolt.ed2008-07-221-5/+6
| | | | | | | | | | | | | | The kernel has a special wchan called `lbolt', which is triggered each second. It doesn't seem to be used a lot and it seems pretty redundant, because we can specify a timeout value to the *sleep() routines. In an attempt to eventually remove lbolt, make the NFS/RPC code use a timeout of `hz' when trying to reconnect. Only the TTY code (not MPSAFE TTY) and the VFS syncer seem to use lbolt now. Reviewed by: attilio, jhb Approved by: philip (mentor), alfred, dfr
* Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT.ru2008-03-251-2/+2
| | | | | | | | | | Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.
* Consolidate the code to generate a new XID for a NFS request into ajhb2008-02-131-8/+1
| | | | | | | | nfs_xid_gen() function instead of duplicating the logic in both nfsm_rpchead() and the NFS3ERR_JUKEBOX handling in nfs_request(). MFC after: 1 week Submitted by: mohans (a long while ago)
* The previous revision broke the case of reconnecting to a TCP NFS serverjhb2008-01-111-1/+22
| | | | | | | | | | | | | | | via a new socket during an NFS operation as that reconnect takes place in the context of an arbitrary thread with an arbitrary credential. Ideally we would like to use the mount point's credential for the entire process of setting up the socket to connect to the NFS server. Since some of the APIs (sobind(), etc.) only take a thread pointer and infer the credential from that instead of a direct credential, work around the problem by temporarily changing the current thread's credential to that of the mount point while connecting the socket and then reverting back to the original credential when we are done. Reviewed by: rwatson Tested on: UDP, TCP, TCP with forced reconnect
* Pass curthread to various socket routines (socreate(), sobind(), andjhb2008-01-101-1/+1
| | | | | | | | | soconnect()) instead of &thread0 when establishing a connection to the NFS server. Otherwise inconsistent credentials may be used when setting up the NFS socket. MFC after: 1 week Reviewed by: rwatson
* NFS MP scaling changes.mohans2007-10-121-71/+123
| | | | | | | | | | | | | | - Eliminate the hideous nfs_sndlock that serialized NFS/TCP request senders thru the sndlock. - Institute a new nfs_connectlock that serializes NFS/TCP reconnects. Add logic to wait for pending request senders to finish sending before reconnecting. Dial down the sb_timeo for NFS/TCP sockets to 1 sec. - Break out the nfs xid manipulation under a new nfs xid lock, rather than over loading the nfs request lock for this purpose. - Fix some of the locking in nfs_request. Many thanks to Kris Kennaway for his help with this and for initiating the MP scaling analysis and work. Kris also tested this patch thorougly. Approved by: re@ (Ken Smith)
* Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, whichrwatson2007-08-061-42/+14
| | | | | | | | | | | | | | | previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)
* In nfs_down(), if rep can be NULL, which we test for, then we shouldrwatson2007-05-181-3/+4
| | | | | | | | | lock and unlock conditionally, not just set the flag on it conditionally. In practice, this bug couldn't manifest, as in the current revision of the code, no callers pass a NULL rep. CID: 1416 Found with: Coverity Prevent(tm)
* Back out a chance to nfs_timer() that inadvertantly crept in the last checkin :(mohans2007-03-091-1/+1
|
* Over NFS, an open() call could result in multiple over-the-wiremohans2007-03-091-1/+1
| | | | | | | | | | | | GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.
* Backing out an earlier change. It seems harmless for NFS to miss the "forcemohans2007-02-161-6/+0
| | | | | unmount" flag, making the acquisition of the MNT_ILOCK in nfs_request() and nfs_sigintr() unnecessary. Pointed out by tegge@.
* Add missing MNT_ILOCK around some mnt_kern_flag accesses.mohans2007-02-111-0/+6
|
* NetApp filers return corrupt post op attrs in the wcc on NFS error responses.mohans2006-12-111-1/+8
| | | | | | | This is easy to reproduce for EROFS. I am not sure if the attrs can be corrupt for other NFS error responses. For now, disabling wcc pre-op attr checks and post-op attr loads on NFS errors (sysctl'ed). Reported by: Kris Kennaway
* bde@ pointed out that tprintf() acquires Giant so callers of tprintf() don'tmohans2006-11-271-6/+4
| | | | | | have to explicitly acquire Giant (although they need to be aware of this and not hold any locks at that point). Remove the acquisitions of Giant in the NFS client wrapping tprintf().
* 1) Fix up locking in nfs_up() and nfs_down.mohans2006-11-201-30/+39
| | | | | | | | | | | | | | | 2) Reduce the acquisitions of the Giant lock in the nfs_socket.c paths significantly. - We don't need to acquire Giant before tsleeping on lbolt anymore, since jhb specialcased lbolt handling in msleep. - nfs_up() needs to acquire Giant only if printing the "server up" message. - nfs_timer() held Giant for the duration of the NFS timer processing, just because the printing of the message in nfs_down() needed it (and we acquire other locks in nfs_timer()). The acquisition of Giant is moved down into nfs_down() now, reducing the time Giant is held in that path. Reported by: Kris Kennaway
* Make EWOULDBLOCK a recoverable error so that the request is retransmitted.mohans2006-10-311-2/+2
| | | | | | | This bug results in data corruption with NFS/TCP. Writes are silently dropped on EWOULDBLOCK (because socket send buffer is full and sockbuf timer fires). Reviewed by: ups@
* Fix for a deadlock triggered by a 'umount -f' causing a NFS request to nevermohans2006-08-291-2/+14
| | | | | | retransmit (or return). Thanks to John Baldwin for helping nail this one. Found by : Kris Kennaway
* soreceive_generic(), and sopoll_generic(). Add new functions sosend(),rwatson2006-07-241-11/+6
| | | | | | | | | | | | | | | | soreceive(), and sopoll(), which are wrappers for pru_sosend, pru_soreceive, and pru_sopoll, and are now used univerally by socket consumers rather than either directly invoking the old so*() functions or directly invoking the protocol switch method (about an even split prior to this commit). This completes an architectural change that was begun in 1996 to permit protocols to provide substitute implementations, as now used by UDP. Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to perform these operations on sockets -- in particular, distributed file systems and socket system calls. Architectural head nod: sam, gnn, wollman
* Signals may be delivered to process as well as to the thread. Check thekib2006-07-081-1/+3
| | | | | | | | thread-delivered signals in addition to the process one. Reviewed by: mohan MFC after: 1 month Approved by: kan (mentor)
* Refactor the NFS over UDP retransmit timeout estimation logic to allowcel2006-05-231-60/+131
| | | | | | | | | | | | | the estimator to be more easily tuned and maintained. There should be no functional change except there is now a lower limit on the retransmit timeout to prevent the client from retransmitting faster than the server's disks can fill requests, and an upper limit to prevent the estimator from taking to long to retransmit during a server outage. Reviewed by: mohan, kris, silby Sponsored by: Network Appliance, Incorporated
* Changes to make the NFS client MP safe.mohans2006-05-191-171/+272
| | | | Thanks to Kris Kennaway for testing and sending lots of bugs my way.
* Fix a snafu caused while patching the previous fix from another branch.mohans2006-05-051-1/+0
|
* Fix for a NFS/TCP client bug which would cause the NFS/TCP stream to getmohans2006-05-051-0/+31
| | | | | out of sync under heavy loads, forcing frequent reconnets, causing EBADRPC errors etc.
* Fix a bug in the NFS/TCP retransmission path.kris2006-03-231-0/+1
| | | | | | | | | | | | | | | | | The bug was that earlier, if a request was retransmitted, we would do subsequent retransmits every 10 msecs. This can cause data corruption under moderate loads by reordering operations as seen by the client NFS attribute cache, and on the server side when the retransmission occurs after the original request has left the duplicate cache, since the operation will be committed for a second time. Further work on retransmission handling is needed (e.g. they are still being done sent too often since they are scaled by HZ, and the size of the dup cache is too small and easily overwhelmed on busy servers). Submitted by: mohans
* If an NFS server returns more than a few EJUKEBOX errors for a given RPCcel2006-03-171-8/+4
| | | | | | | | | | | | | | | | request, the FreeBSD NFS client will quickly back off to a excessively long wait (days, then weeks) before retrying the request. Change the behavior of the FreeBSD NFS client to match the behavior of the reference NFS client implementation (Solaris). This provides a fixed delay of 10 seconds between each retry by default. A sysctl, called nfs3_jukebox_delay, is now available to tune the delay. Unlike Solaris, the sysctl value on FreeBSD is in seconds, rather than in HZ. Sponsored by: Network Appliance, Incorporated Reviewed by: rick Approved by: silby MFC after: 3 days
* Don't log an error on tcp connection reset, even if we don't get ECONNRESET.rees2006-01-201-2/+2
| | | | Submitted by: cel@citi.umich.edu
* Improve upon rev 1.133 where NFS/TCP would not reconnect.ps2005-12-121-13/+2
| | | | Submitted by: Mohan Srinivasan
* Fix for a bug where NFS/TCP would not reconnect (in the case whereps2005-11-211-1/+12
| | | | | | | the server FIN'ed). Seen with Solaris NFS servers. Reported by: TOMITA Yoshinori <yoshint@flab.fujitsu.co.jp> Submitted by: Mohan Strinivasan
* fix a problem with XID re-use when a server returns NFSERR_JUKEBOX.rees2005-11-211-3/+8
| | | | | | | Submitted by: cel@citi.umich.edu Fixed by: rick@snowhite.cis.uoguelph.ca Approved by: alfred MFC after: 3 weeks
* Fix for a race between the thread transmitting the request and theps2005-11-031-1/+5
| | | | | | thread processing the reply. Submitted by: Mohan Srinivasan
* Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),rwatson2005-09-191-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week
* FIx for a bug in the change that made nfs_timer() MPSAFE. We need tops2005-07-271-0/+2
| | | | | | | grab Giant before calling pru_send() (if running with mpsafenet = 0). Found by: Jeremie Le Hen. Fixed by: Maxime Henrion
* Make nfs_timer() MPSAFE. With this change, the bottom half of the NFSps2005-07-191-11/+21
| | | | | | | client (the interface with the protocol stack and callouts) is Giant-free. Submitted by: Mohan Srinivasan.
* Fix for a NFS soft mounts bug where if the number of retries exceedsps2005-07-181-1/+2
| | | | | | | | the max rexmits, the request was not being bounced back with a ETIMEDOUT error. Reported by: Oliver Lehmann Submitted by: Mohan Srinivasan
* Fixes for NFS crashes on architectures that require strict alignment.ps2005-07-141-1/+2
| | | | | | | | | | - Fix nfsm_disct() so that after pulling up data, the remaining data is aligned if necessary. - Fix nfs_clnt_tcp_soupcall() to bcopy() the rpc length out of the mbuf (instead of casting m_data to a uint32). Submitted by: Pyun YongHyeon Reviewed by: Mohan Srinivasan
* set R_MUSTRESEND flag in mark_for_reconnect so re-connected requests getrees2005-05-101-12/+6
| | | | | | | | | | | re-sent instead of timing out. don't log an error message on reconnection, which is not an error. remove unused nfs_mrep_before_tsleep. Reviewed by: Mohan Srinivasan Approved by: alfred
* Fix a bug in NFS/TCP where retransmissions would not reliably happenps2005-05-041-3/+11
| | | | | | | if the server rebooted or tore down the connection for any reason. Found by: Jonathan Noack. Submitted by: Mohan Srinivasan.
* TCP reconnect is not an error.rees2005-04-181-3/+3
| | | | | | Change the message from LOG_ERR to LOG_INFO. Approved by: alfred
* - The NFS client was incorrectly masking SIGSTOP (which isps2005-03-231-19/+6
| | | | | | | | | | | non-maskable). - The NFS client needs to guard against spurious wakeups while waiting for the response. ltrace causes the process under question to wakeup (possibly from ptrace()), which causes NFS to wakeup from tsleep without the response being delivered. Submitted by: Mohan Srinivasan
* Minor cleanup in nfs_request() and removal of a comment that doesn'tps2005-02-261-10/+1
| | | | | | reflect reality. Submitted by: Mohan Srinivasan
* Fix for a potential NFS client race where shared data is updated fromps2005-02-181-0/+4
| | | | | | base context as well as the socket callback. Submitted by: Mohan Srinivasan
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* If the NFS/TCP stream is out of sync between the client and server,ps2005-01-051-1/+1
| | | | | | | | and if the client (erroneously) reads the RPC length as 0 bytes, the client can loop around in the socket callback. Explicitly check for the length being 0 case and teardown/re-connect. Submitted by: Mohan Srinivasan
* Always issue wakeups() to the NFS requestors under the mutexps2004-12-071-7/+17
| | | | | | to close all potential cases of missed wakeups. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* Rewrite of the NFS client's reply handling. We now have NFS socketps2004-12-061-401/+530
| | | | | | | | upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
* In nfs_timer(), pass curthread rather than &thread0 into the protocolrwatson2004-08-251-4/+2
| | | | | | | | | | | send routine. In IPv6 UDP, the thread will be passed to suser(), which asserts that if a thread is used for a super user check, it be curthread. Many of these protocol entry points probably need to accept credentials instead of threads. MT5 candidate. Noticed/tested by: kuriyama
* Turn off SO_REUSEADDR and SO_REUSEPORT, they were causing EADDRINUSEalfred2004-07-131-5/+1
| | | | | | to be returned from the protocol stack. Pointy hat to me for not groking what those options _really_ mean.
* Rename Alfred's kern_setsockopt to so_setsockopt, as this seems adwmalone2004-07-121-2/+2
| | | | | | | | a better name. I have a kern_[sg]etsockopt which I plan to commit shortly, but the arguments to these function will be quite different from so_setsockopt. Approved by: alfred
* Use SO_REUSEADDR and SO_REUSEPORT when reconnecting NFS mounts.alfred2004-07-121-2/+10
| | | | | | | Tune the timeout from 5 seconds to 12 seconds. Provide a sysctl to show how many reconnects the NFS client has done. Seems to fix IPv6 from: kuriyama
* Acquire socket lock in nfs_connect() connection/sleep loop to protectrwatson2004-07-061-6/+6
| | | | socket state and avoid missed wakeups.
OpenPOWER on IntegriCloud