summaryrefslogtreecommitdiffstats
path: root/sys/rpc
Commit message (Collapse)AuthorAgeFilesLines
* MFC r261449:mav2014-02-071-1/+1
| | | | Fix lock acquisition in case no request space available, missed in r260097.
* MFC r260229, r260258, r260367, r260390, r260459, r260648:mav2014-01-224-22/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework NFS Duplicate Request Cache cleanup logic. - Introduce additional hash to group requests by hash of sockref. This allows to process TCP acknowledgements without looping though all the cache, and as result allows to do it every time. - Indroduce additional callbacks to notify application layer about sockets disconnection. Without this last few requests processed just before socket disconnection never processed their ACKs and stuck in cache for many hours. - Implement transport-specific method for tracking reply acknowledgements. New implementation does not cross multiple stack layers to get the data and does not have race conditions that previously made some requests stuck in cache. This could be done more efficiently at sockbuf layer, but that would broke some KBIs, while I don't know other consumers for it aside NFS. - Instead of traversing all DRC twice per request, run cleaning only once per request, and except in some conditions traverse only single hash slot at a time. Together this limits NFS DRC growth only to situations of real connectivity problems. If network is working well, and so all replies are acknowledged, cache remains almost empty even after hours of heavy load. Without this change on the same test cache was growing to many thousand requests even with perfectly working local network. As another result this reduces CPU time spent on the DRC handling during SPEC NFS benchmark from about 10% to 0.5%. Sponsored by: iXsystems, Inc.
* MFC r260097:mav2014-01-222-53/+54
| | | | | | | Move most of NFS file handle affinity code out of the heavily congested global RPC thread pool lock and protect it with own set of locks. On synthetic benchmarks this improves peak NFS request rate by 40%.
* MFC r260036:mav2014-01-224-11/+25
| | | | | | Introduce xprt_inactive_self() -- variant for use when sure that port is assigned to thread. For example, withing receive handlers. In that case the function reduces to single assignment and can avoid locking.
* MFC r260031:mav2014-01-221-1/+9
| | | | | In addition to r259632 completely block receive upcalls if we have more data than we need. This reduces lock pressure from xprt_active() side.
* MFC r259828:mav2014-01-221-4/+7
| | | | Fix a bug introduced at r259632, triggering infinite loop in some cases.
* MFC r259659, r259662:mav2014-01-222-41/+46
| | | | | | | | | | | | | Remove several linear list traversals per request from RPC server code. Do not insert active ports into pool->sp_active list if they are success- fully assigned to some thread. This makes that list include only ports that really require attention, and so traversal can be reduced to simple taking the first one. Remove idle thread from pool->sp_idlethreads list when assigning some work (port of requests) to it. That again makes possible to replace list traversals with simple taking the first element.
* MFC r259632:mav2014-01-221-128/+115
| | | | | | | | | | | | | | | | | | | | | Rework flow control for connection-oriented (TCP) RPC server. When processing receive buffer, write the amount of data, expected in present request record, into socket's so_rcv.sb_lowat to make stack aware about our needs. When processing following upcalls, ignore them until socket collect enough data to be read and processed in one turn. This change reduces number of context switches and other operations in RPC stack during large NFS writes (especially via non-Jumbo networks) by order of magnitude. After precessing current packet, take another look into the pending buffer to find out whether the next packet had been already received. If not, deactivate this port right there without making RPC code to push this port to another thread just to find that there is nothing. If the next packet is received partially, also deactivate the port, but also update socket's so_rcv.sb_lowat to not be woken up prematurely. This change additionally reduces number of context switches per NFS request about in half.
* MFC r258578, r258580, r258581 (by hrs):mav2014-01-2230-812/+748
| | | | | Replace Sun RPC license in TI-RPC library with a 3-clause BSD license with the explicit permissions.
* MFC r258132:mav2014-01-221-21/+27
| | | | | | | | Some minor tuning to rpc/svc.c: - close cosmetic race in svc_exit(); - do not set wait timeout for idle threads if we have no use for wakeups; - create new requested thread sooner, not only after some another thread wakeup, that may happen later under constant load.
* MFC r259842:dim2013-12-283-8/+1
| | | | | | | | | | | Remove some unused static const strings under sys/rpc, which have never been used since the initial commit (r177633). MFC r259843: Move a static const variable to the #if 0 part where it is only used. (Note the #if 0 part has been inactive since the initial commit, r177633, so maybe it should be removed altogether).
* It was reported via email that the cu_sent field used by thermacklem2013-09-061-0/+2
| | | | | | | | | | | | | | | krpc client side UDP was observed as way out of range and caused the rpc.lockd daemon to hang trying to do an RPC. Inspection of the code found two places where the RPC request is re-queued, but the value of cu_sent was not incremented. Since cu_sent is always decremented when the RPC request is dequeued, I think this could have caused cu_sent to go out of range. This patch adds lines to increment cu_sent for these two cases. Reported by: dwhite@ixsystems.com Discussed with: dwhite@ixsystems.com MFC after: 2 weeks
* Add support for host-based (Kerberos 5 service principal) initiatorrmacklem2013-07-092-18/+127
| | | | | | | | credentials to the kernel rpc. Modify the NFSv4 client to add support for the gssname and allgssname mount options to use this capability. Requires the gssd daemon to be running with the "-h" option. Reviewed by: jhb
* Fix a potential socket leak in the NFS server. If a client closes itsjhb2013-04-081-1/+4
| | | | | | | | | | | | | | connection after it was accepted by the userland nfsd process but before it was handled off to svc_vc_create() in the kernel, then svc_vc_create() would see it as a new listen socket and try to listen on it leaving a dangling reference to the socket. Instead, check for disconnected sockets and treat them like a connected socket. The call to pru_getaddr() should fail and cause svc_vc_create() to fail. Note that we need to lock the socket to get a consistent snapshot of so_state since there is a window in soisdisconnected() where both flags are clear. Reviewed by: dfr, rmacklem MFC after: 1 week
* Improve error handling when unwrapping received data.gnn2013-04-041-1/+16
| | | | | Submitted by: Rick Macklem MFC after: 1 week
* Revert 195703 and 195821 as this special stop handling in NFS is nowjhb2013-03-132-5/+3
| | | | | implemented via VFCF_SBDRY rather than passing PBDRY to individual sleep calls.
* Use m_get(), m_gethdr() and m_getcl() instead of historic macros.glebius2013-03-127-15/+8
| | | | Sponsored by: Nginx, Inc.
* Add support for backchannels to the kernel RPC. Backchannelsrmacklem2012-12-086-98/+405
| | | | | | | | | | | | are used by NFSv4.1 for callbacks. A backchannel is a connection established by the client, but used for RPCs done by the server on the client (callbacks). As a result, this patch mixes some client side calls in the server side and vice versa. Some definitions in the .c files were extracted out into a file called krpc.h, so that they could be included in multiple .c files. This code has been in projects/nfsv4.1-client for some time. Although no one has given it a formal review, I believe kib@ has taken a look at it.
* Mechanically substitute flags from historic mbuf allocator withglebius2012-12-058-14/+14
| | | | | | | | | malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
* Modify the comment to take out the names and URL.rmacklem2012-10-251-6/+3
| | | | | Requested by: kib MFC after: 3 days
* Add a comment describing why r241097 was done.rmacklem2012-10-151-0/+11
| | | | | Suggested by: rwatson MFC after: 1 week
* rpc: convert all uid and gid variables to u_int.pfg2012-10-041-4/+4
| | | | | | | | | | | After further discussion, instead of pretending to use uid_t and gid_t as upstream Solaris and linux try to, we are better using u_int, which is in fact what the code can handle and best approaches the range of values used by uid and gid. Discussed with: bde Reviewed by: bde
* libtirpc: be sure to free cl_netid and cl_tppfg2012-10-021-0/+4
| | | | | | | | | | | | | When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. This change matches the reference (OpenSolaris) implementation. Tested by: David Wolfskill Obtained from: Bull GNU/Linux NFSv4 Project (libtirpc) MFC after: 2 weeks
* RPC: Convert all uid and gid variables of the type uid_t and gid_t.pfg2012-10-021-5/+4
| | | | | | | | This matches what upstream (OpenSolaris) does. Tested by: David Wolfskill Obtained from: Bull GNU/Linux NFSv4 project (libtirpc) MFC after: 3 days
* Attila Bogar and Herbert Poeckl both reported similar problemsrmacklem2012-10-011-3/+4
| | | | | | | | | | | | | | w.r.t. a Linux NFS client doing a krb5 NFS mount against the FreeBSD server. We determined this was a Linux bug: http://www.spinics.net/lists/linux-nfs/msg32466.html, however the mount failed to work, because the Destroy operation with a bogus encrypted checksum destroyed the authenticator handle. This patch changes the rpcsec_gss code so that it doesn't Destroy the authenticator handle for this case and, as such, the Linux mount will work. Tested by: Attila Bogar and Herbert Poeckl MFC after: 2 weeks
* Complete revert of r239963:pfg2012-09-272-11/+5
| | | | | | | | | | | | | | | The attempt to merge changes from the linux libtirpc caused rpc.lockd to exit after startup under unclear conditions. After many hours of selective experiments and inconsistent results the conclusion is that it's better to just revert everything and restart in a future time with a much smaller subset of the changes. ____ MFC after: 3 days Reported by: David Wolfskill Tested by: David Wolfskill
* Partial revert of r239963:pfg2012-09-241-4/+0
| | | | | | | | | | | | | | | | | The following change caused rpc.lockd to exit after startup: ____ libtirpc: be sure to free cl_netid and cl_tp When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. ____ MFC after: 3 days Reported by: David Wolfskill Tested by: David Wolfskill
* Fix RPC headers for C++pfg2012-09-022-12/+12
| | | | | | | | | C++ mangling will cause trouble with variables like __rpc_xdr in xdr.h so rename this to XDR. While here add proper C++ guards to RPC headers. PR: 137443 MFC after: 2 weeks
* Bring some changes from Bull's NFSv4 libtirpc implementation.pfg2012-09-013-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We especifically ignored the glibc compatibility changes but this should help interaction with Solaris and Linux. ____ Fixed infinite loop in svc_run() author Steve Dickson Tue, 10 Jun 2008 12:35:52 -0500 (13:35 -0400) Fixed infinite loop in svc_run() ____ __rpc_taddr2uaddr_af() assumes the netbuf to always have a non-zero data. This is a bad assumption and can lead to a seg-fault. This patch adds a check for zero length and returns NULL when found. author Steve Dickson Mon, 27 Oct 2008 11:46:54 -0500 (12:46 -0400) ____ Changed clnt_spcreateerror() to return clearer and more concise error messages. author Steve Dickson Thu, 20 Nov 2008 08:55:31 -0500 (08:55 -0500) ____ Converted all uid and gid variables of the type uid_t and gid_t. author Steve Dickson Wed, 28 Jan 2009 12:44:46 -0500 (12:44 -0500) ____ libtirpc: set r_netid and r_owner in __rpcb_findaddr_timed These fields in the rpcbind GETADDR call are being passed uninitialized to CLNT_CALL. In the case of x86_64 at least, this usually leads to a segfault. On x86, it sometimes causes segfaults and other times causes garbage to be sent on the wire. rpcbind generally ignores the r_owner field for calls that come in over the wire, so it really doesn't matter what we send in that slot. We just need to send something. The reference implementation from Sun seems to send a blank string. Have ours follow suit. author Jeff Layton Fri, 13 Mar 2009 11:44:16 -0500 (12:44 -0400) ____ libtirpc: be sure to free cl_netid and cl_tp When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. author Jeff Layton Fri, 13 Mar 2009 11:47:36 -0500 (12:47 -0400) Obtained from: Bull GNU/Linux NFSv4 Project MFC after: 3 weeks
* Both a crash reported on freebsd-current on Oct. 18 under thermacklem2011-11-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | subject heading "mtx_lock() of destroyed mutex on NFS" and PR# 156168 appear to be caused by clnt_dg_destroy() closing down the socket prematurely. When to close down the socket is controlled by a reference count (cs_refs), but clnt_dg_create() checks for sb_upcall being non-NULL to decide if a new socket is needed. I believe the crashes were caused by the following race: clnt_dg_destroy() finds cs_refs == 0 and decides to delete socket clnt_dg_destroy() then loses race with clnt_dg_create() for acquisition of the SOCKBUF_LOCK() clnt_dg_create() finds sb_upcall != NULL and increments cs_refs to 1 clnt_dg_destroy() then acquires SOCKBUF_LOCK(), sets sb_upcall to NULL and destroys socket This patch fixes the above race by changing clnt_dg_destroy() so that it acquires SOCKBUF_LOCK() before testing cs_refs. Tested by: bz PR: 156168 Reviewed by: dfr MFC after: 2 weeks
* Remove an extraneous "already" from a comment introduced by r226081.rmacklem2011-10-071-1/+1
| | | | | Submitted by: bf1783 at googlemail.com MFC after: 3 days
* A crash reported on freebsd-fs@ on Sep. 23, 2011 under the subjectrmacklem2011-10-071-10/+48
| | | | | | | | | | | | | | | | | | | | | heading "kernel panics with RPCSEC_GSS" appears to be caused by a corrupted tailq list for the client structure. Looking at the code, calls to the function svc_rpc_gss_forget_client() were done in an SMP unsafe manner, with the svc_rpc_gss_lock only being acquired in the function and not before it. As such, when multiple threads called svc_rpc_gss_forget_client() concurrently, it could try and remove the same client structure from the tailq lists multiple times. The patch fixes this by moving the critical code into a separate function called svc_rpc_gss_forget_client_locked(), which must be called with the lock held. For the one case where the caller would have no interest in the lock, svc_rpc_gss_forget_client() was retained, but a loop was added to check that the client structure is still in the tailq lists before removing it, to make it safe for multiple concurrent calls. Tested by: clinton.adams at gmail.com (earlier version) Reviewed by: zkirsch MFC after: 3 days
* Make sure RPC calls over UDP return RPC_INTR status is the process hasart2011-08-281-2/+5
| | | | | | | | | | been interrupted in a restartable syscall. Otherwise we could end up in an (almost) endless loop in clnt_reconnect_call(). PR: kern/160198 Reviewed by: rmacklem Approved by: re (kib), avg (mentor) MFC after: 1 week
* Fix the kgssapi so that it can be loaded as a module. Currentlyrmacklem2011-06-192-0/+269
| | | | | | | | | | | | | the NFS subsystems use five of the rpcsec_gss/kgssapi entry points, but since it was not obvious which others might be useful, all nineteen were included. Basically the nineteen entry points are set in a structure called rpc_gss_entries and inline functions defined in sys/rpc/rpcsec_gss.h check for the entry points being non-NULL and then call them. A default value is returned otherwise. Requested by rwatson. Reviewed by: jhb MFC after: 2 weeks
* This patch is believed to fix a problem in the kernel rpc forrmacklem2011-04-274-6/+11
| | | | | | | | | | | | | | non-interruptible NFS mounts, where a kernel thread will seem to be stuck sleeping on "rpccon". The msleep() in clnt_vc_create() that was waiting to a TCP connect to complete would return ERESTART, since PCATCH was specified. Then the tsleep() in clnt_reconnect_call() would sleep for 1 second and then try again and again and... The patch changes the msleep() in clnt_vc_create() so it only sets the PCATCH flag for interruptible cases. Tested by: pho Reviewed by: jhb MFC after: 2 weeks
* Fix a couple of mbuf leaks introduced by r217242. I dormacklem2011-04-132-2/+7
| | | | | | | | not believe that these leaks had a practical impact, since the situations in which they would have occurred would have been extremely rare. MFC after: 2 weeks
* Mfp4 CH=177274,177280,177284-177285,177297,177324-177325bz2011-02-166-28/+3
| | | | | | | | | | | | | | | | | | | | | | VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
* sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.mdf2011-01-121-2/+2
| | | | Commit the kernel changes.
* Fix a bug in the client side krpc where it was, sometimesrmacklem2011-01-103-17/+10
| | | | | | | | | | | | | | erroneously, assumed that 4 bytes of data were in the first mbuf of a list by replacing the bcopy() with m_copydata(). Also, replace the uses of m_pullup(), which can fail for reasons other than not enough data, with m_copydata(). For the cases where it isn't known that there is enough data in the mbuf list, check first via m_len and m_length(). This is believed to fix a problem reported by dpd at dpdtech.com and george+freebsd at m5p.com. Reviewed by: jhb MFC after: 8 days
* Fix the krpc so that it can handle NFSv3,UDP mounts with a read/writermacklem2010-10-133-7/+20
| | | | | | | | | | | | | data size greater than 8192. Since soreserve(so, 256*1024, 256*1024) would always fail for the default value of sb_max, modify clnt_dg.c so that it uses the calculated values and checks for an error return from soreserve(). Also, add a check for error return from soreserve() to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of the bogus maxsize == 256*1024. PR: kern/150910 Reviewed by: jhb MFC after: 2 weeks
* Make the RPC specific __rpc_inet_ntop() and __rpc_inet_pton() generalattilio2010-09-244-416/+4
| | | | | | | | | | in the kernel (just as inet_ntoa() and inet_aton()) are and sync their prototype accordingly with already mentioned functions. Sponsored by: Sandvine Incorporated Reviewed by: emaste, rstone Approved by: dfr MFC after: 2 weeks
* Remove unnecessary weak reference that was apparently copied from theemaste2010-09-231-7/+0
| | | | | | version of this function in lib/libc/inet/inet_pton.c MFC after: 1 week
* - Check the result of malloc(M_NOWAIT) in replay_alloc(). The callerpjd2010-08-261-20/+25
| | | | | | | | | | (replay_alloc()) knows how to handle replay_alloc() failure. - Eliminate 'freed_one' variable, it is not needed - when no entry is found rce will be NULL. - Add locking assertions where we expect a rc_lock to be held. Reviewed by: rmacklem MFC after: 2 weeks
* Add mutex locking for the call to replay_prune() inrmacklem2010-08-251-0/+2
| | | | | | | replay_setsize(), since replay_prune() expects the rc_lock to be held when it is called. MFC after: 2 weeks
* If the first iteration of the do loop in replay_prune()rmacklem2010-08-251-1/+1
| | | | | | | | | | succeeded and a subsequent interation failed to find an entry to prune, it could loop infinitely, since the "freed" variable wasn't reset to FALSE. This patch moves setting freed FALSE to inside the loop to fix the problem. Tested by: alan.bryan at yahoo.com MFC after: 2 weeks
* When the regular NFS server replied to a UDP client out of the replayrmacklem2010-03-231-0/+2
| | | | | | | | | | | cache, it did not free the request argument mbuf list, resulting in a leak. This patch fixes that leak. Tested by: danny AT cs.huji.ac.il PR: kern/144330 Submitted by: to.my.trociny AT gmail.com (earlier version) Reviewed by: dfr MFC after: 2 weeks
* Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamicbrooks2010-01-121-3/+3
| | | | | | | | kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications. MFC after: 1 month
* Make options KGSSAPI build and add it to NOTES.brooks2010-01-082-8/+11
| | | | | | | | rpcsec_gss_prot.c: Use kernel printf and headers. vc_rpcsec_gss.c: Use a local RPCAUTH_UNIXGIDS definition for 16 instead of using NGROUPS.
* Remove extraneous semicolons, no functional changes.mbr2010-01-071-1/+1
| | | | | Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week
* (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.antoine2009-12-281-2/+2
| | | | | | | | | Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
OpenPOWER on IntegriCloud