summaryrefslogtreecommitdiffstats
path: root/net/sunrpc/xprtsock.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-06-121-3/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
| * sunrpc: Remove sk_no_check settingTom Herbert2014-05-231-3/+0
| | | | | | | | | | | | | | Setting sk_no_check to UDP_CSUM_NORCV seems to have no effect. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | arch: Mass conversion of smp_mb__*()Peter Zijlstra2014-04-181-8/+8
|/ | | | | | | | | | | Mostly scripted conversion of the smp_mb__* barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2014-04-121-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull yet more networking updates from David Miller: 1) Various fixes to the new Redpine Signals wireless driver, from Fariya Fatima. 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from Dmitry Petukhov. 3) UFO and TSO packets differ in whether they include the protocol header in gso_size, account for that in skb_gso_transport_seglen(). From Florian Westphal. 4) If VLAN untagging fails, we double free the SKB in the bridging output path. From Toshiaki Makita. 5) Several call sites of sk->sk_data_ready() were referencing an SKB just added to the socket receive queue in order to calculate the second argument via skb->len. This is dangerous because the moment the skb is added to the receive queue it can be consumed in another context and freed up. It turns out also that none of the sk->sk_data_ready() implementations even care about this second argument. So just kill it off and thus fix all these use-after-free bugs as a side effect. 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti. 7) pktgen needs to do locking properly for LLTX devices, from Daniel Borkmann. 8) xen-netfront driver initializes TX array entries in RX loop :-) From Vincenzo Maffione. 9) After refactoring, some tunnel drivers allow a tunnel to be configured on top itself. Fix from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vti: don't allow to add the same tunnel twice gre: don't allow to add the same tunnel twice drivers: net: xen-netfront: fix array initialization bug pktgen: be friendly to LLTX devices r8152: check RTL8152_UNPLUG net: sun4i-emac: add promiscuous support net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO net: ipv6: Fix oif in TCP SYN+ACK route lookup. drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts drivers: net: cpsw: discard all packets received when interface is down net: Fix use after free by removing length arg from sk_data_ready callbacks. Drivers: net: hyperv: Address UDP checksum issues Drivers: net: hyperv: Negotiate suitable ndis version for offload support Drivers: net: hyperv: Allocate memory for all possible per-pecket information bridge: Fix double free and memory leak around br_allowed_ingress bonding: Remove debug_fs files when module init fails i40evf: program RSS LUT correctly i40evf: remove open-coded skb_cow_head ixgb: remove open-coded skb_cow_head igbvf: remove open-coded skb_cow_head ...
| * net: Fix use after free by removing length arg from sk_data_ready callbacks.David S. Miller2014-04-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several spots in the kernel perform a sequence like: skb_queue_tail(&sk->s_receive_queue, skb); sk->sk_data_ready(sk, skb->len); But at the moment we place the SKB onto the socket receive queue it can be consumed and freed up. So this skb->len access is potentially to freed up memory. Furthermore, the skb->len can be modified by the consumer so it is possible that the value isn't accurate. And finally, no actual implementation of this callback actually uses the length argument. And since nobody actually cared about it's value, lots of call sites pass arbitrary values in such as '0' and even '1'. So just remove the length argument from the callback, that way there is no confusion whatsoever and all of these use-after-free cases get fixed as a side effect. Based upon a patch by Eric Dumazet and his suggestion to audit this issue tree-wide. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2014-04-081-17/+17
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Bruce Fields: "Highlights: - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker - xdr fixes (a larger xdr rewrite has been posted but I decided it would be better to queue it up for 3.16). - miscellaneous fixes and cleanup from all over (thanks especially to Kinglong Mee)" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits) nfsd4: don't create unnecessary mask acl nfsd: revert v2 half of "nfsd: don't return high mode bits" nfsd4: fix memory leak in nfsd4_encode_fattr() nfsd: check passed socket's net matches NFSd superblock's one SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp SUNRPC: New helper for creating client with rpc_xprt NFSD: Free backchannel xprt in bc_destroy NFSD: Clear wcc data between compound ops nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ nfsd4: fix nfs4err_resource in 4.1 case nfsd4: fix setclientid encode size nfsd4: remove redundant check from nfsd4_check_resp_size nfsd4: use more generous NFS4_ACL_MAX nfsd4: minor nfsd4_replay_cache_entry cleanup nfsd4: nfsd4_replay_cache_entry should be static nfsd4: update comments with obsolete function name rpc: Allow xdr_buf_subsegment to operate in-place NFSD: Using free_conn free connection SUNRPC: fix memory leak of peer addresses in XPRT ...
| * SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failedKinglong Mee2014-03-301-0/+2
| | | | | | | | | | | | | | | | Don't move the assign of args->bc_xprt->xpt_bc_xprt out of xs_setup_bc_tcp, because rpc_ping (which is in rpc_create) will using it. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcpKinglong Mee2014-03-301-9/+0
| | | | | | | | | | | | | | | | Besides checking rpc_xprt out of xs_setup_bc_tcp, increase it's reference (it's important). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * NFSD: Free backchannel xprt in bc_destroyKinglong Mee2014-03-301-0/+4
| | | | | | | | | | | | | | | | Backchannel xprt isn't freed right now. Free it in bc_destroy, and put the reference of THIS_MODULE. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * SUNRPC: fix memory leak of peer addresses in XPRTKinglong Mee2014-03-281-8/+11
| | | | | | | | | | | | | | | | Creating xprt failed after xs_format_peer_addresses, sunrpc must free those memory of peer addresses in xprt. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | SUNRPC: RPC callbacks may be split across several TCP segmentsTrond Myklebust2014-02-111-20/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Since TCP is a stream protocol, our callback read code needs to take into account the fact that RPC callbacks are not always confined to a single TCP segment. This patch adds support for multiple TCP segments by ensuring that we only remove the rpc_rqst structure from the 'free backchannel requests' list once the data has been completely received. We rely on the fact that TCP data is ordered for the duration of the connection. Reported-by: shaobingqing <shaobingqing@bwstor.com.cn> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | SUNRPC: Fix races in xs_nospace()Trond Myklebust2014-02-111-1/+5
|/ | | | | | | | | | | | | | | When a send failure occurs due to the socket being out of buffer space, we call xs_nospace() in order to have the RPC task wait until the socket has drained enough to make it worth while trying again. The current patch fixes a race in which the socket is drained before we get round to setting up the machinery in xs_nospace(), and which is reported to cause hangs. Link: http://lkml.kernel.org/r/20140210170315.33dfc621@notabene.brown Fixes: a9a6b52ee1ba (SUNRPC: Don't start the retransmission timer...) Reported-by: Neil Brown <neilb@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2014-01-301-4/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Bruce Fields: - Handle some loose ends from the vfs read delegation support. (For example nfsd can stop breaking leases on its own in a fewer places where it can now depend on the vfs to.) - Make life a little easier for NFSv4-only configurations (thanks to Kinglong Mee). - Fix some gss-proxy problems (thanks Jeff Layton). - miscellaneous bug fixes and cleanup * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits) nfsd: consider CLAIM_FH when handing out delegation nfsd4: fix delegation-unlink/rename race nfsd4: delay setting current_fh in open nfsd4: minor nfs4_setlease cleanup gss_krb5: use lcm from kernel lib nfsd4: decrease nfsd4_encode_fattr stack usage nfsd: fix encode_entryplus_baggage stack usage nfsd4: simplify xdr encoding of nfsv4 names nfsd4: encode_rdattr_error cleanup nfsd4: nfsd4_encode_fattr cleanup minor svcauth_gss.c cleanup nfsd4: better VERIFY comment nfsd4: break only delegations when appropriate NFSD: Fix a memory leak in nfsd4_create_session sunrpc: get rid of use_gssp_lock sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt sunrpc: don't wait for write before allowing reads from use-gss-proxy file nfsd: get rid of unused function definition Define op_iattr for nfsd4_open instead using macro NFSD: fix compile warning without CONFIG_NFSD_V3 ...
| * sunrpc: fix some typosWeng Meiling2013-12-101-4/+3
| | | | | | | | | | Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | Merge tag 'nfs-for-3.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2014-01-281-5/+37
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: - stable fix for an infinite loop in RPC state machine - stable fix for a use after free situation in the NFSv4 trunking discovery - stable fix for error handling in the NFSv4 trunking discovery - stable fix for the page write update code - stable fix for the NFSv4.1 mount time security negotiation - stable fix for the NFSv4 open code. - O_DIRECT locking fixes - fix an Oops in the pnfs file commit code - RPC layer needs finer grained handling of connection errors - more RPC GSS upcall fixes" * tag 'nfs-for-3.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (30 commits) pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done pnfs: fix BUG in filelayout_recover_commit_reqs nfs4: fix discover_server_trunking use after free NFSv4.1: Handle errors correctly in nfs41_walk_client_list nfs: always make sure page is up-to-date before extending a write to cover the entire page nfs: page cache invalidation for dio nfs: take i_mutex during direct I/O reads nfs: merge nfs_direct_write into nfs_file_direct_write nfs: merge nfs_direct_read into nfs_file_direct_read nfs: increment i_dio_count for reads, too nfs: defer inode_dio_done call until size update is done nfs: fix size updates for aio writes nfs4.1: properly handle ENOTSUP in SECINFO_NO_NAME NFSv4.1: Fix a race in nfs4_write_inode NFSv4.1: Don't trust attributes if a pNFS LAYOUTCOMMIT is outstanding point to the right include file in a comment (left over from a9004abc3) NFS: dprintk() should not print negative fileids and inode numbers nfs: fix dead code of ipv6_addr_scope sunrpc: Fix infinite loop in RPC state machine SUNRPC: Add tracepoint for socket errors ...
| * | SUNRPC: Add tracepoint for socket errorsTrond Myklebust2013-12-311-0/+1
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | SUNRPC: Report connection error values to rpc_tasks on the pending queueTrond Myklebust2013-12-311-5/+36
| |/ | | | | | | | | | | | | Currently we only report EAGAIN, which is not descriptive enough for softconn tasks. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane2014-01-141-1/+1
|/ | | | | | | | | | | | This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* SUNRPC: Fix a data corruption issue when retransmitting RPC callsTrond Myklebust2013-11-081-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following scenario can cause silent data corruption when doing NFS writes. It has mainly been observed when doing database writes using O_DIRECT. 1) The RPC client uses sendpage() to do zero-copy of the page data. 2) Due to networking issues, the reply from the server is delayed, and so the RPC client times out. 3) The client issues a second sendpage of the page data as part of an RPC call retransmission. 4) The reply to the first transmission arrives from the server _before_ the client hardware has emptied the TCP socket send buffer. 5) After processing the reply, the RPC state machine rules that the call to be done, and triggers the completion callbacks. 6) The application notices the RPC call is done, and reuses the pages to store something else (e.g. a new write). 7) The client NIC drains the TCP socket send buffer. Since the page data has now changed, it reads a corrupted version of the initial RPC call, and puts it on the wire. This patch fixes the problem in the following manner: The ordering guarantees of TCP ensure that when the server sends a reply, then we know that the _first_ transmission has completed. Using zero-copy in that situation is therefore safe. If a time out occurs, we then send the retransmission using sendmsg() (i.e. no zero-copy), We then know that the socket contains a full copy of the data, and so it will retransmit a faithful reproduction even if the RPC call completes, and the application reuses the O_DIRECT buffer in the meantime. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* SUNRPC: Cleanup xs_destroy()Trond Myklebust2013-10-311-10/+5
| | | | | | There is no longer any need for a separate xs_local_destroy() helper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: close a rare race in xs_tcp_setup_socket.NeilBrown2013-10-311-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have one report of a crash in xs_tcp_setup_socket. The call path to the crash is: xs_tcp_setup_socket -> inet_stream_connect -> lock_sock_nested. The 'sock' passed to that last function is NULL. The only way I can see this happening is a concurrent call to xs_close: xs_close -> xs_reset_transport -> sock_release -> inet_release inet_release sets: sock->sk = NULL; inet_stream_connect calls lock_sock(sock->sk); which gets NULL. All calls to xs_close are protected by XPRT_LOCKED as are most activations of the workqueue which runs xs_tcp_setup_socket. The exception is xs_tcp_schedule_linger_timeout. So presumably the timeout queued by the later fires exactly when some other code runs xs_close(). To protect against this we can move the cancel_delayed_work_sync() call from xs_destory() to xs_close(). As xs_close is never called from the worker scheduled on ->connect_worker, this can never deadlock. Signed-off-by: NeilBrown <neilb@suse.de> [Trond: Make it safe to call cancel_delayed_work_sync() on AF_LOCAL sockets] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* sunrpc: comment typo fixJ. Bruce Fields2013-10-281-2/+2
| | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Only update the TCP connect cookie on a successful connectTrond Myklebust2013-10-011-1/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Enable the keepalive option for TCP socketsTrond Myklebust2013-10-011-0/+13
| | | | | | | | | | For NFSv4 we want to avoid retransmitting RPC calls unless the TCP connection breaks. However we still want to detect TCP connection breakage as soon as possible. Do this by setting the keepalive option with the idle timeout and count set to the 'timeo' and 'retrans' mount options. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2013-09-091-1/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: - Fix NFSv4 recovery so that it doesn't recover lost locks in cases such as lease loss due to a network partition, where doing so may result in data corruption. Add a kernel parameter to control choice of legacy behaviour or not. - Performance improvements when 2 processes are writing to the same file. - Flush data to disk when an RPCSEC_GSS session timeout is imminent. - Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other NFS clients from being able to manipulate our lease and file locking state. - Allow sharing of RPCSEC_GSS caches between different rpc clients. - Fix the broken NFSv4 security auto-negotiation between client and server. - Fix rmdir() to wait for outstanding sillyrename unlinks to complete - Add a tracepoint framework for debugging NFSv4 state recovery issues. - Add tracing to the generic NFS layer. - Add tracing for the SUNRPC socket connection state. - Clean up the rpc_pipefs mount/umount event management. - Merge more patches from Chuck in preparation for NFSv4 migration support" * tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (107 commits) NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set NFSv4: Allow security autonegotiation for submounts NFSv4: Disallow security negotiation for lookups when 'sec=' is specified NFSv4: Fix security auto-negotiation NFS: Clean up nfs_parse_security_flavors() NFS: Clean up the auth flavour array mess NFSv4.1 Use MDS auth flavor for data server connection NFS: Don't check lock owner compatability unless file is locked (part 2) NFS: Don't check lock owner compatibility in writes unless file is locked nfs4: Map NFS4ERR_WRONG_CRED to EPERM nfs4.1: Add SP4_MACH_CRED write and commit support nfs4.1: Add SP4_MACH_CRED stateid support nfs4.1: Add SP4_MACH_CRED secinfo support nfs4.1: Add SP4_MACH_CRED cleanup support nfs4.1: Add state protection handler nfs4.1: Minimal SP4_MACH_CRED implementation SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid SUNRPC: Add an identifier for struct rpc_clnt SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints ...
| * SUNRPC: Add tracepoints to help debug socket connection issuesTrond Myklebust2013-09-041-1/+12
| | | | | | | | | | | | | | Add client side debugging to help trace socket connection/disconnection and unexpected state change issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | net: add sk_stream_is_writeable() helperEric Dumazet2013-07-241-1/+1
|/ | | | | | | | | | | | | | | Several call sites use the hardcoded following condition : sk_stream_wspace(sk) >= sk_stream_min_wspace(sk) Lets use a helper because TCP_NOTSENT_LOWAT support will change this condition for TCP sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-07-111-1/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...
| * sunrpc: server back channel needs no rpcbind methodJ. Bruce Fields2013-05-151-1/+0
| | | | | | | | | | | | | | | | | | XPRT_BOUND is set on server backchannel xprts by xs_setup_bc_tcp() (using xprt_set_bound()), and is never cleared, so ->rpcbind() will never need to be called. Reported-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | net: Convert uses of typedef ctl_table to struct ctl_tableJoe Perches2013-06-131-2/+2
|/ | | | | | | | | | | | | | | | Reduce the uses of this unnecessary typedef. Done via perl script: $ git grep --name-only -w ctl_table net | \ xargs perl -p -i -e '\ sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \ s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge' Reflow the modified lines that now exceed 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* SUNRPC: attempt AF_LOCAL connect on setupJ. Bruce Fields2013-04-261-0/+3
| | | | | | | | | | | In the gss-proxy case, setup time is when I know I'll have the right namespace for the connect. In other cases, it might be useful to get any connection errors earlier--though actually in practice it doesn't make any difference for rpcbind. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* Merge Trond's nfs-for-nextJ. Bruce Fields2013-04-261-5/+9
|\ | | | | | | | | | | | | Merging Trond's nfs-for-next branch, mainly to get b7993cebb841b0da7a33e9d5ce301a9fd3209165 "SUNRPC: Allow rpc_create() to request that TCP slots be unlimited", which a small piece of the gss-proxy work depends on.
| * SUNRPC: Allow rpc_create() to request that TCP slots be unlimitedTrond Myklebust2013-04-141-1/+5
| | | | | | | | | | | | | | This is mainly for use by NFSv4.1, where the session negotiation ultimately wants to decide how many RPC slots we can fill. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasksTrond Myklebust2013-03-251-4/+4
| | | | | | | | | | | | | | | | | | | | In the case of a SOFTCONN rpc task, we really want to ensure that it reports errors like ENETUNREACH back to the caller. Currently, only some of these errors are being reported back (connect errors are not), and they are being converted by the RPC layer into EIO. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | sunrpc: don't attempt to cancel unitialized workJ. Bruce Fields2013-03-091-5/+10
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of dc107402ae06286a9ed33c32daf3f35514a7cb8d "SUNRPC: make AF_LOCAL connect synchronous", we no longer initialize connect_worker in the AF_LOCAL case, resulting in warnings like: WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() Hardware name: Bochs ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x20 Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd sunrpc Pid: 4816, comm: nfsd Tainted: G W 3.8.0-rc2-00049-gdc10740 #801 Call Trace: [<ffffffff8156ec00>] ? free_obj_work+0x60/0xa0 [<ffffffff81046aaf>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81046ba6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff8156eccc>] debug_print_object+0x8c/0xb0 [<ffffffff81055030>] ? timer_debug_hint+0x10/0x10 [<ffffffff8156f7e3>] debug_object_assert_init+0xe3/0x120 [<ffffffff81057ebb>] del_timer+0x2b/0x80 [<ffffffff8109c4e6>] ? mark_held_locks+0x86/0x110 [<ffffffff81065a29>] try_to_grab_pending+0xd9/0x150 [<ffffffff81065b57>] __cancel_work_timer+0x27/0xc0 [<ffffffff81065c03>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0007067>] xs_destroy+0x27/0x80 [sunrpc] [<ffffffffa00040d8>] xprt_destroy+0x78/0xa0 [sunrpc] [<ffffffffa0006241>] xprt_put+0x21/0x30 [sunrpc] [<ffffffffa00030cf>] rpc_free_client+0x10f/0x1a0 [sunrpc] [<ffffffffa0002ff3>] ? rpc_free_client+0x33/0x1a0 [sunrpc] [<ffffffffa0002f7e>] rpc_release_client+0x6e/0xb0 [sunrpc] [<ffffffffa000325d>] rpc_shutdown_client+0xfd/0x1b0 [sunrpc] [<ffffffffa0017196>] rpcb_put_local+0x106/0x130 [sunrpc] ... Acked-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-02-281-8/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd changes from J Bruce Fields: "Miscellaneous bugfixes, plus: - An overhaul of the DRC cache by Jeff Layton. The main effect is just to make it larger. This decreases the chances of intermittent errors especially in the UDP case. But we'll need to watch for any reports of performance regressions. - Containerized nfsd: with some limitations, we now support per-container nfs-service, thanks to extensive work from Stanislav Kinsbursky over the last year." Some notes about conflicts, since there were *two* non-data semantic conflicts here: - idr_remove_all() had been added by a memory leak fix, but has since become deprecated since idr_destroy() does it for us now. - xs_local_connect() had been added by this branch to make AF_LOCAL connections be synchronous, but in the meantime Trond had changed the calling convention in order to avoid a RCU dereference. There were a couple of more obvious actual source-level conflicts due to the hlist traversal changes and one just due to code changes next to each other, but those were trivial. * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits) SUNRPC: make AF_LOCAL connect synchronous nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum svcrpc: fix rpc server shutdown races svcrpc: make svc_age_temp_xprts enqueue under sv_lock lockd: nlmclnt_reclaim(): avoid stack overflow nfsd: enable NFSv4 state in containers nfsd: disable usermode helper client tracker in container nfsd: use proper net while reading "exports" file nfsd: containerize NFSd filesystem nfsd: fix comments on nfsd_cache_lookup SUNRPC: move cache_detail->cache_request callback call to cache_read() SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function SUNRPC: rework cache upcall logic SUNRPC: introduce cache_detail->cache_request callback NFS: simplify and clean cache library NFS: use SUNRPC cache creation and destruction helper for DNS cache nfsd4: free_stid can be static nfsd: keep a checksum of the first 256 bytes of request sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer sunrpc: fix comment in struct xdr_buf definition ...
| * SUNRPC: make AF_LOCAL connect synchronousJ. Bruce Fields2013-02-281-8/+27
| | | | | | | | | | | | | | | | | | | | | | | | It doesn't appear that anyone actually needs to connect asynchronously. Also, using a workqueue for the connect means we lose the namespace information from the original process. This is a problem since there's no way to explicitly pass in a filesystem namespace for resolution of an AF_LOCAL address. Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to ↵Jeff Layton2013-02-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | addr.h These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | SUNRPC: Pass pointers to struct rpc_xprt to the congestion windowTrond Myklebust2013-02-011-3/+3
| | | | | | | | | | | | Avoid access to task->tk_xprt Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: Fix an RCU dereference in xs_local_rpcbindTrond Myklebust2013-02-011-1/+3
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: Pass a pointer to struct rpc_xprt to the connect callbackTrond Myklebust2013-02-011-2/+2
| | | | | | | | | | | | | | Avoid another RCU dereference by passing the pointer to struct rpc_xprt from the caller. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: Eliminate task->tk_xprt accesses that bypass rcu_dereference()Trond Myklebust2013-02-011-1/+1
|/ | | | | | | | tk_xprt is just a shortcut for tk_client->cl_xprt, however cl_xprt is defined as an __rcu variable. Replace dereferences of tk_xprt with non-rcu dereferences where it is safe to do so. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: variable 'svsk' is unused in function bc_send_requestTrond Myklebust2012-12-151-2/+0
| | | | | | Silence a compile time warning. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Handle ECONNREFUSED in xs_local_setup_socketTrond Myklebust2012-12-151-0/+4
| | | | | | | Silence the unnecessary warning "unhandled error (111) connecting to..." and convert it to a dprintk for debugging purposes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: remove BUG_ON from bc_mallocWeston Andros Adamson2012-11-041-2/+4
| | | | | | | | Replace BUG_ON() with WARN_ON_ONCE() and NULL return - the caller will handle this like a memory allocation failure. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: remove BUG_ONs from *_reclassify_socket*Weston Andros Adamson2012-11-041-3/+4
| | | | | | | | | Replace multiple BUG_ON() calls with WARN_ON_ONCE() and early return when sanity checking socket ownership (lock). The bind call will fail if the socket was unsuccessfully reclassified. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Get rid of the xs_error_report socket callbackTrond Myklebust2012-10-241-25/+0
| | | | | | | | | | | | | | | | | | Chris Perl reports that we're seeing races between the wakeup call in xs_error_report and the connect attempts. Basically, Chris has shown that in certain circumstances, the call to xs_error_report causes the rpc_task that is responsible for reconnecting to wake up early, thus triggering a disconnect and retry. Since the sk->sk_error_report() calls in the socket layer are always followed by a tcp_done() in the cases where we care about waking up the rpc_tasks, just let the state_change callbacks take responsibility for those wake ups. Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Tested-by: Chris Perl <chris.perl@gmail.com>
* SUNRPC: Prevent races in xs_abort_connection()Trond Myklebust2012-10-241-5/+8
| | | | | | | | | | | | | | | | The call to xprt_disconnect_done() that is triggered by a successful connection reset will trigger another automatic wakeup of all tasks on the xprt->pending rpc_wait_queue. In particular it will cause an early wake up of the task that called xprt_connect(). All we really want to do here is clear all the socket-specific state flags, so we split that functionality out of xs_sock_mark_closed() into a helper that can be called by xs_abort_connection() Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Tested-by: Chris Perl <chris.perl@gmail.com>
* Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..."Trond Myklebust2012-10-241-1/+1
| | | | | | | | | | | | | This reverts commit 55420c24a0d4d1fce70ca713f84aa00b6b74a70e. Now that we clear the connected flag when entering TCP_CLOSE_WAIT, the deadlock described in this commit is no longer possible. Instead, the resulting call to xs_tcp_shutdown() can interfere with pending reconnection attempts. Reported-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Tested-by: Chris Perl <chris.perl@gmail.com>
* SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAITTrond Myklebust2012-10-241-0/+1
| | | | | | | | | This is needed to ensure that we call xprt_connect() upon the next call to call_connect(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Tested-by: Chris Perl <chris.perl@gmail.com>
OpenPOWER on IntegriCloud