summaryrefslogtreecommitdiffstats
path: root/net/sunrpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2012-05-234-11/+33
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...
| * userns: Convert group_info values from gid_t to kgid_t.Eric W. Biederman2012-05-034-11/+33
| | | | | | | | | | | | | | | | | | | | As a first step to converting struct cred to be all kuid_t and kgid_t values convert the group values stored in group_info to always be kgid_t values. Unless user namespaces are used this change should have no effect. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2012-05-218-44/+31
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking changes from David Miller: 1) Get rid of the error prone NLA_PUT*() macros that used an embedded goto. 2) Kill off the token-ring and MCA networking drivers, from Paul Gortmaker. 3) Reduce high-order allocations made by datagram AF_UNIX sockets, from Eric Dumazet. 4) Add PTP hardware clock support to IGB and IXGBE, from Richard Cochran and Jacob Keller. 5) Allow users to query timestamping capabilities of a card via ethtool, from Richard Cochran. 6) Add loadbalance mode to the teaming driver, from Jiri Pirko. Part of this is that we can now have BPF filters not attached to sockets, and the loadbalancing function is calculated using one. 7) Francois Romieu went through the network drivers removing gratuitous uses of netdev->base_addr, perhaps some day we can remove it completely but it's used for ISA probing still. 8) Add a BPF JIT for sparc. I know, who cares, right? :-) 9) Move networking sysctl registry away from using the compatability mode interfaces in the sysctl code. From Eric W Biederman. 10) Pavel Emelyanov added a way to save and restore TCP socket state via TCP_REPAIR, TCP_REPAIR_QUEUE, and TCP_QUEUE_SEQ socket options as well as a way to forcefully bind a socket to a port via the sk->sk_reuse value SK_FORCE_REUSE. There is also a TCP_REPAIR_OPTIONS which allows to reinstante the TCP options enabled on the connection. 11) Several enhancements from Eric Dumazet that, in particular, can enhance splice performance on TCP sockets significantly. a) Reset the offset of the per-socket sendmsg page when we know we're the only use of the page in linear_to_page(). b) Add facilities such that skb->data can be backed a page rather than SLAB kmalloc'd memory. In particular devices which were receiving into linear RX buffers can now end up providing paged data. The big result is that code like splice and GRO do not have to copy any more. 12) Allow a pure sender to more gracefully handle ACK backlogs in TCP. What can happen at high rates is that the sender hasn't grown his receive buffer limits at all (he's not receiving data so really doesn't need to), but the non-data ACKs consume receive buffer space. sk_add_backlog() is too aggressive in dropping frames in this case, so relax it's requirements by using the receive buffer plus the send buffer limit as the backlog limit instead of just the former. Also from Eric Dumazet. 13) Add ipv6 support to L2TP, from Benjamin LaHaise, James Chapman, and Chris Elston. 14) Implement TCP early retransmit (RFC 5827), from Yuchung Cheng. Basically, we can start fast retransmit before hiting the dupack threshold under certain conditions. 15) New CODEL active queue management packet scheduler, from Eric Dumazet based upon initial work by Dave Taht. Basically, the big feature is that packets are dropped (or ECN bits are set) based upon how long packets live in the queue, rather than the queue length (which is what RED uses). * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1341 commits) drivers/net/stmmac: seq_file fix memory leak ipv6/exthdrs: strict Pad1 and PadN check USB: qmi_wwan: Add ZTE (Vodafone) K3520-Z USB: qmi_wwan: Add ZTE (Vodafone) K3765-Z USB: qmi_wwan: Make forced int 4 whitelist generic net/ipv4: replace simple_strtoul with kstrtoul net/ipv4/ipconfig: neaten __setup placement net: qmi_wwan: Add Vodafone/Huawei K5005 support net: cdc_ether: Add ZTE WWAN matches before generic Ethernet ipv6: use skb coalescing in reassembly ipv4: use skb coalescing in defragmentation net: introduce skb_try_coalesce() net:ipv6:fixed space issues relating to operators. net:ipv6:fixed a trailing white space issue. ipv6: disable GSO on sockets hitting dst_allfrag tg3: use netdev_alloc_frag() API net: napi_frags_skb() is static ppp: avoid false drop_monitor false positives ipv6: bool/const conversions phase2 ipx: Remove spurious NULL checking in ipx_ioctl(). ...
| * \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-05-161-3/+4
| |\ \
| * | | net: Convert net_ratelimit uses to net_<level>_ratelimitedJoe Perches2012-05-153-36/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Standardize the net core ratelimited logging functions. Coalesce formats, align arguments. Change a printk then vprintk sequence to use printf extension %pV. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-05-073-20/+50
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/intel/e1000e/param.c drivers/net/wireless/iwlwifi/iwl-agn-rx.c drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c drivers/net/wireless/iwlwifi/iwl-trans.h Resolved the iwlwifi conflict with mainline using 3-way diff posted by John Linville and Stephen Rothwell. In 'net' we added a bug fix to make iwlwifi report a more accurate skb->truesize but this conflicted with RX path changes that happened meanwhile in net-next. In e1000e a conflict arose in the validation code for settings of adapter->itr. 'net-next' had more sophisticated logic so that logic was used. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | sock: Introduce named constants for sk_reusePavel Emelyanov2012-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Name them in a "backward compatible" manner, i.e. reuse or not are still 1 and 0 respectively. The reuse value of 2 means that the socket with it will forcibly reuse everyone else's port. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: cleanup unsigned to unsigned intEric Dumazet2012-04-155-7/+7
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'dentry-cleanups' (dcache access cleanups and optimizations)Linus Torvalds2012-05-212-8/+3
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This branch simplifies and clarifies the dcache lookup, and allows us to do certain nice optimizations when comparing dentries. It also cleans up the interface to __d_lookup_rcu(), especially around passing the inode information around. * dentry-cleanups: vfs: make it possible to access the dentry hash/len as one 64-bit entry vfs: move dentry name length comparison from dentry_cmp() into callers vfs: do the careful dentry name access for all dentry_cmp cases vfs: remove unnecessary d_unhashed() check from __d_lookup_rcu vfs: clean up __d_lookup_rcu() and dentry_cmp() interfaces
| * | | vfs: make it possible to access the dentry hash/len as one 64-bit entryLinus Torvalds2012-05-102-8/+3
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows comparing hash and len in one operation on 64-bit architectures. Right now only __d_lookup_rcu() takes advantage of this, since that is the case we care most about. The use of anonymous struct/unions hides the alternate 64-bit approach from most users, the exception being a few cases where we initialize a 'struct qstr' with a static initializer. This makes the problematic cases use a new QSTR_INIT() helper function for that (but initializing just the name pointer with a "{ .name = xyzzy }" initializer remains valid, as does just copying another qstr structure). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | auth_gss: the list of pseudoflavors not being parsed correctlySteve Dickson2012-05-031-3/+4
|/ / | | | | | | | | | | | | | | | | gss_mech_list_pseudoflavors() parses a list of registered mechanisms. On that list contains a list of pseudo flavors which was not being parsed correctly, causing only the first pseudo flavor to be found. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: RPC client must use the current utsname hostname stringTrond Myklebust2012-04-301-4/+10
| | | | | | | | | | | | | | | | | | | | Now that the rpc client is namespace aware, it needs to use the utsname of the process that created it instead of using the init_utsname. Both rpc_new_client and rpc_clone_client need to be fixed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
* | SUNRPC: traverse clients tree on PipeFS eventStanislav Kinsbursky2012-04-271-4/+25
| | | | | | | | | | | | | | | | | | | | | | | | v2: recursion was replaced by loop If client is a clone, then it's parent can not be in the list. But parent's Pipefs dentries have to be created and destroyed. Note: event skip helper for clients introduced Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: set per-net PipeFS superblock before notificationStanislav Kinsbursky2012-04-271-1/+2
| | | | | | | | | | | | | | | | | | | | There can be a case, when on MOUNT event RPC client (after it's dentries were created) is not longer hold by anyone except notification callback. I.e. on release this client will be destoroyed. And it's dentries have to be destroyed as well. Which in turn requires per-net PipeFS superblock to be set. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: skip clients with program without PipeFS entriesStanislav Kinsbursky2012-04-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | 1) This is sane. 2) Otherwise there will be soft lockup: do { rpc_get_client_for_event (clnt->cl_dentry == NULL ==> choose) __rpc_pipefs_event (clnt->cl_program->pipe_dir_name == NULL ==> return) } while (1) Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: skip dead but not buried clients on PipeFS eventsStanislav Kinsbursky2012-04-271-1/+2
| | | | | | | | | | | | | | These clients can't be safely dereferenced if their counter in 0. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | SUNRPC: register PipeFS file system after pernet sybsystemStanislav Kinsbursky2012-04-181-8/+9
|/ | | | | | | | | | PipeFS superblock creation routine relays on SUNRPC pernet data presense, which is created on register_pernet_subsys() call in SUNRPC module init function. Registering of PipeFS filesystem prior to registering of per-net subsystem leads to races (mount of PipeFS can dereference uninitialized data). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2012-03-2911-83/+59
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd changes from Bruce Fields: Highlights: - Benny Halevy and Tigran Mkrtchyan implemented some more 4.1 features, moving us closer to a complete 4.1 implementation. - Bernd Schubert fixed a long-standing problem with readdir cookies on ext2/3/4. - Jeff Layton performed a long-overdue overhaul of the server reboot recovery code which will allow us to deprecate the current code (a rather unusual user of the vfs), and give us some needed flexibility for further improvements. - Like the client, we now support numeric uid's and gid's in the auth_sys case, allowing easier upgrades from NFSv2/v3 to v4.x. Plus miscellaneous bugfixes and cleanup. Thanks to everyone! There are also some delegation fixes waiting on vfs review that I suppose will have to wait for 3.5. With that done I think we'll finally turn off the "EXPERIMENTAL" dependency for v4 (though that's mostly symbolic as it's been on by default in distro's for a while). And the list of 4.1 todo's should be achievable for 3.5 as well: http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues though we may still want a bit more experience with it before turning it on by default. * 'for-3.4' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled nfsd4: use auth_unix unconditionally on backchannel nfsd: fix NULL pointer dereference in cld_pipe_downcall nfsd4: memory corruption in numeric_name_to_id() sunrpc: skip portmap calls on sessions backchannel nfsd4: allow numeric idmapping nfsd: don't allow legacy client tracker init for anything but init_net nfsd: add notifier to handle mount/unmount of rpc_pipefs sb nfsd: add the infrastructure to handle the cld upcall nfsd: add a header describing upcall to nfsdcld nfsd: add a per-net-namespace struct for nfsd sunrpc: create nfsd dir in rpc_pipefs nfsd: add nfsd4_client_tracking_ops struct and a way to set it nfsd: convert nfs4_client->cl_cb_flags to a generic flags field NFSD: Fix nfs4_verifier memory alignment NFSD: Fix warnings when NFSD_DEBUG is not defined nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) nfsd: rename 'int access' to 'int may_flags' in nfsd_open() ext4: return 32/64-bit dir name hash according to usage type fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash ...
| * sunrpc: skip portmap calls on sessions backchannelJ. Bruce Fields2012-03-261-0/+1
| | | | | | | | | | | | | | There's obviously no point to doing portmap calls over the sessions backchannel. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * sunrpc: create nfsd dir in rpc_pipefsJeff Layton2012-03-261-0/+5
| | | | | | | | | | | | | | | | Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid upcall. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * Merge nfs containerization work from Trond's treeJ. Bruce Fields2012-03-2626-730/+1547
| |\ | | | | | | | | | | | | The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.
| * | svcrdma: Cleanup sparse warnings in the svcrdma moduleTom Tucker2012-02-176-80/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The svcrdma transport was un-marshalling requests in-place. This resulted in sparse warnings due to __beXX data containing both NBO and HBO data. The code has been restructured to do byte-swapping as the header is parsed instead of when the header is validated immediately after receipt. Also moved extern declarations for the workqueue and memory pools to the private header file. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | SUNPRC: remove marking service temporary sockets with XPT_CHNGBUFStanislav Kinsbursky2012-02-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a cleanup patch. Service temporary sockets can be TCP or RDMA only. But XPT_CHNGBUF service socket flag is checked only for UDP sockets on receive. Thus (if I don't miss something non-obvious) this bit raising for temporary sockets can be removed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | nfsd: remove some unneeded checksDan Carpenter2012-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We check for zero length strings in the caller now, so these aren't needed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | nfsd: don't allow zero length strings in cache_parse()Dan Carpenter2012-02-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no point in passing a zero length string here and quite a few of that cache_parse() implementations will Oops if count is zero. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | Merge tag 'nfs-for-3.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2012-03-281-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes for Linux 3.4 from Trond Myklebust Highlights include: - Fix infinite loops in the mount code - Fix a userspace buffer overflow in __nfs4_get_acl_uncached - Fix a memory leak due to a double reference count in rpcb_getport_async() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> * tag 'nfs-for-3.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_error NFSv4.1: Fix layoutcommit error handling NFSv4: Fix two infinite loops in the mount code SUNRPC: Use the already looked-up xprt in rpcb_getport_async() NFS4.1: remove duplicate variable declaration in filelayout_clear_request_commit Fix length of buffer copied in __nfs4_get_acl_uncached
| * | | SUNRPC: Use the already looked-up xprt in rpcb_getport_async()Bryan Schumaker2012-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rbcb_getport_async() was looking up the rpc_xprt (reference++) and then later looking it up again (reference++) to pass through the rpcbind_args. The xprt would only be dereferenced once, when we were done with the rpcbind_args (reference--). This leaves an extra reference to the transport that would never go away. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds2012-03-281-1/+0
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
| * | | Remove all #inclusions of asm/system.hDavid Howells2012-03-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
* | | | Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2012-03-2326-730/+1547
|\ \ \ \ | |/ / / |/| | / | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates for Linux 3.4 from Trond Myklebust: "New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery)." * tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits) NFS: fix sb->s_id in nfs debug prints xprtrdma: Remove assumption that each segment is <= PAGE_SIZE xprtrdma: The transport should not bug-check when a dup reply is received pnfs-obj: autologin: Add support for protocol autologin NFS: Remove nfs4_setup_sequence from generic rename code NFS: Remove nfs4_setup_sequence from generic unlink code NFS: Remove nfs4_setup_sequence from generic read code NFS: Remove nfs4_setup_sequence from generic write code NFS: Fix more NFS debug related build warnings SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined nfs: non void functions must return a value SUNRPC: Kill compiler warning when RPC_DEBUG is unset SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG NFS: Use cond_resched_lock() to reduce latencies in the commit scans NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner NFS: ncommit count is being double decremented SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Try using machine credentials for RENEW calls NFSv4.1: Fix a few issues in filelayout_commit_pagelist NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code ...
| * | xprtrdma: Remove assumption that each segment is <= PAGE_SIZETom Tucker2012-03-211-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | The xprtrdma FRMR mapping logic assumes that a segment is <= PAGE_SIZE. This is not true for NFS4. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | xprtrdma: The transport should not bug-check when a dup reply is receivedTom Tucker2012-03-211-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | The client side RDMA transport will bug check if it receives a duplicate reply, instead we should simply drop the duplicate reply. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefinedTrond Myklebust2012-03-211-14/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stephen Rothwell reports: net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_mapping': net/sunrpc/rpcb_clnt.c:820:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getport': net/sunrpc/rpcb_clnt.c:837:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_set': net/sunrpc/rpcb_clnt.c:860:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_getaddr': net/sunrpc/rpcb_clnt.c:892:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getaddr': net/sunrpc/rpcb_clnt.c:914:19: warning: unused variable 'task' [-Wunused-variable] fs/lockd/svclock.c:49:20: warning: 'nlmdbg_cookie2a' declared 'static' but never defined [-Wunused-function] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUGTrond Myklebust2012-03-201-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to turn on/off the dprintk() debugging interfaces for those distributions that don't ship the 'rpcdebug' utility. It also allows us to add Kbuild dependencies. Specifically, we already know that dprintk() in general relies on CONFIG_SYSCTL. Now it turns out that the NFS dprintks depend on CONFIG_CRC32 after we added support for the filehandle hash. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()Trond Myklebust2012-03-191-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that for the case of priority queues, we have to assume that __rpc_remove_wait_queue_priority will move new elements from the tk_wait.links lists into the queue->tasks[] list. We therefore cannot use list_for_each_entry_safe() on queue->tasks[], since that will skip these new tasks that __rpc_remove_wait_queue_priority is adding. Without this fix, rpc_wake_up and rpc_wake_up_status will both fail to wake up all functions on priority wait queues, which can result in some nasty hangs. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
| * | SUNRPC: Don't use variable length automatic arrays in kernel codeTrond Myklebust2012-03-121-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | Replace the variable length array in the RPCSEC_GSS crypto code with a fixed length one. The size should be bounded by the variable GSS_KRB5_MAX_BLOCKSIZE, so use that. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Fix a few sparse warningsTrond Myklebust2012-03-117-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment (different address spaces) - svc_partial_recvfrom now takes a struct kvec, so the variable save_iovbase needs to be an ordinary (void *) Make a bunch of variables in net/sunrpc/xprtsock.c static Fix a couple of "warning: symbol 'foo' was not declared. Should it be static?" reports. Fix a couple of conflicting function declarations. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Add API to acquire source addressChuck Lever2012-03-021-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFSv4.0 clients must send endpoint information for their callback service to NFSv4.0 servers during their first contact with a server. Traditionally on Linux, user space provides the callback endpoint IP address via the "clientaddr=" mount option. During an NFSv4 migration event, it is possible that an FSID may be migrated to a destination server that is accessible via a different source IP address than the source server was. The client must update callback endpoint information on the destination server so that it can maintain leases and allow delegation. Without a new "clientaddr=" option from user space, however, the kernel itself must construct an appropriate IP address for the callback update. Provide an API in the RPC client for upper layer RPC consumers to acquire a source address for a remote. The mechanism used by the mount.nfs command is copied: set up a connected UDP socket to the designated remote, then scrape the source address off the socket. We are careful to select the correct network namespace when setting up the temporary UDP socket. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Move clnt->cl_server into struct rpc_xprtTrond Myklebust2012-03-024-49/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the cl_xprt field is updated, the cl_server field will also have to change. Since the contents of cl_server follow the remote endpoint of cl_xprt, just move that field to the rpc_xprt. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ cel: simplify check_gss_callback_principal(), whitespace changes ] [ cel: forward ported to 3.4 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt fieldTrond Myklebust2012-03-025-28/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A migration event will replace the rpc_xprt used by an rpc_clnt. To ensure this can be done safely, all references to cl_xprt must now use a form of rcu_dereference(). Special care is taken with rpc_peeraddr2str(), which returns a pointer to memory whose lifetime is the same as the rpc_xprt. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ cel: fix lockdep splats and layering violations ] [ cel: forward ported to 3.4 ] [ cel: remove rpc_max_reqs(), add rpc_net_ns() ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: move waitq from RPC pipe to RPC inodeStanislav Kinsbursky2012-02-271-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, wait queue, used for polling of RPC pipe changes from user-space, is a part of RPC pipe. But the pipe data itself can be released on NFS umount prior to dentry-inode pair, connected to it (is case of this pair is open by some process). This is not a problem for almost all pipe users, because all PipeFS file operations checks pipe reference prior to using it. Except evenfd. This thing registers itself with "poll" file operation and thus has a reference to pipe wait queue. This leads to oopses on destroying eventfd after NFS umount (like rpc_idmapd do) since not pipe data left to the point already. The solution is to wait queue from pipe data to internal RPC inode data. This looks more logical, because this wiat queue used only for user-space processes, which already holds inode reference. Note: upcalls have to get pipe->dentry prior to dereferecing wait queue to make sure, that mount point won't disappear from underneath us. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: check RPC inode's pipe reference before dereferencingStanislav Kinsbursky2012-02-271-13/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 2 tightly bound objects: pipe data (created for kernel needs, has reference to dentry, which depends on PipeFS mount/umount) and PipeFS dentry/inode pair (created on mount for user-space needs). They both independently may have or have not a valid reference to each other. This means, that we have to make sure, that pipe->dentry reference is valid on upcalls, and dentry->pipe reference is valid on downcalls. The latter check is absent - my fault. IOW, PipeFS dentry can be opened by some process (rpc.idmapd for example), but it's pipe data can belong to NFS mount, which was unmounted already and thus pipe data was destroyed. To fix this, pipe reference have to be set to NULL on rpc_unlink() and checked on PipeFS file operations instead of pipe->dentry check. Note: PipeFS "poll" file operation will be updated in next patch, because it's logic is more complicated. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: release per-net clients lock before calling PipeFS dentries creationStanislav Kinsbursky2012-02-271-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v3: 1) Lookup for client is performed from the beginning of the list on each PipeFS event handling operation. Lockdep is sad otherwise, because inode mutex is taken on PipeFS dentry creation, which can be called on mount notification, where this per-net client lock is taken on clients list walk. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: include filelayout DS rpc stats in mountstatsWeston Andros Adamson2012-02-172-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | Include RPC statistics from all data servers in /proc/self/mountstats for pNFS filelayout mounts. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: add sending,pending queue and max slot to xprt statsAndy Adamson2012-02-162-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With static RPC slots, the xprt backlog queue stats were useful in showing when the transport (TCP) was starved by lack of RPC slots. The new dynamic RPC slot code, commit d9ba131d8f58c0d2ff5029e7002ab43f913b36f9, always provides an RPC slot and so only uses the xprt backlog queue when the tcp_max_slot_table_entries value has been hit or when an allocation error occurs. All requests are now placed on the xprt sending or pending queue which need to be monitored for debugging. The max_slot stat shows the maximum number of dynamic RPC slots reached which is useful when debugging performance issues. Add the new fields at the end of the mountstats xprt stanza so that mountstats outputs the previous correct values and ignores the new fields. Bump NFS_IOSTATS_VERS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: init per-net rpcbind spinlockStanislav Kinsbursky2012-02-161-0/+1
| | | | | | | | | | | | | | | Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Ensure that we can trace waitqueues when !defined(CONFIG_SYSCTL)Trond Myklebust2012-02-151-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracepoint code relies on the queue->name being defined in order to be able to display the name of the waitqueue on which an RPC task is sleeping. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Randy Dunlap <rdunlap@xenotime.net>
| * | Lockd: per-net up and down routines introducedStanislav Kinsbursky2012-02-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces per-net Lockd initialization and destruction routines. The logic is the same as in global Lockd up and down routines. Probably the solution is not the best one. But at least it looks clear. So per-net "up" routine are called only in case of lockd is running already. If per-net resources are not allocated yet, then service is being registered with local portmapper and lockd sockets created. Per-net "down" routine is called on every lockd_down() call in case of global users counter is not zero. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: service shutdown function in network namespace context introducedStanislav Kinsbursky2012-02-151-13/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is enough for releasing resources, allocated for network namespace context, in case of sharing service between them. IOW, each service "user" (LockD, NFSd, etc), which wants to share service between network namespaces, have to release related resources by the function, introduced in this patch, instead of performing service shutdown (of course in case the service is shared already to the moment of release). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: service destruction in network namespace contextStanislav Kinsbursky2012-02-152-12/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer. This patch introduces network namespace filter for service destruction function. Nothing special here - just do exactly the same operations, but only for tranports in passed networks namespace context. BTW, BUG_ON() checks for empty service transports lists were returned into svc_destroy() function. This is because of swithing generic svc_close_all() to networks namespace dependable svc_close_net(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
OpenPOWER on IntegriCloud