op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	lockd: use rpc client's cl_nodename for id encoding	Stanislav Kinsbursky	2012-10-01	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	Taking hostname from uts namespace if not safe, because this cuold be performind during umount operation on child reaper death. And in this case current->nsproxy is NULL already. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	lockd: per-net NSM client creation and destruction helpers introduced	Stanislav Kinsbursky	2012-10-01	3	-2/+54
\| \| \| \| \| \| \| \| \| \| \| \| \|	NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and destroying it's mount namespace). It means, that current nsproxy is set to NULL already, but creation of RPC client requires UTS namespace for gaining hostname string. This patch introduces reference counted NFS RPC clients creation and destruction helpers (similar to RPCBIND RPC clients). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: add debug messages to callback down function	Stanislav Kinsbursky	2012-10-01	1	-0/+2
\| \| \| \| \|	Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: callback per-net usage counting introduced	Stanislav Kinsbursky	2012-10-01	2	-2/+18
\| \| \| \| \| \| \| \|	This patch also introduces refcount-aware nfs_callback_down_net() wrapper for svc_shutdown_net(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: make nfs_callback_tcpport6 per network context	Stanislav Kinsbursky	2012-10-01	4	-6/+4
\| \| \| \| \|	Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: make nfs_callback_tcpport per network context	Stanislav Kinsbursky	2012-10-01	4	-4/+8
\| \| \| \| \|	Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: callback up - users counting cleanup	Stanislav Kinsbursky	2012-10-01	1	-12/+10
\| \| \| \| \| \| \| \| \| \|	Usage coutner now increased only is the service was started sccessfully. Even if service is running already, then goto is not required anymore, because service creation and start will be skipped. With this patch code looks clearer. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: callback service start function introduced	Stanislav Kinsbursky	2012-10-01	1	-32/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is just a code move, which from my POW makes code looks better. I.e. now on start we have 3 different stages: 1) Service creation. 2) Service per-net data allocation. 3) Service start. Patch also renames goto label "out_err:" into "err_start:" to reflect new changes. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: callback up - transport backchannel cleanup	Stanislav Kinsbursky	2012-10-01	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	No need to assign transports backchannel server explicitly in nfs41_callback_up() - there is nfs_callback_bc_serv() function for this. By using it, nfs4_callback_up() and nfs41_callback_up() can be called without transport argument. Note: service have to be passed to nfs_callback_bc_serv() instead of callback, since callback link can be uninitialized. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: move per-net callback thread initialization to nfs_callback_up_net()	Stanislav Kinsbursky	2012-10-01	2	-48/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v4: 1) Callback transport creation routine selection by version simlified. This new function in now called before nfs_minorversion_callback_svc_setup()). Also few small changes: 1) current network namespace in nfs_callback_up() was replaced by transport net. 2) svc_shutdown_net() was moved prior to callback usage counter decrement (because in case of per-net data allocation faulure svc_shutdown_net() have to be skipped). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: callback service creation function introduced	Stanislav Kinsbursky	2012-10-01	1	-14/+49
\| \| \| \| \| \| \| \| \|	This function creates service if it's not exist, or increase usage counter of the existent, and returns pointer to it. Usage counter will be droppepd by svc_destroy() later in nfs_callback_up(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: pass net to nfs_callback_down()	Stanislav Kinsbursky	2012-10-01	3	-4/+4
\| \| \| \| \|	Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4: Add ACCESS operation to OPEN compound	Weston Andros Adamson	2012-10-01	5	-13/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The OPEN operation has no way to differentiate an open for read and an open for execution - both look like read to the server. This allowed users to read files that didn't have READ access but did have EXEC access, which is obviously wrong. This patch adds an ACCESS call to the OPEN compound to handle the difference between OPENs for reading and execution. Since we're going through the trouble of calling ACCESS, we check all possible access bits and cache the results hopefully avoiding an ACCESS call in the future. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Use kzalloc() instead of kmalloc() in the idmapper	Bryan Schumaker	2012-10-01	1	-4/+1
\| \| \| \| \| \| \| \|	This will allocate memory that has already been zeroed, allowing us to remove the memset later on. Signed-off-by: Bryan Schumaker <bjchuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Remove bad delegations during open recovery	Bryan Schumaker	2012-10-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I put the client into an open recovery loop by: Client: Open file read half Server: Expire client (echo 0 > /sys/kernel/debug/nfsd/forget_clients) Client: Drop vm cache (echo 3 > /proc/sys/vm/drop_caches) finish reading file This causes a loop because the client never updates the nfs4_state after discovering that the delegation is invalid. This means it will keep trying to read using the bad delegation rather than attempting to re-open the file. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> CC: stable@vger.kernel.org [3.4+] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Always use the open stateid when checking for expired opens	Bryan Schumaker	2012-10-01	1	-1/+1
\| \| \| \| \| \| \| \| \|	If we are reading through a delegation, and the delegation is OK then state->stateid will still point to a delegation stateid and not an open stateid. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	SUNRPC: Limit the rpciod workqueue concurrency	Trond Myklebust	2012-09-28	1	-1/+1
\| \| \| \| \| \| \|	We shouldn't need more than 1 worker thread per cpu, since rpciod is designed to run without sleeping in most cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: nfs4_proc_layoutreturn must always drop the plh_block_lgets count	Trond Myklebust	2012-09-28	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	Currently it does not do so if the RPC call failed to start. Fix is to move the decrement of plh_block_lgets into nfs4_layoutreturn_release. Also remove a redundant test of task->tk_status in nfs4_layoutreturn_done: if lrp->res.lrs_present is set, then obviously the RPC call succeeded. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: _pnfs_return_layout() shouldn't invalidate the layout on failure	Trond Myklebust	2012-09-28	1	-2/+3
\| \| \| \| \| \| \| \|	Failure of the layoutreturn allocation fails is not a good reason to mark the pnfs_layout_hdr as having failed a layoutget or i/o. Just exit cleanly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Remove the NFS_LAYOUT_RETURNED state	Trond Myklebust	2012-09-28	2	-25/+1
\| \| \| \| \| \| \|	It serves no purpose that the test for whether or not we have valid layout segments doesn't already serve. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Clear NFS_LAYOUT_BULK_RECALL when the layout segments are freed	Trond Myklebust	2012-09-28	1	-0/+2
\| \| \| \| \| \| \|	Once all the affected layout segments have been freed up, clear the NFS_LAYOUT_BULK_RECALL flag so that we can reuse the pnfs_layout_hdr Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Get rid of the NFS_LAYOUT_DESTROYED state	Trond Myklebust	2012-09-28	3	-17/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have a mechanism for blocking LAYOUTGET by means of the plh_block_lgets counter. The only "service" that NFS_LAYOUT_DESTROYED provides at this point is to block layoutget once the layout segment list is empty, which basically means that you have to wait until the pnfs_layout_hdr is destroyed before you can do pNFS on that file again. This patch enables the reuse of the pnfs_layout_hdr if the layout segment list is empty. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Remove unused 'default allocation' for pnfs_alloc_layout_hdr()	Trond Myklebust	2012-09-28	1	-3/+2
\| \| \| \| \| \|	...and ditto for pnfs_free_layout_hdr() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Get rid of pNFS spin lock debugging asserts...	Trond Myklebust	2012-09-28	1	-3/+0
\| \| \| \| \| \|	These are all in static declared functions that are called only once. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Balance pnfs_layout_hdr refcount in pnfs_layout_(insert\|remove)_lseg	Trond Myklebust	2012-09-28	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Ensure that the reference count for pnfs_layout_hdr reverts to the original value after a call to pnfs_layout_remove_lseg(). Note that the caller is expected to hold a reference to the struct pnfs_layout_hdr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Clean up pnfs_put_lseg()	Trond Myklebust	2012-09-28	1	-6/+3
\| \| \| \| \| \| \|	There is no longer a need to use pnfs_free_lseg_list(). Just call pnfs_free_lseg() directly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server list	Trond Myklebust	2012-09-28	2	-20/+28
\| \| \| \| \| \| \| \|	Move the code into pnfs_free_layout_hdr(), and add checks to get_layout_by_fh_locked to ensure that they don't reference a layout that is being freed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Free the pnfs_layout_hdr outside the inode->i_lock	Trond Myklebust	2012-09-28	1	-12/+9
\| \| \| \| \| \| \|	None of the existing pNFS layout drivers seem to require the inode to be locked while they free the layout header. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Remove redundant reference to the pnfs_layout_hdr	Trond Myklebust	2012-09-28	1	-9/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Each layout segment already holds a reference to the pnfs_layout_hdr, so there is no need to hold an extra reference that is released once the last layout segment is freed. Ensure that pnfs_find_alloc_layout() always returns a reference to the pnfs_layout_hdr, which will be matched by the final call to pnfs_put_layout_hdr() in pnfs_update_layout(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Rename the pnfs_put_lseg_common to pnfs_layout_remove_lseg	Trond Myklebust	2012-09-28	1	-12/+15
\| \| \| \| \| \| \|	The latter name is more descriptive of the actual function. Also rename pnfs_insert_layout to pnfs_layout_insert_lseg. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: reset the inode MDS threshold counters on layout destruction	Trond Myklebust	2012-09-28	1	-4/+5
\| \| \| \| \| \| \|	Instead of resetting the inode MDS threshold counters when we mark the layout for destruction, do it as part of freeing the layout. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Get rid of pNFS layout state "NFS_LAYOUT_INVALID"	Trond Myklebust	2012-09-28	3	-10/+7
\| \| \| \| \| \| \| \| \| \|	In all cases where we set NFS_LAYOUT_INVALID, we also set NFS_LAYOUT_DESTROYED. Furthermore, in all cases where we test for NFS_LAYOUT_INVALID, we should also be testing for NFS_LAYOUT_DESTROYED, since the latter means that we hold no valid layout segments. Ergo the two are redundant. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Simplify the pNFS return-on-close code	Trond Myklebust	2012-09-28	3	-10/+5
\| \| \| \| \| \|	Confine it to the nfs4_do_close() code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Fix a race in the pNFS return-on-close code	Trond Myklebust	2012-09-28	3	-17/+17
\| \| \| \| \| \| \|	If we sleep after dropping the inode->i_lock, then we are no longer atomic with respect to the rpc_wake_up() call in pnfs_layout_remove_lseg(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: pnfs_layout_io_set_failed must clear invalid lsegs	Trond Myklebust	2012-09-28	1	-0/+8
\| \| \| \| \| \| \| \|	If pnfs_layout_io_test_failed() authorises a retry of the failed layoutgets, we should clear the existing layout segments so that we start afresh. Do this in pnfs_layout_io_set_failed(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Don't drop the pnfs_layout_hdr after a layoutget failure	Trond Myklebust	2012-09-28	1	-8/+32
\| \| \| \| \| \| \| \|	We want to cache the pnfs_layout_hdr after a layoutget or i/o failure so that pnfs_update_layout() can find it and know when it is time to retry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Fix a reference leak in pnfs_update_layout	Trond Myklebust	2012-09-28	1	-3/+6
\| \| \| \| \| \| \|	If we exit after the call to pnfs_find_alloc_layout(), we have to ensure that we put the struct pnfs_layout_hdr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: pNFS data servers may be temporarily offline	Trond Myklebust	2012-09-28	5	-17/+59
\| \| \| \| \| \| \| \| \| \| \|	In cases where the pNFS data server is just temporarily out of service, we want to mark it as such, and then try again later. Typically that will be in cases of network connection errors etc. This patch allows us to mark the devices as being "unavailable" for such transient errors, and will make them available for retries after a 2 minute timeout period. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Retry pNFS after a 2 minute timeout	Trond Myklebust	2012-09-28	2	-1/+15
\| \| \| \| \| \| \| \|	If we had to fall back to read/write through MDS, then assume that we should retry pNFS after a suitable timeout period. The following patch sets a timeout of 2 minutes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Add helpers for setting/reading the I/O fail bit	Trond Myklebust	2012-09-28	2	-18/+26
\| \| \| \| \| \|	...and make them local to the pnfs.c file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Replace dprintk() in pnfs_update_layout with something less buggy	Trond Myklebust	2012-09-28	1	-7/+11
\| \| \| \| \| \| \| \| \| \|	Dereferencing nfsi->layout in order to read plh_flags without holding a spin lock is bug prone. Furthermore, the dprintk() tells you nothing about whether or not the call succeeded. Replace it with something that tells you about whether or not a valid layout segment was returned for the inode in question. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Replace get_device_info() with filelayout_get_device_info()	Trond Myklebust	2012-09-28	4	-4/+4
\| \| \| \| \| \|	Fix the namespace pollution issue. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Cleanup; add "pnfs_" prefix to put_lseg() and get_lseg()	Trond Myklebust	2012-09-28	4	-31/+31
\| \| \| \|	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()	Trond Myklebust	2012-09-28	4	-22/+22
\| \| \| \|	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4.1: Cleanup add a "pnfs_" prefix to mark_matching_lsegs_invalid	Trond Myklebust	2012-09-28	3	-6/+6
\| \| \| \|	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Clean up the pNFS layoutget interface	Trond Myklebust	2012-09-28	4	-17/+27
\| \| \| \| \| \| \| \|	Ensure that we do return errors from nfs4_proc_layoutget() and that we don't mark the layout as having failed if the error was due to a signal or resource problem on the client side. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	SUNRPC: Get rid of the redundant xprt->shutdown bit field	Trond Myklebust	2012-09-28	4	-40/+11
\| \| \| \| \| \| \| \| \|	It is only set after everyone has dereferenced the transport, and serves no useful purpose: setting it is racy, so all the socket code, etc still needs to be able to cope with the cases where they miss reading it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Write the entire file if a server reboot occurs during fsync()	Trond Myklebust	2012-09-28	2	-0/+14
\| \| \| \| \| \| \|	This is to ensure that we don't clear the NFS_CONTEXT_RESEND_WRITES flag while there are still writes that haven't been resent. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFS: Fix fdatasync/fsync() when confronted with a server reboot	Trond Myklebust	2012-09-28	4	-22/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the server reboots before it can commit the unstable writes to disk, then nfs_commit_release_pages() will detect this when it compares the verifier returned by COMMIT to the one returned by WRITE. When this happens, the client needs to resend those writes in order to guarantee that they make it to stable storage. This patch adds a signalling mechanism to notify fsync() that it needs to retry all writes before it can exit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
*	NFSv4: Convert the nfs4_lock_state->ls_flags to a bit field	Trond Myklebust	2012-09-28	3	-11/+11
\| \| \| \|	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>