summaryrefslogtreecommitdiffstats
path: root/fs/nfsd/nfs4state.c
Commit message (Collapse)AuthorAgeFilesLines
* BKL: remove references to lock_kernel from commentsArnd Bergmann2010-11-171-4/+4
| | | | | | | | | | | Lock_kernel is gone from the code, so the comments should be updated, too. nfsd now uses lock_flocks instead of lock_kernel to protect against posix file locks. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd4: fix 4.1 connection registration raceJ. Bruce Fields2010-11-021-4/+12
| | | | | | | | If a connection is closed just after a sequence or create_session is sent over it, we could end up trying to register a callback that will never get called since the xprt is already marked dead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* locks: let the caller free file_lock on ->setlease failureChristoph Hellwig2010-10-311-0/+1
| | | | | | | | | | | | The caller allocated it, the caller should free it. The only issue so far is that we could change the flp pointer even on an error return if the fl_change callback failed. But we can simply move the flp assignment after the fl_change invocation, as the callers don't care about the flp return value if the setlease call failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd4: initialize delegation pointer to leaseJ. Bruce Fields2010-10-301-17/+2
| | | | | | | | | | | | | | | The NFSv4 server was initializing the dp->dl_flock pointer by the somewhat ridiculous method of a locks_copy_lock callback. Now that setlease uses the passed-in lock instead of doing a copy, dl_flock no longer gets set, resulting in the lock leaking on delegation release, and later possible hangs (among other problems). So, initialize dl_flock and get rid of the callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* locks/nfsd: allocate file lock outside of spinlockArnd Bergmann2010-10-271-11/+15
| | | | | | | | | | | As suggested by Christoph Hellwig, this moves allocation of new file locks out of generic_setlease into the callers, nfs4_open_delegation and fcntl_setlease in order to allow GFP_KERNEL allocations when lock_flocks has become a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>
* Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2010-10-261-182/+311
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits) svcrpc: svc_tcp_sendto XPT_DEAD check is redundant svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue svcrpc: assume svc_delete_xprt() called only once svcrpc: never clear XPT_BUSY on dead xprt nfsd4: fix connection allocation in sequence() nfsd4: only require krb5 principal for NFSv4.0 callbacks nfsd4: move minorversion to client nfsd4: delay session removal till free_client nfsd4: separate callback change and callback probe nfsd4: callback program number is per-session nfsd4: track backchannel connections nfsd4: confirm only on succesful create_session nfsd4: make backchannel sequence number per-session nfsd4: use client pointer to backchannel session nfsd4: move callback setup into session init code nfsd4: don't cache seq_misordered replies SUNRPC: Properly initialize sock_xprt.srcaddr in all cases SUNRPC: Use conventional switch statement when reclassifying sockets sunrpc/xprtrdma: clean up workqueue usage sunrpc: Turn list_for_each-s into the ..._entry-s ... Fix up trivial conflicts (two different deprecation notices added in separate branches) in Documentation/feature-removal-schedule.txt
| * nfsd4: fix connection allocation in sequence()J. Bruce Fields2010-10-241-14/+17
| | | | | | | | | | | | | | | | | | | | We're doing an allocation under a spinlock, and ignoring the possibility of allocation failure. A better fix wouldn't require an unnecessary allocation in the common case, but we'll leave that for later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: move minorversion to clientJ. Bruce Fields2010-10-211-2/+10
| | | | | | | | | | | | | | | | | | | | The minorversion seems more a property of the client than the callback channel. Some time we should probably also enforce consistent minorversion usage from the client; for now, this is just a cosmetic change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: delay session removal till free_clientJ. Bruce Fields2010-10-211-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Have unhash_client_locked() remove client and associated sessions from global hashes, but delay further dismantling till free_client(). (After unhash_client_locked(), the only remaining references outside the destroying thread are from any connections which have xpt_user callbacks registered.) This will simplify locking on session destruction. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: separate callback change and callback probeJ. Bruce Fields2010-10-211-3/+4
| | | | | | | | | | | | | | Only one of the nfsd4_callback_probe callers actually cares about changing the callback information. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: callback program number is per-sessionJ. Bruce Fields2010-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | The callback program is allowed to depend on the session which the callback is going over. No change in behavior yet, while we still only do callbacks over a single session for the lifetime of the client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: track backchannel connectionsJ. Bruce Fields2010-10-211-4/+7
| | | | | | | | | | | | | | We need to keep track of which connections are available for use with the backchannel, which for the forechannel, and which for both. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: confirm only on succesful create_sessionJ. Bruce Fields2010-10-211-3/+5
| | | | | | | | | | | | | | | | Following rfc 5661, section 18.36.4: "If the session is not successfully created, then no changes are made to any client records on the server." We shouldn't be confirming or incrementing the sequence id in this case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: make backchannel sequence number per-sessionJ. Bruce Fields2010-10-211-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we don't deal well with a client that has multiple sessions associated with it (even simultaneously, or serially over the lifetime of the client). In particular, we don't attempt to keep the backchannel running after the original session diseappears. We will fix that soon. Once we do that, we need the slot sequence number to be per-session; otherwise, for example, we cannot correctly handle a case like this: - All session 1 connections are lost. - The client creates session 2. We use it for the backchannel (since it's the only working choice). - The client gives us a new connection to use with session 1. - The client destroys session 2. At this point our only choice is to go back to using session 1. When we do so we must use the sequence number that is next for session 1. We therefore need to maintain multiple sequence number streams. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: use client pointer to backchannel sessionJ. Bruce Fields2010-10-211-3/+1
| | | | | | | | | | | | | | Instead of copying the sessionid, use the new cl_cb_session pointer, which indicates which session we're using for the backchannel. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: move callback setup into session init codeJ. Bruce Fields2010-10-211-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | The backchannel should be associated with a session, it isn't really global to the client. We do, however, want a pointer global to the client which tracks which session we're currently using for client-based callbacks. This is a first step in that direction; for now, just reshuffling of code with no significant change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: don't cache seq_misordered repliesJ. Bruce Fields2010-10-211-2/+1
| | | | | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: return expired on unfound stateid'sJ. Bruce Fields2010-10-021-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 78155ed75f470710f2aecb3e75e3d97107ba8374 "nfsd4: distinguish expired from stale stateids" attempted to distinguish expired and stale stateid's using time information that may not have been completely reliable, so I reverted it. That was throwing out the baby with the bathwater; we still do want to return expired, but let's do that using the simpler approach of just assuming any stateid is expired if it looks like it was given out by the current server instance, but we can't find it any more. This may help clients that are recovering from network partitions. Reported-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: add new connections to sessionJ. Bruce Fields2010-10-011-2/+47
| | | | | | | | | | | | | | | | As long as we're not implementing any session security, we should just automatically add any new connections that come along to the list of sessions associated with the session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: refactor connection allocationJ. Bruce Fields2010-10-011-6/+26
| | | | | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: use callbacks on svc_xprt_deletionJ. Bruce Fields2010-10-011-9/+42
| | | | | | | | | | | | Remove connections from the list when they go down. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: keep per-session list of connectionsJ. Bruce Fields2010-10-011-15/+54
| | | | | | | | | | | | | | The spec requires us in various places to keep track of the connections associated with each session. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: clean up session allocationJ. Bruce Fields2010-10-011-122/+89
| | | | | | | | | | | | | | | | | | | | | | Changes: - make sure session memory reservation is released on failure path. - use min_t()/min() for more compact code in several places. - break alloc_init_session into smaller pieces. - miscellaneous other cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: fix alloc_init_session return typeJ. Bruce Fields2010-10-011-3/+1
| | | | | | | | | | | | This returns an nfs error, not -ERRNO. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: fix alloc_init_session BUILD_BUG_ON()J. Bruce Fields2010-10-011-1/+1
| | | | | | | | | | | | Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: Move callback setup to callback queueJ. Bruce Fields2010-10-011-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of creating the new rpc client from a regular server thread, set a flag, kick off a null call, and allow the null call to do the work of setting up the client on the callback workqueue. Use a spinlock to ensure the callback work gets a consistent view of the callback parameters. This allows, for example, changing the callback from contexts where sleeping is not allowed. I hope it will also keep the locking simple as we add more session and trunking features, by serializing most of the callback-specific work. This also closes a small race where the the new cb_ident could be used with an old connection (or vice-versa). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: use generic callback code in null caseJ. Bruce Fields2010-10-011-0/+1
| | | | | | | | | | | | | | This will eventually allow us, for example, to kick off null callback from contexts where we can't sleep. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd4: minor variable renaming (cb -> conn)J. Bruce Fields2010-10-011-14/+14
| | | | | | | | | | | | | | Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get confused if variables of both types are always named cb.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | fs/locks.c: prepare for BKL removalArnd Bergmann2010-10-051-3/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prepares the removal of the big kernel lock from the file locking code. We still use the BKL as long as fs/lockd uses it and ceph might sleep, but we can flip the definition to a private spinlock as soon as that's done. All users outside of fs/lockd get converted to use lock_flocks() instead of lock_kernel() where appropriate. Based on an earlier patch to use a spinlock from Matthew Wilcox, who has attempted this a few times before, the earliest patch from over 10 years ago turned it into a semaphore, which ended up being slower than the BKL and was subsequently reverted. Someone should do some serious performance testing when this becomes a spinlock, since this has caused problems before. Using a spinlock should be at least as good as the BKL in theory, but who knows... Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Matthew Wilcox <willy@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Sage Weil <sage@newdream.net> Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
* nfsd4: mask out non-access bits in nfs4_access_to_omodeJ. Bruce Fields2010-09-021-1/+1
| | | | | | This fixes an unnecessary BUG(). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* Merge commit 'v2.6.36-rc1' into HEADJ. Bruce Fields2010-08-261-1/+1
|\
| * Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2010-08-071-146/+235
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd4: fix file open accounting for RDWR opens nfsd: don't allow setting maxblksize after svc created nfsd: initialize nfsd versions before creating svc net: sunrpc: removed duplicated #include nfsd41: Fix a crash when a callback is retried nfsd: fix startup/shutdown order bug nfsd: minor nfsd read api cleanup gcc-4.6: nfsd: fix initialized but not read warnings nfsd4: share file descriptors between stateid's nfsd4: fix openmode checking on IO using lock stateid nfsd4: miscellaneous process_open2 cleanup nfsd4: don't pretend to support write delegations nfsd: bypass readahead cache when have struct file nfsd: minor nfsd_svc() cleanup nfsd: move more into nfsd_startup() nfsd: just keep single lockd reference for nfsd nfsd: clean up nfsd_create_serv error handling nfsd: fix error handling in __write_ports_addxprt nfsd: fix error handling when starting nfsd with rpcbind down nfsd4: fix v4 state shutdown error paths ...
| * | nfsd4: shut down callback queue outside state lockJ. Bruce Fields2010-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reportedly causes a lockdep warning on nfsd shutdown. That looks like a false positive to me, but there's no reason why this needs the state lock anyway. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | | nfsd4: fix downgrade/lock logicJ. Bruce Fields2010-08-261-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we already had a RW open for a file, and get a readonly open, we were piggybacking on the existing RW open. That's inconsistent with the downgrade logic which blows away the RW open assuming you'll still have a readonly open. Also, make sure there is a readonly or writeonly open available for locking, again to prevent bad behavior in downgrade cases when any RW open may be lost. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | nfsd4: bad BUG() in preprocess_stateid_opJ. Bruce Fields2010-08-261-1/+0
| |/ |/| | | | | | | | | | | | | | | | | It's OK for this function to return without setting filp--we do it in the special-stateid case. And there's a legitimate case where we can hit this, since we do permit reads on write-only stateid's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: fix file open accounting for RDWR opensJ. Bruce Fields2010-08-071-3/+21
| | | | | | | | | | | | | | | | | | Commit f9d7562fdb9dc0ada3a7aba5dbbe9d965e2a105d "nfsd4: share file descriptors between stateid's" didn't correctly account for O_RDWR opens. Symptoms include leaked files, resulting in failures to unmount and/or warnings about orphaned inodes on reboot. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | gcc-4.6: nfsd: fix initialized but not read warningsAndi Kleen2010-07-291-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes at least one real minor bug: the nfs4 recovery dir sysctl would not return its status properly. Also I finished Al's 1e41568d7378d ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") commit, it moved the IMA code, but left the old path initializer in there. The rest is just dead code removed I think, although I was not fully sure about the "is_borc" stuff. Some more review would be still good. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: share file descriptors between stateid'sJ. Bruce Fields2010-07-291-121/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vfs doesn't really allow us to "upgrade" a file descriptor from read-only to read-write, and our attempt to do so in nfs4_upgrade_open is ugly and incomplete. Move to a different scheme where we keep multiple opens, shared between open stateid's, in the nfs4_file struct. Each file will be opened at most 3 times (for read, write, and read-write), and those opens will be shared between all clients and openers. On upgrade we will do another open if necessary instead of attempting to upgrade an existing open. We keep count of the number of readers and writers so we know when to close the shared files. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | nfsd4: fix openmode checking on IO using lock stateidJ. Bruce Fields2010-07-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | It is legal to perform a write using the lock stateid that was originally associated with a read lock, or with a file that was originally opened for read, but has since been upgraded. So, when checking the openmode, check the mode associated with the open stateid from which the lock was derived. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: miscellaneous process_open2 cleanupJ. Bruce Fields2010-07-291-9/+17
| | | | | | | | | | | | Move more work into helper functions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: don't pretend to support write delegationsJ. Bruce Fields2010-07-291-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The delegation code mostly pretends to support either read or write delegations. However, correct support for write delegations would require, for example, breaking of delegations (and/or implementation of cb_getattr) on stat. Currently all that stops us from handing out delegations is a subtle reference-counting issue. Avoid confusion by adding an earlier check that explicitly refuses write delegations. For now, though, I'm not going so far as to rip out existing half-support for write delegations, in case we get around to using that soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: fix v4 state shutdown error pathsJeff Layton2010-07-231-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If someone tries to shut down the laundry_wq while it isn't up it'll cause an oops. This can happen because write_ports can create a nfsd_svc before we really start the nfs server, and we may fail before the server is ever started. Also make sure state is shutdown on error paths in nfsd_svc(). Use a common global nfsd_up flag instead of nfs4_init, and create common helper functions for nfsd start/shutdown, as there will be other work that we want done only when we the number of nfsd threads transitions between zero and nonzero. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | nfsd4: remove some debugging codeJ. Bruce Fields2010-06-221-2/+0
| | | | | | | | | | | | This is overkill. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | nfsd4: translate memory errors to delay, not serverfaultJ. Bruce Fields2010-06-221-3/+3
| | | | | | | | | | | | | | If the server is out of memory is better for clients to back off and retry than to just error out. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | nfsd4; fix session reference count leakJ. Bruce Fields2010-06-221-1/+0
| | | | | | | | | | | | | | Note the session has to be put() here regardless of what happens to the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | nfsd4: fix use of op_share_accessJ. Bruce Fields2010-05-311-4/+12
|/ | | | | | | | | NFSv4.1 adds additional flags to the share_access argument of the open call. These flags need to be masked out in some of the existing code, but current code does that inconsistently. Tested-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Revert "nfsd4: distinguish expired from stale stateids"J. Bruce Fields2010-05-181-45/+11
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 78155ed75f470710f2aecb3e75e3d97107ba8374. We're depending here on the boot time that we use to generate the stateid being monotonic, but get_seconds() is not necessarily. We still depend at least on boot_time being different every time, but that is a safer bet. We have a few reports of errors that might be explained by this problem, though we haven't been able to confirm any of them. But the minor gain of distinguishing expired from stale errors seems not worth the risk. Conflicts: fs/nfsd/nfs4state.c Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: safer initialization order in find_file()Pavel Emelyanov2010-05-181-3/+3
| | | | | | | | | | | | | | | | | | | The alloc_init_file() first adds a file to the hash and then initializes its fi_inode, fi_id and fi_had_conflict. The uninitialized fi_inode could thus be erroneously checked by the find_file(), so move the hash insertion lower. The client_mutex should prevent this race in practice; however, we eventually hope to make less use of the client_mutex, so the ordering here is an accident waiting to happen. I didn't find whether the same can be true for two other fields, but the common sense tells me it's better to initialize an object before putting it into a global hash table :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: implement reclaim_completeJ. Bruce Fields2010-05-131-3/+30
| | | | | | | This is a mandatory operation. Also, here (not in open) is where we should be committing the reboot recovery information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: nfsd4_destroy_session must set callback client under the state lockBenny Halevy2010-05-131-0/+2
| | | | | | | | nfsd4_set_callback_client must be called under the state lock to atomically set or unset the callback client and shutting down the previous one. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
OpenPOWER on IntegriCloud