op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	nfsd4: remove redundant check from nfsd4_open	J. Bruce Fields	2009-03-18	1	-4/+0
\| \| \| \| \| \| \|	Note that we already checked for this invalid case at the top of this function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: don't do lookup within readdir in recovery code	J. Bruce Fields	2009-03-18	1	-40/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main nfsd code was recently modified to no longer do lookups from withing the readdir callback, to avoid locking problems on certain filesystems. This (rather hacky, and overdue for replacement) NFSv4 recovery code has the same problem. Fix it to build up a list of names (instead of dentries) and do the lookups afterwards. Reported symptoms were a deadlock in the xfs code (called from nfsd4_recdir_load), with /var/lib/nfs on xfs. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reported-by: David Warren <warren@atmos.washington.edu>
*	nfsd4: support putpubfh operation	J. Bruce Fields	2009-03-18	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently putpubfh returns NFSERR_OPNOTSUPP, which isn't actually allowed for v4. The right error is probably NFSERR_NOTSUPP. But let's just implement it; though rarely seen, it can be used by Solaris (with a special mount option), is mandated by the rfc, and is trivial for us to support. Thanks to Yang Hongyang for pointing out the original problem, and to Mike Eisler, Tom Talpey, Trond Myklebust, and Dave Noveck for further argument.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	Short write in nfsd becomes a full write to the client	David Shaw	2009-03-18	4	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a filesystem being written to via NFS returns a short write count (as opposed to an error) to nfsd, nfsd treats that as a success for the entire write, rather than the short count that actually succeeded. For example, given a 8192 byte write, if the underlying filesystem only writes 4096 bytes, nfsd will ack back to the nfs client that all 8192 bytes were written. The nfs client does have retry logic for short writes, but this is never called as the client is told the complete write succeeded. There are probably other ways it could happen, but in my case it happened with a fuse (filesystem in userspace) filesystem which can rather easily have a partial write. Here is a patch to properly return the short write count to the client. Signed-off-by: David Shaw <dshaw@jabberwocky.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	NFSD: return nfsv4 error code nfserr_notsupp rather than nfsv[23]'s ↵	Benny Halevy	2009-03-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	nfserr_opnotsupp Thanks for Bill Baker at sun.com for catching this at Connectathon 2009. This bug was introduced in 2.6.27 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: move rpc_client setup to a separate function	J. Bruce Fields	2009-03-18	1	-8/+21
\| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: fix do_probe_callback errors	J. Bruce Fields	2009-03-18	1	-8/+7
\| \| \| \| \| \| \| \| \|	The errors returned aren't used. Just return 0 and make them available to a dprintk(). Also, consistently use -ERRNO errors instead of nfs errors. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: Benny Halevy <bhalevy@panasas.com>
*	nfsd4: remove use of mutex for file_hashtable	J. Bruce Fields	2009-03-18	2	-26/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	As part of reducing the scope of the client_mutex, and in order to remove the need for mutexes from the callback code (so that callbacks can be done as asynchronous rpc calls), move manipulations of the file_hashtable under the recall_lock. Update the relevant comments while we're here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Alexandros Batsakis <batsakis@netapp.com> Reviewed-by: Benny Halevy <bhalevy@panasas.com>
*	nfsd4: put_nfs4_client does not require state lock	J. Bruce Fields	2009-03-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	Since free_client() is guaranteed to only be called once, and to only touch the client structure itself (not any common data structures), it has no need for the state lock. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Alexandros Batsakis <batsakis@netapp.com>
*	nfsd4: rename io_during_grace_disallowed	J. Bruce Fields	2009-03-18	1	-2/+2
\| \| \| \| \| \| \|	Use a slightly clearer, more concise name. Also removed unused argument. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: remove unused CHECK_FH flag	J. Bruce Fields	2009-03-18	2	-4/+4
\| \| \| \| \| \|	All users now pass this, so it's meaningless. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: fail when delegreturn gets a non-delegation stateid	J. Bruce Fields	2009-03-18	1	-2/+1
\| \| \| \| \| \| \| \|	Previous cleanup reveals an obvious (though harmless) bug: when delegreturn gets a stateid that isn't for a delegation, it should return an error rather than doing nothing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: separate delegreturn case from preprocess_stateid_op	J. Bruce Fields	2009-03-18	1	-12/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Delegreturn is enough a special case for preprocess_stateid_op to warrant just open-coding it in delegreturn. There should be no change in behavior here; we're just reshuffling code. Thanks to Yang Hongyang for catching a critical typo. Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: add a helper function to decide if stateid is delegation	J. Bruce Fields	2009-03-18	1	-1/+6
\| \| \| \| \| \|	Make this check self-documenting. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: remove some dprintk's	J. Bruce Fields	2009-03-18	1	-9/+2
\| \| \| \| \| \| \| \| \|	I can't recall ever seeing these printk's used to debug a problem. I'll happily put them back if we see a case where they'd be useful. (Though if we do that the find_XXX() errors would probably be better reported in find_XXX() functions themselves.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: remove unneeded local variable	J. Bruce Fields	2009-03-18	1	-5/+2
\| \| \| \| \| \|	We no longer need stidp. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op	J. Bruce Fields	2009-03-18	1	-11/+8
\| \| \| \| \| \| \| \| \|	Note that we exit this first big "if" with stp == NULL if and only if we took the first branch; therefore, the second "if" is redundant, and we can just combine the two, simplifying the logic. Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: move check_stateid_generation check	J. Bruce Fields	2009-03-18	1	-3/+6
\| \| \| \| \| \|	No change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: trivial preprocess_stateid_op cleanup	J. Bruce Fields	2009-03-18	1	-6/+8
\| \| \| \| \| \|	Remove a couple redundant comments, adjust style; no change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd(v2/v3): fix the failure of creation from HPUX client	wengang wang	2009-03-18	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sometimes HPUX nfs client sends a create request to linux nfs server(v2/v3). the dump of the request is like: obj_attributes mode: value follows set_it: value follows (1) mode: 00 uid: no value set_it: no value (0) gid: value follows set_it: value follows (1) gid: 8030 size: value follows set_it: value follows (1) size: 0 atime: don't change set_it: don't change (0) mtime: don't change set_it: don't change (0) note that mode is 00(havs no rwx privilege even for the owner) and it requires to set size to 0. as current nfsd(v2/v3) implementation, the server does mainly 2 steps: 1) creates the file in mode specified by calling vfs_create(). 2) sets attributes for the file by calling nfsd_setattr(). at step 2), it finally calls file system specific setattr() function which may fail when checking permission because changing size needs WRITE privilege but it has none since mode is 000. for this case, a new file created, we may simply ignore the request of setting size to 0, so that WRITE privilege is not needed and the open succeeds. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> -- vfs.c \| 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	lockd: clean up blocking lock cases of nlsmvc_lock()	Miklos Szeredi	2009-03-18	1	-5/+8
\| \| \| \| \| \| \| \|	No change in behavior, just rearranging the switch so that we break out of the switch if and only if we're in the wait case. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd: lock state around put client and delegation in nfsd4_cb_recall	Alexandros Batsakis	2009-03-18	1	-1/+2
\| \| \| \| \| \| \| \| \|	not having the state locked before putting the client/delegation causes a bug. Also removed the comment from the function header about the state being already locked Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: use helper for copying delegation filehandle	J. Bruce Fields	2009-03-18	2	-5/+3
\| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: use helper for copying filehandles for replay	J. Bruce Fields	2009-03-18	1	-11/+6
\| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: fix misplaced comment	J. Bruce Fields	2009-03-18	1	-1/+1
\| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd: clarify exclusive create bitmask result.	J. Bruce Fields	2009-03-18	1	-3/+5
\| \| \| \| \| \| \|	The use of \|= is confusing--the bitmask is always initialized to zero in this case, so we're effectively just doing an assignment here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd : Define NFSD only when FILE_LOCKING is enabled	Manish Katiyar	2009-03-18	1	-0/+1
\| \| \| \| \| \| \| \|	Enable NFSD only when FILE_LOCKING is enabled, since we don't want to support NFSD without FILE_LOCKING. Signed-off-by: Manish Katiyar <mkatiyar@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	NFSD: cleanup for nfs3proc.c	Qinghuang Feng	2009-03-18	1	-2/+3
\| \| \| \| \| \| \| \|	MSDOS_SUPER_MAGIC is defined in <linux/magic.h>, so use MSDOS_SUPER_MAGIC directly. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: split open/lockowner release code	J. Bruce Fields	2009-03-18	1	-23/+34
\| \| \| \| \| \| \| \|	The caller always knows specifically whether it's releasing a lockowner or an openowner, and the code is simpler if we use separate functions (and the apparent recursion is gone). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: remove a forward declaration	J. Bruce Fields	2009-03-18	1	-43/+42
\| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
*	nfsd4: split lockstateid/openstateid release logic	J. Bruce Fields	2009-03-18	1	-23/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The flags here attempt to make the code more general, but I find it actually just adds confusion. I think it's clearer to separate the logic for the open and lock cases entirely. And eventually we may want to separate the stateowner and stateid types as well, as many of the fields aren't shared between the lock and open cases. Also move to eliminate forward references. Start with the stateid's. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: Benny Halevy <bhalevy@panasas.com>
*	Merge branch 'for-2.6.29' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	2009-03-18	1	-0/+1
\|\ \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.29' of git://linux-nfs.org/~bfields/linux: nfsd: nfsd should drop CAP_MKNOD for non-root NFSD: provide encode routine for OP_OPENATTR
\| *	NFSD: provide encode routine for OP_OPENATTR	Benny Halevy	2009-03-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although this operation is unsupported by our implementation we still need to provide an encode routine for it to merely encode its (error) status back in the compound reply. Thanks for Bill Baker at sun.com for testing with the Sun OpenSolaris' client, finding, and reporting this bug at Connectathon 2009. This bug was introduced in 2.6.27 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* \|	Merge branch 'for_linus' of ↵	Linus Torvalds	2009-03-17	3	-7/+16
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix bb_prealloc_list corruption due to wrong group locking ext4: fix bogus BUG_ONs in in mballoc code ext4: Print the find_group_flex() warning only once ext4: fix header check in ext4_ext_search_right() for deep extent trees.
\| *	ext4: fix bb_prealloc_list corruption due to wrong group locking	Eric Sandeen	2009-03-16	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is for Red Hat bug 490026: EXT4 panic, list corruption in ext4_mb_new_inode_pa ext4_lock_group(sb, group) is supposed to protect this list for each group, and a common code flow to remove an album is like this: ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL); ext4_lock_group(sb, grp); list_del(&pa->pa_group_list); ext4_unlock_group(sb, grp); so it's critical that we get the right group number back for this prealloc context, to lock the right group (the one associated with this pa) and prevent concurrent list manipulation. however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a comment, "-1 is to protect from crossing allocation group". This makes sense for the group_pa, where pa_pstart is advanced by the length which has been used (in ext4_mb_release_context()), and when the entire length has been used, pa_pstart has been advanced to the first block of the next group. However, for inode_pa, pa_pstart is never advanced; it's just set once to the first block in the group and not moved after that. So in this case, if we subtract one in ext4_mb_put_pa(), we are actually locking the previous group, and opening the race with the other threads which do not subtract off the extra block. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| *	ext4: fix bogus BUG_ONs in in mballoc code	Eric Sandeen	2009-03-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thiemo Nagel reported that: # dd if=/dev/zero of=image.ext4 bs=1M count=2 # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \ -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4 # mount -o loop image.ext4 mnt/ # dd if=/dev/zero of=mnt/file oopsed, with a BUG_ON in ext4_mb_normalize_request because size == EXT4_BLOCKS_PER_GROUP It appears to me (esp. after talking to Andreas) that the BUG_ON is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should be allowed, though larger sizes do indicate a problem. Fix that an another (apparently rare) codepath with a similar check. Reported-by: Thiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| *	ext4: Print the find_group_flex() warning only once	Theodore Ts'o	2009-03-12	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a short-term warning, and even printk_ratelimit() can result in too much noise in system logs. So only print it once as a warning. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| *	ext4: fix header check in ext4_ext_search_right() for deep extent trees.	Eric Sandeen	2009-03-10	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ext4_ext_search_right() function is confusing; it uses a "depth" variable which is 0 at the root and maximum at the leaves, but the on-disk metadata uses a "depth" (actually eh_depth) which is opposite: maximum at the root, and 0 at the leaves. The ext4_ext_check_header() function is given a depth and checks the header agaisnt that depth; it expects the on-disk semantics, but we are giving it the opposite in the while loop in this function. We should be giving it the on-disk notion of "depth" which we can get from (p_depth - depth) - and if you look, the last (more commonly hit) call to ext4_ext_check_header() does just this. Sending in the wrong depth results in (incorrect) messages about corruption: EXT4-fs error (device sdb1): ext4_ext_search_right: bad header in inode #2621457: unexpected eh_depth - magic f30a, entries 340, max 340(0), depth 1(2) http://bugzilla.kernel.org/show_bug.cgi?id=12821 Reported-by: David Dindorp <ddi@dubex.dk> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	Avoid 64-bit "switch()" statements on 32-bit architectures	Linus Torvalds	2009-03-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit ee6f779b9e0851e2f7da292a9f58e0095edf615a ("filp->f_pos not correctly updated in proc_task_readdir") changed the proc code to use filp->f_pos directly, rather than through a temporary variable. In the process, that caused the operations to be done on the full 64 bits, even though the offset is never that big. That's all fine and dandy per se, but for some unfathomable reason gcc generates absolutely horrid code when using 64-bit values in switch() statements. To the point of actually calling out to gcc helper functions like __cmpdi2 rather than just doing the trivial comparisons directly the way gcc does for normal compares. At which point we get link failures, because we really don't want to support that kind of crazy code. Fix this by just casting the f_pos value to "unsigned long", which is plenty big enough for /proc, and avoids the gcc code generation issue. Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Zhang Le <r0bertz@gentoo.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	filp->f_pos not correctly updated in proc_task_readdir	Zhang Le	2009-03-16	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	filp->f_pos only get updated at the end of the function. Thus d_off of those dirents who are in the middle will be 0, and this will cause a problem in glibc's readdir implementation, specifically endless loop. Because when overflow occurs, f_pos will be set to next dirent to read, however it will be 0, unless the next one is the last one. So it will start over again and again. There is a sample program in man 2 gendents. This is the output of the program running on a multithread program's task dir before this patch is applied: $ ./a.out /proc/3807/task --------------- nread=128 --------------- i-node# file type d_reclen d_off d_name 506442 directory 16 1 . 506441 directory 16 0 .. 506443 directory 16 0 3807 506444 directory 16 0 3809 506445 directory 16 0 3812 506446 directory 16 0 3861 506447 directory 16 0 3862 506448 directory 16 8 3863 This is the output after this patch is applied $ ./a.out /proc/3807/task --------------- nread=128 --------------- i-node# file type d_reclen d_off d_name 506442 directory 16 1 . 506441 directory 16 2 .. 506443 directory 16 3 3807 506444 directory 16 4 3809 506445 directory 16 5 3812 506446 directory 16 6 3861 506447 directory 16 7 3862 506448 directory 16 8 3863 Signed-off-by: Zhang Le <r0bertz@gentoo.org> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2009-03-14	2	-4/+7
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Fix Xilinx SystemACE driver to handle empty CF slot block: fix memory leak in bio_clone() block: Add gfp_mask parameter to bio_integrity_clone()
\| * \|	block: fix memory leak in bio_clone()	Li Zefan	2009-03-14	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If bio_integrity_clone() fails, bio_clone() returns NULL without freeing the newly allocated bio. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	block: Add gfp_mask parameter to bio_integrity_clone()	un'ichi Nomura	2009-03-14	2	-3/+4
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stricter gfp_mask might be required for clone allocation. For example, request-based dm may clone bio in interrupt context so it has to use GFP_ATOMIC. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* \|	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6	Linus Torvalds	2009-03-14	6	-37/+171
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined... SUNRPC: xprt_connect() don't abort the task if the transport isn't bound SUNRPC: Fix an Oops due to socket not set up yet... Bug 11061, NFS mounts dropped NFS: Handle -ESTALE error in access() NLM: Fix GRANT callback address comparison when IPv6 is enabled NLM: Shrink the IPv4-only version of nlm_cmp_addr() NFSv3: Fix posix ACL code NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2) SUNRPC: Tighten up the task locking rules in __rpc_execute()
\| * \|	NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...	Trond Myklebust	2009-03-12	1	-29/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stephen Rothwell reports: Today's linux-next build (powerpc ppc64_defconfig) failed like this: fs/built-in.o: In function `.nfs_get_client': client.c:(.text+0x115010): undefined reference to `.__ipv6_addr_type' Fix by moving the IPV6 specific parts of commit d7371c41b0cda782256b1df759df4e8d4724584c ("Bug 11061, NFS mounts dropped") into the '#ifdef IPV6..." section. Also fix up a couple of formatting issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\| * \|	Bug 11061, NFS mounts dropped	Ian Dall	2009-03-10	1	-1/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=11061 sockaddr structures can't be reliably compared using memcmp() because there are padding bytes in the structure which can't be guaranteed to be the same even when the sockaddr structures refer to the same socket. Instead compare all the relevant fields. In the case of IPv6 sin6_flowinfo is not compared because it only affects QoS and sin6_scope_id is only compared if the address is "link local" because "link local" addresses need only be unique to a specific link. Signed-off-by: Ian Dall <ian@beware.dropbear.id.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\| * \|	NFS: Handle -ESTALE error in access()	Suresh Jayaraman	2009-03-10	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hi Trond, I have been looking at a bugreport where trying to open applications on KDE on a NFS mounted home fails temporarily. There have been multiple reports on different kernel versions pointing to this common issue: http://bugzilla.kernel.org/show_bug.cgi?id=12557 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/269954 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508866.html This issue can be reproducible consistently by doing this on a NFS mounted home (KDE): 1. Open 2 xterm sessions 2. From one of the xterm session, do "ssh -X <remote host>" 3. "stat ~/.Xauthority" on the remote SSH session 4. Close the two xterm sessions 5. On the server do a "stat ~/.Xauthority" 6. Now on the client, try to open xterm This will fail. Even if the filehandle had become stale, the NFS client should invalidate the cache/inode and should repeat LOOKUP. Looking at the packet capture when the failure occurs shows that there were two subsequent ACCESS() calls with the same filehandle and both fails with -ESTALE error. I have tested the fix below. Now the client issue a LOOKUP after the ACCESS() call fails with -ESTALE. If all this makes sense to you, can you consider this for inclusion? Thanks, If the server returns an -ESTALE error due to stale filehandle in response to an ACCESS() call, we need to invalidate the cache and inode so that LOOKUP() can be retried. Without this change, the nfs client retries ACCESS() with the same filehandle, fails again and could lead to temporary failure of applications running on nfs mounted home. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\| * \|	NLM: Fix GRANT callback address comparison when IPv6 is enabled	Chuck Lever	2009-03-10	1	-1/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NFS mount command may pass an AF_INET server address to lockd. If lockd happens to be using a PF_INET6 listener, the nlm_cmp_addr() in nlmclnt_grant() will fail to match requests from that host because they will all have a mapped IPv4 AF_INET6 address. Adopt the same solution used in nfs_sockaddr_match_ipaddr() for NFSv4 callbacks: if either address is AF_INET, map it to an AF_INET6 address before doing the comparison. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\| * \|	NFSv3: Fix posix ACL code	Trond Myklebust	2009-03-10	2	-27/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a memory leak due to allocation in the XDR layer. In cases where the RPC call needs to be retransmitted, we end up allocating new pages without clearing the old ones. Fix this by moving the allocation into nfs3_proc_setacls(). Also fix an issue discovered by Kevin Rudd, whereby the amount of memory reserved for the acls in the xdr_buf->head was miscalculated, and causing corruption. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\| * \|	NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)	Trond Myklebust	2009-03-10	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changeset ea31a4437c59219bf3ea946d58984b01a45a289c (nfs: Fix misparsing of nfsv4 fs_locations attribute) causes the mountpath that is calculated at the beginning of try_location() to be clobbered when we later strncpy a non-nul terminated hostname using an incorrect buffer length. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>