op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	fix oops in fs/9p late mount failure	Al Viro	2010-01-26	1	-1/+2
\| \| \| \| \| \| \|	if 9P ->get_sb() fails late (at root inode or root dentry allocation), we'll hit its ->kill_sb() with NULL ->s_root Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fix leak in romfs_fill_super()	Al Viro	2010-01-26	1	-0/+1
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	get rid of pointless checks after simple_pin_fs()	Al Viro	2010-01-26	1	-9/+2
\| \| \| \| \| \|	if we'd just got success from it, vfsmount won't be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Fix failure exits in bfs_fill_super()	Al Viro	2010-01-26	1	-22/+21
\| \| \| \| \| \|	double iput(), leaks... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fix affs parse_options()	Al Viro	2010-01-26	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	Error handling in that sucker got broken back in 2003. If function returns 0 on failure, it's not nice to add return -EINVAL into it. Adding return 1 on other failure exits is also not a good thing (and yes, original success exits with 1 and some of failure exits with 0 are still there; so's the original logics in callers). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Fix remount races with symlink handling in affs	Al Viro	2010-01-26	4	-8/+25
\| \| \| \| \| \| \| \|	A couple of fields in affs_sb_info is used in follow_link() and symlink() for handling AFFS "absolute" symlinks. Need locking against affs_remount() updates. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Fix a leak in affs_fill_super()	Al Viro	2010-01-26	1	-0/+2
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fnctl: f_modown should call write_lock_irqsave/restore	Greg Kroah-Hartman	2010-01-26	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown() should call write_lock_irqsave instead of just write_lock_irq so that because a caller could have a spinlock held and it would not be good to renable interrupts. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge branch 'for_linus' of ↵	Linus Torvalds	2010-01-25	3	-35/+77
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag ext4: Fix quota accounting error with fallocate ext4: Handle -EDQUOT error on write
\| *	ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag	Aneesh Kumar K.V	2010-01-15	3	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should update reserve space if it is delalloc buffer and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag. So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
\| *	ext4: Fix quota accounting error with fallocate	Aneesh Kumar K.V	2010-01-25	3	-13/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we fallocate a region of the file which we had recently written, and which is still in the page cache marked as delayed allocated blocks we need to make sure we don't do the quota update on writepage path. This is because the needed quota updated would have already be done by fallocate. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
\| *	ext4: Handle -EDQUOT error on write	Aneesh Kumar K.V	2010-01-22	1	-14/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to release the journal before we do a write_inode. Otherwise we could deadlock. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* \|	eventfd - allow atomic read and waitqueue remove	Davide Libenzi	2010-01-25	1	-15/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KVM needs a wait to atomically remove themselves from the eventfd ->poll() wait queue head, in order to handle correctly their IRQfd deassign operation. This patch introduces such API, plus a way to read an eventfd from its context. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@redhat.com>
* \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6	Linus Torvalds	2010-01-21	1	-0/+3
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: tty: fix race in tty_fasync serial: serial_cs: oxsemi quirk breaks resume serial: imx: bit &/\| confusion serial: Fix crash if the minimum rate of the device is > 9600 baud serial-core: resume serial hardware with no_console_suspend serial: 8250_pnp: use wildcard for serial Wacom tablets nozomi: quick fix for the close/close bug compat_ioctl: Supress "unknown cmd" message on serial /dev/console
\| * \|	compat_ioctl: Supress "unknown cmd" message on serial /dev/console	Atsushi Nemoto	2010-01-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After the commit fb07a5f8 ("compat_ioctl: remove all VT ioctl handling"), I got this error message on 64-bit mips kernel with 32-bit busybox userland: ioctl32(init:1): Unknown cmd fd(0) cmd(00005600){t:'V';sz:0} arg(7fd76480) on /dev/console The cmd 5600 is VT_OPENQRY. The busybox's init issues this ioctl to know vt-console or serial-console. If the console was serial console, VT ioctls are not handled by the serial driver. And by quick search, I found some programs using VT_GETMODE to check vt-console is available or not. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* \| \|	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2010-01-21	1	-1/+1
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: fs/bio.c: fix shadows sparse warning drbd: The kernel code is now equivalent to out of tree release 8.3.7 drbd: Allow online resizing of DRBD devices while peer not reachable (needs to be explicitly forced) drbd: Don't go into StandAlone mode when authentification failes because of network error drivers/block/drbd/drbd_receiver.c: correct NULL test cfq-iosched: Respect ioprio_class when preempting genhd: overlapping variable definition block: removed unused as_io_context DM: Fix device mapper topology stacking block: bdev_stack_limits wrapper block: Fix discard alignment calculation and printing block: Correct handling of bottom device misaligment drbd: check on CONFIG_LBDAF, not LBD drivers/block/drbd: Correct NULL test drbd: Silenced an assert that could triggered after changing write ordering method drbd: Kconfig fix drbd: Fix for a race between IO and a detach operation [Bugz 262] drbd: Use drbd_crypto_is_hash() instead of an open coded check
\| * \| \|	fs/bio.c: fix shadows sparse warning	Thiago Farina	2010-01-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fs/bio.c:81:33: warning: symbol 'bslab' shadows an earlier one fs/bio.c:74:25: originally declared here Signed-off-by: Thiago Farina <tfransosi@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* \| \| \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-01-21	4	-50/+109
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: ecryptfs: use after free ecryptfs: Eliminate useless code ecryptfs: fix interpose/interpolate typos in comments ecryptfs: pass matching flags to interpose as defined and used there ecryptfs: remove unnecessary d_drop calls in ecryptfs_link ecryptfs: don't ignore return value from lock_rename ecryptfs: initialize private persistent file before dereferencing pointer eCryptfs: Remove mmap from directory operations eCryptfs: Add getattr function eCryptfs: Use notify_change for truncating lower inodes
\| * \| \| \|	ecryptfs: use after free	Dan Carpenter	2010-01-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "full_alg_name" variable is used on a couple error paths, so we shouldn't free it until the end. Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: stable@kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: Eliminate useless code	Julia Lawall	2010-01-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The variable lower_dentry is initialized twice to the same (side effect-free) expression. Drop one initialization. A simplified version of the semantic match that finds this problem is: (http://coccinelle.lip6.fr/) // <smpl> @forall@ idexpression x; identifier f!=ERR_PTR; @@ x = f(...) ... when != x ( x = f(...,<+...x...+>,...) \| x = f(...) ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: fix interpose/interpolate typos in comments	Erez Zadok	2010-01-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Acked-by: Dustin Kirkland <kirkland@canonical.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: pass matching flags to interpose as defined and used there	Erez Zadok	2010-01-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ecryptfs_interpose checks if one of the flags passed is ECRYPTFS_INTERPOSE_FLAG_D_ADD, defined as 0x00000001 in ecryptfs_kernel.h. But the only user of ecryptfs_interpose to pass a non-zero flag to it, has hard-coded the value as "1". This could spell trouble if any of these values changes in the future. Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: remove unnecessary d_drop calls in ecryptfs_link	Erez Zadok	2010-01-19	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unnecessary because it would unhash perfectly valid dentries, causing them to have to be re-looked up the next time they're needed, which presumably is right after. Signed-off-by: Aseem Rastogi <arastogi@cs.sunysb.edu> Signed-off-by: Shrikar archak <shrikar84@gmail.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Saumitra Bhanage <sbhanage@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: don't ignore return value from lock_rename	Erez Zadok	2010-01-19	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	ecryptfs: initialize private persistent file before dereferencing pointer	Erez Zadok	2010-01-19	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ecryptfs_open dereferences a pointer to the private lower file (the one stored in the ecryptfs inode), without checking if the pointer is NULL. Right afterward, it initializes that pointer if it is NULL. Swap order of statements to first initialize. Bug discovered by Duckjin Kang. Signed-off-by: Duckjin Kang <fromdj2k@gmail.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	eCryptfs: Remove mmap from directory operations	Tyler Hicks	2010-01-19	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adrian reported that mkfontscale didn't work inside of eCryptfs mounts. Strace revealed the following: open("./", O_RDONLY\|O_NONBLOCK\|O_LARGEFILE\|O_DIRECTORY\|O_CLOEXEC) = 3 fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC) open("./fonts.scale", O_WRONLY\|O_CREAT\|O_TRUNC, 0666) = 4 getdents(3, /* 80 entries */, 32768) = 2304 open("./.", O_RDONLY) = 5 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0 fstat64(5, {st_mode=S_IFDIR\|0755, st_size=16384, ...}) = 0 mmap2(NULL, 16384, PROT_READ, MAP_PRIVATE, 5, 0) = 0xb7fcf000 close(5) = 0 --- SIGBUS (Bus error) @ 0 (0) --- +++ killed by SIGBUS +++ The mmap2() on a directory was successful, resulting in a SIGBUS signal later. This patch removes mmap() from the list of possible ecryptfs_dir_fops so that mmap() isn't possible on eCryptfs directory files. https://bugs.launchpad.net/ecryptfs/+bug/400443 Reported-by: Adrian C. <anrxc@sysphere.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	eCryptfs: Add getattr function	Tyler Hicks	2010-01-19	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i_blocks field of an eCryptfs inode cannot be trusted, but generic_fillattr() uses it to instantiate the blocks field of a stat() syscall when a filesystem doesn't implement its own getattr(). Users have noticed that the output of du is incorrect on newly created files. This patch creates ecryptfs_getattr() which calls into the lower filesystem's getattr() so that eCryptfs can use its kstat.blocks value after calling generic_fillattr(). It is important to note that the block count includes the eCryptfs metadata stored in the beginning of the lower file plus any padding used to fill an extent before encryption. https://bugs.launchpad.net/ecryptfs/+bug/390833 Reported-by: Dominic Sacré <dominic.sacre@gmx.de> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
\| * \| \| \|	eCryptfs: Use notify_change for truncating lower inodes	Tyler Hicks	2010-01-19	1	-32/+67
\| \| \|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When truncating inodes in the lower filesystem, eCryptfs directly invoked vmtruncate(). As Christoph Hellwig pointed out, vmtruncate() is a filesystem helper function, but filesystems may need to do more than just a call to vmtruncate(). This patch moves the lower inode truncation out of ecryptfs_truncate() and renames the function to truncate_upper(). truncate_upper() updates an iattr for the lower inode to indicate if the lower inode needs to be truncated upon return. ecryptfs_setattr() then calls notify_change(), using the updated iattr for the lower inode, to complete the truncation. For eCryptfs functions needing to truncate, ecryptfs_truncate() is reintroduced as a simple way to truncate the upper inode to a specified size and then truncate the lower inode accordingly. https://bugs.launchpad.net/bugs/451368 Reported-by: Christoph Hellwig <hch@lst.de> Acked-by: Dustin Kirkland <kirkland@canonical.com> Cc: ecryptfs-devel@lists.launchpad.net Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
* \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable	Linus Torvalds	2010-01-21	7	-41/+125
\|\ \ \ \ \| \|/ / / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix possible panic on unmount Btrfs: deal with NULL acl sent to btrfs_set_acl Btrfs: fix regression in orphan cleanup Btrfs: Fix race in btrfs_mark_extent_written Btrfs, fix memory leaks in error paths Btrfs: align offsets for btrfs_ordered_update_i_size btrfs: fix missing last-entry in readdir(3)
\| * \| \|	Btrfs: fix possible panic on unmount	Josef Bacik	2010-01-17	1	-13/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can race with the unmount of an fs and the stopping of a kthread where we will free the block group before we're done using it. The reason for this is because we do not hold a reference on the block group while its caching, since the allocator drops its reference once it exits or moves on to the next block group. This patch fixes the problem by taking a reference to the block group before we start caching and dropping it when we're done to make sure all accesses to the block group are safe. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	Btrfs: deal with NULL acl sent to btrfs_set_acl	Chris Mason	2010-01-17	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is legal for btrfs_set_acl to be sent a NULL acl. This makes sure we don't dereference it. A similar patch was sent by Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	Btrfs: fix regression in orphan cleanup	Josef Bacik	2010-01-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently orphan cleanup only ever gets triggered if we cross subvolumes during a lookup, which means that if we just mount a plain jane fs that has orphans in it, they will never get cleaned up. This results in panic's like these http://www.kerneloops.org/oops.php?number=1109085 where adding an orphan entry results in -EEXIST being returned and we panic. In order to fix this, we check to see on lookup if our root has had the orphan cleanup done, and if not go ahead and do it. This is easily reproduceable by running this testcase #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> #include <unistd.h> #include <stdio.h> int main(int argc, char *argv) { char data[4096]; char newdata[4096]; int fd1, fd2; memset(data, 'a', 4096); memset(newdata, 'b', 4096); while (1) { int i; fd1 = creat("file1", 0666); if (fd1 < 0) break; for (i = 0; i < 512; i++) write(fd1, data, 4096); fsync(fd1); close(fd1); fd2 = creat("file2", 0666); if (fd2 < 0) break; ftruncate(fd2, 4096 512); for (i = 0; i < 512; i++) write(fd2, newdata, 4096); close(fd2); i = rename("file2", "file1"); unlink("file1"); } return 0; } and then pulling the power on the box, and then trying to run that test again when the box comes back up. I've tested this locally and it fixes the problem. Thanks to Tomas Carnecky for helping me track this down initially. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	Btrfs: Fix race in btrfs_mark_extent_written	Yan, Zheng	2010-01-17	1	-20/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix bug reported by Johannes Hirte. The reason of that bug is btrfs_del_items is called after btrfs_duplicate_item and btrfs_del_items triggers tree balance. The fix is check that case and call btrfs_search_slot when needed. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	Btrfs, fix memory leaks in error paths	Jiri Slaby	2010-01-17	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stanse found 2 memory leaks in relocate_block_group and __btrfs_map_block. cluster and multi are not freed/assigned on all paths. Fix that. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	Btrfs: align offsets for btrfs_ordered_update_i_size	Yan, Zheng	2010-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some callers of btrfs_ordered_update_i_size can now pass in a NULL for the ordered extent to update against. This makes sure we properly align the offset they pass in when deciding how much to bump the on disk i_size. Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| * \| \|	btrfs: fix missing last-entry in readdir(3)	Jan Engelhardt	2010-01-17	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd) commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d Author: Jan Engelhardt <jengelh@medozas.de> Date: Wed Dec 9 22:57:36 2009 +0100 Btrfs: fix missing last-entry in readdir(3) When one does a 32-bit readdir(3), the last entry of a directory is missing. This is however not due to passing a large value to filldir, but it seems to have to do with glibc doing telldir or something quirky. In any case, this patch fixes it in practice. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* \| \| \|	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs	Linus Torvalds	2010-01-18	8	-130/+201
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: xfs_swap_extents needs to handle dynamic fork offsets xfs: fix missing error check in xfs_rtfree_range xfs: fix stale inode flush avoidance xfs: Remove inode iolock held check during allocation xfs: reclaim all inodes by background tree walks xfs: Avoid inodes in reclaim when flushing from inode cache xfs: reclaim inodes under a write lock
\| * \| \| \|	xfs: xfs_swap_extents needs to handle dynamic fork offsets	Dave Chinner	2010-01-15	1	-16/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When swapping extents, we can corrupt inodes by swapping data forks that are in incompatible formats. This is caused by the two indoes having different fork offsets due to the presence of an attribute fork on an attr2 filesystem. xfs_fsr tries to be smart about setting the fork offset, but the trick it plays only works on attr1 (old fixed format attribute fork) filesystems. Changing the way xfs_fsr sets up the attribute fork will prevent this situation from ever occurring, so in the kernel code we can get by with a preventative fix - check that the data fork in the defragmented inode is in a format valid for the inode it is being swapped into. This will lead to files that will silently and potentially repeatedly fail defragmentation, so issue a warning to the log when this particular failure occurs to let us know that xfs_fsr needs updating/fixing. To help identify how to improve xfs_fsr to avoid this issue, add trace points for the inodes being swapped so that we can determine why the swap was rejected and to confirm that the code is making the right decisions and modifications when swapping forks. A further complication is even when the swap is allowed to proceed when the fork offset is different between the two inodes then value for the maximum number of extents the data fork can hold can be wrong. Make sure these are also set correctly after the swap occurs. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: fix missing error check in xfs_rtfree_range	Dave Chinner	2010-01-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When xfs_rtfind_forw() returns an error, the block is returned uninitialised. xfs_rtfree_range() is not checking the error return, so could be using an uninitialised block number for modifying bitmap summary info. The problem was found by gcc when compiling the userspace libxfs code - it is an copy of the kernel code with the exact same bug. gcc gives an uninitialised variable warning on the userspace code but not on the kernel code. You gotta love the consistency (Mmmm, slightly chewy today!). Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: fix stale inode flush avoidance	Dave Chinner	2010-01-15	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When reclaiming stale inodes, we need to guarantee that inodes are unpinned before returning with a "clean" status. If we don't we can reclaim inodes that are pinned, leading to use after free in the transaction subsystem as transactions complete. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: Remove inode iolock held check during allocation	Dave Chinner	2010-01-15	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lockdep complains about a the lock not being initialised as we do an ASSERT based check that the lock is not held before we initialise it to catch inodes freed with the lock held. lockdep does this check for us in the lock initialisation code, so remove the ASSERT to stop the lockdep warning. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: reclaim all inodes by background tree walks	Dave Chinner	2010-01-15	1	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We cannot do direct inode reclaim without taking the flush lock to ensure that we do not reclaim an inode under IO. We check the inode is clean before doing direct reclaim, but this is not good enough because the inode flush code marks the inode clean once it has copied the in-core dirty state to the backing buffer. It is the flush lock that determines whether the inode is still under IO, even though it is marked clean, and the inode is still required at IO completion so we can't reclaim it even though it is clean in core. Hence the requirement that we need to take the flush lock even on clean inodes because this guarantees that the inode writeback IO has completed and it is safe to reclaim the inode. With delayed write inode flushing, we coul dend up waiting a long time on the flush lock even for a clean inode. The background reclaim already handles this efficiently, so avoid all the problems by killing the direct reclaim path altogether. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: Avoid inodes in reclaim when flushing from inode cache	Dave Chinner	2010-01-15	1	-13/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reclaim code will handle flushing of dirty inodes before reclaim occurs, so avoid them when determining whether an inode is a candidate for flushing to disk when walking the radix trees. This is based on a test patch from Christoph Hellwig. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
\| * \| \| \|	xfs: reclaim inodes under a write lock	Dave Chinner	2010-01-15	3	-87/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the inode tree reclaim walk exclusive to avoid races with concurrent sync walkers and lookups. This is a version of a patch posted by Christoph Hellwig that avoids all the code duplication. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* \| \| \| \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-01-17	8	-77/+51
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: do_add_mount() should sanitize mnt_flags CIFS shouldn't make mountpoints shrinkable mnt_flags fixes in do_remount() attach_recursive_mnt() needs to hold vfsmount_lock over set_mnt_shared() may_umount() needs namespace_sem Fix configfs leak Fix the -ESTALE handling in do_filp_open() ecryptfs: Fix refcnt leak on ecryptfs_follow_link() error path Fix ACC_MODE() for real Unrot uml mconsole a bit hppfs: handle ->put_link() Kill 9p readlink() fix autofs/afs/etc. magic mountpoint breakage
\| * \| \| \| \|	do_add_mount() should sanitize mnt_flags	Al Viro	2010-01-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MNT_WRITE_HOLD shouldn't leak into new vfsmount and neither should MNT_SHARED (the latter will be set properly, along with the rest of shared-subtree data structures) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \| \| \|	CIFS shouldn't make mountpoints shrinkable	Al Viro	2010-01-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \| \| \|	mnt_flags fixes in do_remount()	Al Viro	2010-01-16	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* need vfsmount_lock over modifying it * need to preserve MNT_SHARED/MNT_UNBINDABLE Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \| \| \|	attach_recursive_mnt() needs to hold vfsmount_lock over set_mnt_shared()	Al Viro	2010-01-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	race in mnt_flags update Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \| \| \|	may_umount() needs namespace_sem	Al Viro	2010-01-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	otherwise it races with clone_mnt() changing mnt_share/mnt_slaves Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>