summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* fs: change return values from -EACCES to -EPERMZhao Hongjiang2013-02-266-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs/exec.c: make bprm_mm_init() staticYuanhan Liu2013-02-261-1/+1
| | | | | | | | There is only one user of bprm_mm_init, and it's inside the same file. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2/dlm: use GFP_ATOMIC inside a spin_lockDan Carpenter2013-02-261-1/+1
| | | | | | | | | | | My static checker complains that this is called with a spin_lock held in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason we have not received any bug reports about this is that recovery is not a common operation. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2: fix possible use-after-free with AIOJan Kara2013-02-261-1/+1
| | | | | | | | | | | | | | | | Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code pathSunil Mushran2013-02-261-1/+1
| | | | | | | | | Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init. Reported-and-Tested-by: Vincent Etienne <vetienne@aprogsys.com> Signed-off-by: Sunil Mushran <sunil.mushran@gmail.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zeroAl Viro2013-02-262-3/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* export kernel_write(), convert open-coded instancesAl Viro2013-02-262-7/+4
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: encode_fh: return FILEID_INVALID if invalid fid_typeNamjae Jeon2013-02-2610-19/+19
| | | | | | | | | | | | | This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Vivek Trivedi <t.vivek@samsung.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill f_vfsmntAl Viro2013-02-263-4/+4
| | | | | | very few users left... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry opJeff Layton2013-02-267-13/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfsd: handle vfs_getattr errors in acl protocolJ. Bruce Fields2013-02-264-9/+24
| | | | | | | | | | | We're currently ignoring errors from vfs_getattr. The correct thing to do is to do the stat in the main service procedure not in the response encoding. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch vfs_getattr() to struct pathAl Viro2013-02-269-30/+34
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ceph: prepopulate inodes only when request is abortedSage Weil2013-02-261-2/+38
| | | | | | | | | | If r_aborted is true, we do not hold the dir i_mutex, and cannot touch the dcache. However, we still need to update the inodes with the state returned by the MDS. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* d_hash_and_lookup(): export, switch open-coded instancesAl Viro2013-02-264-24/+18
| | | | | | | | * calling conventions change - ERR_PTR() is returned on ->d_hash() errors; NULL is just for dcache miss now. * exported, open-coded instances in ncpfs and cifs converted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()Al Viro2013-02-263-35/+33
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: split dropping the acls from v9fs_set_create_acl()Al Viro2013-02-263-21/+28
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: switch v9fs_acl_chmod() from dentry to inode+fidAl Viro2013-02-263-16/+13
| | | | | | caller has both, might as well pass them explicitly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: switch v9fs_set_acl() from dentry to fidAl Viro2013-02-261-5/+12
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: lift the call of set_cached_acl() into the callers of v9fs_set_acl()Al Viro2013-02-261-4/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: add fid-based variant of v9fs_xattr_set()Al Viro2013-02-262-15/+20
| | | | | | ... making v9fs_xattr_set() a wrapper for it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* hugetlb_file_setup(): use d_alloc_pseudo()Al Viro2013-02-261-4/+15
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* hostfs: directory methods have no business in non-directory inode_operationsAl Viro2013-02-221-8/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* __d_materialise_unique() is too genericAl Viro2013-02-221-14/+5
| | | | | | | Its first argument is always non-root, while the second one is always root. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: Fix possible use-after-free with AIOJan Kara2013-02-221-1/+1
| | | | | | | | | | | | | | | Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. CC: Christoph Hellwig <hch@infradead.org> CC: Jens Axboe <axboe@kernel.dk> CC: Jeff Moyer <jmoyer@redhat.com> CC: stable@vger.kernel.org Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* constify d_lookup() argumentsAl Viro2013-02-221-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* constify __d_lookup() argumentsAl Viro2013-02-221-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* lookup_slow: get rid of name argumentAl Viro2013-02-221-4/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* lookup_fast: get rid of name argumentAl Viro2013-02-221-5/+5
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of name and type arguments of walk_component()Al Viro2013-02-221-10/+8
| | | | | | ... always can be found in nameidata now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* link_path_walk(): move assignments to nd->last/nd->last_type upAl Viro2013-02-221-12/+10
| | | | | | ... and clean the main loop a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* vfs: remove d_path_with_unreachableJeff Layton2013-02-221-31/+0
| | | | | | | The last caller was removed >2 years ago in commit 7b2a69ba7. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: Preserve error code in get_empty_filp(), part 2Anatol Pomozov2013-02-224-15/+11
| | | | | | | | | | | | | | | | | | | | | Allocating a file structure in function get_empty_filp() might fail because of several reasons: - not enough memory for file structures - operation is not allowed - user is over its limit Currently the function returns NULL in all cases and we loose the exact reason of the error. All callers of get_empty_filp() assume that the function can fail with ENFILE only. Return error through pointer. Change all callers to preserve this error code. [AV: cleaned up a bit, carved the get_empty_filp() part out into a separate commit (things remaining here deal with alloc_file()), removed pipe(2) behaviour change] Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* propagate error from get_empty_filp() to its callersAl Viro2013-02-223-30/+28
| | | | | | Based on parts from Anatol's patch (the rest is the next commit). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: file_inode(file)Al Viro2013-02-22158-375/+373
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* mount: consolidate permission checksAl Viro2013-02-221-33/+7
| | | | | | ... and ask for global CAP_SYS_ADMIN only for superblock-level remounts Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of unprotected dereferencing of mnt->mnt_nsAl Viro2013-02-221-12/+17
| | | | | | | | | | It's safe only under namespace_sem or vfsmount_lock; all places in fs/namespace.c that want mnt->mnt_ns->user_ns actually want to use current->nsproxy->mnt_ns->user_ns (note the calls of check_mnt() in there). Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-01-167-42/+64
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs bugfixes from Ben Myers: - fix(es) for compound buffers - fix for dquot soft timer asserts due to overflow of d_blk_softlimit - fix for regression in dir v2 code introduced in commit 20f7e9f3726a ("xfs: factor dir2 block read operations") * tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfs: xfs: recalculate leaf entry pointer after compacting a dir2 block xfs: remove int casts from debug dquot soft limit timer asserts xfs: fix the multi-segment log buffer format xfs: fix segment in xfs_buf_item_format_segment xfs: rename bli_format to avoid confusion with bli_formats xfs: use b_maps[] for discontiguous buffers
| * xfs: recalculate leaf entry pointer after compacting a dir2 blockEric Sandeen2013-01-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones hit this assert when doing a compile on recent git, with CONFIG_XFS_DEBUG enabled: XFS: Assertion failed: (char *)dup - (char *)hdr == be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup)), file: fs/xfs/xfs_dir2_data.c, line: 828 Upon further digging, the tag found by xfs_dir2_data_unused_tag_p(dup) contained "2" and not the proper offset, and I found that this value was changed after the memmoves under "Use a stale leaf for our new entry." in xfs_dir2_block_addname(), i.e. memmove(&blp[mid + 1], &blp[mid], (highstale - mid) * sizeof(*blp)); overwrote it. What has happened is that the previous call to xfs_dir2_block_compact() has rearranged things; it changes btp->count as well as the blp array. So after we make that call, we must recalculate the proper pointer to the leaf entries by making another call to xfs_dir2_block_leaf_p(). Dave provided a metadump image which led to a simple reproducer (create a particular filename in the affected directory) and this resolves the testcase as well as the bug on his live system. Thanks also to dchinner for looking at this one with me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * xfs: remove int casts from debug dquot soft limit timer assertsBrian Foster2013-01-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | The int casts here make it easy to trigger an assert with a large soft limit. For example, set a >4TB soft limit on an empty volume to reproduce a (0 > -x) comparison due to an overflow of d_blk_softlimit. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * xfs: fix the multi-segment log buffer formatMark Tinguely2013-01-162-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | Per Dave Chinner suggestion, this patch: 1) Corrects the detection of whether a multi-segment buffer is still tracking data. 2) Clears all the buffer log formats for a multi-segment buffer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * xfs: fix segment in xfs_buf_item_format_segmentMark Tinguely2013-01-161-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | Not every segment in a multi-segment buffer is dirty in a transaction and they will not be outputted. The assert in xfs_buf_item_format_segment() that checks for the at least one chunk of data in the segment to be used is not necessary true for multi-segmented buffers. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * xfs: rename bli_format to avoid confusion with bli_formatsMark Tinguely2013-01-163-24/+24
| | | | | | | | | | | | | | | | | | | | Rename the bli_format structure to __bli_format to avoid accidently confusing them with the bli_formats pointer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * xfs: use b_maps[] for discontiguous buffersMark Tinguely2013-01-162-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits starting at 77c1a08 introduced a multiple segment support to xfs_buf. xfs_trans_buf_item_match() could not find a multi-segment buffer in the transaction because it was looking at the single segment block number rather than the multi-segment b_maps[0].bm.bn. This results on a recursive buffer lock that can never be satisfied. This patch: 1) Changed the remaining b_map accesses to be b_maps[0] accesses. 2) Renames the single segment b_map structure to __b_map to avoid future confusion. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* | Merge branch 'for_linus' of ↵Linus Torvalds2013-01-162-2/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 and udf fixes from Jan Kara: "One ext3 performance regression fix and one udf regression fix (oops on interrupted mount)." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: UDF: Fix a null pointer dereference in udf_sb_free_partitions jbd: don't wake kjournald unnecessarily
| * | UDF: Fix a null pointer dereference in udf_sb_free_partitionsNamjae Jeon2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a regression caused by commit bff943af6fe "udf: Fix memory leak when mounting" due to which it was triggering a kernel null point dereference in case of interrupted mount OR when allocating memory to sbi->s_partmaps failed in function udf_sb_alloc_partition_maps. Reported-and-tested-by: James Hogan <james@albanarts.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | jbd: don't wake kjournald unnecessarilyEric Sandeen2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't send an extra wakeup to kjournald in the case where we already have the proper target in j_commit_request, i.e. that commit has already been requested for commit. commit d9b0193 "jbd: fix fsync() tid wraparound bug" changed the logic leading to a wakeup, but it caused some extra wakeups which were found to lead to a measurable performance regression. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | | vfs: add missing virtual cache flush after editing partial pagesLinus Torvalds2013-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Morton pointed this out a month ago, and then I completely forgot about it. If we read a partial last page of a block device, we will zero out the end of the page, but since that page can then be mapped into user space, we should also make sure to flush the cache on architectures that have virtual caches. We have the flush_dcache_page() function for this, so use it. Now, in practice this really never matters, because nobody sane uses virtual caches to begin with, and they largely exist on old broken RISC arhitectures. And even if you did run on one of those obsolete CPU's, the whole "mmap and access the last partial page of a block device" behavior probably doesn't actually exist. The normal IO functions (read/write) will never see the zeroed-out part of the page that migth not be coherent in the cache, because they honor the size of the device. So I'm marking this for stable (3.7 only), but I'm not sure anybody will ever care. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # 3.7 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'driver-core-3.8-rc3' of ↵Linus Torvalds2013-01-141-1/+1
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are two patches for 3.8-rc3. One removes the __dev* defines from init.h now that all usages of it are gone from your tree. The other fix is for debugfs's paramater that was using the wrong base for the option. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: debugfs: convert gid= argument from decimal, not octal Remove __dev* markings from init.h
| * | debugfs: convert gid= argument from decimal, not octalDave Reisner2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch technically breaks userspace, but I suspect that anyone who actually used this flag would have encountered this brokenness, declared it lunacy, and already sent a patch. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Reviewed-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | fs/exec.c: work around icc miscompilationXi Wang2013-01-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tricky problem is this check: if (i++ >= max) icc (mis)optimizes this check as: if (++i > max) The check now becomes a no-op since max is MAX_ARG_STRINGS (0x7FFFFFFF). This is "allowed" by the C standard, assuming i++ never overflows, because signed integer overflow is undefined behavior. This optimization effectively reverts the previous commit 362e6663ef23 ("exec.c, compat.c: fix count(), compat_count() bounds checking") that tries to fix the check. This patch simply moves ++ after the check. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud