summaryrefslogtreecommitdiffstats
path: root/fs/ioctl.c
Commit message (Collapse)AuthorAgeFilesLines
* vfs: cap dedupe request structure size at PAGE_SIZEDarrick J. Wong2016-09-151-0/+4
| | | | | | | | | | | | | | | | | | | | | Kirill A Shutemov reports that the kernel doesn't try to cap dest_count in any way, and uses the number to allocate kernel memory. This causes high order allocation warnings in the kernel log if someone passes in a big enough value. We should clamp the allocation at PAGE_SIZE to avoid stressing the VM. The two existing users of the dedupe ioctl never send more than 120 requests, so we can safely clamp dest_range at PAGE_SIZE, because with 4k pages we can handle up to 127 dedupe candidates. Given the max extent length of 16MB, we can end up doing 2GB of IO which is plenty. [ Note: the "offsetof()" can't overflow, because 'count' is just a 16-bit integer. That's not obvious in the limited context of the patch, so I'm noting it here because it made me go look. - Linus ] Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs: fix return type of ioctl_file_dedupe_rangeDarrick J. Wong2016-09-151-1/+1
| | | | | | | | | | All the VFS functions in the dedupe ioctl path return int status, so the ioctl handler ought to as well. Found by Coverity, CID 1350952. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs: ioctl: prevent double-fetch in dedupe ioctlScott Bauer2016-07-281-0/+1
| | | | | | | | | This prevents a double-fetch from user space that can lead to to an undersized allocation and heap overflow. Fixes: 54dbc1517237 ("vfs: hoist the btrfs deduplication ioctl to the vfs") Signed-off-by: Scott Bauer <sbauer@plzdonthack.me> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* wrappers for ->i_mutex accessAl Viro2016-01-221-2/+2
| | | | | | | | | | | parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'work.copy_file_range' of ↵Linus Torvalds2016-01-121-0/+67
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs copy_file_range updates from Al Viro: "Several series around copy_file_range/CLONE" * 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: btrfs: use new dedupe data function pointer vfs: hoist the btrfs deduplication ioctl to the vfs vfs: wire up compat ioctl for CLONE/CLONE_RANGE cifs: avoid unused variable and label nfsd: implement the NFSv4.2 CLONE operation nfsd: Pass filehandle to nfs4_preprocess_stateid_op() vfs: pull btrfs clone API to vfs layer locks: new locks_mandatory_area calling convention vfs: Add vfs_copy_file_range() support for pagecache copies btrfs: add .copy_file_range file operation x86: add sys_copy_file_range to syscall tables vfs: add copy_file_range syscall and vfs helper
| * vfs: hoist the btrfs deduplication ioctl to the vfsDarrick J. Wong2016-01-011-0/+38
| | | | | | | | | | | | | | | | Hoist the btrfs EXTENT_SAME ioctl up to the VFS and make the name more systematic (FIDEDUPERANGE). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * vfs: pull btrfs clone API to vfs layerChristoph Hellwig2015-12-071-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The btrfs clone ioctls are now adopted by other file systems, with NFS and CIFS already having support for them, and XFS being under active development. To avoid growth of various slightly incompatible implementations, add one to the VFS. Note that clones are different from file copies in several ways: - they are atomic vs other writers - they support whole file clones - they support 64-bit legth clones - they do not allow partial success (aka short writes) - clones are expected to be a fast metadata operation Because of that it would be rather cumbersome to try to piggyback them on top of the recent clone_file_range infrastructure. The converse isn't true and the clone_file_range system call could try clone file range as a first attempt to copy, something that further patches will enable. Based on earlier work from Peng Tao. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | compat_ioctl: don't pass fd around when not neededAl Viro2016-01-081-2/+2
|/ | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fsioctl.c: make generic_block_fiemap() signal-tolerantDmitry Monakhov2015-02-101-0/+5
| | | | | | | | | | | | | | | | | | | | __generic_block_fiemap may spin very long time for large sparse files. Without this patch an unprivileged user may abuse system resources simply by spawning a vast number of unkilable busyloops (works on ext2/ext3): truncate --size 1T test for ((i=0;i<1024;i++)) do filefrag test > /dev/null & done Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2014-12-161-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Bruce Fields: "A comparatively quieter cycle for nfsd this time, but still with two larger changes: - RPC server scalability improvements from Jeff Layton (using RCU instead of a spinlock to find idle threads). - server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna Schumaker, enabling fallocate on new clients" * 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd4: fix xdr4 count of server in fs_location4 nfsd4: fix xdr4 inclusion of escaped char sunrpc/cache: convert to use string_escape_str() sunrpc: only call test_bit once in svc_xprt_received fs: nfsd: Fix signedness bug in compare_blob sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt sunrpc: convert to lockless lookup of queued server threads sunrpc: fix potential races in pool_stats collection sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it sunrpc: require svc_create callers to pass in meaningful shutdown routine sunrpc: have svc_wake_up only deal with pool 0 sunrpc: convert sp_task_pending flag to use atomic bitops sunrpc: move rq_cachetype field to better optimize space sunrpc: move rq_splice_ok flag into rq_flags sunrpc: move rq_dropme flag into rq_flags sunrpc: move rq_usedeferral flag to rq_flags sunrpc: move rq_local field to rq_flags sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it nfsd: minor off by one checks in __write_versions() sunrpc: release svc_pool_map reference when serv allocation fails ...
| * VFS: Rename do_fallocate() to vfs_fallocate()Anna Schumaker2014-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | This function needs to be exported so it can be used by the NFSD module when responding to the new ALLOCATE and DEALLOCATE operations in NFS v4.2. Christoph Hellwig suggested renaming the function to stay consistent with how other vfs functions are named. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | fs: add freeze_super/thaw_super fs hooksBenjamin Marzinski2014-11-171-1/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, freezing a filesystem involves calling freeze_super, which locks sb->s_umount and then calls the fs-specific freeze_fs hook. This makes it hard for gfs2 (and potentially other cluster filesystems) to use the vfs freezing code to do freezes on all the cluster nodes. In order to communicate that a freeze has been requested, and to make sure that only one node is trying to freeze at a time, gfs2 uses a glock (sd_freeze_gl). The problem is that there is no hook for gfs2 to acquire this lock before calling freeze_super. This means that two nodes can attempt to freeze the filesystem by both calling freeze_super, acquiring the sb->s_umount lock, and then attempting to grab the cluster glock sd_freeze_gl. Only one will succeed, and the other will be stuck in freeze_super, making it impossible to finish freezing the node. To solve this problem, this patch adds the freeze_super and thaw_super hooks. If a filesystem implements these hooks, they are called instead of the vfs freeze_super and thaw_super functions. This means that every filesystem that implements these hooks must call the vfs freeze_super and thaw_super functions itself within the hook function to make use of the vfs freezing code. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* file->f_op is never NULL...Al Viro2013-10-241-2/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: file_inode(file)Al Viro2013-02-221-6/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch simple cases of fget_light to fdgetAl Viro2012-09-261-16/+9
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: reduce the use of module.h wherever possiblePaul Gortmaker2012-02-281-1/+1
| | | | | | | | | | For files only using THIS_MODULE and/or EXPORT_SYMBOL, map them onto including export.h -- or if the file isn't even using those, then just delete the include. Fix up any implicit include dependencies that were being masked by module.h along the way. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* vfs: fix up ENOIOCTLCMD error handlingLinus Torvalds2012-01-051-1/+1
| | | | | | | | | | | | | | | | | | | | | We're doing some odd things there, which already messes up various users (see the net/socket.c code that this removes), and it was going to add yet more crud to the block layer because of the incorrect error code translation. ENOIOCTLCMD is not an error return that should be returned to user mode from the "ioctl()" system call, but it should *not* be translated as EINVAL ("Invalid argument"). It should be translated as ENOTTY ("Inappropriate ioctl for device"). That EINVAL confusion has apparently so permeated some code that the block layer actually checks for it, which is sad. We continue to do so for now, but add a big comment about how wrong that is, and we should remove it entirely eventually. In the meantime, this tries to keep the changes localized to just the EINVAL -> ENOTTY fix, and removing code that makes it harder to do the right thing. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs: cleanup do_vfs_ioctl()Namhyung Kim2011-03-211-13/+8
| | | | | | | | | | Move declaration of 'inode' to beginning of the function. Since it is referenced directly or indirectly (in case of FIFREEZE/FITHAW/ FS_IOC_FIEMAP) it's not harmful IMHO. And remove unnecessary casts using 'argp' instead. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: make block fiemap mapping length at least blocksize longJosef Bacik2011-02-021-0/+7
| | | | | | | | | | | | | | Some filesystems don't deal well with being asked to map less than blocksize blocks (GFS2 for example). Since we are always mapping at least blocksize sections anyway, just make sure len is at least as big as a blocksize so we don't trip up any filesystems. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: fix address space warnings in ioctl_fiemap()Namhyung Kim2011-01-171-5/+5
| | | | | | | | | | | | | | | | | | | | | | The fi_extents_start field of struct fiemap_extent_info is a user pointer but was not marked as __user. This makes sparse emit following warnings: CHECK fs/ioctl.c fs/ioctl.c:114:26: warning: incorrect type in argument 1 (different address spaces) fs/ioctl.c:114:26: expected void [noderef] <asn:1>*dst fs/ioctl.c:114:26: got struct fiemap_extent *[assigned] dest fs/ioctl.c:202:14: warning: incorrect type in argument 1 (different address spaces) fs/ioctl.c:202:14: expected void const volatile [noderef] <asn:1>*<noident> fs/ioctl.c:202:14: got struct fiemap_extent *[assigned] fi_extents_start fs/ioctl.c:212:27: warning: incorrect type in argument 1 (different address spaces) fs/ioctl.c:212:27: expected void [noderef] <asn:1>*dst fs/ioctl.c:212:27: got char *<noident> Also add 'ufiemap' variable to eliminate unnecessary casts. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for_linus' of ↵Linus Torvalds2010-11-191-39/+0
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard fs: Do not dispatch FITRIM through separate super_operation ext4: ext4_fill_super shouldn't return 0 on corruption jbd2: fix /proc/fs/jbd2/<dev> when using an external journal ext4: missing unlock in ext4_clear_request_list() ext4: fix setting random pages PageUptodate
| * fs: Do not dispatch FITRIM through separate super_operationLukas Czerner2010-11-191-39/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | There was concern that FITRIM ioctl is not common enough to be included in core vfs ioctl, as Christoph Hellwig pointed out there's no real point in dispatching this out to a separate vector instead of just through ->ioctl. So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs from super_operation structure. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* | BKL: remove extraneous #include <smp_lock.h>Arnd Bergmann2010-11-171-1/+0
|/ | | | | | | | | | The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: Add FITRIM ioctlLukas Czerner2010-10-271-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds an filesystem independent ioctl to allow implementation of file system batched discard support. I takes fstrim_range structure as an argument. fstrim_range is definec in the include/fs.h and its definition is as follows. struct fstrim_range { start; len; minlen; } start - first Byte to trim len - number of Bytes to trim from start minlen - minimum extent length to trim, free extents shorter than this number of Bytes will be ignored. This will be rounded up to fs block size. It is also possible to specify NULL as an argument. In this case the arguments will set itself as follows: start = 0; len = ULLONG_MAX; minlen = 0; So it will trim the whole file system at one run. After the FITRIM is done, the number of actually discarded Bytes is stored in fstrim_range.len to give the user better insight on how much storage space has been really released for wear-leveling. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* bkl: Remove locked .ioctl file operationArnd Bergmann2010-08-141-14/+4
| | | | | | | | | | The last user is gone, so we can safely remove this Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: John Kacur <jkacur@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* Introduce freeze_super and thaw_super for the fsfreeze ioctlJosef Bacik2010-05-211-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | Currently the way we do freezing is by passing sb>s_bdev to freeze_bdev and then letting it do all the work. But freezing is more of an fs thing, and doesn't really have much to do with the bdev at all, all the work gets done with the super. In btrfs we do not populate s_bdev, since we can have multiple bdev's for one fs and setting s_bdev makes removing devices from a pool kind of tricky. This means that freezing a btrfs filesystem fails, which causes us to corrupt with things like tux-on-ice which use the fsfreeze mechanism. So instead of populating sb->s_bdev with a random bdev in our pool, I've broken the actual fs freezing stuff into freeze_super and thaw_super. These just take the super_block that we're freezing and does the appropriate work. It's basically just copy and pasted from freeze_bdev. I've then converted freeze_bdev over to use the new super helpers. I've tested this with ext4 and btrfs and verified everything continues to work the same as before. The only new gotcha is multiple calls to the fsfreeze ioctl will return EBUSY if the fs is already frozen. I thought this was a better solution than adding a freeze counter to the super_block, but if everybody hates this idea I'm open to suggestions. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Cleanup generic block based fiemapJosef Bacik2010-04-231-39/+53
| | | | | | | | | | | This cleans up a few of the complaints of __generic_block_fiemap. I've fixed all the typing stuff, used inline functions instead of macros, gotten rid of a couple of variables, and made sure the size and block requests are all block aligned. It also fixes a problem where sometimes FIEMAP_EXTENT_LAST wasn't being set properly. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* __generic_block_fiemap(): fix for files bigger than 4GBMike Hommey2009-11-121-1/+1
| | | | | | | | | | | | | | | | | Because of an integer overflow on start_blk, various kind of wrong results would be returned by the generic_block_fiemap() handler, such as no extents when there is a 4GB+ hole at the beginning of the file, or wrong fe_logical when an extent starts after the first 4GB. Signed-off-by: Mike Hommey <mh@glandium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Eric Sandeen <sandeen@sgi.com> Cc: Josef Bacik <jbacik@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs: explicitly cast s_maxbytes in fiemap_check_rangesJeff Layton2009-09-241-4/+5
| | | | | | | | | | | | | | | | | If fiemap_check_ranges is passed a large enough value, then it's possible that the value would be cast to a signed value for comparison against s_maxbytes when we change it to loff_t. Make sure that doesn't happen by explicitly casting s_maxbytes to an unsigned value for the purposes of comparison. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Robert Love <rlove@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mandeep Singh Baines <msb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: Add new pre-allocation ioctls to vfs for compatibility with legacy xfs ↵Ankit Jain2009-06-241-0/+35
| | | | | | | | | | | | | | | | ioctls This patch adds ioctls to vfs for compatibility with legacy XFS pre-allocation ioctls (XFS_IOC_*RESVP*). The implementation effectively invokes sys_fallocate for the new ioctls. Also handles the compat_ioctl case. Note: These legacy ioctls are also implemented by OCFS2. [AV: folded fixes from hch] Signed-off-by: Ankit Jain <me@ankitjain.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* No instance of ->bmap() needs BKLAl Viro2009-06-171-2/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* vfs: Enable FS_IOC_FIEMAP and FIGETBSZ for all filetypesAneesh Kumar K.V2009-05-131-4/+10
| | | | | | | | The fiemap and get_blk_size ioctls should be enabled even for directories. So move it outisde file_ioctl. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* fiemap: fix problem with setting FIEMAP_EXTENT_LASTJosef Bacik2009-05-061-20/+55
| | | | | | | | | | | | | | | | | | | Fix a problem where the generic block based fiemap stuff would not properly set FIEMAP_EXTENT_LAST on the last extent. I've reworked things to keep track if we go past the EOF, and mark the last extent properly. The problem was reported by and tested by Eric Sandeen. Tested-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-btrfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Rationalize fasync return valuesJonathan Corbet2009-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | Most fasync implementations do something like: return fasync_helper(...); But fasync_helper() will return a positive value at times - a feature used in at least one place. Thus, a number of other drivers do: err = fasync_helper(...); if (err < 0) return err; return 0; In the interests of consistency and more concise code, it makes sense to map positive return values onto zero where ->fasync() is called. Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* Move FASYNC bit handling to f_op->fasync()Jonathan Corbet2009-03-161-12/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Removing the BKL from FASYNC handling ran into the challenge of keeping the setting of the FASYNC bit in filp->f_flags atomic with regard to calls to the underlying fasync() function. Andi Kleen suggested moving the handling of that bit into fasync(); this patch does exactly that. As a result, we have a couple of internal API changes: fasync() must now manage the FASYNC bit, and it will be called without the BKL held. As it happens, every fasync() implementation in the kernel with one exception calls fasync_helper(). So, if we make fasync_helper() set the FASYNC bit, we can avoid making any changes to the other fasync() functions - as long as those functions, themselves, have proper locking. Most fasync() implementations do nothing but call fasync_helper() - which has its own lock - so they are easily verified as correct. The BKL had already been pushed down into the rest. The networking code has its own version of fasync_helper(), so that code has been augmented with explicit FASYNC bit handling. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* Use f_lock to protect f_flagsJonathan Corbet2009-03-161-3/+4
| | | | | | | | | | | | Traditionally, changes to struct file->f_flags have been done under BKL protection, or with no protection at all. This patch causes all f_flags changes after file open/creation time to be done under protection of f_lock. This allows the removal of some BKL usage and fixes a number of longstanding (if microscopic) races. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* [CVE-2009-0029] System call wrappers part 15Heiko Carstens2009-01-141-1/+1
| | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* filesystem freeze: implement generic freeze featureTakashi Sato2009-01-091-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ioctls for the generic freeze feature are below. o Freeze the filesystem int ioctl(int fd, int FIFREEZE, arg) fd: The file descriptor of the mountpoint FIFREEZE: request code for the freeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 o Unfreeze the filesystem int ioctl(int fd, int FITHAW, arg) fd: The file descriptor of the mountpoint FITHAW: request code for unfreeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 Error number: If the filesystem has already been unfrozen, errno is set to EINVAL. [akpm@linux-foundation.org: fix CONFIG_BLOCK=n] Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* GFS2: Support for FIEMAP ioctlSteven Whitehouse2009-01-051-10/+34
| | | | | | | | | | | | | | | | | | This patch implements the FIEMAP ioctl for GFS2. We can use the generic code (aside from a lock order issue, solved as per Ted Tso's suggestion) for which I've introduced a new variant of the generic function. We also have one exception to deal with, namely stuffed files, so we do that "by hand", setting all the required flags. This has been tested with a modified (I could only find an old version) of Eric's test program, and appears to work correctly. This patch does not currently support FIEMAP of xattrs, but the plan is to add that feature at some future point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com>
* Fix a race condition in FASYNC handlingJonathan Corbet2008-12-051-4/+8
| | | | | | | | | | | | | | | | | | Changeset a238b790d5f99c7832f9b73ac8847025815b85f7 (Call fasync() functions without the BKL) introduced a race which could leave file->f_flags in a state inconsistent with what the underlying driver/filesystem believes. Revert that change, and also fix the same races in ioctl_fioasync() and ioctl_fionbio(). This is a minimal, short-term fix; the real fix will not involve the BKL. Reported-by: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* provide generic_block_fiemap() only with BLOCK=yAdrian Bunk2008-10-121-0/+4
| | | | | | | | | | | | | | | | | | This fixes the following compile error with CONFIG_BLOCK=n caused by commit 68c9d702bb72f367f3b148963ec6cf5e07ff7f65 ("generic block based fiemap implementation"): CC fs/ioctl.o fs/ioctl.c: In function 'generic_block_fiemap': fs/ioctl.c:249: error: storage size of 'tmp' isn't known fs/ioctl.c:272: error: invalid application of 'sizeof' to incomplete type 'struct buffer_head' fs/ioctl.c:280: error: implicit declaration of function 'buffer_mapped' fs/ioctl.c:249: warning: unused variable 'tmp' make[2]: *** [fs/ioctl.o] Error 1 Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* generic block based fiemap implementationJosef Bacik2008-10-031-0/+118
| | | | | | | | | | | | | | | | | | Any block based fs (this patch includes ext3) just has to declare its own fiemap() function and then call this generic function with its own get_block_t. This works well for block based filesystems that will map multiple contiguous blocks at one time, but will work for filesystems that only map one block at a time, you will just end up with an "extent" for each block. One gotcha is this will not play nicely where there is hole+data after the EOF. This function will assume its hit the end of the data as soon as it hits a hole after the EOF, so if there is any data past that it will not pick that up. AFAIK no block based fs does this anyway, but its in the comments of the function anyway just in case. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org
* vfs: vfs-level fiemap interfaceMark Fasheh2008-10-081-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap inode operation. Userspace can get extent information on a file via fiemap ioctl. As input, the fiemap ioctl takes a struct fiemap which includes an array of struct fiemap_extent (fm_extents). Size of the extent array is passed as fm_extent_count and number of extents returned will be written into fm_mapped_extents. Offset and length fields on the fiemap structure (fm_start, fm_length) describe a logical range which will be searched for extents. All extents returned will at least partially contain this range. The actual extent offsets and ranges returned will be unmodified from their offset and range on-disk. The fiemap ioctl returns '0' on success. On error, -1 is returned and errno is set. If errno is equal to EBADR, then fm_flags will contain those flags which were passed in which the kernel did not understand. On all other errors, the contents of fm_extents is undefined. As fiemap evolved, there have been many authors of the vfs patch. As far as I can tell, the list includes: Kalpak Shah <kalpak.shah@sun.com> Andreas Dilger <adilger@sun.com> Eric Sandeen <sandeen@redhat.com> Mark Fasheh <mfasheh@suse.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: linux-api@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
* make vfs_ioctl() staticAdrian Bunk2008-04-291-2/+2
| | | | | | | | | Make the needlessly global vfs_ioctl() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fix up kerneldoc in fs/ioctl.c a little bitChristoph Hellwig2008-02-091-4/+4
| | | | | | | | | | | - remove non-standard in/out markers - use tabs for formatting Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Cc: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: factor out three helpers for FIBMAP/FIONBIO/FIOASYNC file ioctlsErez Zadok2008-02-071-54/+75
| | | | | | | | | | | Factor out file-specific ioctl code into smaller helper functions, away from file_ioctl(). This helps code readability and also reduces indentation inside case statements. Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: swap do_ioctl and vfs_ioctl namesErez Zadok2008-02-071-8/+20
| | | | | | | | | | | | | | | | | | | Rename old vfs_ioctl to do_ioctl, because the comment above it clearly indicates that it is an internal function not to be exported to modules; therefore it should have a more traditional do_XXX name. The new do_ioctl is exported in fs.h but not to modules. Rename the old do_ioctl to vfs_ioctl because the names vfs_XXX should preferably be reserved to callable VFS functions which modules may call, as many other vfs_XXX functions already do. Export the new vfs_ioctl to GPL modules so others can use it (including Unionfs and eCryptfs). Add DocBook for new vfs_ioctl. [akpm@linux-foundation.org: fix build] Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: apply coding standards to fs/ioctl.cErez Zadok2008-02-071-80/+84
| | | | | | | Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drop obsolete sys_ioctl exportChristoph Hellwig2007-07-161-8/+0
| | | | | | | | | | | | | | | sys_ioctl() was only exported for our first version of compat ioctl handling. Now that the whole compat ioctl handling mess is more or less sorted out there are no more modular users left and we can kill it. There's one exception and that's sparc64's solaris compat module, but sparc64 has it's own export predating the generic one by years for that which this patch leaves untouched. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* revert "vanishing ioctl handler debugging"Andrew Morton2007-07-161-11/+3
| | | | | | | | | Revert my do_ioctl() debugging patch: Paul fixed the bug. Cc: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud