summaryrefslogtreecommitdiffstats
path: root/fs/xfs/linux-2.6
Commit message (Collapse)AuthorAgeFilesLines
* xfs: split xfs_sync_inodesChristoph Hellwig2009-06-084-26/+51
| | | | | | | | | | | | | | | | | | | | | | | xfs_sync_inodes is used to write back either file data or inode metadata. In general we always do these separately, except for one fishy case in xfs_fs_put_super that does both. So separate xfs_sync_inodes into separate xfs_sync_data and xfs_sync_attr functions. In xfs_fs_put_super we first call the data sync and then the attr sync as that was the previous order. The moved log force in that path doesn't make a difference because we will force the log again as part of the real unmount process. The filesystem readonly checks are not performed by the new function but instead moved into the callers, given that most callers alredy have it further up in the stack. Also add debug checks that we do not pass in incorrect flags in the new xfs_sync_data and xfs_sync_attr function and fix the one place that did pass in a wrong flag. Also remove a comment mentioning xfs_sync_inodes that has been incorrect for a while because we always take either the iolock or ilock in the sync path these days. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: use generic inode iterator in xfs_qm_dqrele_all_inodesChristoph Hellwig2009-06-082-2/+8
| | | | | | | | | | Use xfs_inode_ag_iterator instead of opencoding the inode walk in the quota code. Mark xfs_inode_ag_iterator and xfs_sync_inode_valid non-static to allow using them from the quota code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: introduce a per-ag inode iteratorDave Chinner2009-06-081-166/+150
| | | | | | | | | | | | | | | | | | | Given that we walk across the per-ag inode lists so often, it makes sense to introduce an iterator for this. Convert the sync and reclaim code to use this new iterator, quota code will follow in the next patch. Also change xfs_reclaim_inode to return -EGAIN instead of 1 for an inode already under reclaim. This simplifies the AG iterator and doesn't matter for the only other caller. [hch: merged the lookup and execute callbacks back into one to get the pag_ici_lock locking correct and simplify the code flow] Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: remove unused parameter from xfs_reclaim_inodesDave Chinner2009-06-082-19/+5
| | | | | | | | | The noblock parameter of xfs_reclaim_inodes is only ever set to zero. Remove it and all the conditional code that is never executed. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: factor out inode validation for syncDave Chinner2009-06-081-22/+37
| | | | | | | | | Separate the validation of inodes found by the radix tree walk from the radix tree lookup. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: split inode flushing from xfs_sync_inodes_agChristoph Hellwig2009-06-081-17/+33
| | | | | | | | | | | In many cases we only want to sync inode metadata. Split out the inode flushing into a separate helper to prepare factoring the inode sync code. Based on a patch from Dave Chinner, but redone to keep the current behaviour exactly and leave changes to the flushing logic to another patch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: split inode data writeback from xfs_sync_inodes_agDave Chinner2009-06-081-20/+32
| | | | | | | | | | | In many cases we only want to sync inode data. Start spliting the inode sync into data sync and inode sync by factoring out the inode data flush. [hch: minor cleanups] Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: kill xfs_qmopsChristoph Hellwig2009-06-083-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill the quota ops function vector and replace it with direct calls or stubs in the CONFIG_XFS_QUOTA=n case. Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set otherwise. This brings us back closer to the way this code worked in IRIX and earlier Linux versions, but we keep a lot of the more useful factoring of common code. Eventually we should also kill xfs_qm_bhv.c, but that's left for a later patch. Reduces the size of the source code by about 250 lines and the size of XFS module by about 1.5 kilobytes with quotas enabled: text data bss dec hex filename 615957 2960 3848 622765 980ad fs/xfs/xfs.o 617231 3152 3848 624231 98667 fs/xfs/xfs.o.old Fallout: - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects the inode locked and xfs_qm_dqattach which does the locking around it, thus removing XFS_QMOPT_ILOCKED. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: prevent deadlock in xfs_qm_shake()Felix Blyakher2009-06-011-1/+1
| | | | | | | | | | | | It's possible to recurse into filesystem from the memory allocation, which deadlocks in xfs_qm_shake(). Add check for __GFP_FS, and bail out if it is not set. Signed-off-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Hedi Berriche <hedi@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: block callers of xfs_flush_inodes() correctlyDave Chinner2009-04-062-3/+10
| | | | | | | | | | | | | | | | | | | | | xfs_flush_inodes() currently uses a magic timeout to wait for some inodes to be flushed before returning. This isn't really reliable but used to be the best that could be done due to deadlock potential of waiting for the entire flush. Now the inode flush is safe to execute while we hold page and inode locks, we can wait for all the inodes to flush synchronously. Convert the wait mechanism to a completion to do this efficiently. This should remove all remaining spurious ENOSPC errors from the delayed allocation reservation path. This is extracted almost line for line from a larger patch from Mikulas Patocka. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: make inode flush at ENOSPC synchronousDave Chinner2009-04-063-27/+17
| | | | | | | | | | | | | | | | | | | When we are writing to a single file and hit ENOSPC, we trigger a background flush of the inode and try again. Because we hold page locks and the iolock, the flush won't proceed until after we release these locks. This occurs once we've given up and ENOSPC has been reported. Hence if this one is the only dirty inode in the system, we'll get an ENOSPC prematurely. To fix this, remove the async flush from the allocation routines and move it to the top of the write path where we can do a synchronous flush and retry the write again. Only retry once as a second ENOSPC indicates that we really are ENOSPC. This avoids a page cache deadlock when trying to do this flush synchronously in the allocation layer that was identified by Mikulas Patocka. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: use xfs_sync_inodes() for device flushingDave Chinner2009-04-063-28/+36
| | | | | | | | | | | | | | | | Currently xfs_device_flush calls sync_blockdev() which is a no-op for XFS as all it's metadata is held in a different address to the one sync_blockdev() works on. Call xfs_sync_inodes() instead to flush all the delayed allocation blocks out. To do this as efficiently as possible, do it via two passes - one to do an async flush of all the dirty blocks and a second to wait for all the IO to complete. This requires some modification to the xfs-sync_inodes_ag() flush code to do efficiently. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: prevent unwritten extent conversion from blocking I/O completionDave Chinner2009-04-063-17/+31
| | | | | | | | | | | | | | | Unwritten extent conversion can recurse back into the filesystem due to memory allocation. Memory reclaim requires I/O completions to be processed to allow the callers to make progress. If the I/O completion workqueue thread is doing the recursion, then we have a deadlock situation. Move unwritten extent completion into it's own workqueue so it doesn't block I/O completions for normal delayed allocation or overwrite data. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: pagecache usage optimizationHisashi Hifumi2009-03-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi. I introduced "is_partially_uptodate" aops for XFS. A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. I did a performance test using the sysbench. #sysbench --num-threads=4 --max-requests=100000 --test=fileio --file-num=1 \ --file-block-size=8K --file-total-size=1G --file-test-mode=rndrw \ --file-fsync-freq=0 --file-rw-ratio=0.5 run -2.6.29-rc6 Test execution summary: total time: 123.8645s total number of events: 100000 total time taken by event execution: 442.4994 per-request statistics: min: 0.0000s avg: 0.0044s max: 0.3387s approx. 95 percentile: 0.0118s -2.6.29-rc6-patched Test execution summary: total time: 108.0757s total number of events: 100000 total time taken by event execution: 417.7505 per-request statistics: min: 0.0000s avg: 0.0042s max: 0.3217s approx. 95 percentile: 0.0118s arch: ia64 pagesize: 16k blocksize: 4k Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>
* xfs: kill ino64 mount optionChristoph Hellwig2009-03-292-24/+3
| | | | | | | | | | | The ino64 mount option adds a fixed offset to 32bit inode numbers to bring them into the 64bit range. There's no need for this kind of debug tool given that it's easy to produce real 64bit inode numbers for testing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com>
* xfs: kill mutex_t typedefChristoph Hellwig2009-03-292-26/+1
| | | | | | | | | People continue to complain about this for weird reasons, but there's really no point in keeping this typedef for a couple of users anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com>
* xfs: kill VN_BADChristoph Hellwig2009-03-161-8/+0
| | | | | | | Remove this rather pointless wrapper and use is_bad_inode directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: kill vn_atime_* helpers.Christoph Hellwig2009-03-161-19/+0
| | | | | | | | Two out of three are unused already, and the third is better done open-coded with a comment describing what's going on here. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: include header files for prototypesHannes Eder2009-03-061-0/+1
| | | | | | | | | | | | | | | | | | Fix this sparse warnings: fs/xfs/linux-2.6/xfs_ioctl.c:72:1: warning: symbol 'xfs_find_handle' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:249:1: warning: symbol 'xfs_open_by_handle' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:361:1: warning: symbol 'xfs_readlink_by_handle' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:496:1: warning: symbol 'xfs_attrmulti_attr_get' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:525:1: warning: symbol 'xfs_attrmulti_attr_set' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:555:1: warning: symbol 'xfs_attrmulti_attr_remove' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:657:1: warning: symbol 'xfs_ioc_space' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_ioctl.c:1340:1: warning: symbol 'xfs_file_ioctl' was not declared. Should it be static? fs/xfs/support/debug.c:65:1: warning: symbol 'xfs_fs_vcmn_err' was not declared. Should it be static? fs/xfs/support/debug.c:112:1: warning: symbol 'xfs_hex_dump' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: make symbols staticHannes Eder2009-03-061-3/+3
| | | | | | | | | | | | | | | | Instead of the keyword 'static' the macro 'STATIC' is used, so the symbols are still global with CONFIG_XFS_DEBUG. Fix this sparse warnings: fs/xfs/linux-2.6/xfs_super.c:638:1: warning: symbol 'xfs_blkdev_get' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_super.c:655:1: warning: symbol 'xfs_blkdev_put' was not declared. Should it be static? fs/xfs/linux-2.6/xfs_super.c:876:1: warning: symbol 'xfsaild' was not declared. Should it be static? fs/xfs/xfs_bmap.c:6208:1: warning: symbol 'xfs_check_block' was not declared. Should it be static? fs/xfs/xfs_dir2_leaf.c:553:1: warning: symbol 'xfs_dir2_leaf_check' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: only issues a cache flush on unmount if barriers are enabledChristoph Hellwig2009-03-043-8/+16
| | | | | | | | | | | Currently we unconditionally issue a flush from xfs_free_buftarg, but since 2.6.29-rc1 this gives a warning in the style of end_request: I/O error, dev vdb, sector 0 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* Revert "[XFS] remove old vmap cache"Felix Blyakher2009-02-181-1/+74
| | | | | | | | | | This reverts commit d2859751cd0bf586941ffa7308635a293f943c17. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* Revert "[XFS] use scalable vmap API"Felix Blyakher2009-02-181-3/+3
| | | | | | | | | | This reverts commit 95f8e302c04c0b0c6de35ab399a5551605eeb006. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: get rid of indirections in the quotaops implementationChristoph Hellwig2009-02-095-71/+161
| | | | | | | | | Currently we call from the nicely abstracted linux quotaops into a ugly multiplexer just to split the calls out at the same boundary again. Rewrite the quota ops handling to remove that obfucation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_mkdir into xfs_createChristoph Hellwig2009-02-091-21/+9
| | | | | | | | | xfs_create and xfs_mkdir only have minor differences, so merge both of them into a sigle function. While we're at it also make the error handling code more straight-forward. Signed-off-by: Christoph Hellwig <hch@lst.de> Dave Chinner <david@fromorbit.com>
* xfs: remove uchar_t/ushort_t/uint_t/ulong_t typesChristoph Hellwig2009-02-091-1/+1
| | | | | | | Just another set of types obsfucating the code, remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: cleanup xfs_find_handleChristoph Hellwig2009-02-081-62/+44
| | | | | | | | Remove the superflous igrab by keeping a reference on the path/file all the time and clean up various bits of surrounding code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>
* xfs: merge xfs_inode_flush into xfs_fs_write_inodeChristoph Hellwig2009-02-042-11/+37
| | | | | | | | | Splitting the task for a VFS-induced inode flush into two functions doesn't make any sense, so merge the two functions dealing with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Dave Chinner <david@fromorbit.com>
* [XFS] Warn on transaction in flight on read-only remountFelix Blyakher2009-01-271-1/+5
| | | | | | | | | Till VFS can correctly support read-only remount without racing, use WARN_ON instead of BUG_ON on detecting transaction in flight after quiescing filesystem. Signed-off-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: fix bad_features2 fixups for the root filesystemChristoph Hellwig2009-01-191-1/+16
| | | | | | | | | | | | | Currently the bad_features2 fixup and the alignment updates in the superblock are skipped if we mount a filesystem read-only. But for the root filesystem the typical case is to mount read-only first and only later remount writeable so we'll never perform this update at all. It's not a big problem but means the logs of people needing the fixup get spammed at every boot because they never happen on disk. Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: use mnt_want_write in compat_attrmulti ioctlChristoph Hellwig2009-01-191-0/+9
| | | | | | | | | The compat version of the attrmulti ioctl needs to ask for and then later release write access to the mount just like the native version, otherwise we could potentially write to read-only mounts. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: fix dentry aliasing issues in open_by_handleChristoph Hellwig2009-01-193-300/+195
| | | | | | | | | | | | | | | | | | Open by handle just grabs an inode by handle and then creates itself a dentry for it. While this works for regular files it is horribly broken for directories, where the VFS locking relies on the fact that there is only just one single dentry for a given inode, and that these are always connected to the root of the filesystem so that it's locking algorithms work (see Documentations/filesystems/Locking) Remove all the existing open by handle code and replace it with a small wrapper around the exportfs code which deals with all these issues. At the same time we also make the checks for a valid handle strict enough to reject all not perfectly well formed handles - given that we never hand out others that's okay and simplifies the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'master' of ↵Lachlan McIlroy2009-01-143-21/+4
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
| * filesystem freeze: remove XFS specific ioctl interfaces for freeze featureTakashi Sato2009-01-092-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It removes XFS specific ioctl interfaces and request codes for freeze feature. This patch has been supplied by David Chinner. Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Cc: Dave Chinner <david@fromorbit.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * filesystem freeze: add error handling of write_super_lockfs/unlockfsTakashi Sato2009-01-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, ext3 in mainline Linux doesn't have the freeze feature which suspends write requests. So, we cannot take a backup which keeps the filesystem's consistency with the storage device's features (snapshot and replication) while it is mounted. In many case, a commercial filesystem (e.g. VxFS) has the freeze feature and it would be used to get the consistent backup. If Linux's standard filesystem ext3 has the freeze feature, we can do it without a commercial filesystem. So I have implemented the ioctls of the freeze feature. I think we can take the consistent backup with the following steps. 1. Freeze the filesystem with the freeze ioctl. 2. Separate the replication volume or create the snapshot with the storage device's feature. 3. Unfreeze the filesystem with the unfreeze ioctl. 4. Take the backup from the separated replication volume or the snapshot. This patch: VFS: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they can return an error. Rename write_super_lockfs and unlockfs of the super block operation freeze_fs and unfreeze_fs to avoid a confusion. ext3, ext4, xfs, gfs2, jfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that write_super_lockfs returns an error if needed, and unlockfs always returns 0. reiserfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they always return 0 (success) to keep a current behavior. Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of git+ssh://git.melbourne.sgi.com/git/xfsLachlan McIlroy2009-01-092-6/+19
|\ \
| * | [XFS] Remove several unused typedefs.Eric Sandeen2009-01-091-2/+0
| | | | | | | | | | | | | | | | | | Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
| * | [XFS] pass XFS_IGET_BULKSTAT to xfs_iget for handle operationsChristoph Hellwig2009-01-091-4/+19
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS clients or users of the handle ioctls can pass us arbitrary inode numbers through the exportfs interface. Make sure we use the XFS_IGET_BULKSTAT so that these don't cause shutdowns due to the corruption checks. Also translate the EINVAL we get back for invalid inode clusters into an ESTALE which is more appropinquate, and remove the useless check for a NULL inode on a successfull xfs_iget return. I have a testcase to reproduce this using the handle interface which I will submit to xfsqa. Reported-by: Mario Becroft <mb@gem.win.co.nz> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6Lachlan McIlroy2009-01-081-1/+1
|\ \ | |/
| * trivial: fix then -> than typos in comments and documentationFrederik Schwarzer2009-01-061-1/+1
| | | | | | | | | | | | | | - (better, more, bigger ...) then -> (...) than Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | [XFS] use scalable vmap APINick Piggin2009-01-061-3/+3
| | | | | | | | | | | | | | | | | | | | Implement XFS's large buffer support with the new vmap APIs. See the vmap rewrite (db64fe02) for some numbers. The biggest improvement that comes from using the new APIs is avoiding the global KVA allocation lock on every call. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | [XFS] remove old vmap cacheNick Piggin2009-01-061-74/+1
|/ | | | | | | | | | | | | | | | | | | XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps track of them in a list. To purge the batch, it just goes through the list and calls vunamp on each one. This is pretty poor: a global TLB flush is generally still performed on each vunmap, with the most expensive parts of the operation being the broadcast IPIs and locking involved in the SMP callouts, and the locking involved in the vmap management -- none of these are avoided by just batching up the calls. I'm actually surprised it ever made much difference. (Now that the lazy vmap allocator is upstream, this description is not quite right, but the vunmap batching still doesn't seem to do much) Rip all this logic out of XFS completely. I will improve vmap performance and scalability directly in subsequent patch. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] Fix merge failuresLachlan McIlroy2008-12-292-4/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/linux-2.6/xfs_cred.h fs/xfs/linux-2.6/xfs_globals.h fs/xfs/linux-2.6/xfs_ioctl.c fs/xfs/xfs_vnodeops.h Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
| * CRED: Pass credentials through dentry_open()David Howells2008-11-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Pass credentials through dentry_open() so that the COW creds patch can have SELinux's flush_unauthorized_files() pass the appropriate creds back to itself when it opens its null chardev. The security_dentry_open() call also now takes a creds pointer, as does the dentry_open hook in struct security_operations. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
| * CRED: Separate task security context from task_structDavid Howells2008-11-142-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
| * CRED: Wrap task credential accesses in the XFS filesystemDavid Howells2008-11-142-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: xfs@oss.sgi.com Signed-off-by: James Morris <jmorris@namei.org>
* | [XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPILachlan McIlroy2008-12-241-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iolock is dropped and re-acquired around the call to XFS_SEND_NAMESP(). While the iolock is released the file can become cached. We then 'goto retry' and - if we are doing direct I/O - mapping->nrpages may now be non zero but need_i_mutex will be zero and we will hit the WARN_ON(). Since we have dropped the I/O lock then the file size may have also changed so what we need to do here is 'goto start' like we do for the XFS_SEND_DATA() DMAPI event. We also need to update the filesize before releasing the iolock so that needs to be done before the XFS_SEND_NAMESP event. If we drop the iolock before setting the filesize we could race with a truncate. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | [XFS] Remove XFS_BUF_SHUT() and friendsLachlan McIlroy2008-12-221-4/+0
| | | | | | | | | | | | | | Code does nothing so remove it. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | [XFS] Use the incore inode size in xfs_file_readdir()Lachlan McIlroy2008-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We should be using the incore inode size here not the linux inode size. The incore inode size is always up to date for directories whereas the linux inode size is not updated for directories. We've hit assertions in xfs_bmap() and traced it back to the linux inode size being zero but the incore size being correct. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | Merge branch 'master' of git+ssh://git.melbourne.sgi.com/git/xfsLachlan McIlroy2008-12-128-188/+70
|\ \
OpenPOWER on IntegriCloud