summaryrefslogtreecommitdiffstats
path: root/fs/xfs/xfs_vnodeops.c
Commit message (Collapse)AuthorAgeFilesLines
* xfs: Check new inode size is OK before preallocatingDave Chinner2010-05-281-1/+1
| | | | | | | | | | | | | | | | | | | The new xfsqa test 228 tries to preallocate more space than the filesystem contains. it should fail, but instead triggers an assert about lock flags. The failure is due to the size extension failing in vmtruncate() due to rlimit being set. Check this before we start the preallocation to avoid allocating space that will never be used. Also the path through xfs_vn_allocate already holds the IO lock, so it should not be present in the lock flags when the setattr fails. Hence the assert needs to take this into account. This will prevent other such callers from hitting this incorrect ASSERT. (Fixed a reference to "newsize" to read "new_size". -Alex) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: remove wrapper for the fsync file operationChristoph Hellwig2010-03-011-107/+0
| | | | | | | | | | | | | Currently the fsync file operation is divided into a low-level routine doing all the work and one that implements the Linux file operation and does minimal argument wrapping. This is a leftover from the days of the vnode operations layer and can be removed to simplify the code a bit, as well as preparing for the implementation of an optimized fdatasync which needs to look at the Linux inode state. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: optimize log flushing in xfs_fsyncChristoph Hellwig2010-02-121-2/+8
| | | | | | | | | | | If we have a pinned inode it must have a log item attached to it. Usually that log item will have ili_last_lsn already set, in which case we only need to flush the log up to that LSN instead of doing a full log force. This gives speedups of about 5% in some fsync heavy workloads. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: remove invalid barrier optimization from xfs_fsyncChristoph Hellwig2010-02-021-10/+2
| | | | | | | | | We always need to flush the disk write cache and can't skip it just because the no inode attributes have changed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: cleanup up xfs_log_force calling conventionsChristoph Hellwig2010-01-211-3/+2
| | | | | | | | | | | | | | | | Remove the XFS_LOG_FORCE argument which was always set, and the XFS_LOG_URGE define, which was never used. Split xfs_log_force into a two helpers - xfs_log_force which forces the whole log, and xfs_log_force_lsn which forces up to the specified LSN. The underlying implementations already were entirely separate, as were the users. Also re-indent the new _xfs_log_force/_xfs_log_force which previously had a weird coding style. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: remove duplicate buffer flagsChristoph Hellwig2010-01-211-2/+2
| | | | | | | | | | | | | Currently we define aliases for the buffer flags in various namespaces, which only adds confusion. Remove all but the XBF_ flags to clean this up a bit. Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer uses, but I'll clean that up later. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: convert DM ops to use unsigned char namesDave Chinner2010-01-201-2/+4
| | | | | | | | | dmops uses a signed char for it's namespace event. To be consistent with the rest of the code, convert them to unsigned char for the namespace string. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: fix timestamp handling in xfs_setattrChristoph Hellwig2010-01-101-54/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have some rather odd code in xfs_setattr for updating the a/c/mtime timestamps: - first we do a non-transaction update if all three are updated together - second we implicitly update the ctime for various changes instead of relying on the ATTR_CTIME flag - third we set the timestamps to the current time instead of the arguments in the iattr structure in many cases. This patch makes sure we update it in a consistent way: - always transactional - ctime is only updated if ATTR_CTIME is set or we do a size update, which is a special case - always to the times passed in from the caller instead of the current time The only non-size caller of xfs_setattr that doesn't come from the VFS is updated to set ATTR_CTIME and pass in a valid ctime value. Reported-by: Eric Blake <ebb9@byu.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: event tracing supportChristoph Hellwig2009-12-141-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the old xfs tracing support that could only be used with the out of tree kdb and xfsidbg patches to use the generic event tracer. To use it make sure CONFIG_EVENT_TRACING is enabled and then enable all xfs trace channels by: echo 1 > /sys/kernel/debug/tracing/events/xfs/enable or alternatively enable single events by just doing the same in one event subdirectory, e.g. echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable or set more complex filters, etc. In Documentation/trace/events.txt all this is desctribed in more detail. To reads the events do a cat /sys/kernel/debug/tracing/trace Compared to the last posting this patch converts the tracing mostly to the one tracepoint per callsite model that other users of the new tracing facility also employ. This allows a very fine-grained control of the tracing, a cleaner output of the traces and also enables the perf tool to use each tracepoint as a virtual performance counter, allowing us to e.g. count how often certain workloads git various spots in XFS. Take a look at http://lwn.net/Articles/346470/ for some examples. Also the btree tracing isn't included at all yet, as it will require additional core tracing features not in mainline yet, I plan to deliver it later. And the really nice thing about this patch is that it actually removes many lines of code while adding this nice functionality: fs/xfs/Makefile | 8 fs/xfs/linux-2.6/xfs_acl.c | 1 fs/xfs/linux-2.6/xfs_aops.c | 52 - fs/xfs/linux-2.6/xfs_aops.h | 2 fs/xfs/linux-2.6/xfs_buf.c | 117 +-- fs/xfs/linux-2.6/xfs_buf.h | 33 fs/xfs/linux-2.6/xfs_fs_subr.c | 3 fs/xfs/linux-2.6/xfs_ioctl.c | 1 fs/xfs/linux-2.6/xfs_ioctl32.c | 1 fs/xfs/linux-2.6/xfs_iops.c | 1 fs/xfs/linux-2.6/xfs_linux.h | 1 fs/xfs/linux-2.6/xfs_lrw.c | 87 -- fs/xfs/linux-2.6/xfs_lrw.h | 45 - fs/xfs/linux-2.6/xfs_super.c | 104 --- fs/xfs/linux-2.6/xfs_super.h | 7 fs/xfs/linux-2.6/xfs_sync.c | 1 fs/xfs/linux-2.6/xfs_trace.c | 75 ++ fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++ fs/xfs/linux-2.6/xfs_vnode.h | 4 fs/xfs/quota/xfs_dquot.c | 110 --- fs/xfs/quota/xfs_dquot.h | 21 fs/xfs/quota/xfs_qm.c | 40 - fs/xfs/quota/xfs_qm_syscalls.c | 4 fs/xfs/support/ktrace.c | 323 --------- fs/xfs/support/ktrace.h | 85 -- fs/xfs/xfs.h | 16 fs/xfs/xfs_ag.h | 14 fs/xfs/xfs_alloc.c | 230 +----- fs/xfs/xfs_alloc.h | 27 fs/xfs/xfs_alloc_btree.c | 1 fs/xfs/xfs_attr.c | 107 --- fs/xfs/xfs_attr.h | 10 fs/xfs/xfs_attr_leaf.c | 14 fs/xfs/xfs_attr_sf.h | 40 - fs/xfs/xfs_bmap.c | 507 +++------------ fs/xfs/xfs_bmap.h | 49 - fs/xfs/xfs_bmap_btree.c | 6 fs/xfs/xfs_btree.c | 5 fs/xfs/xfs_btree_trace.h | 17 fs/xfs/xfs_buf_item.c | 87 -- fs/xfs/xfs_buf_item.h | 20 fs/xfs/xfs_da_btree.c | 3 fs/xfs/xfs_da_btree.h | 7 fs/xfs/xfs_dfrag.c | 2 fs/xfs/xfs_dir2.c | 8 fs/xfs/xfs_dir2_block.c | 20 fs/xfs/xfs_dir2_leaf.c | 21 fs/xfs/xfs_dir2_node.c | 27 fs/xfs/xfs_dir2_sf.c | 26 fs/xfs/xfs_dir2_trace.c | 216 ------ fs/xfs/xfs_dir2_trace.h | 72 -- fs/xfs/xfs_filestream.c | 8 fs/xfs/xfs_fsops.c | 2 fs/xfs/xfs_iget.c | 111 --- fs/xfs/xfs_inode.c | 67 -- fs/xfs/xfs_inode.h | 76 -- fs/xfs/xfs_inode_item.c | 5 fs/xfs/xfs_iomap.c | 85 -- fs/xfs/xfs_iomap.h | 8 fs/xfs/xfs_log.c | 181 +---- fs/xfs/xfs_log_priv.h | 20 fs/xfs/xfs_log_recover.c | 1 fs/xfs/xfs_mount.c | 2 fs/xfs/xfs_quota.h | 8 fs/xfs/xfs_rename.c | 1 fs/xfs/xfs_rtalloc.c | 1 fs/xfs/xfs_rw.c | 3 fs/xfs/xfs_trans.h | 47 + fs/xfs/xfs_trans_buf.c | 62 - fs/xfs/xfs_vnodeops.c | 8 70 files changed, 2151 insertions(+), 2592 deletions(-) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: simplify xfs_buf_get / xfs_buf_read interfacesChristoph Hellwig2009-12-111-3/+2
| | | | | | | | | | | | | | Currently the low-level buffer cache interfaces are highly confusing as we have a _flags variant of each that does actually respect the flags, and one without _flags which has a flags argument that gets ignored and overriden with a default set. Given that very few places use the default arguments get rid of the duplication and convert all callers to pass the flags explicitly. Also remove the now confusing _flags postfix. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: fix mmap_sem/iolock inversion in xfs_free_eofblocksChristoph Hellwig2009-12-111-8/+26
| | | | | | | | | | | | | | When xfs_free_eofblocks is called from ->release the VM might already hold the mmap_sem, but in the write path we take the iolock before taking the mmap_sem in the generic write code. Switch xfs_free_eofblocks to only trylock the iolock if called from ->release and skip trimming the prellocated blocks in that case. We'll still free them later on the final iput. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: simplify inode teardownChristoph Hellwig2009-12-111-40/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Currently the reclaim code for the case where we don't reclaim the final reclaim is overly complicated. We know that the inode is clean but instead of just directly reclaiming the clean inode we go through the whole process of marking the inode reclaimable just to directly reclaim it from the calling context. Besides being overly complicated this introduces a race where iget could recycle an inode between marked reclaimable and actually being reclaimed leading to panics. This patch gets rid of the existing reclaim path, and replaces it with a simple call to xfs_ireclaim if the inode was clean. While we're at it we also use the slightly more lax xfs_inode_clean check we'd use later to determine if we need to flush the inode here. Finally get rid of xfs_reclaim function and place the remaining small bits of reclaim code directly into xfs_fs_destroy_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Patrick Schreurs <patrick@news-service.com> Reported-by: Tommy van Leeuwen <tommy@news-service.com> Tested-by: Patrick Schreurs <patrick@news-service.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: implement ->dirty_inode to fix timestamp handlingChristoph Hellwig2009-10-081-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is picking up on Felix's repost of Dave's patch to implement a .dirty_inode method. We really need this notification because the VFS keeps writing directly into the inode structure instead of going through methods to update this state. In addition to the long-known atime issue we now also have a caller in VM code that updates c/mtime that way for shared writeable mmaps. And I found another one that no one has noticed in practice in the FIFO code. So implement ->dirty_inode to set i_update_core whenever the inode gets externally dirtied, and switch the c/mtime handling to the same scheme we already use for atime (always picking up the value from the Linux inode). Note that this patch also removes the xfs_synchronize_atime call in xfs_reclaim it was superflous as we already synchronize the time when writing the inode via the log (xfs_inode_item_format) or the normal buffers (xfs_iflush_int). In addition also remove the I_CLEAR check before copying the Linux timestamps - now that we always have the Linux inode available we can always use the timestamps in it. Also switch to just using file_update_time for regular reads/writes - that will get us all optimization done to it for free and make sure we notice early when it breaks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: use correct log reservation when handling ENOSPC in xfs_createChristoph Hellwig2009-09-091-2/+2
| | | | | | | | | | We added the ENOSPC handling patch in xfs_create just after it got mered with xfs_mkdir. Change the log reservation to the variable for either the create or mkdir value so it does the right thing if get here for creating a directory. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: merge fsync and O_SYNC handlingChristoph Hellwig2009-09-011-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The guarantees for O_SYNC are exactly the same as the ones we need to make for an fsync call (and given that Linux O_SYNC is O_DSYNC the equivalent is fdadatasync, but we treat both the same in XFS), except with a range data writeout. Jan Kara has started unifying these two path for filesystems using the generic helpers, and I've started to look at XFS. The actual transaction commited by xfs_fsync and xfs_write_sync_logforce has a different transaction number, but actually is exactly the same. We'll only use the fsync transaction going forward. One major difference is that xfs_write_sync_logforce never issues a cache flush unless we commit a transaction causing that as a side-effect, which is an obvious bug in the O_SYNC handling. Second all the locking and i_update_size vs i_update_core changes from 978b7237123d007b9fa983af6e0e2fa8f97f9934 never made it to xfs_write_sync_logforce, so we add them back. To make xfs_fsync easily usable from the O_SYNC path, the filemap_fdatawait call is moved up to xfs_file_fsync, so that we don't wait on the whole file after we already waited for our portion in xfs_write. We'll also use a plain call to filemap_write_and_wait_range instead of the previous sync_page_rang which did it in two steps including an half-hearted inode write out that doesn't help us. Once we're done with this also remove the now useless i_update_size tracking. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: add more statics & drop some unused functionsEric Sandeen2009-08-311-1/+1
| | | | | | | | | | A lot more functions could be made static, but they need forward declarations; this does some easy ones, and also found a few unused functions in the process. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmapChristoph Hellwig2009-08-121-1/+3
| | | | | | | | | | | | xfs_readlink_bmap is called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
* xfs: use generic Posix ACL codeChristoph Hellwig2009-06-101-1/+14
| | | | | | | | | | | | This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: kill xfs_qmopsChristoph Hellwig2009-06-081-48/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill the quota ops function vector and replace it with direct calls or stubs in the CONFIG_XFS_QUOTA=n case. Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set otherwise. This brings us back closer to the way this code worked in IRIX and earlier Linux versions, but we keep a lot of the more useful factoring of common code. Eventually we should also kill xfs_qm_bhv.c, but that's left for a later patch. Reduces the size of the source code by about 250 lines and the size of XFS module by about 1.5 kilobytes with quotas enabled: text data bss dec hex filename 615957 2960 3848 622765 980ad fs/xfs/xfs.o 617231 3152 3848 624231 98667 fs/xfs/xfs.o.old Fallout: - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects the inode locked and xfs_qm_dqattach which does the locking around it, thus removing XFS_QMOPT_ILOCKED. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* xfs: flush delayed allcoation blocks on ENOSPC in createDave Chinner2009-04-061-0/+7
| | | | | | | | | | | | | | If we are creating lots of small files, we can fail to get a reservation for inode create earlier than we should due to EOF preallocation done during delayed allocation reservation. Hence on the first reservation ENOSPC failure flush all the delayed allocation blocks out of the system and retry. This fixes the last commonly triggered spurious ENOSPC issue that has been reported. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: fix various typosMalcolm Parsons2009-03-291-1/+1
| | | | | Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: kill VN_BADChristoph Hellwig2009-03-161-2/+2
| | | | | | | Remove this rather pointless wrapper and use is_bad_inode directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_mkdir into xfs_createChristoph Hellwig2009-02-091-268/+83
| | | | | | | | | xfs_create and xfs_mkdir only have minor differences, so merge both of them into a sigle function. While we're at it also make the error handling code more straight-forward. Signed-off-by: Christoph Hellwig <hch@lst.de> Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_inode_flush into xfs_fs_write_inodeChristoph Hellwig2009-02-041-45/+0
| | | | | | | | | Splitting the task for a VFS-induced inode flush into two functions doesn't make any sense, so merge the two functions dealing with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Dave Chinner <david@fromorbit.com>
* xfs: tiny cleanup for xfs_linkChristoph Hellwig2009-02-041-2/+4
| | | | | | | | | | The source and target inodes are guaranteed to never be the same by the VFS, so no need to check for that (and we would get into bad trouble later anyway if that were the case). Also clean up the error handling to use two gotos instead of nested conditions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>
* [XFS] Remove the rest of the macro-to-function indirections.Eric Sandeen2009-01-161-10/+10
| | | | | | | | Remove the last of the macros-defined-to-static-functions. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] use inode_change_ok for setattr permission checkingChristoph Hellwig2008-12-111-113/+36
| | | | | | | | | | | | | Instead of implementing our own checks use inode_change_ok to check for necessary permission in setattr. There is a slight change in behaviour as inode_change_ok doesn't allow i_mode updates to add the suid or sgid without superuser privilegues while the old XFS code just stripped away those bits from the file mode. (First sent on Semptember 29th) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] Check return value of xfs_buf_get_noaddr()Lachlan McIlroy2008-12-051-0/+2
| | | | | | | | | We check the return value of all other calls to xfs_buf_get_noaddr(). Make sense to do it here too. Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
* move vn_iowait / vn_iowake into xfs_aops.cChristoph Hellwig2008-12-041-3/+4
| | | | | | | | | | The whole machinery to wait on I/O completion is related to the I/O path and should be there instead of in xfs_vnode.c. Also give the functions more descriptive names. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
* [XFS] remove i_gen from incore inodeDave Chinner2008-12-011-27/+2
| | | | | | | | | | | | | | | i_gen is incremented in directory operations when the directory is changed. It is never read or otherwise used so it should be removed to help reduce the size of the struct xfs_inode. The patch also removes a duplicate logging of the directory inode core. We only need to do this once per transaction so kill the one associated with the i_gen increment. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niv Sardi <xaiki@sgi.com>
* [XFS] fix error inversion problems with data flushingDave Chinner2008-12-011-1/+1
| | | | | | | | | | | | | | | | | XFS gets the sign of the error wrong in several places when gathering the error from generic linux functions. These functions return negative error values, while the core XFS code returns positive error values. Hence when XFS inverts the error to be returned to the VFS, it can incorrectly invert a negative error and this error will be ignored by the syscall return. Fix all the problems related to calling filemap_* functions. Problem initially identified by Nick Piggin in xfs_fsync(). Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niv Sardi <xaiki@sgi.com>
* [XFS] wire up ->open for directoriesChristoph Hellwig2008-12-011-22/+0
| | | | | | | | | | | | | | | | | | Currently there's no ->open method set for directories on XFS. That means we don't perform any check for opening too large directories without O_LARGEFILE, we don't check for shut down filesystems, and we don't actually do the readahead for the first block in the directory. Instead of just setting the directories open routine to xfs_file_open we merge the shutdown check directly into xfs_file_open and create a new xfs_dir_open that first calls xfs_file_open and then performs the readahead for block 0. (First sent on September 29th) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
* [XFS] Fix double free of log ticketsDave Chinner2008-11-171-0/+6
| | | | | | | | | | | | | | | | | | | | | | | When an I/O error occurs during an intermediate commit on a rolling transaction, xfs_trans_commit() will free the transaction structure and the related ticket. However, the duplicate transaction that gets used as the transaction continues still contains a pointer to the ticket. Hence when the duplicate transaction is cancelled and freed, we free the ticket a second time. Add reference counting to the ticket so that we hold an extra reference to the ticket over the transaction commit. We drop the extra reference once we have checked that the transaction commit did not return an error, thus avoiding a double free on commit error. Credit to Nick Piggin for tripping over the problem. SGI-PV: 989741 Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] remove restricted chown parameter from xfs linuxTim Shimmin2008-10-301-11/+2
| | | | | | | | | | | | | | | | | | | | On Linux all filesystems are supposed to be operating under Posix' restricted chown. Restricted chown means it restricts chown to the owner unless you have CAP_FOWNER. NOTE: that 2 files outside of fs/xfs have been modified too for this change. Reviewed-by: Dave Chinner <david@fromorbit.com> SGI-PV: 988919 SGI-Modid: xfs-linux-melb:xfs-kern:32413a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] kill sys_credChristoph Hellwig2008-10-301-4/+2
| | | | | | | | | | | | | | | capable_cred has been unused for a while so we can kill it and sys_cred. That also means the cred argument to xfs_setattr and xfs_change_file_space can be removed now. SGI-PV: 988918 SGI-Modid: xfs-linux-melb:xfs-kern:32412a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] Trivial xfs_remove comment fixupChristoph Hellwig2008-10-301-2/+2
| | | | | | | | | | | | | | | The dp to ip comment should be for the unconditional xfs_droplink call, and the "." link obviously only exists for directories, so it should be in the is_dir conditional. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32374a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] kill deleted inodes listDavid Chinner2008-10-301-11/+1
| | | | | | | | | | | | | Now that the deleted inodes list is unused, kill it. This also removes the i_reclaim list head from the xfs_inode, shrinking it by two pointers. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32334a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] mark inodes for reclaim via a tag in the inode radix treeDavid Chinner2008-10-301-0/+1
| | | | | | | | | | | | | | Prepare for removing the deleted inode list by marking inodes for reclaim in the inode radix trees so that we can use the radix trees to find reclaimable inodes. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32331a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] rename inode reclaim functionsDavid Chinner2008-10-301-1/+1
| | | | | | | | | | | | | | The function names xfs_finish_reclaim and xfs_finish_reclaim_all are not very descriptive of what they are reclaiming. Rename to xfs_reclaim_inode[s] to match the xfs_sync_inodes() function. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32330a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] move inode reclaim functions to xfs_sync.cDavid Chinner2008-10-301-90/+0
| | | | | | | | | | | | | | Background inode reclaim is run by the xfssyncd. Move the reclaim worker functions to be close to the sync code as the are very similar in structure and are both run from the same background thread. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32329a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] Combine the XFS and Linux inodesDavid Chinner2008-10-301-10/+3
| | | | | | | | | | | | | | | | | | | | | To avoid issues with different lifecycles of XFS and Linux inodes, embedd the linux inode inside the XFS inode. This means that the linux inode has the same lifecycle as the XFS inode, even when it has been released by the OS. XFS inodes don't live much longer than this (a short stint in reclaim at most), so there isn't significant memory usage penalties here. Version 3 o kill xfs_icount() Version 2 o remove unused commented out code from xfs_iget(). o kill useless cast in VFS_I() SGI-PV: 988141 SGI-Modid: xfs-linux-melb:xfs-kern:32323a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] Remove xfs_iflush_all and clean up xfs_finish_reclaim_all()David Chinner2008-10-301-24/+18
| | | | | | | | | | | | | | | | | | | | | | | | | xfs_iflush_all() walks the m_inodes list to find inodes that need reclaiming. We already have such a list - the m_del_inodes list. Replace xfs_iflush_all() with a call to xfs_finish_reclaim_all() and clean up xfs_finish_reclaim_all() to handle the different flush modes now needed. Originally based on a patch from Christoph Hellwig. Version 3 o rediff against new linux-2.6/xfs_sync.c code Version 2 o revert xfs_syncsub() inode reclaim behaviour back to original code o xfs_quiesce_fs() should use XFS_IFLUSH_DELWRI_ELSE_ASYNC, not XFS_IFLUSH_ASYNC, to prevent change of behaviour. SGI-PV: 988139 SGI-Modid: xfs-linux-melb:xfs-kern:32284a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] Don't do I/O beyond eof when unreserving spaceLachlan McIlroy2008-09-171-0/+18
| | | | | | | | | | | | | | | | | | When unreserving space with boundaries that are not block aligned we round up the start and round down the end boundaries and then use this function, xfs_zero_remaining_bytes(), to zero the parts of the blocks that got dropped during the rounding. The problem is we don't consider if these blocks are beyond eof. Worse still is if we encounter delayed allocations beyond eof we will try to use the magic delayed allocation block number as a real block number. If the file size is ever extended to expose these blocks then we'll go through xfs_zero_eof() to zero them anyway. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:32055a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
* [XFS] Prevent lockdep false positives when locking two inodes.David Chinner2008-09-171-0/+8
| | | | | | | | | | | | | | | | | | | | | If we call xfs_lock_two_inodes() to grab both the iolock and the ilock, then drop the ilocks on both inodes, then grab them again (as xfs_swap_extents() does) then lockdep will report a locking order problem. This is a false positive. To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks at once - force calers to make two separate calls. This means that nested dropping and regaining of the ilocks will retain the same lockdep subclass and so lockdep will not see anything wrong with this code. SGI-PV: 986238 SGI-Modid: xfs-linux-melb:xfs-kern:31999a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* CRED: Introduce credential access wrappersDavid Howells2008-08-141-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | The patches that are intended to introduce copy-on-write credentials for 2.6.28 require abstraction of access to some fields of the task structure, particularly for the case of one task accessing another's credentials where RCU will have to be observed. Introduced here are trivial no-op versions of the desired accessors for current and other tasks so that other subsystems can start to be converted over more easily. Wrappers are introduced into a new header (linux/cred.h) for UID/GID, EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed user_struct. These wrappers are macros because the ordering between header files mitigates against making them inline functions. linux/cred.h is #included from linux/sched.h. Further, XFS is modified such that it no longer defines and uses parameterised versions of current_fs[ug]id(), thus getting rid of the namespace collision otherwise incurred. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
* [XFS] update timestamp in xfs_ialloc manuallyChristoph Hellwig2008-08-131-1/+0
| | | | | | | | | | | | | | | | | In xfs_ialloc we just want to set all timestamps to the current time. We don't need to mark the inode dirty like xfs_ichgtime does, and we don't need nor want the opimizations in xfs_ichgtime that I will introduce in the next patch. So just opencode the timestamp update in xfs_ialloc, and remove the new unused XFS_ICHGTIME_ACC case in xfs_ichgtime. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31825a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] kill bhv_vnode_tChristoph Hellwig2008-08-131-1/+1
| | | | | | | | | | | | | | | All remaining bhv_vnode_t instance are in code that's more or less Linux specific. (Well, for xfs_acl.c that could be argued, but that code is on the removal list, too). So just do an s/bhv_vnode_t/struct inode/ over the whole tree. We can clean up variable naming and some useless helpers later. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31781a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] remove some easy bhv_vnode_t instancesChristoph Hellwig2008-08-131-13/+8
| | | | | | | | | | | | In various places we can just move a VFS_I call into the argument list of called functions/macros instead of having a local bhv_vnode_t. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31776a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] kill xfs_lock_dir_and_entryChristoph Hellwig2008-08-131-121/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | When multiple inodes are locked in XFS it happens in order of the inode number, with the everything but the first inode trylocked if any of the previous inodes is in the AIL. Except for the sorting of the inodes this logic is implemented in xfs_lock_inodes, but also partially duplicated in xfs_lock_dir_and_entry in a particularly stupid way adds a lock roundtrip if the inode ordering is not optimal. This patch adds a new helper xfs_lock_two_inodes that takes two inodes and locks them in the most optimal way according to the above locking protocol and uses it for all places that want to lock two inodes. The only caller of xfs_lock_inodes is xfs_rename which might lock up to four inodes. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31772a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* [XFS] kill vn_to_inodeChristoph Hellwig2008-08-131-2/+2
| | | | | | | | | | | | bhv_vnode_t is just a typedef for struct inode, so there's no need for a helper to convert between the two. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31761a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
OpenPOWER on IntegriCloud