op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	xfs: remove block number from inode lookup code	Dave Chinner	2010-06-24	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	The block number comes from bulkstat based inode lookups to shortcut the mapping calculations. We ar enot able to trust anything from bulkstat, so drop the block number as well so that the correct lookups and mappings are always done. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED	Dave Chinner	2010-06-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Inode numbers may come from somewhere external to the filesystem (e.g. file handles, bulkstat information) and so are inherently untrusted. Rename the flag we use for these lookups to make it obvious we are doing a lookup of an untrusted inode number and need to verify it completely before trying to read it from disk. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: always use iget in bulkstat	Christoph Hellwig	2010-06-23	1	-241/+40
\| \| \| \| \| \| \| \| \| \| \|	The non-coherent bulkstat versionsthat look directly at the inode buffers causes various problems with performance optimizations that make increased use of just logging inodes. This patch makes bulkstat always use iget, which should be fast enough for normal use with the radix-tree based inode cache introduced a while ago. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
*	xfs: return inode fork offset in bulkstat for fsr	Dave Chinner	2010-03-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	So that fsr can attempt to get the fork offset of the temporary inode it uses the same as the inode it is defragmenting, pass the fork offset out in the bulkstat information. The bulkstat structure has padding that has always been zeroed, so userspace can tell if this field is set or not by use of the xattr present flag and a non-zero value for the fork offset. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: replace KM_LARGE with explicit vmalloc use	Christoph Hellwig	2010-01-21	1	-3/+5
\| \| \| \| \| \| \| \| \|	We use the KM_LARGE flag to make kmem_alloc and friends use vmalloc if necessary. As we only need this for a few boot/mount time allocations just switch to explicit vmalloc calls there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: Replace per-ag array with a radix tree	Dave Chinner	2010-01-15	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The use of an array for the per-ag structures requires reallocation of the array when growing the filesystem. This requires locking access to the array to avoid use after free situations, and the locking is difficult to get right. To avoid needing to reallocate an array, change the per-ag structures to an allocated object per ag and index them using a tree structure. The AGs are always densely indexed (hence the use of an array), but the number supported is 2^32 and lookups tend to be random and hence indexing needs to scale. A simple choice is a radix tree - it works well with this sort of index. This change also removes another large contiguous allocation from the mount/growfs path in XFS. The growing process now needs to change to only initialise the new AGs required for the extra space, and as such only needs to exclusively lock the tree for inserts. The rest of the code only needs to lock the tree while doing lookups, and hence this will remove all the deadlocks that currently occur on the m_perag_lock as it is now an innermost lock. The lock is also changed to a spinlock from a read/write lock as the hold time is now extremely short. To complete the picture, the per-ag structures will need to be reference counted to ensure that we don't free/modify them while they are still in use. This will be done in subsequent patch. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: implement ->dirty_inode to fix timestamp handling	Christoph Hellwig	2009-10-08	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is picking up on Felix's repost of Dave's patch to implement a .dirty_inode method. We really need this notification because the VFS keeps writing directly into the inode structure instead of going through methods to update this state. In addition to the long-known atime issue we now also have a caller in VM code that updates c/mtime that way for shared writeable mmaps. And I found another one that no one has noticed in practice in the FIFO code. So implement ->dirty_inode to set i_update_core whenever the inode gets externally dirtied, and switch the c/mtime handling to the same scheme we already use for atime (always picking up the value from the Linux inode). Note that this patch also removes the xfs_synchronize_atime call in xfs_reclaim it was superflous as we already synchronize the time when writing the inode via the log (xfs_inode_item_format) or the normal buffers (xfs_iflush_int). In addition also remove the I_CLEAR check before copying the Linux timestamps - now that we always have the Linux inode available we can always use the timestamps in it. Also switch to just using file_update_time for regular reads/writes - that will get us all optimization done to it for free and make sure we notice early when it breaks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: rationalize xfs_inobt_lookup*	Christoph Hellwig	2009-09-01	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currenly we have a xfs_inobt_lookup* variant for each comparism direction, and all these get all three fields of the inobt records passed, while the common case is just looking for the inode number and we have only marginally more callers than xfs_inobt_lookup* variants. So opencode a direct call to xfs_btree_lookup for the single case where we need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that just takes the inode number and the direction for all other callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
*	xfs: improve xfs_inobt_get_rec prototype	Christoph Hellwig	2009-09-01	1	-40/+44
\| \| \| \| \| \| \| \| \| \| \|	Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the three individual variables, too. So just change xfs_inobt_get_rec to write the output into a xfs_inobt_rec_incore_t directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
*	xfs: add more statics & drop some unused functions	Eric Sandeen	2009-08-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	A lot more functions could be made static, but they need forward declarations; this does some easy ones, and also found a few unused functions in the process. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
*	xfs: fix various typos	Malcolm Parsons	2009-03-29	1	-1/+1
\| \| \| \| \|	Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: kill vn_atime_* helpers.	Christoph Hellwig	2009-03-16	1	-1/+6
\| \| \| \| \| \| \| \|	Two out of three are unused already, and the third is better done open-coded with a comment describing what's going on here. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
*	[XFS] Remove the rest of the macro-to-function indirections.	Eric Sandeen	2009-01-16	1	-3/+3
\| \| \| \| \| \| \| \|	Remove the last of the macros-defined-to-static-functions. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Fix xfs_bulkstat_one size checks & error handling	sandeen@sandeen.net	2008-12-02	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 32-bit xfs_blkstat_one handler was failing because a size check checked whether the remaining (32-bit) user buffer was less than the (64-bit) bulkstat buffer, and failed with ENOMEM if so. Move this check into the respective handlers so that they check the correct sizes. Also, the formatters were returning negative errors or positive bytes copied; this was odd in the positive error value world of xfs, and handled wrong by at least some of the callers, which treated the bytes returned as an error value. Move the bytes-used assignment into the formatters. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Make the bulkstat_one compat ioctl handling more sane	sandeen@sandeen.net	2008-12-02	1	-4/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the compat formatter was handled by passing in "private_data" for the xfs_bulkstat_one formatter, which was really just another formatter... IMHO this got confusing. Instead, just make a new xfs_bulkstat_one_compat formatter for xfs_bulkstat, and call it via a wrapper. Also, don't translate the ioctl nrs into their native counterparts, that just clouds the issue; we're in a compat handler anyway, just switch on the 32-bit cmds. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] kill the XFS_IMAP_BULKSTAT flag	Christoph Hellwig	2008-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \|	Just pass down the XFS_IGET_* flags all the way down to xfs_imap instead of translating them mid-way. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
*	[XFS] embededd struct xfs_imap into xfs_inode	Christoph Hellwig	2008-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Most uses of struct xfs_imap are to map and inode to a buffer. To avoid copying around the inode location information we should just embedd a strcut xfs_imap into the xfs_inode. To make sure it doesn't bloat an inode the im_len is changed to a ushort, which is fine as that's what the users exepect anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
*	[XFS] kill XFS_DINODE_VERSION_ defines	Christoph Hellwig	2008-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	These names don't add any value at all over just using the numerical values. (First sent on October 9th) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
*	[XFS] kill xfs_dinode_core_t	Christoph Hellwig	2008-12-01	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have a separate xfs_icdinode_t for the in-core inode which gets logged there is no need anymore for the xfs_dinode vs xfs_dinode_core split - the fact that part of the structure gets logged through the inode log item and a small part not can better be described in a comment. All sizeof operations on the dinode_core either really wanted the icdinode and are switched to that one, or had already added the size of the agi unlinked list pointer. Later both will be replaced with helpers once we get the larger CRC-enabled dinode. Removing the data and attribute fork unions also has the advantage that xfs_dinode.h doesn't need to pull in every header under the sun. While we're at it also add some more comments describing the dinode structure. (First sent on October 7th) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
*	[XFS] stop using xfs_itobp in xfs_bulkstat	Christoph Hellwig	2008-10-30	1	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xfs_bulkstat only wants the dinode, offset and buffer from a given inode number. Instead of using xfs_itobp on a fake inode which is complicated and currently leads to leaks of the security data just use xfs_inotobp which is designed to do exactly the kind of lookup xfs_bulkstat wants. The only thing that's missing in xfs_inotobp is a flags paramter that let's us pass down XFS_IMAP_BULKSTAT, but that can easily added. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32397a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
*	[XFS] implement generic xfs_btree_increment	Christoph Hellwig	2008-10-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: Dave Chinner <dgc@sgi.com> Because this is the first major generic btree routine this patch includes some infrastrucure, first a few routines to deal with a btree block that can be either in short or long form, second xfs_btree_read_buf_block, which is the new central routine to read a btree block given a cursor, and third the new xfs_btree_ptr_addr routine to calculate the address for a given btree pointer record. [hch: split out from bigger patch and minor adaptions] SGI-PV: 985583 SGI-Modid: xfs-linux-melb:xfs-kern:32190a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Bill O'Donnell <billodo@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
*	[XFS] split up xfs_btree_init_cursor	Christoph Hellwig	2008-10-30	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xfs_btree_init_cursor contains close to little shared code for the different btrees and will get even more non-common code in the future. Split it up into one routine per btree type. Because xfs_btree_dup_cursor needs to call the init routine for a generic btree cursor add a new btree operation vector that contains a dup_cursor method that initializes a new cursor based on an existing one. The btree operations vector is based on an idea and code from Dave Chinner and will grow more entries later during this series. SGI-PV: 985583 SGI-Modid: xfs-linux-melb:xfs-kern:32176a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Bill O'Donnell <billodo@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
*	[XFS] Make use of the init-once slab optimisation.	David Chinner	2008-10-30	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid having to initialise some fields of the XFS inode on every allocation, we can use the slab init-once feature to initialise them. All we have to guarantee is that when we free the inode, all it's entries are in the initial state. Add asserts where possible to ensure debug kernels check this initial state before freeing and after allocation. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31925a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
*	[XFS] remove some easy bhv_vnode_t instances	Christoph Hellwig	2008-08-13	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \|	In various places we can just move a VFS_I call into the argument list of called functions/macros instead of having a local bhv_vnode_t. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31776a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Kill shouty XFS_ITOV() macro	David Chinner	2008-08-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Replace XFS_ITOV() with the new VFS_I() inline. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31724a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Remove unused arg from kmem_free()	Denys Vlasenko	2008-07-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	kmem_free() function takes (ptr, size) arguments but doesn't actually use second one. This patch removes size argument from all callsites. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31050a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] kill di_mode checks after xfs_iget	Christoph Hellwig	2008-04-29	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unless XFS_IGET_CREATE is passed xfs_iget will return ENOENT if it encounters an inode with di_mode == 0. Remove the duplicated checks in the callers. (the log recovery case is not touched for now) SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30898a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] xfs_bulkstat_one_dinode() never returns an error.	David Chinner	2008-04-18	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \|	Mark it void. SGI-PV: 980084 SGI-Modid: xfs-linux-melb:xfs-kern:30828a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Don't block pdflush when writing back inodes	David Chinner	2008-04-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When pdflush is writing back inodes, it can get stuck on inode cluster buffers that are currently under I/O. This occurs when we write data to multiple inodes in the same inode cluster at the same time. Effectively, delayed allocation marks the inode dirty during the data writeback. Hence if the inode cluster was flushed during the writeback of the first inode, the writeback of the second inode will block waiting for the inode cluster write to complete before writing it again for the newly dirtied inode. Basically, we want to avoid this from happening so we don't block pdflush and slow down all of writeback. Hence we introduce a non-blocking async inode flush flag that pdflush uses. If this flag is set, we use non-blocking operations (e.g. try locks) whereever we can to avoid blocking or extra I/O being issued. SGI-PV: 970925 SGI-Modid: xfs-linux-melb:xfs-kern:30501a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] remove shouting-indirection macros from xfs_sb.h	Eric Sandeen	2008-04-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Remove macro-to-small-function indirection from xfs_sb.h, and remove some which are completely unused. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30528a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros.	Christoph Hellwig	2008-02-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an xfs_dinode that embedds a xfs_dinode_core one will have to do endian swapping while the other doesn't. Instead of having the current mess with the CFORK macros that have byteswapping and non-byteswapping version (which are inconsistantly named while we're at it) just define each family of the macros to stand by itself and simplify the whole matter. A few direct references to the CFORK variants were cleaned up to use IFORK or DFORK to make this possible. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30163a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Remove the BPCSHIFT and NB* based macros from XFS.	Tim Shimmin	2008-02-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The BPCSHIFT based macros, btoc, ctob, offtoc* and ctooff are either not used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE where appropriate. Initial patch and motivation from Nicolas Kaiser. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30096a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] Make xfs_bulkstat() to report unlinked but referenced inodes	Vlad Apostolov	2008-02-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need xfs_bulkstat() to report inode stat for inodes with link count zero but reference count non zero. The fix here: http://oss.sgi.com/archives/xfs/2007-09/msg00266.html changed this behavior and made xfs_bulkstat() to filter all unlinked inodes including those that are not destroyed yet but held by reference. The attached patch returns back to the original behavior by marking the on-disk inode buffer "dirty" when di_mode is cleared (at that time both inode link and reference counter are zero). SGI-PV: 972004 SGI-Modid: xfs-linux-melb:xfs-kern:29914a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
*	[XFS] 971064 Various fixups for xfs_bulkstat().	Lachlan McIlroy	2007-12-10	1	-14/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]() - remove the special case for XFS_IOC_FSBULKSTAT with count == 1. This special case causes bulkstat to fail because the special case uses xfs_bulkstat_single() instead of xfs_bulkstat() and the two functions have different semantics. xfs_bulkstat() will return the next inode after the one supplied while skipping internal inodes (ie quota inodes). xfs_bulkstate_single() will only lookup the inode supplied and return an error if it is an internal inode. - in xfs_bulkstat(), need to initialise 'lastino' to the inode supplied so in cases were we return without examining any inodes the scan wont restart back at zero. - sanity check for valid *ubcountp values. Cannot sanity check for valid ubuffer here because some users of xfs_bulkstat() don't supply a buffer. - checks against 'ubleft' (the space left in the user's buffer) should be against 'statstruct_size' which is the supplied minimum object size. The mixture of checks against statstruct_size and 0 was one of the reasons we were skipping inodes. - if the formatter function returns BULKSTAT_RV_NOTHING and an error and the error is not ENOENT or EINVAL then we need to abort the scan. ENOENT is for inodes that are no longer valid and we just skip them. EINVAL is returned if we try to lookup an internal inode so we skip them too. For a DMF scan if the inode and DMF attribute cannot fit into the space left in the user's buffer it would return ERANGE. We didn't handle this error and skipped the inode. We would continue to skip inodes until one fitted into the user's buffer or we completed the scan. - put back the recalculation of agino (that got removed with the last fix) at the end of the while loop. This is because the code at the start of the loop expects agino to be the last inode examined if it is non-zero. - if we found some inodes but then encountered an error, return success this time and the error next time. If the formatter aborted with ENOMEM we will now return this error but only if we couldn't read any inodes. Previously if we encountered ENOMEM without reading any inodes we returned a zero count and no error which falsely indicated the scan was complete. SGI-PV: 973431 SGI-Modid: xfs-linux-melb:xfs-kern:30089a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>
*	[XFS] This fix prevents bulkstat from spinning in an infinite loop.	Lachlan McIlroy	2007-10-16	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Here 'agino' increments through the inodes in an allocation group. At the end of the innermost 'for' loop it will hold the value of the next inode to look at (ie the first inode in the next cluster/chunk). Assigning 'lastino' to 'agino' resets it to the last inode in the last inode cluster we just looked at. This causes us to look up the very same cluster and examine all the inodes all over again, and again, and again... We also want to set 'lastino' for the cases when we're not interested in the inode so that the next call to bulkstat won't re-examine the same uninteresting inodes. SGI-PV: 971064 SGI-Modid: xfs-linux-melb:xfs-kern:29840a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] get_bulkall() could return incorrect inode state	Vlad Apostolov	2007-10-16	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the following scenario xfs_bulkstat() returns incorrect stale inode state: 1. File_A is created and its inode synced to disk. 2. File_A is unlinked and doesn't exist anymore. 3. Filesystem sync is invoked. 4. File_B is created. File_B happens to reclaim File_A's inode. 5. xfs_bulkstat() is called and detects File_B but reports the incorrect File_A inode state. Explanation for the incorrect inode state is that inodes are not immediately synced on file create for performance reasons. This leaves the on-disk inode buffer uninitialized (or with old state from a previous generation inode) and this is what xfs_bulkstat() would report. The patch marks the on-disk inode buffer "dirty" on unlink. When the inode is reclaimed (by a new file create), xfs_bulkstat() would filter this inode by the "dirty" mark. Once the inode is flushed to disk, the on-disk buffer "dirty" mark is automatically removed and a following xfs_bulkstat() would return the correct inode state. Marking the on-disk inode buffer "dirty" on unlink is achieved by setting the on-disk di_nlink field to 0. Note that the in-core di_nlink has already been set to 0 and a corresponding transaction logged by xfs_droplink(). This is an exception from the rule that any on-disk inode buffer changes has to be followed by a disk write (inode flush). Synchronizing the in-core to on-disk di_nlink values in advance (before the actual inode flush to disk) should be fine in this case because the inode is already unlinked and it would never change its di_nlink again for this inode generation. SGI-PV: 970842 SGI-Modid: xfs-linux-melb:xfs-kern:29757a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Mark Goodwin <markgw@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] dinode endianess annotations	Christoph Hellwig	2007-10-15	1	-28/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Biggest bit is duplicating the dinode structure so we have one annotated for native endianess and one for disk endianess. The other significant change is that xfs_xlate_dinode_core is split into one helper per direction to allow for proper annotations, everything else is trivial. As a sidenode splitting out the incore dinode means we can move it into xfs_inode.h in a later patch and severely improving on the include hell in xfs. SGI-PV: 968563 SGI-Modid: xfs-linux-melb:xfs-kern:29476a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode	Michal Marek	2007-07-14	1	-8/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 32bit struct xfs_fsop_bulkreq has different size and layout of members, no matter the alignment. Move the code out of the #else branch (why was it there in the first place?). Define _32 variants of the ioctl constants. * 32bit struct xfs_bstat is different because of time_t and on i386 because of different padding. Make xfs_bulkstat_one() accept a custom "output formatter" in the private_data argument which takes care of the xfs_bulkstat_one_compat() that takes care of the different layout in the compat case. * i386 struct xfs_inogrp has different padding. Add a similar "output formatter" mecanism to xfs_inumbers(). SGI-PV: 967354 SGI-Modid: xfs-linux-melb:xfs-kern:29102a Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	Fix occurrences of "the the "	Michael Opdenacker	2007-05-09	1	-1/+1
\| \| \| \| \|	Signed-off-by: Michael Opdenacker <michael@free-electrons.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
*	[XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error	Vlad Apostolov	2006-09-28	1	-0/+2
\| \| \| \| \| \| \| \|	SGI-PV: 955947 SGI-Modid: xfs-linux-melb:xfs-kern:26986a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks	Vlad Apostolov	2006-09-28	1	-10/+14
\| \| \| \| \| \| \| \| \| \|	consistent in bulkstat SGI-PV: 956241 SGI-Modid: xfs-linux-melb:xfs-kern:26984a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Really fix use after free in xfs_iunpin.	David Chinner	2006-09-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous attempts to fix the linux inode use-after-free in xfs_iunpin simply made the problem harder to hit. We actually need complete exclusion between xfs_reclaim and xfs_iunpin, as well as ensuring that the i_flags are consistent during both of these functions. Introduce a new spinlock for exclusion and the i_flags, and fix up xfs_iunpin to use igrab before marking the inode dirty. SGI-PV: 952967 SGI-Modid: xfs-linux-melb:xfs-kern:26964a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms.	Nathan Scott	2006-09-28	1	-1/+1
\| \| \| \| \| \| \| \|	SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26907a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error	Vlad Apostolov	2006-09-28	1	-3/+3
\| \| \| \| \| \| \| \|	SGI-PV: 955157 SGI-Modid: xfs-linux-melb:xfs-kern:26869a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] pv 955157, rv bnaujok - break the loop on formatter() error	Vlad Apostolov	2006-09-28	1	-0/+5
\| \| \| \| \| \| \| \|	SGI-PV: 955157 SGI-Modid: xfs-linux-melb:xfs-kern:26866a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Add a greedy allocation interface, allocating within a min/max size	Nathan Scott	2006-09-28	1	-14/+2
\| \| \| \| \| \| \| \| \| \|	range. SGI-PV: 955302 SGI-Modid: xfs-linux-melb:xfs-kern:26803a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Remove last bulkstat false-positives with debug kernels.	Nathan Scott	2006-09-28	1	-1/+2
\| \| \| \| \| \| \| \|	SGI-PV: 953819 SGI-Modid: xfs-linux-melb:xfs-kern:26628a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Increase the size of the buffer holding the local inode cluster	Nathan Scott	2006-09-28	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \|	list, to increase our potential readahead window and in turn improve bulkstat performance. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26607a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Drop unneeded endian conversion in bulkstat and start readahead for	Nathan Scott	2006-09-28	1	-37/+31
\| \| \| \| \| \| \| \| \| \| \|	batches of inode cluster buffers at once, before any blocking reads are issued. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26606a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
*	[XFS] Rework DMAPI bulkstat calls in such a way that we can directly	Nathan Scott	2006-09-28	1	-10/+55
\| \| \| \| \| \| \| \| \| \| \| \|	extract inline attributes out of the bulkstat buffer (for that case), rather than using an (extremely expensive for large icount filesystems) iget for fetching attrs. SGI-PV: 944409 SGI-Modid: xfs-linux-melb:xfs-kern:26602a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>