op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	ext4: don't use ext4_error in ext4_check_descriptors	Josef Bacik	2008-04-29	1	-16/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Because ext4_check_descriptors is called at mount time you can't use ext4_error as it calls ext4_commit_sb, which since the sb isn't all the way initialized causes bad things to happen (ie a panic). This patch changes the ext4_error's to printk's to keep this problem from happening. Thanks much, Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: mark inode dirty after initializing the extent tree	Aneesh Kumar K.V	2008-04-29	1	-6/+7
\| \| \| \| \| \| \| \| \| \|	We should mark the inode dirty only after initializing the extent tree. Also if we fail during extent initialization we need to call DQUOT_FREE_INODE. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: update ctime and mtime for truncate with extents.	Solofo Ramangalahy	2008-04-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recently announced "Linux POSIX file system test suite" caught a truncate issue when using extents: mtime and ctime are not updated when truncate is successful. This is the single issue caught with "default" ext4 (mkfs and mount with minimal options). The testsuite does not report failure with -o noextents. With the following patch, all tests of the testsuite pass. Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group	Aneesh Kumar K.V	2008-04-29	1	-14/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can't do GFP_NOFS allocation after taking ext4_lock_group BUG: sleeping function called from invalid context at mm/slab.c:3054 in_atomic():1, irqs_disabled():0 1 lock held by vi/2426: #0: (&ei->i_data_sem){----}, at: [<c01cf665>] ext4_release_file+0x23/0x66 Pid: 2426, comm: vi Not tainted 2.6.25-rc7 #24 [<c011a3dc>] __might_sleep+0xbe/0xc5 [<c01620c9>] kmem_cache_alloc+0x22/0xa6 [<c01e382a>] ext4_mb_release_inode_pa+0x73/0x1b3 [<c01e6adf>] ext4_mb_discard_inode_preallocations+0x22d/0x2d4 [<c013000a>] ? param_set_ushort+0x32/0x39 [<c01ceba1>] ext4_discard_reservation+0x27/0x6a [<c01cf66c>] ext4_release_file+0x2a/0x66 [<c0165bd6>] __fput+0xae/0x155 [<c0165e46>] fput+0x17/0x19 [<c0163756>] filp_close+0x50/0x5a [<c01647c0>] sys_close+0x71/0xad [<c0104aba>] sysenter_past_esp+0x5f/0xa5 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: move headers out of include/linux	Christoph Hellwig	2008-04-29	27	-64/+61
\| \| \| \| \| \| \| \| \| \|	Move ext4 headers out of include/linux. This is just the trivial move, there's some more thing that could be done later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: fix wrong gfp type under transaction	Josef Bacik	2008-04-29	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the allocations with GFP_KERNEL while under a transaction problems in ext4. This patch is the same as its ext3 counterpart, just switches these to GFP_NOFS. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix hang on umount with quotas when journal is aborted	Jan Kara	2008-04-29	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Call dquot_drop() from ext4_dquot_drop() even if we fail to start a transaction. Otherwise we never get to dropping references to quota structures from the inode and umount will hang indefinitely. Thanks to Payphone LIOU for spotting the problem. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> CC: Payphone LIOU <lioupayphone@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix update of mtime and ctime on rename	Jan Kara	2008-04-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	The patch below makes ext4 update mtime and ctime of the directory into which we move file even if the directory entry already exists. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jdb2: replace remaining __FUNCTION__ occurrences	Harvey Harrison	2008-04-17	3	-12/+12
\| \| \| \| \| \| \| \| \|	__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: replace remaining __FUNCTION__ occurrences	Harvey Harrison	2008-04-17	10	-97/+97
\| \| \| \| \| \| \| \| \|	__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: only create debugfs and stats entries if init is successful	Duane Griffin	2008-04-29	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \|	jbd2 debugfs and stats entries should only be created if cache initialisation is successful. At the moment they are being created unconditionally which will leave them dangling if cache (and hence module) initialisation fails. Signed-off-by: Duane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: fix kernel-doc notation	Randy Dunlap	2008-04-17	2	-3/+5
\| \| \| \| \| \| \| \| \|	Fix kernel-doc notation in jbd2. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: replace potentially false assertion with if block	Duane Griffin	2008-04-17	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an error occurs during jbd2 cache initialisation it is possible for the journal_head_cache to be NULL when jbd2_journal_destroy_journal_head_cache is called. Replace the J_ASSERT with an if block to handle the situation correctly. Note that even with this fix things will break badly if jbd2 is statically compiled in and cache initialisation fails. Signed-off-by: Duane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: eliminate duplicated code in revocation table init/destroy functions	Duane Griffin	2008-04-17	1	-76/+51
\| \| \| \| \| \| \| \| \| \| \| \| \|	The revocation table initialisation/destruction code is repeated for each of the two revocation tables stored in the journal. Refactoring the duplicated code into functions is tidier, simplifies the logic in initialisation in particular, and slightly reduces the code size. There should not be any functional change. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: tidy up revoke cache initialisation and destruction	Duane Griffin	2008-04-28	1	-14/+22
\| \| \| \| \| \| \| \| \| \| \|	Make revocation cache destruction safe to call if initialisation fails partially or entirely. This allows it to be used to cleanup in the case of initialisation failure, simplifying the code slightly. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
*	ext4: remove duplicate include of ext4_fs_i.h header file	Joe Perches	2008-04-29	1	-2/+0
\| \| \| \| \| \| \|	include/linux/ext4_fs_i.h is included in include/linux/ext_fs.h twice Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: make ext4_xattr_list() static	Mingming Cao	2008-04-17	2	-8/+3
\| \| \| \| \| \| \| \|	This patch makes the needlessly global ext4_xattr_list() static. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: remove extra define of ext4_new_blocks_old from mballoc.c	Mingming Cao	2008-04-17	1	-2/+0
\| \| \| \| \| \| \| \|	The function prototype of ext4_new_blocks_old() is defined in ext4_fs.h, so we don't need the extra function prototype in mballoc.c Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: check ext4_journal_get_write_access() errors	Akinobu Mita	2008-04-17	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \|	Check ext4_journal_get_write_access() errors. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Tweedie <sct@redhat.com> Cc: adilger@clusterfs.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mingming Cao <cmm@us.ibm.com>
*	ext4: use ext4_get_group_desc()	Akinobu Mita	2008-04-17	1	-17/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Use ext4_get_group_desc() in ext4_get_inode_block() instead of open coding the functionality. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Tweedie <sct@redhat.com> Cc: adilger@clusterfs.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mingming Cao <cmm@us.ibm.com>
*	ext4: use ext4_group_first_block_no()	Akinobu Mita	2008-04-17	2	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Use ext4_group_first_block_no() and assign the return values to ext4_fsblk_t variables. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Tweedie <sct@redhat.com> Cc: adilger@clusterfs.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mingming Cao <cmm@us.ibm.com>
*	ext4: convert byte order of constant instead of variable	Marcin Slusarz	2008-04-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Convert byte order of constant instead of variable which can be done at compile time (vs run time). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: le*_add_cpu conversion	Marcin Slusarz	2008-04-17	7	-38/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: sct@redhat.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: adilger@clusterfs.com Cc: Mingming Cao <cmm@us.ibm.com>
*	ext4: Convert list_for_each_rcu() to list_for_each_entry_rcu()	Aneesh Kumar K.V	2008-04-17	1	-13/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The list_for_each_entry_rcu() primitive should be used instead of list_for_each_rcu(), as the former is easier to use and provides better type safety. http://groups.google.com/group/linux.kernel/browse_thread/thread/45749c83451cebeb/0633a65759ce7713?lnk=raot Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: reduce mballoc stack usage with noinline_for_stack	Eric Sandeen	2008-04-29	1	-16/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mballoc.c is a whole lot of static functions, which gcc seems to really like to inline. With the changes below, on x86, I can at least get from: 432 ext4_mb_new_blocks 240 ext4_mb_free_blocks 208 ext4_mb_discard_group_preallocations 188 ext4_mb_seq_groups_show 164 ext4_mb_init_cache 152 ext4_mb_release_inode_pa 136 ext4_mb_seq_history_show ... to 220 ext4_mb_free_blocks 188 ext4_mb_seq_groups_show 176 ext4_mb_regular_allocator 164 ext4_mb_init_cache 156 ext4_mb_new_blocks 152 ext4_mb_release_inode_pa 136 ext4_mb_seq_history_show 124 ext4_mb_release_group_pa ... which still has some big functions in there, but not 432 bytes! Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	Convert ext4 to use unlocked_ioctl	Andi Kleen	2008-04-29	4	-13/+6
\| \| \| \| \| \| \| \| \|	I checked ext4_ioctl and it looked largely safe to not be used without BKL. So convert it over to unlocked_ioctl. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
*	ext4: Cache the correct extent length for uninit extents	Aneesh Kumar K.V	2008-04-29	1	-5/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we convert an uninitialized extent to an initialized extent we need to make sure we return the number of blocks in the extent from the file system block corresponding to logical file block. Otherwise we cache wrong extent details and this results in file system corruption. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Return unwritten buffer head when trying to read from prealloc space.	Aneesh Kumar K.V	2008-04-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	ext4_ext_get_blocks() returns the number of blocks allocated with buffer head unmapped for a read from prealloc space. This is needed so that delayed allocation doesn't do block reservation for prealloc space since the blocks are already reserved on disk. Mark the buffer head unwritten. Some code paths try to read the block if the buffer_head is not new and no uptodate. Marking the buffer head unwritten avoids this reading. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: make ext4_ext_get_blocks always return <= max_blocks	Aneesh Kumar K.V	2008-04-29	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	ext4_ext_get_blocks() returns number of blocks allocated with buffer heads unmapped for a read from prealloc space. This is needed so that delayed allocation doesn't do block reservation for prealloc space since the blocks are already resevred on disk. Fix ext4_ext_get_blocks to not return greater than max_blocks, since some of the code paths cannot handle such a return value. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix fallocate to update the file size in each transaction	Aneesh Kumar K.V	2008-04-29	1	-51/+38
\| \| \| \| \| \| \| \| \| \| \| \|	ext4_fallocate needs to update file size in each transaction. Otherwise if we crash the file size won't be seen. We were also not marking the inode dirty after updating file size before. Also when we try to retry allocation due to ENOSPC, make sure we reset the variable ret so that we actually do a retry. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix race between migration and mmap write	Aneesh Kumar K.V	2008-04-29	3	-6/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fail migrate if we allocated new blocks via mmap write. If we write to holes in the file via mmap, we end up allocating new blocks. This block allocation happens without taking inode->i_mutex. Since migrate is protected by i_mutex and migrate expects that no new blocks get allocated during migrate, fail migrate if new blocks get allocated. We can't take inode->i_mutex in the mmap write path because that would result in a locking order violation between i_mutex and mmap_sem. Also adding a separate rw_sempahore for protection is really high overhead for a rare operation such as migrate. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: zero out small extents when writing to prealloc area.	Aneesh Kumar K.V	2008-04-17	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \|	If the preallocated area is small zero out the full extent instead of splitting them. This should avoid the "write every alternate block" problem that could grow the number of extents dramatically. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: ENOSPC error handling for writing to an uninitialized extent	Aneesh Kumar K.V	2008-04-29	1	-7/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch handles possible ENOSPC errors when writing to an uninitialized extent in case the filesystem is full. A write to a prealloc area causes the split of an unititalized extent into initialized and uninitialized extents. If we don't have space to add new extent information, instead of returning error, convert the existing uninitialized extent to initialized one. We need to zero out the blocks corresponding to the entire extent to prevent uninitialized data reaching userspace. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	sparc: Export symbols for ZERO_PAGE usage in modules.	Aneesh Kumar K.V	2008-04-29	2	-0/+3
\| \| \| \| \| \| \| \| \| \|	ext4 uses ZERO_PAGE(0) to zero out blocks. We need to export different symbols in different arches for the usage of ZERO_PAGE in modules. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	m68k: Export empty_zero_page for ZERO_PAGE usage in modules.	Aneesh Kumar K.V	2008-04-29	1	-0/+1
\| \| \| \| \| \| \| \| \|	ext4 uses ZERO_PAGE(0) to zero out blocks. We need to export different symbols in different arches for the usage of ZERO_PAGE in modules. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	arm: Export empty_zero_page for ZERO_PAGE usage in modules.	Aneesh Kumar K.V	2008-04-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	ext4 uses ZERO_PAGE(0) to zero out blocks. We need to export different symbols in different arches for the usage of ZERO_PAGE in modules. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Enable extent format for symlinks.	Aneesh Kumar K.V	2008-04-29	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This patch enables extent-formatted normal symlinks. Using extents format allows a symlink to refer to a block number larger than 2^32 on large filesystems. We still don't enable extent format for fast symlinks, which are contained in the inode itself. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix fallocate error path	Aneesh Kumar K.V	2008-04-17	1	-2/+24
\| \| \| \| \| \| \| \| \| \|	Put the old extent details back if we fail to split the uninitialized extent. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: fix mount option parsing	Josef Bacik	2008-04-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The "resize" option won't be noticed as it comes after the NULL option, so if you try to mount (or in this case remount) with that option it won't be recognized. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: fdatasync should skip metadata writeout when overwriting	Hisashi Hifumi	2008-04-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently fdatasync is identical to fsync in ext3. I think fdatasync should skip journal flush in data=ordered and data=writeback mode when it overwrites to already-instantiated blocks on HDD. When I_DIRTY_DATASYNC flag is not set, fdatasync should skip journal writeout because this indicates only atime or/and mtime updates. Following patch is the same approach of ext2's fsync code(ext2_sync_file). I did a performance test using the sysbench. #sysbench --num-threads=128 --max-requests=50000 --test=fileio --file-total-size=128G --file-test-mode=rndwr --file-fsync-mode=fdatasync run The result on ext3 was: -2.6.24 Operations performed: 0 Read, 50080 Write, 59600 Other = 109680 Total Read 0b Written 782.5Mb Total transferred 782.5Mb (12.116Mb/sec) 775.45 Requests/sec executed Test execution summary: total time: 64.5814s total number of events: 50080 total time taken by event execution: 3713.9836 per-request statistics: min: 0.0000s avg: 0.0742s max: 0.9375s approx. 95 percentile: 0.2901s Threads fairness: events (avg/stddev): 391.2500/23.26 execution time (avg/stddev): 29.0155/1.99 -2.6.24-patched Operations performed: 0 Read, 50009 Write, 61596 Other = 111605 Total Read 0b Written 781.39Mb Total transferred 781.39Mb (16.419Mb/sec) 1050.83 Requests/sec executed Test execution summary: total time: 47.5900s total number of events: 50009 total time taken by event execution: 2934.5768 per-request statistics: min: 0.0000s avg: 0.0587s max: 0.8938s approx. 95 percentile: 0.1993s Threads fairness: events (avg/stddev): 390.6953/22.64 execution time (avg/stddev): 22.9264/1.17 Filesystem I/O throughput was improved. Signed-off-by :Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: check return of ext4_orphan_get properly	Josef Bacik	2008-04-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This patch fix a panic while running fsfuzzer. We are improperly checking the return of ext4_orphan_get. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: fix possible journal overflow issues	Josef Bacik	2008-04-17	2	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are several cases where the running transaction can get buffers added to its BJ_Metadata list which it never dirtied, which makes its t_nr_buffers counter end up larger than its t_outstanding_credits counter. This will cause issues when starting new transactions as while we are logging buffers we decrement t_outstanding_buffers, so when t_outstanding_buffers goes negative, we will report that we need less space in the journal than we actually need, so transactions will be started even though there may not be enough room for them. In the worst case scenario (which admittedly is almost impossible to reproduce) this will result in the journal running out of space. The fix is to only refile buffers from the committing transaction to the running transactions BJ_Modified list when b_modified is set on that journal, which is the only way to be sure if the running transaction has modified that buffer. This patch also fixes an accounting error in journal_forget, it is possible that we can call journal_forget on a buffer without having modified it, only gotten write access to it, so instead of freeing a credit, we only do so if the buffer was modified. The assert will help catch if this problem occurs. Without these two patches I could hit this assert within minutes of running postmark, with them this issue no longer arises. Cc: <linux-ext4@vger.kernel.org> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	jbd2: fix the way the b_modified flag is cleared	Josef Bacik	2008-04-17	2	-16/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently at the start of a journal commit we loop through all of the buffers on the committing transaction and clear the b_modified flag (the flag that is set when a transaction modifies the buffer) under the j_list_lock. The problem is that everywhere else this flag is modified only under the jbd2 lock buffer flag, so it will race with a running transaction who could potentially set it, and have it unset by the committing transaction. This is also a big waste, you can have several thousands of buffers that you are clearing the modified flag on when you may not need to. This patch removes this code and instead clears the b_modified flag upon entering do_get_write_access/journal_get_create_access, so if that transaction does indeed use the buffer then it will be accounted for properly, and if it does not then we know we didn't use it. That will be important for the next patch in this series. Tested thoroughly by myself using postmark/iozone/bonnie++. Cc: <linux-ext4@vger.kernel.org> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	Merge branch 'release' of ↵	Linus Torvalds	2008-04-29	10	-25/+61
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Provide ACPI fixup for /proc/cpuinfo/physical_id [IA64] Remove printk noise on unimplemented SAL_PHYSICAL_ID_INFO [IA64] allocate multiple contiguous pages via uncached allocator [IA64] bugfix: nptcg breaks cpu-hotadd
\| *	[IA64] Provide ACPI fixup for /proc/cpuinfo/physical_id	Alex Chiang	2008-04-29	4	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Legacy HP ia64 platforms currently cannot provide /proc/cpuinfo/physical_id due to legacy SAL/PAL implementations. However, that physical topology information can be obtained via ACPI. Provide an interface that gives ACPI one last chance to provide physical_id for these legacy platforms. This logic only comes into play iff: - ACPI actually provides slot information for the CPU - we lack a valid socket_id Otherwise, we don't do anything. Since x86 uses the ACPI processor driver as well, we provide a nop stub function for arch_fix_phys_package_id() in asm-x86/topology.h Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[IA64] Remove printk noise on unimplemented SAL_PHYSICAL_ID_INFO	Alex Chiang	2008-04-29	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 113134fcbca83619be4c68d0ca66db6093777b5d changed the flow of control when calling PAL_LOGICAL_TO_PHYSICAL and SAL_PHYSICAL_ID_INFO. With the change, if a platform did not implement the latter, a useless printk would appear in the boot log: ia64_sal_pltid failed with -1 So let's check the return code and only printk on a true error, and do not print anything in the unimplemented case. While we're in there, clean up some stylistic issues too. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[IA64] allocate multiple contiguous pages via uncached allocator	Dean Nelson	2008-04-29	4	-21/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable the uncached allocator to allocate multiple pages of contiguous uncached memory. Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[IA64] bugfix: nptcg breaks cpu-hotadd	Hidetoshi Seto	2008-04-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If "max_purges" from PAL is 0, it actually means 1. However it was not handled later when a hot-added cpu pass the max_purges from PAL. This makes systems easy to go BUG_ON(). Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2008-04-29	1	-0/+155
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: Smack: Integrate Smack with Audit Security: Typecast CAP_*_SET macros Security: Make secctx_to_secid() take const secdata
\| * \|	Smack: Integrate Smack with Audit	Ahmed S. Darwish	2008-04-30	1	-0/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setup the new Audit hooks for Smack. SELinux Audit rule fields are recycled to avoid `auditd' userspace modifications. Currently only equality testing is supported on labels acting as a subject (AUDIT_SUBJ_USER) or as an object (AUDIT_OBJ_USER). Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>