op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PATCH] zoned vm counters: conversion of nr_unstable to per zone counter	Christoph Lameter	2006-06-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] zoned vm counters: conversion of nr_dirty to per zone counter	Christoph Lameter	2006-06-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Kill PF_SYNCWRITE flag	Jens Axboe	2006-06-23	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A process flag to indicate whether we are doing sync io is incredibly ugly. It also causes performance problems when one does a lot of async io and then proceeds to sync it. Part of the io will go out as async, and the other part as sync. This causes a disconnect between the previously submitted io and the synced io. For io schedulers such as CFQ, this will cause us lost merges and suboptimal behaviour in scheduling. Remove PF_SYNCWRITE completely from the fsync/msync paths, and let the O_DIRECT path just directly indicate that the writes are sync by using WRITE_SYNC instead. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] writeback: fix range handling	OGAWA Hirofumi	2006-06-23	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a writeback_control's `start' and `end' fields are used to indicate a one-byte-range starting at file offset zero, the required values of .start=0,.end=0 mean that the ->writepages() implementation has no way of telling that it is being asked to perform a range request. Because we're currently overloading (start == 0 && end == 0) to mean "this is not a write-a-range request". To make all this sane, the patch changes range of writeback_control. So caller does: If it is calling ->writepages() to write pages, it sets range (range_start/end or range_cyclic) always. And if range_cyclic is true, ->writepages() thinks the range is cyclic, otherwise it just uses range_start and range_end. This patch does, - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h -1 is usually ok for range_end (type is long long). But, if someone did, range_end += val; range_end is "val - 1" u64val = range_end >> bits; u64val is "~(0ULL)" or something, they are wrong. So, this adds LLONG_MAX to avoid nasty things, and uses LLONG_MAX for range_end. - All callers of ->writepages() sets range_start/end or range_cyclic. - Fix updates of ->writeback_index. It seems already bit strange. If it starts at 0 and ended by check of nr_to_write, this last index may reduce chance to scan end of file. So, this updates ->writeback_index only if range_cyclic is true or whole-file is scanned. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Nathan Scott <nathans@sgi.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Steven French <sfrench@us.ibm.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Move cond_resched() after iput() in sync_sb_inodes()	OGAWA Hirofumi	2006-03-25	1	-1/+1
\| \| \| \| \| \| \| \|	In here, I think the following order is more cache-friendly. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] kernel-docs: fix kernel-doc format problems	Randy Dunlap	2005-11-07	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert to proper kernel-doc format. Some have extra blank lines (not allowed immed. after the function name) or need blank lines (after all parameters). Function summary must be only one line. Colon (":") in a function description does weird things (causes kernel-doc to think that it's a new section head sadly). Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] write_inode_now(): write inode if not BDI_CAP_NO_WRITEBACK	Andrew Morton	2005-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \|	If the backing_dev_info doesn't have BDI_CAP_NO_WRITEBACK we're not supposed to write back an inode's pages. But in this situation write_inode_now() refuses to write the inode itself as well. Fix. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] fix __writeback_single_inode WARN_ON	Andrea Arcangeli	2005-10-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the inode count is zero in inode writeback, the WARN_ON(!(inode->i_state & I_WILL_FREE)); is broken, and needs to test for either I_WILL_FREE\|I_FREEING. When the inode is in I_FREEING state, it's already out of the visibility of the vm so it can't be freed so it doesn't require the __iget and the generic_delete_inode path can call the sync internally to the lowlevel fs callback during the last iput. So the inode being in I_FREEING is also a valid condition for calling the sync with i_count == 0. The specific stack trace is this: 0xc00000007b8fb6e0 0xc00000000010118c .__writeback_single_inode +0x5c 0xc00000007b8fb6e0 0xc0000000001014dc (lr) .sync_inode +0x3c 0xc00000007b8fb790 0xc0000000001014dc .sync_inode +0x3c 0xc00000007b8fb820 0xc0000000001a5020 .ext2_sync_inode +0x64 0xc00000007b8fb8f0 0xc0000000001a65b4 .ext2_truncate +0x3f8 0xc00000007b8fba40 0xc0000000001a6940 .ext2_delete_inode +0xdc 0xc00000007b8fbac0 0xc0000000000f7a5c .generic_delete_inode +0x124 0xc00000007b8fbb50 0xc0000000000f5fe0 .iput +0xb8 0xc00000007b8fbbe0 0xc0000000000e9fd4 .sys_unlink +0x2a8 0xc00000007b8fbd10 0xc00000000001048c .ret_from_syscall_1 +0x0 Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] fix nr_unused accounting, and avoid recursing in iput with ↵	Andrea Arcangeli	2005-10-30	1	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I_WILL_FREE set list_move(&inode->i_list, &inode_in_use); } else { list_move(&inode->i_list, &inode_unused); + inodes_stat.nr_unused++; } } wake_up_inode(inode); Are you sure the above diff is correct? It was added somewhere between 2.6.5 and 2.6.8. I think it's wrong. The only way I can imagine the i_count to be zero in the above path, is that I_WILL_FREE is set. And if I_WILL_FREE is set, then we must not increase nr_unused. So I believe the above change is buggy and it will definitely overstate the number of unused inodes and it should be backed out. Note that __writeback_single_inode before calling __sync_single_inode, can drop the spinlock and we can have both the dirty and locked bitflags clear here: spin_unlock(&inode_lock); __wait_on_inode(inode); iput(inode); XXXXXXX spin_lock(&inode_lock); } use inode again here a construct like the above makes zero sense from a reference counting standpoint. Either we don't ever use the inode again after the iput, or the inode_lock should be taken _before_ executing the iput (i.e. a __iput would be required). Taking the inode_lock after iput means the iget was useless if we keep using the inode after the iput. So the only chance the 2.6 was safe to call __writeback_single_inode with the i_count == 0, is that I_WILL_FREE is set (I_WILL_FREE will prevent the VM to free the inode in XXXXX). Potentially calling the above iput with I_WILL_FREE was also wrong because it would recurse in iput_final (the second mainline bug). The below (untested) patch fixes the nr_unused accounting, avoids recursing in iput when I_WILL_FREE is set and makes sure (with the BUG_ON) that we don't corrupt memory and that all holders that don't set I_WILL_FREE, keeps a reference on the inode! Signed-off-by: Andrea Arcangeli <andrea@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] O(1) sb list traversing on syncs	Kirill Korotaev	2005-06-23	1	-37/+27
\| \| \| \| \| \| \| \| \| \| \|	This patch removes O(n^2) super block loops in sync_inodes(), sync_filesystems() etc. in favour of using __put_super_and_need_restart() which I introduced earlier. We faced a noticably long freezes on sb syncing when there are thousands of super blocks in the system. Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] DocBook: fix some descriptions	Martin Waitz	2005-05-01	1	-1/+3
\| \| \| \| \| \| \| \| \|	Some KernelDoc descriptions are updated to match the current code. No code changes. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Linux-2.6.12-rc2v2.6.12-rc2	Linus Torvalds	2005-04-16	1	-0/+695
	Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!