op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2007-04-27	15	-773/+695
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: preserve boolean values across policy reloads selinux: change numbering of boolean directory inodes in selinuxfs selinux: remove unused enumeration constant from selinuxfs selinux: explicitly number all selinuxfs inodes selinux: export initial SID contexts via selinuxfs selinux: remove userland security class and permission definitions SELinux: move security_skb_extlbl_sid() out of the security server MAINTAINERS: update selinux entry SELinux: rename selinux_netlabel.h to netlabel.h SELinux: extract the NetLabel SELinux support from the security server NetLabel: convert a BUG_ON in the CIPSO code to a runtime check NetLabel: cleanup and document CIPSO constants
\| *	selinux: preserve boolean values across policy reloads	Stephen Smalley	2007-04-26	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At present, the userland policy loading code has to go through contortions to preserve boolean values across policy reloads, and cannot do so atomically. As this is what we always want to do for reloads, let the kernel preserve them instead. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Karl MacMillan <kmacmillan@mentalrootkit.com> Signed-off-by: James Morris <jmorris@namei.org>
\| *	selinux: change numbering of boolean directory inodes in selinuxfs	James Carter	2007-04-26	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the numbering of the booleans directory inodes in selinuxfs to provide more room for new inodes without a conflict in inode numbers and to be consistent with how inode numbering is done in the initial_contexts directory. Signed-off-by: James Carter <jwcart2@tycho.nsa.gov> Acked-by: Eric Paris <eparis@parisplace.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	selinux: remove unused enumeration constant from selinuxfs	James Carter	2007-04-26	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the unused enumeration constant, SEL_AVC, from the sel_inos enumeration in selinuxfs. Signed-off-by: James Carter <jwcart2@tycho.nsa.gov> Acked-by: Eric Paris <eparis@parisplace.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	selinux: explicitly number all selinuxfs inodes	James Carter	2007-04-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Explicitly number all selinuxfs inodes to prevent a conflict between inodes numbered using last_ino when created with new_inode() and those labeled explicitly. Signed-off-by: James Carter <jwcart2@tycho.nsa.gov> Acked-by: Eric Paris <eparis@parisplace.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	selinux: export initial SID contexts via selinuxfs	James Carter	2007-04-26	3	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the initial SID contexts accessible to userspace via selinuxfs. An initial use of this support will be to make the unlabeled context available to libselinux for use for invalidated userspace SIDs. Signed-off-by: James Carter <jwcart2@tycho.nsa.gov> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	selinux: remove userland security class and permission definitions	Stephen Smalley	2007-04-26	6	-314/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove userland security class and permission definitions from the kernel as the kernel only needs to use and validate its own class and permission definitions and userland definitions may change. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	SELinux: move security_skb_extlbl_sid() out of the security server	Paul Moore	2007-04-26	3	-35/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As suggested, move the security_skb_extlbl_sid() function out of the security server and into the SELinux hooks file. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
\| *	MAINTAINERS: update selinux entry	Stephen Smalley	2007-04-26	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Add Eric Paris as an SELinux maintainer. Signed-off-by: James Morris <jmorris@namei.org>
\| *	SELinux: rename selinux_netlabel.h to netlabel.h	Paul Moore	2007-04-26	3	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the beginning I named the file selinux_netlabel.h to avoid potential namespace colisions. However, over time I have realized that there are several other similar cases of multiple header files with the same name so I'm changing the name to something which better fits with existing naming conventions. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
\| *	SELinux: extract the NetLabel SELinux support from the security server	Paul Moore	2007-04-26	6	-405/+481
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up until this patch the functions which have provided NetLabel support to SELinux have been integrated into the SELinux security server, which for various reasons is not really ideal. This patch makes an effort to extract as much of the NetLabel support from the security server as possibile and move it into it's own file within the SELinux directory structure. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
\| *	NetLabel: convert a BUG_ON in the CIPSO code to a runtime check	Paul Moore	2007-04-26	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes a BUG_ON in the CIPSO code to a runtime check. It should also increase the readability of the code as it replaces an unexplained constant with a well defined macro. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
\| *	NetLabel: cleanup and document CIPSO constants	Paul Moore	2007-04-26	1	-8/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch collects all of the CIPSO constants and puts them in one place; it also documents each value explaining how the value is derived. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
* \|	make SysRq-T show all tasks again	Ingo Molnar	2007-04-27	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	show_state() (SysRq-T) developed the buggy habbit of not showing TASK_RUNNING tasks. This was due to the mistaken belief that state_filter == -1 would be a pass-through filter - while in reality it did not let TASK_RUNNING == 0 p->state values through. Fix this by restoring the original '!state_filter means all tasks' special-case i had in the original version. Test-built and test-booted on i686, SysRq-T now works as intended. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	seqlocks: trivial remove weird whitespace	Daniel Walker	2007-04-27	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'for-linus' of git://git.infradead.org/ubi-2.6	Linus Torvalds	2007-04-27	30	-0/+12114
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.infradead.org/ubi-2.6: UBI: remove unused variable UBI: add me to MAINTAINERS JFFS2: add UBI support UBI: Unsorted Block Images
\| * \|	UBI: remove unused variable	Artem Bityutskiy	2007-04-27	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
\| * \|	UBI: add me to MAINTAINERS	Artem Bityutskiy	2007-04-27	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
\| * \|	JFFS2: add UBI support	Artem Bityutskiy	2007-04-27	3	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch make JFFS2 able to work with UBI volumes via the emulated MTD devices which are directly mapped to these volumes. Signed-off-by: Artem Bityutskiy <dedekind@infradead.org>
\| * \|	UBI: Unsorted Block Images	Artem B. Bityutskiy	2007-04-27	26	-0/+12065
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	UBI (Latin: "where?") manages multiple logical volumes on a single flash device, specifically supporting NAND flash devices. UBI provides a flexible partitioning concept which still allows for wear-levelling across the whole flash device. In a sense, UBI may be compared to the Logical Volume Manager (LVM). Whereas LVM maps logical sector numbers to physical HDD sector numbers, UBI maps logical eraseblocks to physical eraseblocks. More information may be found at http://www.linux-mtd.infradead.org/doc/ubi.html Partitioning/Re-partitioning An UBI volume occupies a certain number of erase blocks. This is limited by a configured maximum volume size, which could also be viewed as the partition size. Each individual UBI volume's size can be changed independently of the other UBI volumes, provided that the sum of all volume sizes doesn't exceed a certain limit. UBI supports dynamic volumes and static volumes. Static volumes are read-only and their contents are protected by CRC check sums. Bad eraseblocks handling UBI transparently handles bad eraseblocks. When a physical eraseblock becomes bad, it is substituted by a good physical eraseblock, and the user does not even notice this. Scrubbing On a NAND flash bit flips can occur on any write operation, sometimes also on read. If bit flips persist on the device, at first they can still be corrected by ECC, but once they accumulate, correction will become impossible. Thus it is best to actively scrub the affected eraseblock, by first copying it to a free eraseblock and then erasing the original. The UBI layer performs this type of scrubbing under the covers, transparently to the UBI volume users. Erase Counts UBI maintains an erase count header per eraseblock. This frees higher-level layers (like file systems) from doing this and allows for centralized erase count management instead. The erase counts are used by the wear-levelling algorithm in the UBI layer. The algorithm itself is exchangeable. Booting from NAND For booting directly from NAND flash the hardware must at least be capable of fetching and executing a small portion of the NAND flash. Some NAND flash controllers have this kind of support. They usually limit the window to a few kilobytes in erase block 0. This "initial program loader" (IPL) must then contain sufficient logic to load and execute the next boot phase. Due to bad eraseblocks, which may be randomly scattered over the flash device, it is problematic to store the "secondary program loader" (SPL) statically. Also, due to bit-flips it may become corrupted over time. UBI allows to solve this problem gracefully by storing the SPL in a small static UBI volume. UBI volumes vs. static partitions UBI volumes are still very similar to static MTD partitions: * both consist of eraseblocks (logical eraseblocks in case of UBI volumes, and physical eraseblocks in case of static partitions; * both support three basic operations - read, write, erase. But UBI volumes have the following advantages over traditional static MTD partitions: * there are no eraseblock wear-leveling constraints in case of UBI volumes, so the user should not care about this; * there are no bit-flips and bad eraseblocks in case of UBI volumes. So, UBI volumes may be considered as flash devices with relaxed restrictions. Where can it be found? Documentation, kernel code and applications can be found in the MTD gits. What are the applications for? The applications help to create binary flash images for two purposes: pfi files (partial flash images) for in-system update of UBI volumes, and plain binary images, with or without OOB data in case of NAND, for a manufacturing step. Furthermore some tools are/and will be created that allow flash content analysis after a system has crashed.. Who did UBI? The original ideas, where UBI is based on, were developed by Andreas Arnez, Frank Haverkamp and Thomas Gleixner. Josh W. Boyer and some others were involved too. The implementation of the kernel layer was done by Artem B. Bityutskiy. The user-space applications and tools were written by Oliver Lohmann with contributions from Frank Haverkamp, Andreas Arnez, and Artem. Joern Engel contributed a patch which modifies JFFS2 so that it can be run on a UBI volume. Thomas Gleixner did modifications to the NAND layer. Alexander Schmidt made some testing work as well as core functionality improvements. Signed-off-by: Artem B. Bityutskiy <dedekind@linutronix.de> Signed-off-by: Frank Haverkamp <haver@vnet.ibm.com>
* \|	Merge branch 'upstream-linus' of ↵	Linus Torvalds	2007-04-27	31	-2237/+4697
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (27 commits) ocfs2: Cache extent records ocfs2: Remember rw lock level during direct io ocfs2: Fix up i_blocks calculation to know about holes ocfs2: Fix extent lookup to return true size of holes ocfs2: Read from an unwritten extent returns zeros ocfs2: make room for unwritten extents flag ocfs2: Use own splice write actor ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate() [PATCH] Turn do_sync_file_range() into do_sync_mapping_range() ocfs2: zero tail of sparse files on truncate ocfs2: Teach ocfs2_get_block() about holes ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write() ocfs2: teach ocfs2_file_aio_write() about sparse files ocfs2: Turn off shared writeable mmap for local files systems with holes. ocfs2: abstract out allocation locking ocfs2: teach extend/truncate about sparse files ocfs2: temporarily remove extent map caching ocfs2: sparse b-tree support ocfs2: small cleanup of ocfs2_request_delete() ocfs2: remove unused code ...
\| * \|	ocfs2: Cache extent records	Mark Fasheh	2007-04-26	7	-0/+289
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The extent map code was ripped out earlier because of an inability to deal with holes. This patch adds back a simpler caching scheme requiring far less code. Our old extent map caching was designed back when meta data block caching in Ocfs2 didn't work very well, resulting in many disk reads. These days our metadata caching is much better, resulting in no un-necessary disk reads. As a result, extent caching doesn't have to be as fancy, nor does it have to cache as many extents. Keeping the last 3 extents seen should be sufficient to give us a small performance boost on some streaming workloads. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Remember rw lock level during direct io	Mark Fasheh	2007-04-26	3	-7/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cluster locking might have been redone because a direct write won't complete, so this needs to be reflected in the iocb. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Fix up i_blocks calculation to know about holes	Mark Fasheh	2007-04-26	9	-30/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Older file systems which didn't support holes did a dumb calculation of i_blocks based on i_size. This is no longer accurate, so fix things up to take actual allocation into account. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Fix extent lookup to return true size of holes	Mark Fasheh	2007-04-26	5	-12/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initially, we had wired things to return a size '1' of holes. Cook up a small amount of code to find the next extent and calculate the number of clusters between the virtual offset and the next allocated extent. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Read from an unwritten extent returns zeros	Mark Fasheh	2007-04-26	10	-23/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: make room for unwritten extents flag	Mark Fasheh	2007-04-26	6	-69/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to the size of our group bitmaps, we'll never have a leaf node extent record with more than 16 bits worth of clusters. Split e_clusters up so that leaf nodes can get a flags field where we can mark unwritten extents. Interior nodes whose length references all the child nodes beneath it can't split their e_clusters field, so we use a union to preserve sizing there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Use own splice write actor	Mark Fasheh	2007-04-26	3	-1/+162
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to fill holes during a splice write. Provide our own splice write actor which can call ocfs2_file_buffered_write() with a splice-specific callback. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate()	Mark Fasheh	2007-04-26	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do this instead of filemap_fdatawrite() - this way we sync only the range between i_size and the cluster boundary. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	[PATCH] Turn do_sync_file_range() into do_sync_mapping_range()	Mark Fasheh	2007-04-26	2	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do_sync_file_range() accepts a file * from which it takes an address_space to sync. Abstract out the bulk of the function into do_sync_mapping_range() which takes the address_space directly. This way callers who want to sync an address_space directly can take advantage of the functionality provided. do_sync_file_range() is preserved as a small wrapper around do_sync_mapping_range(). Ocfs2 in particular would like to use this to initiate a sync of a specific inode range during truncate, where a file * may not be available. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
\| * \|	ocfs2: zero tail of sparse files on truncate	Mark Fasheh	2007-04-26	7	-25/+328
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data. This introduced a new helper, which can be used in ocfs2_write(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Teach ocfs2_get_block() about holes	Mark Fasheh	2007-04-26	1	-38/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ocfs2_get_block() didn't understand sparse files, fix that. Also remove some code that isn't really useful anymore. We can fix up ocfs2_direct_IO_get_blocks() at the same time. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write()	Mark Fasheh	2007-04-26	1	-120/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are no longer used, and can't handle file systems with sparse file allocation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: teach ocfs2_file_aio_write() about sparse files	Mark Fasheh	2007-04-26	7	-57/+1076
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size. Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Turn off shared writeable mmap for local files systems with holes.	Mark Fasheh	2007-04-26	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will be turned back on once we can do allocation in ->page_mkwrite(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: abstract out allocation locking	Mark Fasheh	2007-04-26	1	-27/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, file allocation for ocfs2 is done within ocfs2_extend_file(), which is either called from ->setattr() (for an i_size change), or at the top of ocfs2_file_aio_write(). Inodes on file systems with sparse file support will want to do their allocation during the actual write call. In either case the cluster locking decisions are the same. We abstract out that code into a new function, ocfs2_lock_allocators() which will be used by a later patch to enable writing to sparse files. This also provides a nice cleanup of ocfs2_extend_allocation(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: teach extend/truncate about sparse files	Mark Fasheh	2007-04-26	3	-237/+320
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For ocfs2_truncate_file(), we eliminate the "simple" truncate case which no longer exists since i_size is not tied to i_clusters. In ocfs2_extend_file(), we skip the allocation / page zeroing code for file systems which understand sparse files. The core truncate code is changed to do a bottom up tree traversal. This gets abstracted out into it's own function. To make things more readable, most of the special case handling for in-inode extents from ocfs2_do_truncate() is also removed. Though write support for sparse files comes in a later patch, we at least update ocfs2_prepare_inode_for_write() to skip allocation for sparse files. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: temporarily remove extent map caching	Mark Fasheh	2007-04-26	14	-996/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: sparse b-tree support	Mark Fasheh	2007-04-26	8	-497/+2002
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so that it can later be re-used to implement large extended attributes. This patch only adds the rotation code and does minimal updates to callers of the extent api. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: small cleanup of ocfs2_request_delete()	Mark Fasheh	2007-04-26	1	-33/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two checks in there (one for inode newness, one for other mounted nodes) which are unnecessary, so remove them. The DLM will allow the trylock in either case without any messaging overhead. Removing these makes ocfs2_request_delete() a one liner function, so just move the trylock out one level into ocfs2_query_inode_wipe(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: remove unused code	Tiger Yang	2007-04-26	4	-329/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove node messaging code that becomes unused with the delete inode vote removal. [Removed even more cruft which I spotted during review --Mark] Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Remove delete inode vote	Tiger Yang	2007-04-26	10	-38/+205
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: filter more error prints	Mark Fasheh	2007-04-26	2	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want to print anything at all in ocfs2_lookup() when getting an error from ocfs2_iget() - it could be something as innocuous as a signal being detected in the dlm. ocfs2_permission() should filter on -ENOENT which ocfs2_meta_lock() can return if the inode was deleted on another node. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Replace panic() with emergency_restart() when fencing	Sunil Mushran	2007-04-26	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have noticed panic() hanging leading us to a situation in which the node, while otherwise dead, is still disk heartbeating. This leads to a hung cluster as the other nodes are waiting for this node to stop disk heartbeating. This situation is only resolved by power resetting the box. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Silence compiler warnings	Sunil Mushran	2007-04-26	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2: Local mounts should skip inode updates	Mark Fasheh	2007-04-26	1	-12/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want the extent map and uptodate cache destruction in ocfs2_meta_lock_update() on a local mount, so skip that. This fixes several bugs with uptodate being cleared on buffers and extent maps being corrupted. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2_dlm: Call cond_resched_lock() once per hash bucket scan	Sunil Mushran	2007-04-26	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In dlm_migrate_all_locks(), we currently call cond_resched_lock() after processing each lockres in a hash bucket. Move it outside the loop so as to call it only after the entire hash bucket has been processed. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
\| * \|	ocfs2_dlm: fix race in dlm_remaster_locks	Srinivas Eeda	2007-04-26	1	-0/+2
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a possibility that dlm_remaster_locks could overwride node->state with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock spinlock. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* \|	Merge branch 'e1000-fixes' of ↵	Linus Torvalds	2007-04-27	1	-46/+58
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'e1000-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: e1000: FIX: Stop raw interrupts disabled nag from RT e1000: FIX: firmware handover bits e1000: FIX: be ready for incoming irq at pci_request_irq
\| * \|	e1000: FIX: Stop raw interrupts disabled nag from RT	Mark Huth	2007-04-26	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current e1000_xmit_frame spews raw interrupt disabled nag messages when used with RT kernel patches. This patch uses spin_trylock_irqsave, which allows RT patches to properly manage the irq semantics. Signed-off-by: Mark Huth <mhuth@mvista.com> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>