op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	[IA64] Altix system controller event handling	Greg Howard	2005-04-25	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following is an update of the patch I sent yesterday (3/9/05) incorporating suggestions from Christoph Hellwig and Andreas Schwab. It allows Altix and Altix-like systems to handle environmental events generated by the system controllers, and should apply on top of Jack Steiner's patch of 3/1/05 ("New chipset support for SN platform") and Mark Goodwin's patch of 3/8/05 ("Altix SN topology support for new chipsets and pci topology"). Signed-off-by: Greg Howard <ghoward@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] vector sharing (Large I/O system support)	Kenji Kaneshige	2005-04-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current ia64 linux cannot handle greater than 184 interrupt sources because of the lack of vectors. The following patch enables ia64 linux to handle greater than 184 interrupt sources by allowing the same vector number to be shared by multiple IOSAPIC's RTEs. The design of this patch is besed on "Intel(R) Itanium(R) Processor Family Interrupt Architecture Guide". Even if you don't have a large I/O system, you can see the behavior of vector sharing by changing IOSAPIC_LAST_DEVICE_VECTOR to fewer value. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] multi-core/multi-thread identification	Suresh Siddha	2005-04-25	4	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Version 3 - rediffed to apply on top of Ashok's hotplug cpu patch. /proc/cpuinfo output in step with x86. This is an updated MC/MT identification patch based on the previous discussions on list. Add the Multi-core and Multi-threading detection for IPF. - Add new core and threading related fields in /proc/cpuinfo. Physical id Core id Thread id Siblings - setup the cpu_core_map and cpu_sibling_map appropriately - Handles Hot plug CPU Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Gordon Jin <gordon.jin@intel.com> Signed-off-by: Rohit Seth <rohit.seth@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Altix SN add support for slots in geoid_t locator	Mark Goodwin	2005-04-25	3	-21/+31
\| \| \| \| \| \| \| \| \| \| \|	This patch against ia64-test-2.6.12 is needed for forthcoming Altix chipsets. It renames geoid_any_t to geoid_common_t and splits the 8bit 'slab' field into two 4bit fields for 'slab' and 'slot'. Similar changes in the Altix SAL will retain backward compatibility for old kernels. Signed-off-by: Mark Goodwin <markgw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Shub2 BTE support - BTE recovery code	Russ Anderson	2005-04-25	1	-2/+17
\| \| \| \| \| \| \| \| \| \|	patch 2: Shub2 BTE recovery code will be implemented in SAL. Define the SAL interface. Modify bte_error to call SAL for shub2. Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Add new MMR definitions/Modify BTE initialiation&copy.	Russ Anderson	2005-04-25	4	-11/+69
\| \| \| \| \| \| \| \| \| \|	patch 1: Add new MMR definitions. Modify BTE initialiation. Modify BTE copy. Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] fix: warning: `ql_size' might be used uninitialized	Tony Luck	2005-04-25	1	-1/+1
\| \| \| \| \| \|	Oops. Should have caught this before I checked it in. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Percpu quicklist for combined allocator for pgd/pmd/pte.	Robin Holt	2005-04-25	2	-86/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces using the quicklists for pgd, pmd, and pte levels by combining the alloc and free functions into a common set of routines. This greatly simplifies the reading of this header file. This patch is simple but necessary for large numa configurations. It simply ensures that only pages from the local node are added to a cpus quicklist. This prevents the trapping of pages on a remote nodes quicklist by starting a process, touching a large number of pages to fill pmd and pte entries, migrating to another node, and then unmapping or exiting. With those conditions, the pages get trapped and if the machine has more than 100 nodes of the same size, the calculation of the pgtable high water mark will be larger than any single node so page table cache flushing will never occur. I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without this patch and did not notice any change. On an sn2 machine, there was a slight improvement which is possibly due to pages from other nodes trapped on the test node before starting the run. I did not investigate further. This patch shrinks the quicklist based upon free memory on the node instead of the high/low water marks. I have written it to enable preemption periodically and recalculate the amount to shrink every time we have freed enough pages that the quicklist size should have grown. I rescan the nodes zones each pass because other processess may be draining node memory at the same time as we are adding. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Bus driver for the CX port of SGI's TIO chip.	Bruce Losure	2005-04-25	2	-0/+74
\| \| \| \| \| \| \| \|	This patch is to provide CX port infrastructure for SGI TIO-based h/w. Also a 'core services' driver for SGI FPGA-based h/w. Signed-off-by: Bruce Losure <blosure@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] perfmon: make pfm_sysctl a global, and other cleanup	Stephane Eranian	2005-04-25	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- make pfm_sysctl a global such that it is possible to enable/disable debug printk in sampling formats using PFM_DEBUG. - remove unused pfm_debug_var variable - fix a bug in pfm_handle_work where an BUG_ON() could be triggered. There is a path where pfm_handle_work() can be called with interrupts enabled, i.e., when TIF_NEED_RESCHED is set. The fix correct the masking and unmasking of interrupts in pfm_handle_work() such that we restore the interrupt mask as it was upon entry. signed-off-by: stephane eranian <eranian@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] support variable length nasids in shub2	Colin Ngam	2005-04-25	1	-1/+2
\| \| \| \| \| \| \| \|	This patch enables our TIO IO chipset to support variable length nasids in Shub2 chipset. Signed-off-by: Colin Ngam <cngam@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Shub2 provides an addition of 2 External Interrupt events.	Colin Ngam	2005-04-25	1	-0/+17
\| \| \| \| \|	Signed-off-by: Colin Ngam <cngam@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Altix SN topology support for new chipsets and pci topology	Mark Goodwin	2005-04-25	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \|	please accept this patch to the Altix SN platform topology export interface to support new chipsets and to export PCI topology. This follows on top of Jack Steiner's patch dated March 1st ("New chipset support for SN platform"). Signed-off-by: Mark Goodwin <markgw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] Change SAL call request code for SN systems	Jack Steiner	2005-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Change the value of the SAL call number for a new SAL request. The initial implementation in the PROM did not match what the OS expected. Since the OS can run on PROMs that do not implement the new call, changing the call number avoids the issue. New PROMs will implement the new call number. (This avoids problems with the 4.05 PROM). Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] altix: tioca chip driver (agp)	Mark Maule	2005-04-25	3	-1/+804
\| \| \| \| \| \| \| \|	Provide a driver for the altix TIOCA AGP chipset. An agpgart backend will be provided as a separate patch. Signed-off-by: Mark Maule <maule@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64-SGI] sn2-move-pci-headers.patch	Mark Maule	2005-04-25	2	-0/+109
\| \| \| \| \| \| \| \|	Move a couple of headers out of arch/ia64/sn/include/pci and into include/asm-ia64/sn. Signed-off-by: Mark Maule <maule@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[PATCH] ppc iomem annotations: ->io_base_virt	Al Viro	2005-04-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* ->io_base_virt in struct pci_controller is iomem pointer. Marked as such. Most of the places that used it are already annotated to expect iomem. * places that did gratitious (and wrong) casts a-la isa_io_base = (unsigned long)ioremap(...); hose->io_base_virt = (void )isa_io_base; turned into hose->io_base_virt = ioremap(...); isa_io_base = (unsigned long)hose->io_base_virt; pci_bus_io_base() annotated as returning iomem pointer. Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] ppc user annotations: sigcontext	Al Viro	2005-04-25	1	-1/+1
\| \| \| \| \| \| \|	sigcontext.regs is a userland pointer Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Automatic merge of ↵	Linus Torvalds	2005-04-25	2	-4/+4
\|\ \| \| \| \| \| \|	rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6.git
\| *	[SPARC64]: Fix SMP build.	David S. Miller	2005-04-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kill build failures in the SMP+!PREEMPT case introduced by Al Viro's spinlock.h changes. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[SPARC]: Fix mxcc warning	Tom 'spot' Callaway	2005-04-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Peter Jones uncovered this one while we were debugging the framebuffer issues. There are some references to -1 in the mxcc asm code, which should be 0xffffffff. This patch gets rid of the -1s. Signed-off-by: Tom 'spot' Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[SELINUX]: Fix ipv6_skip_exthdr() invocation causing OOPS.	Herbert Xu	2005-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SELinux hooks invoke ipv6_skip_exthdr() with an incorrect length final argument. However, the length argument turns out to be superfluous. I was just reading ipv6_skip_exthdr and it occured to me that we can get rid of len altogether. The only place where len is used is to check whether the skb has two bytes for ipv6_opt_hdr. This check is done by skb_header_pointer/skb_copy_bits anyway. Now it might appear that we've made the code slower by deferring the check to skb_copy_bits. However, this check should not trigger in the common case so this is OK. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[PKT_SCHED]: Introduce simple actions.	Jamal Hadi Salim	2005-04-24	3	-0/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	And provide an example simply action in order to demonstrate usage. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[TCP]: skb pcount with MTU discovery	David S. Miller	2005-04-24	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is that when doing MTU discovery, the too-large segments in the write queue will be calculated as having a pcount of >1. When tcp_write_xmit() is trying to send, tcp_snd_test() fails the cwnd test when pcount > cwnd. The segments are eventually transmitted one at a time by keepalive, but this can take a long time. This patch checks if TSO is enabled when setting pcount. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[AX25] Introduce ax25_type_trans	Arnaldo Carvalho de Melo	2005-04-24	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replacing the open coded equivalents and making ax25 look more like a linux network protocol, i.e. more similar to inet. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[NETFILTER]: Fix NAT sequence number adjustment	Patrick McHardy	2005-04-24	1	-0/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NAT changes in 2.6.11 changed the position where helpers are called and perform packet mangling. Before 2.6.11, a NAT helper was called before the packet was NATed and had its sequence number adjusted. Since 2.6.11, the helpers get packets with already adjusted sequence numbers. This breaks sequence number adjustment, adjust_tcp_sequence() needs the original sequence number to determine whether a packet was a retransmission and to store it for further corrections. It can't be reconstructed without more information than available, so this patch restores the old order by calling helpers from a new conntrack hook two priorities below ip_conntrack_confirm() and adjusting the sequence number from a new NAT hook one priority below ip_conntrack_confirm(). Tracked down by Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PATCH] ppc trivial iomem annotations: chrp	Al Viro	2005-04-24	1	-1/+1
\| \| \| \| \|	Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] mostek bogus sparse annotations fixed	Al Viro	2005-04-24	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	void * __iomem foo is not a pointer to iomem - it's an iomem variable containing void . A pile of such guys in arch/sparc64/kernel/time.c, drivers/sbus/char/rtc.c and include/asm-sparc64/mostek.h turned into intended void __iomem . Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] __get_unaligned() turned into macro	Al Viro	2005-04-24	1	-41/+42
\| \| \| \| \| \| \| \| \| \|	Turns __get_unaligned() and __put_unaligned into macros. That is definitely safe; leaving them as inlines breaks on e.g. alpha [try to build ncpfs there and you'll get unresolved symbols since we end up getting __get_unaligned() not inlined]. Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[IA64] cpu hotplug: return offlined cpus to SAL	Ashok Raj	2005-04-22	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is required to support cpu removal for IPF systems. Existing code just fakes the real offline by keeping it run the idle thread, and polling for the bit to re-appear in the cpu_state to get out of the idle loop. For the cpu-offline to work correctly, we need to pass control of this CPU back to SAL so it can continue in the boot-rendez mode. This gives the SAL control to not pick this cpu as the monarch processor for global MCA events, and addition does not wait for this cpu to checkin with SAL for global MCA events as well. The handoff is implemented as documented in SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State" Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6.git	Linus Torvalds	2005-04-22	2	-3/+8
\|\
\| *	[SPARC]: Provide generic ioctls in Sparc RTC driver.	David S. Miller	2005-04-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide support for drivers/char/rtc.c ioctls in the Mostek rtc driver as well as the Sparc specific RTCGET and RTCSET. This allows userspace to be much less messy. Currently util-linux and other spots jump through hoops trying various ioctl variants until it hits the right one whatever driver actually being used supports. Eventually all of this should move over to the genrtc.c driver, but not today... While we are here, fix up the register types for sparse. Thanks to Frans Pop for helping point out this issue. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[SPARC64]: Provide a pgprot_noncached() implementation.	David S. Miller	2005-04-21	1	-0/+5
\| \| \| \| \| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[TG3]: add bcm5752 entry to pci_ids.h	John W. Linville	2005-04-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add proper entry for bcm5752 PCI ID to pci_ids.h, and use it in tg3. I did this separately in case patches like this (i.e. new PCI IDs) need to come from more "official" sources. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[AX25]: make ax25_queue_xmit a net_device parameter	Arnaldo Carvalho de Melo	2005-04-21	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \|	I.e. not using skb->dev as a way to pass the parameter used to fill... skb->dev :-) Also to get the _type_trans open coded sequence grouped, next changesets will introduce ax25_type_trans. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6.git	Linus Torvalds	2005-04-21	2	-7/+24
\|\
\| *	[IA64] fix fls()	David Mosberger-Tang	2005-04-21	2	-7/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ia64-version of fls() never worked as intended (the bitnumbering was off by 1 and fls(0) was undefined). This patch fixes the problem by using a popcnt-based fls(), which on McKinley-derived cores is slightly faster than both ia64_fls() and generic_fls(). The resulting code, however, is bigger (7-8 bundles instead of about 3 bundles). Also switch ia64_popcnt() to __builtin_popcountl() for GCC v3.4 or newer since the compiler can predicate that and schedule it better. Thanks to Simon Derr and Matt Mackall for tracking down this bug. Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	[PATCH] alpha: key management syscalls	Richard Henderson	2005-04-21	1	-1/+4
\|/ \| \| \| \|	Allocate syscall numbers for add_key, request_key, keyctl.
*	[SPARC64]: sparc64 preempt + smp	Al Viro	2005-04-20	1	-17/+31
\| \| \| \| \| \| \|	PREEMPT+SMP support - see if it looks sane... Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: skbuff: remove old NET_CALLER macro	Stephen Hemminger	2005-04-19	1	-6/+0
\| \| \| \| \| \| \| \|	Here is a revised alternative that uses BUG_ON/WARN_ON (as suggested by Herbert Xu) to eliminate NET_CALLER. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory	Herbert Xu	2005-04-19	1	-0/+2
\| \| \| \| \| \| \| \|	So here is a patch that introduces skb_store_bits -- the opposite of skb_copy_bits, and uses them to read/write the csum field in rawv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Shave sizeof(ptr) bytes off dst_entry	Herbert Xu	2005-04-19	1	-3/+2
\| \| \| \| \|	Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PATCH] freepgt: arch FIRST_USER_ADDRESS 0	Hugh Dickins	2005-04-19	20	-21/+21
\| \| \| \| \| \| \| \| \|	Replace misleading definition of FIRST_USER_PGD_NR 0 by definition of FIRST_USER_ADDRESS 0 in all the MMU architectures beyond arm and arm26. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: arm26 FIRST_USER_ADDRESS PAGE_SIZE	Hugh Dickins	2005-04-19	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	ARM26 define FIRST_USER_ADDRESS as PAGE_SIZE (beyond the machine vectors when they are mapped low), and use that definition in place of locally defined MIN_MAP_ADDR. Previously, ARM26 permitted user mappings at 0 if the machine vectors were mapped high; but that's inconsistent with ARM, and FIRST_USER_ADDRESS would then have to be determined at runtime. Let's fix it at PAGE_SIZE throughout the architecture. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: arm FIRST_USER_ADDRESS PAGE_SIZE	Hugh Dickins	2005-04-19	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	ARM define FIRST_USER_ADDRESS as PAGE_SIZE (beyond the machine vectors when they are mapped low), and use that definition in place of locally defined MIN_MAP_ADDR. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: remove arch pgd_addr_end	Hugh Dickins	2005-04-19	3	-46/+3
\| \| \| \| \| \| \| \| \| \| \| \|	ia64 and sparc64 hurriedly had to introduce their own variants of pgd_addr_end, to leapfrog over the holes in their virtual address spaces which the final clear_page_range suddenly presented when converted from pgd_index to pgd_addr_end. But now that free_pgtables respects the vma list, those holes are never presented, and the arch variants can go. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: hugetlb_free_pgd_range	Hugh Dickins	2005-04-19	5	-9/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ia64 and ppc64 had hugetlb_free_pgtables functions which were no longer being called, and it wasn't obvious what to do about them. The ppc64 case turns out to be easy: the associated tables are noted elsewhere and freed later, safe to either skip its hugetlb areas or go through the motions of freeing nothing. Since ia64 does need a special case, restore to ppc64 the special case of skipping them. The ia64 hugetlb case has been broken since pgd_addr_end went in, though it probably appeared to work okay if you just had one such area; in fact it's been broken much longer if you consider a long munmap spanning from another region into the hugetlb region. In the ia64 hugetlb region, more virtual address bits are available than in the other regions, yet the page tables are structured the same way: the page at the bottom is larger. Here we need to scale down each addr before passing it to the standard free_pgd_range. Was about to write a hugely_scaled_down macro, but found htlbpage_to_page already exists for just this purpose. Fixed off-by-one in ia64 is_hugepage_only_range. Uninline free_pgd_range to make it available to ia64. Make sure the vma-gathering loop in free_pgtables cannot join a hugepage_only_range to any other (safe to join huges? probably but don't bother). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: remove MM_VM_SIZE(mm)	Hugh Dickins	2005-04-19	4	-21/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's only one usage of MM_VM_SIZE(mm) left, and it's a troublesome macro because mm doesn't contain the (32-bit emulation?) info needed. But it too is only needed because we ignore the end from the vma list. We could make flush_pgtables return that end, or unmap_vmas. Choose the latter, since it's a natural fit with unmap_mapping_range_vma needing to know its restart addr. This does make more than minimal change, but if unmap_vmas had returned the end before, this is how we'd have done it, rather than storing the break_addr in zap_details. unmap_vmas used to return count of vmas scanned, but that's just debug which hasn't been useful in a while; and if we want the map_count 0 on exit check back, it can easily come from the final remove_vm_struct loop. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] freepgt: free_pgtables use vma list	Hugh Dickins	2005-04-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent woes with some arches needing their own pgd_addr_end macro; and 4-level clear_page_range regression since 2.6.10's clear_page_tables; and its long-standing well-known inefficiency in searching throughout the higher-level page tables for those few entries to clear and free: all can be blamed on ignoring the list of vmas when we free page tables. Replace exit_mmap's clear_page_range of the total user address space by free_pgtables operating on the mm's vma list; unmap_region use it in the same way, giving floor and ceiling beyond which it may not free tables. This brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled, in which case latency fixes spoil unmap_vmas throughput). Beware: the do_mmap_pgoff driver failure case must now use unmap_region instead of zap_page_range, since a page table might have been allocated, and can only be freed while it is touched by some vma. Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted from the clear_page_range levels. (Most of free_pgtables' old code was actually for a non-existent case, prev not properly set up, dating from before hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we might want to add latency lockdrops later; but no attempt to do so yet, going by vma should itself reduce latency. But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful examination: put that off until a later patch of the series. What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma? And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that we need to do more than is done here - every PMD_SIZE ever occupied will be flushed, do we really have to flush every PGDIR_SIZE ever partially occupied? A shame to complicate it unnecessarily. Special thanks to David Miller for time spent repairing my ceilings. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Merge with kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6.git/	Linus Torvalds	2005-04-19	2	-3/+17
\|\ \| \| \| \| \| \|	for 13 driver core, sysfs, and debugfs fixes.