summaryrefslogtreecommitdiffstats
path: root/arch/x86_64/mm
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] x86-64: fix perms/range of vsyscall vma in /proc/*/mapsErnie Petrides2006-12-071-3/+4
| | | | | | | | | | | | | | | | | | | | | The final line of /proc/<pid>/maps on x86_64 for native 64-bit tasks shows an incorrect ending address and incorrect permissions. There is only a single page mapped in this vsyscall region, and it is accessible for both read and execute. The patch below fixes this. (Since 32-bit-compat tasks have a real vma with correct perms/range, no change is necessary for that scenario.) Before the patch, a "cat /proc/self/maps | tail -1" shows this: ffffffffff600000-ffffffffffe00000 ---p 00000000 [...] After the patch, this is the output: ffffffffff600000-ffffffffff601000 r-xp 00000000 [...] Signed-off-by: Ernie Petrides <petrides@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de>
* [PATCH] x86-64: Use probe_kernel_address in arch/x86_64/*Andi Kleen2006-12-071-5/+5
| | | | | | Instead of open coded __get_user Signed-off-by: Andi Kleen <ak@suse.de>
* [PATCH] x86-64: Speed and clean up cache flushing in change_page_attrAndi Kleen2006-12-071-26/+32
| | | | | | | | | | | | CLFLUSH is a lot faster than WBINVD so avoid the later if at all possible. Always pass the complete list of pages to other CPUs to cut down the number of IPIs. Minor other cleanup and sync with i386 version. Signed-off-by: Andi Kleen <ak@suse.de>
* [PATCH] x86_64: fix memory hotplug build with NUMA=nYasunori Goto2006-11-201-8/+1
| | | | | | | | | | | | | | | | | | | | | | This is to fix compile error of x86-64 memory hotplug without any NUMA option. CC arch/x86_64/mm/init.o arch/x86_64/mm/init.c:501: error: redefinition of 'memory_add_physaddr_to_nid' include/linux/memory_hotplug.h:71: error: previous definition of 'memory_add_phys addr_to_nid' was here arch/x86_64/mm/init.c:509: error: redefinition of 'memory_add_physaddr_to_nid' arch/x86_64/mm/init.c:501: error: previous definition of 'memory_add_physaddr_to_ nid' was here I confirmed compile completion with !NUMA, (NUMA & !ACPI_NUMA), or (NUMA & ACPI_NUMA). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Acked-by: Andi Kleen <ak@suse.de> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86-64: Handle reserve_bootmem_generic beyond end_pfnAndi Kleen2006-11-141-1/+14
| | | | | | | | | | | This can happen on kexec kernels with some configurations, in particularly on Unisys ES7000 systems. Analysis by Amul Shah Cc: Amul Shah <amul.shah@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de>
* [PATCH] x86-64: x86_64 hot-add memory srat.c fixkeith mannthey2006-10-211-2/+2
| | | | | | | | | | | | | This patch corrects the logic used in srat.c to figure out what parsing what action to take when registering hot-add areas. Hot-add areas should only be added to the node information for the MEMORY_HOTPLUG_RESERVE case. When booting MEMORY_HOTPLUG_SPARSE hot-add areas on everything but the last node are getting include in the node data and during kernel boot the pages are setup then the kernel dies when the pages are used. This patch fixes this issue. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>
* [PATCH] mm: use symbolic names instead of indices for zone initialisationMel Gorman2006-10-112-6/+11
| | | | | | | | | | | | | | | | | | | | | | | Arch-independent zone-sizing is using indices instead of symbolic names to offset within an array related to zones (max_zone_pfns). The unintended impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set. As a result, the the machine fails to boot but will boot with CONFIG_HIGHMEM turned off. The following patch properly initialises the max_zone_pfns[] array and uses symbolic names instead of indices in each architecture using arch-independent zone-sizing. Two users have successfully booted their powerpcs with it (one an ibook G4). It has also been boot tested on x86, x86_64, ppc64 and ia64. Please merge for 2.6.19-rc2. Credit to Benjamin Herrenschmidt for identifying the bug and rolling the first fix. Additional credit to Johannes Berg and Andreas Schwab for reporting the problem and testing on powerpc. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Generic ioremap_page_range: x86_64 conversionHaavard Skinnemoen2006-10-011-104/+7
| | | | | | | | | | Convert x86_64 to use generic ioremap_page_range() [akpm@osdl.org: build fix] Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] hot-add-mem x86_64: use CONFIG_MEMORY_HOTPLUG_RESERVEKeith Mannthey2006-10-011-4/+13
| | | | | | | | | | | | | The api for hot-add memory already has a construct for finding nodes based on an address, memory_add_physaddr_to_nid. This patch allows the fucntion to do something besides return 0. It uses the nodes_add infomation to lookup to node info for a hot add event. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] hot-add-mem x86_64: memory_add_physaddr_to_nid node fixupKeith Mannthey2006-10-012-0/+3
| | | | | | | | | | | | | | | | | | | In cases where the acpi memory-add event does not containe the pxm (node) infomation allow the driver to look up node info based on the address. The acpi_get_node call returns -1 if it can't decode the pxm info, this causes add_memory to panic. acpi_get_node would have to decode the resource from the handle (a lenghty proposition). This seems to be the cleanist point to interject the hook. [kamezawa.hiroyu@jp.fujitsu.com: build fixes] [y-goto@jp.fujitsu.com: build fixes] Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] hot-add-mem x86_64: memory_add_physaddr_to_nid enableKeith Mannthey2006-10-012-14/+19
| | | | | | | | | | | | | The api for hot-add memory already has a construct for finding nodes based on an address, memory_add_physaddr_to_nid. This patch allows the fucntion to do something besides return 0. It uses the nodes_add infomation to lookup to node info for a hot add event. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] hot-add-mem x86_64: Enable SPARSEMEM in srat.cKeith Mannthey2006-10-011-22/+29
| | | | | | | | | | | | Enable x86_64 srat.c to share code between both reserve and sparsemem based add memory paths. Both paths need the hot-add area node locality infomration (nodes_add). This code refactors the code path to allow this. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pidspace: is_init()Sukadev Bhattiprolu2006-09-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an updated version of Eric Biederman's is_init() patch. (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and replaces a few more instances of ->pid == 1 with is_init(). Further, is_init() checks pid and thus removes dependency on Eric's other patches for now. Eric's original description: There are a lot of places in the kernel where we test for init because we give it special properties. Most significantly init must not die. This results in code all over the kernel test ->pid == 1. Introduce is_init to capture this case. With multiple pid spaces for all of the cases affected we are looking for only the first process on the system, not some other process that has pid == 1. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: <lxc-devel@lists.sourceforge.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] make PROT_WRITE imply PROT_READJason Baron2006-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make PROT_WRITE imply PROT_READ for a number of architectures which don't support write only in hardware. While looking at this, I noticed that some architectures which do not support write only mappings already take the exact same approach. For example, in arch/alpha/mm/fault.c: " if (cause < 0) { if (!(vma->vm_flags & VM_EXEC)) goto bad_area; } else if (!cause) { /* Allow reads even for write-only mappings */ if (!(vma->vm_flags & (VM_READ | VM_WRITE))) goto bad_area; } else { if (!(vma->vm_flags & VM_WRITE)) goto bad_area; } " Thus, this patch brings other architectures which do not support write only mappings in-line and consistent with the rest. I've verified the patch on ia64, x86_64 and x86. Additional discussion: Several architectures, including x86, can not support write-only mappings. The pte for x86 reserves a single bit for protection and its two states are read only or read/write. Thus, write only is not supported in h/w. Currently, if i 'mmap' a page write-only, the first read attempt on that page creates a page fault and will SEGV. That check is enforced in arch/blah/mm/fault.c. However, if i first write that page it will fault in and the pte will be set to read/write. Thus, any subsequent reads to the page will succeed. It is this inconsistency in behavior that this patch is attempting to address. Furthermore, if the page is swapped out, and then brought back the first read will also cause a SEGV. Thus, any arbitrary read on a page can potentially result in a SEGV. According to the SuSv3 spec, "if the application requests only PROT_WRITE, the implementation may also allow read access." Also as mentioned, some archtectures, such as alpha, shown above already take the approach that i am suggesting. The counter-argument to this raised by Arjan, is that the kernel is enforcing the write only mapping the best it can given the h/w limitations. This is true, however Alan Cox, and myself would argue that the inconsitency in behavior, that is applications can sometimes work/sometimes fails is highly undesireable. If you read through the thread, i think people, came to an agreement on the last patch i posted, as nobody has objected to it... Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Allow an arch to expand node boundariesMel Gorman2006-09-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Arch-independent zone-sizing determines the size of a node (pgdat->node_spanned_pages) based on the physical memory that was registered by the architecture. However, when CONFIG_MEMORY_HOTPLUG_RESERVE is set, the architecture expects that the spanned_pages will be much larger and that mem_map will be allocated that is used lated on memory hot-add. This patch allows an architecture that sets CONFIG_MEMORY_HOTPLUG_RESERVE to call push_node_boundaries() which will set the node beginning and end to at *least* the requested boundary. Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Account for holes that are outside the range of physical memoryMel Gorman2006-09-271-1/+3
| | | | | | | | | | | | | | | | | | | absent_pages_in_range() made the assumption that users of the API would not care about holes beyound the end of physical memory. This was not the case. This patch will account for ranges outside of physical memory as holes correctly. Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Account for memmap and optionally the kernel image as holesMel Gorman2006-09-271-1/+3
| | | | | | | | | | | | | | | | | | | | | The x86_64 code accounted for memmap and some portions of the the DMA zone as holes. This was because those areas would never be reclaimed and accounting for them as memory affects min watermarks. This patch will account for the memmap as a memory hole. Architectures may optionally use set_dma_reserve() if they wish to account for a portion of memory in ZONE_DMA as a hole. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Have x86_64 use add_active_range() and free_area_init_nodesMel Gorman2006-09-274-72/+28
| | | | | | | | | | | | | | | | | Size zones and holes in an architecture independent manner for x86_64. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds2006-09-266-63/+57
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits) [PATCH] Don't set calgary iommu as default y [PATCH] i386/x86-64: New Intel feature flags [PATCH] x86: Add a cumulative thermal throttle event counter. [PATCH] i386: Make the jiffies compares use the 64bit safe macros. [PATCH] x86: Refactor thermal throttle processing [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64) [PATCH] Fix unwinder warning in traps.c [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1 [PATCH] x86: Move direct PCI scanning functions out of line [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI [PATCH] Don't leak NT bit into next task [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder [PATCH] Fix some broken white space in ia32_signal.c [PATCH] Initialize argument registers for 32bit signal handlers. [PATCH] Remove all traces of signal number conversion [PATCH] Don't synchronize time reading on single core AMD systems [PATCH] Remove outdated comment in x86-64 mmconfig code [PATCH] Use string instructions for Core2 copy/clear [PATCH] x86: - restore i8259A eoi status on resume [PATCH] i386: Split multi-line printk in oops output. ...
| * [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing ↵Andi Kleen2006-09-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | conf1 Some buggy systems can machine check when config space accesses happen for some non existent devices. i386/x86-64 do some early device scans that might trigger this. Allow pci=noearly to disable this. Also when type 1 is disabling also don't do any early accesses which are always type1. This moves the pci= configuration parsing to be a early parameter. I don't think this can break anything because it only changes a single global that is only used by PCI. Cc: gregkh@suse.de Cc: Trammell Hudson <hudson@osresearch.net> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Use proper accessors to change PSE bits in change_page_attr()Andi Kleen2006-09-261-10/+6
| | | | | | | | | | | | | | Use normal pte accessors in change_page_attr() to access the PSE bits. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Fix pte_exec/mkexec and use it in change_page_attr()Andi Kleen2006-09-261-3/+5
| | | | | | | | | | | | | | | | | | | | | | Fix the pte_exec/mkexec page table accessor functions to really use the NX bit. Previously they only checked the USER bit, but weren't actually used for anything. Then use them in change_page_attr() to manipulate the NX bit properly. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Remove bogus warning from early_ioremapAndi Kleen2006-09-261-1/+0
| | | | | | | | | | | | | | It is correct for its only caller right now, but not for possible future others. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] make numa_emulation() __initAndrew Morton2006-09-261-1/+1
| | | | | | | | | | Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86_64 kernel mapping fixKeith Mannthey2006-09-261-25/+26
| | | | | | | | | | | | | | | | | | | | | | | | Fix for the x86_64 kernel mapping code. Without this patch the update path only inits one pmd_page worth of memory and tramples any entries on it. now the calling convention to phys_pmd_init and phys_init is to always pass a [pmd/pud] page not an offset within a page. Signed-off-by: Keith Mannthey<kmannth@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
| * [PATCH] make fault notifier unconditional and export itAndi Kleen2006-09-261-9/+3
| | | | | | | | | | | | | | | | | | | | It's needed for external debuggers and overhead is very small. Also make the actual notifier chain they use static Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Add sparse annotations to quiet sparse in arch/x86_64/mm/fault.cAndi Kleen2006-09-261-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes linux/arch/x86_64/mm/fault.c:125:7: warning: incorrect type in argument 1 (different address spaces) linux/arch/x86_64/mm/fault.c:125:7: expected void [noderef] *<noident><asn:1> linux/arch/x86_64/mm/fault.c:125:7: got unsigned char *[assigned] instr linux/arch/x86_64/mm/fault.c:163:8: warning: incorrect type in argument 1 (different address spaces) linux/arch/x86_64/mm/fault.c:163:8: expected void [noderef] *<noident><asn:1> linux/arch/x86_64/mm/fault.c:163:8: got unsigned char *[assigned] instr linux/arch/x86_64/mm/fault.c:179:9: warning: incorrect type in argument 1 (different address spaces) linux/arch/x86_64/mm/fault.c:179:9: expected void [noderef] *<noident><asn:1> linux/arch/x86_64/mm/fault.c:179:9: got unsigned long *<noident> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Clean up acpi_numa variableAndi Kleen2006-09-261-0/+2
| | | | | | | | | | | | | | | | | | Move it into srat.c No need to clutter up setup.c for it And remove use in setup.c completely - it only guarded a printk which can be done unconditionally. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] Convert x86-64 to early paramAndi Kleen2006-09-261-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of hackish manual parsing Requires earlier i386 patchkit, but also fixes i386 early_printk again. I removed some obsolete really early parameters which didn't do anything useful. Also made a few parameters that needed it early (mostly oops printing setup) Also removed one panic check that wasn't visible without early console anyways (the early console is now initialized after that panic) This cleans up a lot of code. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] initialize end of memory variables as early as possibleJan Beulich2006-09-261-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | While an earlier patch already did a small step into that direction, this patch moves initialization of all memory end variables to as early as possible, so that dependent code doesn't need to check whether these variables have already been set. Also, remove a misleading (perhaps just outdated) comment, and make static a variable only used in a single file. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>
* | [PATCH] Standardize pxx_page macrosDave McCracken2006-09-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the changes necessary for shared page tables is to standardize the pxx_page macros. pte_page and pmd_page have always returned the struct page associated with their entry, while pte_page_kernel and pmd_page_kernel have returned the kernel virtual address. pud_page and pgd_page, on the other hand, return the kernel virtual address. Shared page tables needs pud_page and pgd_page to return the actual page structures. There are very few actual users of these functions, so it is simple to standardize their usage. Since this is basic cleanup, I am submitting these changes as a standalone patch. Per Hugh Dickins' comments about it, I am also changing the pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning. Signed-off-by: Dave McCracken <dmccr@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] reduce MAX_NR_ZONES: remove two strange uses of MAX_NR_ZONESChristoph Lameter2006-09-261-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I keep seeing zones on various platforms that are never used and wonder why we compile support for them into the kernel. Counters show up for HIGHMEM and DMA32 that are alway zero. This patch allows the removal of ZONE_DMA32 for non x86_64 architectures and it will get rid of ZONE_HIGHMEM for arches not using highmem (like 64 bit architectures). If an arch does not define CONFIG_HIGHMEM then ZONE_HIGHMEM will not be defined. Similarly if an arch does not define CONFIG_ZONE_DMA32 then ZONE_DMA32 will not be defined. No current architecture uses all the 4 zones (DMA,DMA32,NORMAL,HIGH) that we have now. The patchset will reduce the number of zones for all platforms. On many platforms that do not have DMA32 or HIGHMEM this will reduce the number of zones by 50%. F.e. ia64 only uses DMA and NORMAL. Large amounts of memory can be saved for larger systemss that may have a few hundred NUMA nodes. With ZONE_DMA32 and ZONE_HIGHMEM support optional MAX_NR_ZONES will be 2 for many non i386 platforms and even for i386 without CONFIG_HIGHMEM set. Tested on ia64, x86_64 and on i386 with and without highmem. The patchset consists of 11 patches that are following this message. One could go even further than this patchset and also make ZONE_DMA optional because some platforms do not need a separate DMA zone and can do DMA to all of memory. This could reduce MAX_NR_ZONES to 1. Such a patchset will hopefully follow soon. This patch: Fix strange uses of MAX_NR_ZONES Sometimes we use MAX_NR_ZONES - x to refer to a zone. Make that explicit. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] lockdep: beautify x86_64 stacktracesIngo Molnar2006-07-031-1/+0
| | | | | | | | | | Beautify x86_64 stacktraces to be more readable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] add __[start|end]_rodata sections to asm-generic/sections.hHeiko Carstens2006-07-011-4/+3
| | | | | | | | | | | | | | Add __start_rodata and __end_rodata to sections.h to avoid extern declarations. Needed by s390 code (see following patch). [akpm@osdl.org: update architectures] Cc: Arjan van de Ven <arjan@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Remove obsolete #include <linux/config.h>Jörn Engel2006-06-305-5/+0
| | | | | Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* typo fixes: occuring -> occurringAdrian Bunk2006-06-301-1/+1
| | | | Signed-off-by: Adrian Bunk <bunk@stusta.de>
* [PATCH] add poison.h and patch primary usersRandy Dunlap2006-06-271-2/+5
| | | | | | | | | | | | | | | | | Localize poison values into one header file for better documentation and easier/quicker debugging and so that the same values won't be used for multiple purposes. Use these constants in core arch., mm, driver, and fs code. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pgdat allocation for new node add (specify node id)Yasunori Goto2006-06-271-28/+38
| | | | | | | | | | | | | | | | Change the name of old add_memory() to arch_add_memory. And use node id to get pgdat for the node at NODE_DATA(). Note: Powerpc's old add_memory() is defined as __devinit. However, add_memory() is usually called only after bootup. I suppose it may be redundant. But, I'm not well known about powerpc. So, I keep it. (But, __meminit is better at least.) Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'x86-64'Linus Torvalds2006-06-263-29/+32
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * x86-64: (83 commits) [PATCH] x86_64: x86_64 stack usage debugging [PATCH] x86_64: (resend) x86_64 stack overflow debugging [PATCH] x86_64: msi_apic.c build fix [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs [PATCH] x86_64: Avoid broadcasting NMI IPIs [PATCH] x86_64: fix apic error on bootup [PATCH] x86_64: enlarge window for stack growth [PATCH] x86_64: Minor string functions optimizations [PATCH] x86_64: Move export symbols to their C functions [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR [PATCH] x86_64: Fix modular pc speaker [PATCH] x86_64: remove sys32_ni_syscall() [PATCH] x86_64: Do not use -ffunction-sections for modules [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle [PATCH] x86_64: adjust kstack_depth_to_print default [PATCH] i386/x86-64: adjust /proc/interrupts column headings [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels [PATCH] x86_64: Fix fast check in safe_smp_processor_id [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status ... Manual resolve of trivial conflict in arch/i386/kernel/Makefile
| * [PATCH] x86_64: enlarge window for stack growthChuck Ebbert2006-06-261-2/+4
| | | | | | | | | | | | | | | | | | | | Allow stack growth so the 'enter' instruction works. Also fixes problem in compat_sys_kexec_load() which could allocate more than 128 bytes using compat_alloc_user_space(). Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [PATCH] x86_64: Move export symbols to their C functionsAndi Kleen2006-06-261-0/+5
| | | | | | | | | | | | | | | | Only exports for assembler files are left in x8664_ksyms.c Originally inspired by a patch from Al Viro Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [PATCH] x86_64: miscellaneous mm/init.c fixesJan Beulich2006-06-261-6/+6
| | | | | | | | | | | | | | | | | | | | | | - fix an off-by-one error in phys_pmd_init() - prevent phys_pmd_init() from removing mappings established earlier - fix the direct mapping early printk to in fact show the end of the range - remove an apparently orphan comment Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [PATCH] x86_64: Calgary IOMMU - IOMMU abstractionsJon Mason2006-06-261-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch creates a new interface for IOMMUs by adding a centralized location for IOMMU allocation (for translation tables/apertures) and IOMMU initialization. In creating these, code was moved around for abstraction, uniformity, and consiceness. Take note of the move of the iommu_setup bootarg parsing code to __setup. This is enabled by moving back the location of the aperture allocation/detection to mem init (which while ugly, was already the location of the swiotlb_init). While a slight departure from the previous patch, I belive this provides the true intention of the previous versions of the patch which changed this code. It also makes the addition of the upcoming calgary code much cleaner than previous patches. [AK: Removed one broken change. iommu_setup still has to be called early] Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [PATCH] x86_64: Get rid of pud_offset_k / __pud_offset_kAndi Kleen2006-06-262-2/+2
| | | | | | | | | | | | | | | | pud_offset_k() equivalent to pud_offset() now. Pointed out by Jan Beulich Similar for __pud_offset_ok, which needs a small change in the callers. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * [PATCH] x86_64: x86_64 version of the smp alternative patch.Gerd Hoffmann2006-06-261-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes are largely identical to the i386 version: * alternative #define are moved to the new alternative.h file. * one new elf section with pointers to the lock prefixes which can be nop'ed out for non-smp. * two new elf sections simliar to the "classic" alternatives to replace SMP code with simpler UP code. * fixup headers to use alternative.h instead of defining their own LOCK / LOCK_PREFIX macros. The patch reuses the i386 version of the alternatives code to avoid code duplication. The code in alternatives.c was shuffled around a bit to reduce the number of #ifdefs needed. It also got some tweaks needed for x86_64 (vsyscall page handling) and new features (noreplacement option which was x86_64 only up to now). Debug printk's are changed from compile-time to runtime. Loosely based on a early version from Bastian Blank <waldi@debian.org> Signed-off-by: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Notify page fault call chain for x86_64Anil S Keshavamurthy2006-06-261-2/+37
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in the do_page_fault() code path, we call notify_die(DIE_PAGE_FAULT, ...) to notify the page fault. Since notify_die() is highly overloaded, this page fault notification is currently being sent to all the components registered with register_die_notification() which uses the same die_chain to loop for all the registered components which is unnecessary. In order to optimize the do_page_fault() code path, this critical page fault notification is now moved to different call chain and the test results showed great improvements. And the kprobes which is interested in this notifications, now registers onto this new call chain only when it need to, i.e Kprobes now registers for page fault notification only when their are an active probes and unregisters from this page fault notification when no probes are active. I have incorporated all the feedback given by Ananth and Keith and everyone, and thanks for all the review feedback. This patch: Overloading of page fault notification with the notify_die() has performance issues(since the only interested components for page fault is kprobes and/or kdb) and hence this patch introduces the new notifier call chain exclusively for page fault notifications their by avoiding notifying unnecessary components in the do_page_fault() code path. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Unify pxm_to_node() and node_to_pxm()Yasunori Goto2006-06-231-32/+1
| | | | | | | | | | | | | Consolidate the various arch-specific implementations of pxm_to_node() and node_to_pxm() into a single generic version. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86_64: Handle empty node zeroDaniel Yeisley2006-05-301-1/+3
| | | | | | | | | | | | | | | From: Daniel Yeisley <dan.yeisley@unisys.com> It is possible to boot a Unisys ES7000 with CPUs from multiple cells, and not also include the memory from those cells. This can create a scenario where node 0 has cpus, but no associated memory. The system will boot fine in a configuration where node 0 has memory, but nodes 2 and 3 do not. [AK: I rechecked the code and generic code seems to indeed handle that already. Dan's original patch had a change for mm/slab.c that seems to be already in now.] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86_64: Fix memory hotadd heuristicsAndi Kleen2006-05-161-4/+11
| | | | | | | | | | | | | | | | | This fixes some boot failures on Dell and Unisys systems with memory hotadd added. - Set hotadd_percent to 0 by default. This means anybody using hotadd memory needs to specify the value on the command line. That's because there are lots of Intel boxes which have a bogus hotplug area in their SRAT and they would waste a lot of memory before. - Fix calculation of how much memory to use when the hotplug area exceeds hotadd_percent - Fix fallback when the - Fix fallback if memory hotadd is not compiled in. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86_64: sparsemem does not need node_mem_mapAndy Whitcroft2006-04-221-0/+2
| | | | | | | | | | | | | | | Seems we are trying to init the node_mem_map when we don't need to, for example when SPARSEMEM is enabled. This causes the error below during compilation. Use CONFIG_FLAT_NODE_MEM_MAP to gate allocation and init. arch/x86_64/mm/numa.c: In function `setup_node_zones': arch/x86_64/mm/numa.c:191: error: structure has no member named `node_mem_map' Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
OpenPOWER on IntegriCloud