summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/mm
Commit message (Collapse)AuthorAgeFilesLines
...
* [POWERPC] Introduce lowmem_end_addr to distinguish from total_lowmemKumar Gala2008-04-175-9/+16
| | | | | | | | | | | | | | total_lowmem represents the amount of low memory, not the physical address that low memory ends at. If the start of memory is at 0 it happens that total_lowmem can be used as both the size and the address that lowmem ends at (or more specifically one byte beyond the end). To make the code a bit more clear and deal with the case when the start of memory isn't at physical 0, we introduce lowmem_end_addr that represents one byte beyond the last physical address in the lowmem region. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Remove and replace uses of PPC_MEMSTART with memstart_addrKumar Gala2008-04-176-22/+15
| | | | | | | | | | | | | | | | A number of users of PPC_MEMSTART (40x, ppc_mmu_32) can just always use 0 as we don't support booting these kernels at non-zero physical addresses since their exception vectors must be at 0 (or 0xfffx_xxxx). For the sub-arches that support relocatable interrupt vectors (book-e), it's reasonable to have memory start at a non-zero physical address. For those cases use the variable memstart_addr instead of the #define PPC_MEMSTART since the only uses of PPC_MEMSTART are for initialization and in the future we can set memstart_addr at runtime to have a relocatable kernel. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* Merge branch 'linux-2.6'Paul Mackerras2008-04-141-2/+6
|\
| * [POWERPC] Fix deadlock with mmu_hash_lock in hash_page_syncBenjamin Herrenschmidt2008-04-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hash_page_sync() takes and releases the low level mmu hash lock in order to sync with other processors disposing of page tables. Because that lock can be needed to service hash misses triggered by interrupt handlers, taking it must be done with interrupts off. However, hash_page_sync() appears to be called with interrupts enabled, thus causing occasional deadlocks. We fix it by making sure hash_page_sync() masks interrupts while holding the lock. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] htab_remove_mapping is only used by MEMORY_HOTPLUGStephen Rothwell2008-04-071-0/+2
| | | | | | | | | | | | | | | | This eliminates a warning in builds that don't define CONFIG_MEMORY_HOTPLUG. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Remove redundant display of free swap space in show_mem()Johannes Weiner2008-04-011-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | show_mem() has no need to print the amount of free swap space manually because show_free_areas() does this already and is called by the former. The two outputs only differ in text formatting: printk("Free swap = %lukB\n", ...); printk("Free swap: %6ldkB\n", ...); Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Replace remaining __FUNCTION__ occurrencesHarvey Harrison2008-04-012-2/+2
| | | | | | | | | | | | | | | | | | __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] arch_add_memory() cannot be __devinitGeert Uytterhoeven2008-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: vmlinux.o(.text+0xb41b0): Section mismatch in reference from the function .add_memory() to the function .devinit.text:.arch_add_memory() The function .add_memory() references the function __devinit .arch_add_memory(). This is often because .add_memory lacks a __devinit annotation or the annotation of .arch_add_memory is wrong. arch_add_memory() is also not __devinit on other architectures Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Add error return from htab_remove_mapping()Badari Pulavarty2008-04-011-5/+9
| | | | | | | | | | | | | | | | If the platform doesn't support hpte_removebolted(), gracefully return failure rather than success. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | Merge branch 'linux-2.6'Paul Mackerras2008-03-262-3/+14
|\ \ | |/
| * [POWERPC] Don't use 64k pages for ioremap on pSeriesPaul Mackerras2008-03-241-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On pSeries, the hypervisor doesn't let us map in the eHEA ethernet adapter using 64k pages, and thus the ehea driver will fail if 64k pages are configured. This works around the problem by always using 4k pages for ioremap on pSeries (but not on other platforms). A better fix would be to check whether the partition could ever have an eHEA adapter, and only force 4k pages if it could, but this will do for 2.6.25. This is based on an earlier patch by Tony Breeds. Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [POWERPC] Fix PMU + soft interrupt disable bugAnton Blanchard2008-03-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the PMU is an NMI now, it can come at any time we are only soft disabled. We must hard disable around the two places we allow the kernel stack SLB and r1 to go out of sync. Otherwise the PMU exception can force a kernel stack SLB into another slot, which can lead to it getting evicted, which can lead to a nasty unrecoverable SLB miss in the exception entry code. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | Merge branch 'linux-2.6'Paul Mackerras2008-03-131-2/+2
|\ \ | |/
| * [POWERPC] Fix large hash table allocation on Cell bladesMichael Ellerman2008-03-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My recent hack to allocate the hash table under 1GB on cell was poorly tested, *cough*. It turns out on blades with large amounts of memory we fail to allocate the hash table at all. This is because RTAS has been instantiated just below 768MB, and 0-x MB are used by the kernel, leaving no areas that are both large enough and also naturally-aligned. For the cell IOMMU hack the page tables must be under 2GB, so use that as the limit instead. This has been tested on real hardware and boots happily. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Add code for removing HPTEs for parts of the linear mappingBadari Pulavarty2008-02-261-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | For memory remove, we need to clean up htab mappings for the section of the memory we are removing. This implements support for removing htab bolted mappings for pSeries logical partitions. Other sub-archs may need to implement similar functionality for hotplug memory remove to work on them. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [LIB]: Make PowerPC LMB code generic so sparc64 can use it too.David S. Miller2008-02-139-365/+12
|/ | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-2.6.25' of ↵Linus Torvalds2008-02-081-0/+33
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc [POWERPC] Enable hotplug memory remove for 64-bit powerpc [POWERPC] Add remove_memory() for 64-bit powerpc [POWERPC] Make cell IOMMU fixed mapping printk more useful [POWERPC] Fix potential cell IOMMU bug when switching back to default DMA ops [POWERPC] Don't enable cell IOMMU fixed mapping if there are no dma-ranges [POWERPC] Fix cell IOMMU null pointer explosion on old firmwares [POWERPC] spufs: Fix timing dependent false return from spufs_run_spu [POWERPC] spufs: No need to have a runnable SPU for libassist update [POWERPC] spufs: Update SPU_Status[CISHP] in backing runcntl write [POWERPC] spufs: Fix state_mutex leaks [POWERPC] Disable G5 NAP mode during SMU commands on U3
| * [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpcBadari Pulavarty2008-02-081-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | walk_memory_resource() verifies if there are holes in a given memory range, by checking against /proc/iomem. On x86/ia64 system memory is represented in /proc/iomem. On powerpc, we don't show system memory as IO resource in /proc/iomem - instead it's maintained in /proc/device-tree. This provides a way for an architecture to provide its own walk_memory_resource() function. On powerpc, the memory region is small (16MB), contiguous and non-overlapping. So extra checking against the device-tree is not needed. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [POWERPC] Add remove_memory() for 64-bit powerpcBadari Pulavarty2008-02-081-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | Supply remove_memory() function for 64-bit powerpc. This is still not quite complete as it needs to do some more arch-specific stuff, which will be added in a later patch. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | CONFIG_HIGHPTE vs. sub-page page tables.Martin Schwidefsky2008-02-081-6/+8
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a later patch. For everybody else it will be a (struct page *). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-2.6.25' of ↵Linus Torvalds2008-02-072-4/+69
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (69 commits) [POWERPC] Add SPE registers to core dumps [POWERPC] Use regset code for compat PTRACE_*REGS* calls [POWERPC] Use generic compat_sys_ptrace [POWERPC] Use generic compat_ptrace_request [POWERPC] Use generic ptrace peekdata/pokedata [POWERPC] Use regset code for PTRACE_*REGS* requests [POWERPC] Switch to generic compat_binfmt_elf code [POWERPC] Switch to using user_regset-based core dumps [POWERPC] Add user_regset compat support [POWERPC] Add user_regset_view definitions [POWERPC] Use user_regset accessors for GPRs [POWERPC] ptrace accessors for special regs MSR and TRAP [POWERPC] Use user_regset accessors for SPE regs [POWERPC] Use user_regset accessors for altivec regs [POWERPC] Use user_regset accessors for FP regs [POWERPC] mpc52xx: fix compile error introduce when rebasing patch [POWERPC] 4xx: PCIe indirect DCR spinlock fix. [POWERPC] Add missing native dcr dcr_ind_lock spinlock [POWERPC] 4xx: Fix offset value on Warp board [POWERPC] 4xx: Add 440EPx Sequoia ehci dts entry ...
| * [POWERPC] Fake NUMA emulation for PowerPCBalbir Singh2008-02-071-3/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here's a dumb simple implementation of fake NUMA nodes for PowerPC. Fake NUMA nodes can be specified using the following command line option numa=fake=<node range> node range is of the format <range1>,<range2>,...<rangeN> Each of the rangeX parameters is passed using memparse(). I find the patch useful for fake NUMA emulation on my simple PowerPC machine. I've tested it on a numa box with the following arguments numa=fake=512M numa=fake=512M,768M numa=fake=256M,512M mem=512M numa=fake=1G mem=768M numa=fake= without any numa= argument The other side-effect introduced by this patch is that; in the case where we don't have NUMA information, we now set a node online after adding each LMB. This node could very well be node 0, but in the case that we enable fake NUMA nodes, when we cross node boundaries, we need to set the new node online. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [POWERPC] update_mmu_cache: Don't cache-flush non-readable pagesScott Wood2008-02-061-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, update_mmu_cache will crash if given a no-access PTE. There's no need to synchronize dcache/icache unless it's an exec mapping -- however, due to the existence of older glibc versions that execute out of a read-but-no-exec page, readability is tested instead. This assumes no exec-only mappings; if such mappings become supported, they will need to go through the kmap_atomic() version of dcache/icache synchronization. This fixes a bug reported by some users where the kernel would crash while dumping core on a threaded program. Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | Introduce flags for reserve_bootmem()Bernhard Walle2008-02-072-4/+6
|/ | | | | | | | | | | | | | | | | | | | | | | | This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. This patch: Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: <linux-arch@vger.kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* add mm argument to pte/pmd/pud/pgd_freeBenjamin Herrenschmidt2008-02-051-3/+3
| | | | | | | | | | | | | | | | | | (with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [POWERPC] Allocate the hash table under 1G on cellMichael Ellerman2008-01-311-3/+9
| | | | | | | | | | | | In order to support the fixed IOMMU mapping (in a subsequent patch), we need the hash table to be inside the IOMMUs DMA window. This is usually 2G, but let's make sure the hash table is under 1G as that will satisfy the IOMMU requirements and also means the hash table will be on node 0. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
* Revert "[POWERPC] Fake NUMA emulation for PowerPC"Paul Mackerras2008-01-261-54/+5
| | | | | | | | | | | | | | | | | | This reverts commit 5c3f5892a2db6757a72ce8b27cba90db06683e1d, basically because it changes behaviour even when no fake NUMA information is specified on the kernel command line. Firstly, it changes the nid, thus destroying the real NUMA information. Secondly, it also changes behaviour in that if a node ends up with no memory in it because of the memory limit, we used to set it online and now we don't. Also, in the non-NUMA case with no fake NUMA information, we do node_set_online once for each LMB now, whereas previously we only did it once. I don't know if that is actually a problem, but it does seem unnecessary. Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Make setjmp/longjmp code usable outside of xmonMichael Neuling2008-01-251-4/+2
| | | | | | | | | This makes the setjmp/longjmp code used by xmon, generically available to other code. It also removes the requirement for debugger hooks to be only called on 0x300 (data storage) exception. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* Merge branch 'for-2.6.25' of ↵Paul Mackerras2008-01-243-5/+35
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc into for-2.6.25
| * [POWERPC] 85xx: Respect KERNELBASE, PAGE_OFFSET, and PHYSICAL_START on e500Dale Farnsworth2008-01-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The e500 MMU init code previously assumed KERNELBASE always equaled PAGE_OFFSET and PHYSICAL_START was 0. This is useful for kdump support as well as asymetric multicore. For the initial kdump support the secondary kernel will run at 32M but need access to all of memory so we bump the initial TLB up to 64M. This also matches with the forth coming ePAPR spec. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| * [POWERPC] Fix handling of memreserve if the range lands in highmemKumar Gala2008-01-232-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were several issues if a memreserve range existed and happened to be in highmem: * The bootmem allocator is only aware of lowmem so calling reserve_bootmem with a highmem address would cause a BUG_ON * All highmem pages were provided to the buddy allocator Added a lmb_is_reserved() api that we now use to determine if a highem page should continue to be PageReserved or provided to the buddy allocator. Also, we incorrectly reported the amount of pages reserved since all highmem pages are initally marked reserved and we clear the PageReserved flag as we "free" up the highmem pages. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* | Merge branch 'linux-2.6'Paul Mackerras2008-01-241-0/+2
|\ \
| * | [POWERPC] Fix boot failure on POWER6Paul Mackerras2008-01-151-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 473980a99316c0e788bca50996375a2815124ce1 added a call to clear the SLB shadow buffer before registering it. Unfortunately this means that we clear out the entries that slb_initialize has previously set in there. On POWER6, the hypervisor uses the SLB shadow buffer when doing partition switches, and that means that after the next partition switch, each non-boot CPU has no SLB entries to map the kernel text and data, which causes it to crash. This fixes it by reverting most of 473980a9 and instead clearing the 3rd entry explicitly in slb_initialize. This fixes the problem that 473980a9 was trying to solve, but without breaking POWER6. Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | [POWERPC] Fix CPU hotplug when using the SLB shadow bufferMichael Neuling2008-01-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before we register the SLB shadow buffer, we need to invalidate the entries in the buffer, otherwise we can end up stale entries from when we previously offlined the CPU. This does this invalidate as well as unregistering the buffer with PHYP before we offline the cpu. Tested and fixes crashes seen on 970MP (thanks to tonyb) and POWER5. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | [POWERPC] Provide a way to protect 4k subpages when using 64k pagesPaul Mackerras2008-01-244-18/+295
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using 64k pages on 64-bit PowerPC systems makes life difficult for emulators that are trying to emulate an ISA, such as x86, which use a smaller page size, since the emulator can no longer use the MMU and the normal system calls for controlling page protections. Of course, the emulator can emulate the MMU by checking and possibly remapping the address for each memory access in software, but that is pretty slow. This provides a facility for such programs to control the access permissions on individual 4k sub-pages of 64k pages. The idea is that the emulator supplies an array of protection masks to apply to a specified range of virtual addresses. These masks are applied at the level where hardware PTEs are inserted into the hardware page table based on the Linux PTEs, so the Linux PTEs are not affected. Note that this new mechanism does not allow any access that would otherwise be prohibited; it can only prohibit accesses that would otherwise be allowed. This new facility is only available on 64-bit PowerPC and only when the kernel is configured for 64k pages. The masks are supplied using a new subpage_prot system call, which takes a starting virtual address and length, and a pointer to an array of protection masks in memory. The array has a 32-bit word per 64k page to be protected; each 32-bit word consists of 16 2-bit fields, for which 0 allows any access (that is otherwise allowed), 1 prevents write accesses, and 2 or 3 prevent any access. Implicit in this is that the regions of the address space that are protected are switched to use 4k hardware pages rather than 64k hardware pages (on machines with hardware 64k page support). In fact the whole process is switched to use 4k hardware pages when the subpage_prot system call is used, but this could be improved in future to switch only the affected segments. The subpage protection bits are stored in a 3 level tree akin to the page table tree. The top level of this tree is stored in a structure that is appended to the top level of the page table tree, i.e., the pgd array. Since it will often only be 32-bit addresses (below 4GB) that are protected, the pointers to the first four bottom level pages are also stored in this structure (each bottom level page contains the protection bits for 1GB of address space), so the protection bits for addresses below 4GB can be accessed with one fewer loads than those for higher addresses. Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Add hugepagesz boot-time parameterJon Tollefson2008-01-172-34/+96
| | | | | | | | | | | | | | | | | | | | | | | | This adds the hugepagesz boot-time parameter for ppc64. It lets one pick the size for huge pages. The choices available are 64K and 16M when the base page size is 4k. It defaults to 16M (previously the only only choice) if nothing or an invalid choice is specified. Tested 64K huge pages successfully with the libhugetlbfs 1.2. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Fake NUMA emulation for PowerPCBalbir Singh2007-12-201-5/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here's a dumb simple implementation of fake NUMA nodes for PowerPC. Fake NUMA nodes can be specified using the following command line option numa=fake=<node range> node range is of the format <range1>,<range2>,...<rangeN> Each of the rangeX parameters is passed using memparse(). I find this useful for fake NUMA emulation on my simple PowerPC machine. I've tested it on a non-numa box with the following arguments: numa=fake=1G numa=fake=1G,2G name=fake=1G,512M,2G numa=fake=1500M,2800M mem=3500M numa=fake=1G mem=512M numa=fake=1G mem=1G Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Use SLB size from the device treeMichael Neuling2007-12-113-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently we hardwire the number of SLBs to 64, but PAPR says we should use the ibm,slb-size property to obtain the number of SLB entries. This uses this property instead of assuming 64. If no property is found, we assume 64 entries as before. This soft patches the SLB handler, so it shouldn't change performance at all. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Add missing spaces in printk formatsjoe@perches.com2007-12-031-1/+1
|/ | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Fix 8xx build breakage due to _tlbie changesBenjamin Herrenschmidt2007-11-202-2/+2
| | | | | | | | | | | My changes to _tlbie to fix 4xx unfortunately broke 8xx build in a couple of places. This fixes it. Spotted by Olof Johansson. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Fix build failure on legacy iSeriesKamalesh Babulal2007-11-201-0/+1
| | | | | | | | | | | | | | | | | | Include <asm/iseries/hv_call.h> in arch/powerpc/mm/stab.c to fix the following compile error (found with randconfig): CC arch/powerpc/mm/stab.o arch/powerpc/mm/stab.c: In function "stab_initialize": arch/powerpc/mm/stab.c:282: error: implicit declaration of function "HvCall1" arch/powerpc/mm/stab.c:282: error: "HvCallBaseSetASR" undeclared (first use in this function) arch/powerpc/mm/stab.c:282: error: (Each undeclared identifier is reported only once arch/powerpc/mm/stab.c:282: error: for each function it appears in.) make[1]: *** [arch/powerpc/mm/stab.o] Error 1 make: *** [arch/powerpc/mm] Error 2 Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Silence an annoying boot messageStephen Rothwell2007-11-131-12/+4
| | | | | | | | | | | | vmemmap_populate will printk (with KERN_WARNING) for a lot of pages if CONFIG_SPARSEMEM_VMEMMAP is enabled (at least it does on iSeries). Use pr_debug for it instead. Replace the only other use of DBG in this file with pr_debug as well. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Fix CONFIG_SMP=n build error on ppc64Olof Johansson2007-11-131-2/+0
| | | | | | | | | | | | | | | | | | | | The patch "KVM: fix !SMP build error" change the way smp_call_function() actually uses the passed in function names on non-SMP builds. So previously it was never caught that the function passed in was never actually defined. This causes a build error on ppc64_defconfig + CONFIG_SMP=n: arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now': arch/powerpc/mm/tlb_64.c:71: error: 'pte_free_smp_sync' undeclared (first use in this function) arch/powerpc/mm/tlb_64.c:71: error: (Each undeclared identifier is reported only once arch/powerpc/mm/tlb_64.c:71: error: for each function it appears in.) So we need to define it even if CONFIG_SMP is off. Either that or ifdef out the smp_call_function() call, but that's ugly. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
* Merge branch 'for-2.6.24' of ↵Paul Mackerras2007-11-084-12/+12
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx into merge
| * [POWERPC] ppc405 Fix arithmatic rollover bug when memory size under 16MGrant Likely2007-11-011-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mmu_mapin_ram() loops over total_lowmem to setup page tables. However, if total_lowmem is less that 16M, the subtraction rolls over and results in a number just under 4G (because total_lowmem is an unsigned value). This patch rejigs the loop from countup to countdown to eliminate the bug. Special thanks to Magnus Hjorth who wrote the original patch to fix this bug. This patch improves on his by making the loop code simpler (which also eliminates the possibility of another rollover at the high end) and also applies the change to arch/powerpc. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
| * [POWERPC] 4xx: Deal with 44x virtually tagged icacheBenjamin Herrenschmidt2007-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 44x family has an interesting "feature" which is a virtually tagged instruction cache (yuck !). So far, we haven't dealt with it properly, which means we've been mostly lucky or people didn't report the problems, unless people have been running custom patches in their distro... This is an attempt at fixing it properly. I chose to do it by setting a global flag whenever we change a PTE that was previously marked executable, and flush the entire instruction cache upon return to user space when that happens. This is a bit heavy handed, but it's hard to do more fine grained flushes as the icbi instruction, on those processor, for some very strange reasons (since the cache is virtually mapped) still requires a valid TLB entry for reading in the target address space, which isn't something I want to deal with. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
| * [POWERPC] 4xx: Fix 4xx flush_tlb_page()Benjamin Herrenschmidt2007-11-012-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 4xx CPUs, the current implementation of flush_tlb_page() uses a low level _tlbie() assembly function that only works for the current PID. Thus, invalidations caused by, for example, a COW fault triggered by get_user_pages() from a different context will not work properly, causing among other things, gdb breakpoints to fail. This patch adds a "pid" argument to _tlbie() on 4xx processors, and uses it to flush entries in the right context. FSL BookE also gets the argument but it seems they don't need it (their tlbivax form ignores the PID when invalidating according to the document I have). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
* | [POWERPC] Fix switch_slb handling of 1T ESID valueswill schmidt2007-11-081-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have 1TB segment size support, we need to be using the GET_ESID_1T macro when comparing ESID values for pc, stack, and unmapped_base within switch_slb(). A new helper function called esids_match() contains the logic for deciding when to call GET_ESID and GET_ESID_1T. This fixes a duplicate-slb-entry inspired machine-check exception I was seeing when trying to run java on a power6 partition. Tested on power6 and power5. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Include udbg.h when using udbg_printfwill schmidt2007-11-082-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | This fixes the error error: implicit declaration of function "udbg_printf" We have a few spots where we reference udbg_printf() without #including udbg.h. These are within #ifdef DEBUG blocks, so unnoticed until we do a #define DEBUG or #define DEBUG_LOW nearby. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] powerpc: Fix demotion of segments to 4K pagesBenjamin Herrenschmidt2007-10-292-4/+7
|/ | | | | | | | | | | | | | | | | | When demoting a process to use 4K HW pages (instead of 64K), which happens under various circumstances such as doing cache inhibited mappings on machines that do not support 64K CI pages, the assembly hash code calls back into the C function flush_hash_page(). This function prototype was recently changed to accomodate for 1T segments but the assembly call site was not updated, causing applications that do demotion to hang. In addition, when updating the per-CPU PACA for the new sizes, we didn't properly update the slice "map", thus causing the SLB miss code to re-insert segments for the wrong size. This fixes both and adds a warning comment next to the C implementation to try to avoid problems next time someone changes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
OpenPOWER on IntegriCloud