summaryrefslogtreecommitdiffstats
path: root/include/asm-powerpc
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] scheduler cache-hot-autodetectakpm@osdl.org2006-01-121-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ) From: Ingo Molnar <mingo@elte.hu> This is the latest version of the scheduler cache-hot-auto-tune patch. The first problem was that detection time scaled with O(N^2), which is unacceptable on larger SMP and NUMA systems. To solve this: - I've added a 'domain distance' function, which is used to cache measurement results. Each distance is only measured once. This means that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT distances 0 and 1, and on SMP distance 0 is measured. The code walks the domain tree to determine the distance, so it automatically follows whatever hierarchy an architecture sets up. This cuts down on the boot time significantly and removes the O(N^2) limit. The only assumption is that migration costs can be expressed as a function of domain distance - this covers the overwhelming majority of existing systems, and is a good guess even for more assymetric systems. [ People hacking systems that have assymetries that break this assumption (e.g. different CPU speeds) should experiment a bit with the cpu_distance() function. Adding a ->migration_distance factor to the domain structure would be one possible solution - but lets first see the problem systems, if they exist at all. Lets not overdesign. ] Another problem was that only a single cache-size was used for measuring the cost of migration, and most architectures didnt set that variable up. Furthermore, a single cache-size does not fit NUMA hierarchies with L3 caches and does not fit HT setups, where different CPUs will often have different 'effective cache sizes'. To solve this problem: - Instead of relying on a single cache-size provided by the platform and sticking to it, the code now auto-detects the 'effective migration cost' between two measured CPUs, via iterating through a wide range of cachesizes. The code searches for the maximum migration cost, which occurs when the working set of the test-workload falls just below the 'effective cache size'. I.e. real-life optimized search is done for the maximum migration cost, between two real CPUs. This, amongst other things, has the positive effect hat if e.g. two CPUs share a L2/L3 cache, a different (and accurate) migration cost will be found than between two CPUs on the same system that dont share any caches. (The reliable measurement of migration costs is tricky - see the source for details.) Furthermore i've added various boot-time options to override/tune migration behavior. Firstly, there's a blanket override for autodetection: migration_cost=1000,2000,3000 will override the depth 0/1/2 values with 1msec/2msec/3msec values. Secondly, there's a global factor that can be used to increase (or decrease) the autodetected values: migration_factor=120 will increase the autodetected values by 20%. This option is useful to tune things in a workload-dependent way - e.g. if a workload is cache-insensitive then CPU utilization can be maximized by specifying migration_factor=0. I've tested the autodetection code quite extensively on x86, on 3 P3/Xeon/2MB, and the autodetected values look pretty good: Dual Celeron (128K L2 cache): --------------------- migration cost matrix (max_cache_size: 131072, cpu: 467 MHz): --------------------- [00] [01] [00]: - 1.7(1) [01]: 1.7(1) - --------------------- cacheflush times [2]: 0.0 (0) 1.7 (1784008) --------------------- Here the slow memory subsystem dominates system performance, and even though caches are small, the migration cost is 1.7 msecs. Dual HT P4 (512K L2 cache): --------------------- migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz): --------------------- [00] [01] [02] [03] [00]: - 0.4(1) 0.0(0) 0.4(1) [01]: 0.4(1) - 0.4(1) 0.0(0) [02]: 0.0(0) 0.4(1) - 0.4(1) [03]: 0.4(1) 0.0(0) 0.4(1) - --------------------- cacheflush times [2]: 0.0 (33900) 0.4 (448514) --------------------- Here it can be seen that there is no migration cost between two HT siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory system makes inter-physical-CPU migration pretty cheap: 0.4 msecs. 8-way P3/Xeon [2MB L2 cache]: --------------------- migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz): --------------------- [00] [01] [02] [03] [04] [05] [06] [07] [00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) [04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) [05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) [06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) [07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - --------------------- cacheflush times [2]: 0.0 (0) 19.2 (19281756) --------------------- This one has huge caches and a relatively slow memory subsystem - so the migration cost is 19 msecs. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: <wilder@us.ibm.com> Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sched: add cacheflush() asmIngo Molnar2006-01-121-0/+10
| | | | | | | | | | Add per-arch sched_cacheflush() which is a write-back cacheflush used by the migration-cost calibration code at bootup time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-mergeLinus Torvalds2006-01-116-5/+92
|\
| * powerpc/32: Fix compile error caused by pud_t/pgt_t confusionPaul Mackerras2006-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | PPC32 is still using asm-generic/4level-fixup.h, but asm-powerpc/page.h was defining pud_t and pgd_t. Depending on the order in which files got included, this could result in a compilation error. Tweak the ifdef so that page.h doesn't try to define pud_t on ppc32 (which uses 2-level page tables). Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc/64: per cpu data optimisationsAnton Blanchard2006-01-112-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current ppc64 per cpu data implementation is quite slow. eg: lhz 11,18(13) /* smp_processor_id() */ ld 9,.LC63-.LCTOC1(30) /* per_cpu__variable_name */ ld 8,.LC61-.LCTOC1(30) /* __per_cpu_offset */ sldi 11,11,3 /* form index into __per_cpu_offset */ mr 10,9 ldx 9,11,8 /* __per_cpu_offset[smp_processor_id()] */ ldx 0,10,9 /* load per cpu data */ 5 loads for something that is supposed to be fast, pretty awful. One reason for the large number of loads is that we have to synthesize 2 64bit constants (per_cpu__variable_name and __per_cpu_offset). By putting __per_cpu_offset into the paca we can avoid the 2 loads associated with it: ld 11,56(13) /* paca->data_offset */ ld 9,.LC59-.LCTOC1(30) /* per_cpu__variable_name */ ldx 0,9,11 /* load per cpu data Longer term we can should be able to do even better than 3 loads. If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset was in a register we could cut it down to one load. A suggestion from Rusty is to use gcc's __thread extension here. In order to do this we would need to free up r13 (the __thread register and where the paca currently is). So far Ive had a few unsuccessful attempts at doing that :) The patch also allocates per cpu memory node local on NUMA machines. This patch from Rusty has been sitting in my queue _forever_ but stalled when I hit the compiler bug. Sorry about that. Finally I also only allocate per cpu data for possible cpus, which comes straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128) and 4 possible cpus we see some nice gains: total used free shared buffers cached Mem: 4012228 212860 3799368 0 0 162424 total used free shared buffers cached Mem: 4016200 212984 3803216 0 0 162424 A saving of 3.75MB. Quite nice for smaller machines. Note: we now have to be careful of per cpu users that touch data for !possible cpus. At this stage it might be worth making the NUMA and possible cpu optimisations generic, but per cpu init is done so early we have to be careful that all architectures have their possible map setup correctly. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: parallel port init fixMichael Neuling2006-01-111-2/+26
| | | | | | | | | | | | | | This stops parport from accessing nonexistent parallel ports. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Make early debugging configurable via KconfigMichael Ellerman2006-01-112-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds Kconfig entries to control the early debugging options, currently in setup_64.c. Doing this via Kconfig rather than #defines means you can have one source tree, which is buildable for multiple platforms - and you can enable the correct early debug option for each platform via .config. I made udbg_early_init() a static inline because otherwise GCC is to daft to optimise it away when debugging is off. Now that we have udbg_init_rtas() we can make call_rtas_display_status* static. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | asm-powerpc: header included twiceNicolas Kaiser2006-01-111-1/+0
|/ | | | | | | Header included twice. Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-mergeLinus Torvalds2006-01-105-10/+41
|\
| * powerpc: Introduce a new config symbol to control 16550 early debug codePaul Mackerras2006-01-101-0/+4
| | | | | | | | | | | | | | | | | | | | The previous change by Kumar Gala in this area led to legacy_serial.c and udbg_16550.c being built as modules when CONFIG_SERIAL_8250=m. Fix this by introducing a new symbol, CONFIG_PPC_UDBG_16550, to control whether these files get built, and arrange for it to be selected for those platforms that need it. Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Save device BARs much earlier in the boot sequenceLinas Vepstas2006-01-102-6/+5
| | | | | | | | | | | | | | | | | | | | 241-eeh-save-bars-earlier.patch Save the PCI device bars *before* any PCI probing is done. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 76c902b919098860f3d4e125f847abcc4cb1782a commit)
| * [PATCH] powerpc: Don't continue with PCI Error recovery if slot reset failed.Linas Vepstas2006-01-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | 238-eeh-stop-if-reset_failed.patch If the firmware is unable to reset the PCI slot for some reason, then don't attempt any further recovery steps after that point. Instead, mark the device as permanently failed. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from e06b942521eb2cdaf232726f45a820d5837acb12 commit)
| * [PATCH] powerpc: Remove duplicate codeLinas Vepstas2006-01-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | 234-eeh-find-pe.patch The find_device_pe() routine is duplicated in two files. Remove one of the two copies, declare the other extern. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 48408e708282d4d0269136ff27ea5acbd9410b5a commit)
| * [PATCH] powerpc: Add "partitionable endpoint" supportLinas Vepstas2006-01-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | 26-eeh-partition-endpoint.patch New versions of firmware introduce a new method by which the "partitionable endpoint" (the point at which the pci bus is cut) should be located. This code adds the support for this (mandatory) new feature. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from 9fcfb5d35b5294659f9299aa9cae6fd16325c07e commit)
| * [PATCH] powerpc: Split out PCI address cache to its own fileLinas Vepstas2006-01-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 25-pci-address-cache.patch The core EEH file is rather large. This patch splits out a self-contained chunk of it into its own file. This is the chunk that performes the caching and lookup of pci devices based on the i/o addresses of thier resoures. This code is almos architecture-independent and could be used by any system that wanted to find a pci device based only on the i/o address used by the device. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from b0b291d59906d4a9a89ed9e34d9fd684c7188924 commit)
| * [PATCH] powerpc: PCI Error Recovery: PPC64 core recovery routinesLinas Vepstas2006-01-103-5/+19
| | | | | | | | | | | | | | | | | | | | Various PCI bus errors can be signaled by newer PCI controllers. The core error recovery routines are architecture dependent. This patch adds a recovery infrastructure for the PPC64 pSeries systems. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> (cherry picked from e8ca11b460c4c9c7fa6b529be221529ebd770e38 commit)
* | [PATCH] kprobes: fix build breakageAnanth N Mavinakayanahalli2006-01-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | The following patch (against 2.6.15-rc5-mm3) fixes a kprobes build break due to changes introduced in the kprobe locking in 2.6.15-rc5-mm3. In addition, the patch reverts back the open-coding of kprobe_mutex. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] kprobes: arch_remove_kprobeAnil S Keshavamurthy2006-01-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently arch_remove_kprobes() is only implemented/required for x86_64 and powerpc. All other architecture like IA64, i386 and sparc64 implementes a dummy function which is being called from arch independent kprobes.c file. This patch removes the dummy functions and replaces it with #define arch_remove_kprobe(p, s) do { } while(0) Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] kprobes: cleanup include/asm/kprobes.hAnil S Keshavamurthy2006-01-101-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch specific kprobes.h files never gets included when CONFIG_KPROBES is turned off. Hence check for CONFIG_KPROBES is not appropriate here in this arch specific kprobes.h files. Also the below defined function kprobes_exception_notify() is not needed when CONFIG_KPROBES is off. Compile tested for both CONFIG_KPROBES=y and N. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] kprobes: enable funcions only for required archAnil S Keshavamurthy2006-01-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel/kprobes.c defines get_insn_slot() and free_insn_slot() which are currently required _only_ for x86_64 and powerpc (which has no-exec support). FYI, get{free}_insn_slot() functions manages the memory page which is mapped as executable, required for instruction emulation. This patch moves those two functions under __ARCH_WANT_KPROBES_INSN_SLOT and defines __ARCH_WANT_KPROBES_INSN_SLOT in arch specific kprobes.h file. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Kdump: powerpc and s390 build failure fixakpm@osdl.org2006-01-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ) From: Vivek Goyal <vgoyal@in.ibm.com> crash_setup_regs() is an architecture dependent function which is called in architecture independent section. So every architecture supporting kexec should at least provide a dummy definition of crash_setup_regs() even if crash dumping is not implemented yet, to avoid build failures. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] kdump: dynamic per cpu allocation of memory for saving cpu registersVivek Goyal2006-01-101-3/+0
|/ | | | | | | | | | | | | | | | | | | | | - In case of system crash, current state of cpu registers is saved in memory in elf note format. So far memory for storing elf notes was being allocated statically for NR_CPUS. - This patch introduces dynamic allocation of memory for storing elf notes. It uses alloc_percpu() interface. This should lead to better memory usage. - Introduced based on Andi Kleen's and Eric W. Biederman's suggestions. - This patch also moves memory allocation for elf notes from architecture dependent portion to architecture independent portion. Now crash_notes is architecture independent. The whole idea is that size of memory to be allocated per cpu (MAX_NOTE_BYTES) can be architecture dependent and allocation of this memory can be architecture independent. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/mingo/mutex-2.6Linus Torvalds2006-01-092-0/+10
|\
| * [PATCH] mutex subsystem, add default include/asm-*/mutex.h filesArjan van de Ven2006-01-091-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | add the per-arch mutex.h files for the remaining architectures. We default to asm-generic/mutex-dec.h, because that performs quite well on most arches. Arches that do not have atomic decrement/increment instructions should switch to mutex-xchg.h instead. Arches can also provide their own implementation for the mutex fastpath primitives. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * [PATCH] mutex subsystem, add atomic_xchg() to all archesIngo Molnar2006-01-091-0/+1
| | | | | | | | | | | | | | add atomic_xchg() to all the architectures. Needed by the new mutex code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org>
* | spelling: s/retreive/retrieve/Adrian Bunk2006-01-101-3/+3
|/ | | | Signed-off-by: Adrian Bunk <bunk@stusta.de>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-mergeLinus Torvalds2006-01-0991-182/+2041
|\
| * [PATCH] ppc64: Fix oprofile when compiled as a moduleAnton Blanchard2006-01-091-2/+9
| | | | | | | | | | | | | | | | | | My recent changes to oprofile broke it when built as a module. Fix it by using an enum instead of a function pointer. This way we still retain the oprofile configuration in the cputable. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] 3/5 powerpc: Add platform functions interpreterBenjamin Herrenschmidt2006-01-093-0/+275
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the platform function interpreter itself along with the backends for UniN/U3/U4, mac-io, GPIOs and i2c. It adds the ability to execute those do-platform-* scripts in the device-tree (at least for most devices for which a backend is provided). This should replace the clock spreading hacks properly. It might also have an impact on all sort of machines since some of the scripts marked "at init" will now be executed on boot (or some other on sleep/wakeup), those will possibly do things that the kernel didn't do at all, like setting some values into some i2c devices (changing thermal sensor calibration or conversion rate) etc... Thus regression testing is MUCH welcome. Also loook for errors in dmesg. That's also why I've left rather verbose debugging enabled in this version of the patch. (I do expect some Windtunnel G4s to show some errors as they have an i2c clock chip on the PMU bus that uses some primitives that the i2c backend doesn't implement yet. I really need users that have one of those machine to come back to me so we can get that done right, though the errors themselves should be harmless, I suspect the machine might not run at full speed). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] 2/5 powerpc: Rework PowerMac i2c part 2Benjamin Herrenschmidt2006-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the continuation of the previous patch. This one removes the old PowerMac i2c drivers (i2c-keywest and i2c-pmac-smu) and replaces them both with a single stub driver that uses the new PowerMac low i2c layer. Now that i2c-keywest is gone, the low-i2c code is extended to support interrupt driver transfers. All i2c busses now appear as platform devices. Compatibility with existing drivers should be maintained as the i2c bus names have been kept identical, except for the SMU bus but in that later case, all users has been fixed. With that patch added, matching a device node to an i2c_adapter becomes trivial. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] 1/5 powerpc: Rework PowerMac i2c part 1Benjamin Herrenschmidt2006-01-093-24/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first part of a rework of the PowerMac i2c code. It completely reworks the "low_i2c" layer. It is now more flexible, supports KeyWest, SMU and PMU i2c busses, and provides functions to match device nodes to i2c busses and adapters. This patch also extends & fix some bugs in the SMU driver related to i2c support and removes the clock spreading hacks from the pmac feature code rather than adapting them to the new API since they'll be replaced by the platform function code completely in patch 3/5 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] spufs: set irq affinity for running threadsArnd Bergmann2006-01-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For far, all SPU triggered interrupts always end up on the first SMT thread, which is a bad solution. This patch implements setting the affinity to the CPU that was running last when entering execution on an SPU. This should result in a significant reduction in IPI calls and better cache locality for SPE thread specific data. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] spufs: fix allocation on 64k pagesArnd Bergmann2006-01-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | The size of the local store is architecture defined and independent from the page size, so it should not be defined in terms of pages in the first place. This mistake broke a few places when building for 64kb pages. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] spufs: abstract priv1 register access.Arnd Bergmann2006-01-091-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a hypervisor based setup, direct access to the first priviledged register space can typically not be allowed to the kernel and has to be implemented through hypervisor calls. As suggested by Masato Noguchi, let's abstract the register access trough a number of function calls. Since there is currently no public specification of actual hypervisor calls to implement this, I only provide a place that makes it easier to hook into. Cc: Masato Noguchi <Masato.Noguchi@jp.sony.com> Cc: Geoff Levand <geoff.levand@am.sony.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] spufs: clean up use of bitopsArnd Bergmann2006-01-091-4/+2
| | | | | | | | | | | | | | | | | | | | checking bits manually might not be synchonized with the use of set_bit/clear_bit. Make sure we always use the correct bitops by removing the unnecessary identifiers. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] cell: enable pause(0) in cpu_idleArnd Bergmann2006-01-093-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. From: Maximino Aguilar <maguilar@us.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] Small fix in eeh definitions when CONFIG_EEH not enabledHaren Myneni2006-01-091-0/+3
| | | | | | | | | | | | | | | | | | Undefined symbols (eeh_add_device_tree_early and eeh_remove_bus_device) when EEH is not enabled. This small patch will fix this. Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: added a udbg_progressKumar Gala2006-01-091-0/+1
| | | | | | | | | | | | | | Added a common udbg_progress for use by ppc_md.progress() Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * powerpc: Fix some #ifndef __KERNEL__ that should be #ifdefPaul Mackerras2006-01-093-3/+3
| | | | | | | | | | | | Grrr.... Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: sanitize header files for user space includesArnd Bergmann2006-01-0975-29/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | include/asm-ppc/ had #ifdef __KERNEL__ in all header files that are not meant for use by user space, include/asm-powerpc does not have this yet. This patch gets us a lot closer there. There are a few cases where I was not sure, so I left them out. I have verified that no CONFIG_* symbols are used outside of __KERNEL__ any more and that there are no obvious compile errors when including any of the headers in user space libraries. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: G4+ oprofile supportAndy Fleming2006-01-092-16/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds oprofile support for the 7450 and all its multitudinous derivatives. * Added 7450 (and derivatives) support for oprofile * Changed e500 cputable to have oprofile model and cpu_type fields * Added support for classic 32-bit performance monitor interrupt * Cleaned up common powerpc oprofile code to be as common as possible * Cleaned up oprofile_impl.h to reflect 32 bit classic code * Added 32-bit MMCRx bitfield definitions and SPR numbers Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: pci_address_to_pio fixBenjamin Herrenschmidt2006-01-091-3/+3
| | | | | | | | | | | | | | | | | | This fixes pci_address_to_pio() to return an unsigned long (to be safe) and fixes a bug in the implementation that caused it to return a bogus IO port number Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Replace VMALLOCBASE with VMALLOC_STARTDavid Gibson2006-01-092-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be the same thing: the start of the vmalloc() area at 0xd000000000000000. VMALLOC_START is used much more widely, including in generic code, so this patch gets rid of the extraneous VMALLOCBASE. This does require moving the definitions of region IDs from page_64.h to pgtable.h, but they don't clearly belong in the former rather than the latter, anyway. While we're moving them, clean up the definitions of the REGION_IDs: - Abolish REGION_SIZE, it was only used once, to define REGION_MASK anyway - Define the specific region ids in terms of the REGION_ID() macro. - Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than KERNELBASE. It amounts to the same thing, but conceptually this is about the region of the linear mapping (which starts at PAGE_OFFSET) rather than of the kernel text itself (which is at KERNELBASE). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Experimental support for new G5 Macs (#2)Benjamin Herrenschmidt2006-01-093-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds some very basic support for the new machines, including the Quad G5 (tested), and other new dual core based machines and iMac G5 iSight (untested). This is still experimental ! There is no thermal control yet, there is no proper handing of MSIs, etc.. but it boots, I have all 4 cores up on my machine. Compared to the previous version of this patch, this one adds DART IOMMU support for the U4 chipset and thus should work fine on setups with more than 2Gb of RAM. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: export PCI fixup routinelinas2006-01-091-0/+1
| | | | | | | | | | | | | | | | | | There is code in the RPAPHP directory that is identical to this routine; I'll be removing that code in an upcoming patch, but this patch is needed to expose the function to make it callable. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Update MPIC workaroundsSegher Boessenkool2006-01-091-1/+2
| | | | | | | | | | | | | | | | | | Cleanup the MPIC IO-APIC workarounds, make them a bit more generic, smaller and faster. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Remove device_node addrs/n_addrBenjamin Herrenschmidt2006-01-092-41/+9
| | | | | | | | | | | | | | | | | | | | | | | | The pre-parsed addrs/n_addrs fields in struct device_node are finally gone. Remove the dodgy heuristics that did that parsing at boot and remove the fields themselves since we now have a good replacement with the new OF parsing code. This patch also fixes a bunch of drivers to use the new code instead, so that at least pmac32, pseries, iseries and g5 defconfigs build. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] powerpc: Dont set 32bit cputable bits on 64bitAnton Blanchard2006-01-091-9/+11
| | | | | | | | | | | | | | | | Milton and I were looking at the cputable code and it looks like we can set spurious bits on 64bit. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] ppc64: Add NUMA cpu summary at bootAnton Blanchard2006-01-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to print a NUMA cpu summary at boot before the hotplug cpu code was added. This has been useful for catching machine configuration as well as firmware bugs in the past. This patch restores that functionality. An example of the output is: Node 0 CPUs: 0-7 Node 1 CPUs: 8-15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * [PATCH] spufs: Improved SPU preemptability [part 2].Arnd Bergmann2006-01-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reduces lock complexity of SPU scheduler, particularly for involuntary preemptive switches. As a result the new code does a better job of mapping the highest priority tasks to SPUs. Lock complexity is reduced by using the system default workqueue to perform involuntary saves. In this way we avoid nasty lock ordering problems that the previous code had. A "minimum timeslice" for SPU contexts is also introduced. The intent here is to avoid thrashing. While the new scheduler does a better job at prioritization it still does nothing for fairness. From: Mark Nutter <mnutter@us.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
OpenPOWER on IntegriCloud