summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2014-10-312-12/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus on the kernel side: - a revert for a newly introduced PMU driver which isn't complete yet and where we ran out of time with fixes (to be tried again in v3.19) - this makes up for a large chunk of the diffstat. - compilation warning fixes - a printk message fix - event_idx usage fixes/cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Trivial typo fix for --demangle perf tools: Fix report -F dso_from for data without branch info perf tools: Fix report -F dso_to for data without branch info perf tools: Fix report -F symbol_from for data without branch info perf tools: Fix report -F symbol_to for data without branch info perf tools: Fix report -F mispredict for data without branch info perf tools: Fix report -F in_tx for data without branch info perf tools: Fix report -F abort for data without branch info perf tools: Make CPUINFO_PROC an array to support different kernel versions perf callchain: Use global caching provided by libunwind perf/x86/intel: Revert incomplete and undocumented Broadwell client support perf/x86: Fix compile warnings for intel_uncore perf: Fix typos in sample code in the perf_event.h header perf: Fix and clean up initialization of pmu::event_idx perf: Fix bogus kernel printk perf diff: Add missing hists__init() call at tool start
| * perf: Fix and clean up initialization of pmu::event_idxPeter Zijlstra2014-10-282-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy reported that the current state of event_idx is rather confused. So remove all but the x86_pmu implementation and change the default to return 0 (the safe option). Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Cody P Schafer <dev@codyps.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Himangi Saraogi <himangi774@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sukadev@linux.vnet.ibm.com <sukadev@linux.vnet.ibm.com> Cc: Thomas Huth <thuth@linux.vnet.ibm.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux390@de.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | powerpc/numa: ensure per-cpu NUMA mappings are correct on topology updateNishanth Aravamudan2014-10-291-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | We received a report of warning in kernel/sched/core.c where the sched group was NULL on an LPAR after a topology update. This seems to occur because after the topology update has moved the CPUs, cpu_to_node is returning the old value still, which ends up breaking the consistency of the NUMA topology in the per-cpu maps. Ensure that we update the per-cpu fields when we re-map CPUs. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/numa: use cached value of update->cpu in update_cpu_topologyNishanth Aravamudan2014-10-291-2/+2
| | | | | | | | | | | | | | | | There isn't any need to keep referring to update->cpu, as we've already checked cpu == update->cpu at this point. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm: Use appropriate ESID mask in copro_calculate_slb()Ian Munsie2014-10-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes copro_calculate_slb() mask the ESID by the correct mask for 1T vs 256M segments. This has no effect by itself as the extra bits were ignored, but it makes debugging the segment table entries easier and means that we can directly compare the ESID values for duplicates without needing to worry about masking in the comparison. This will be used to simplify a comparison in the following patch. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Revert "powerpc/powernv: Fix endian bug in LPC bus debugfs accessors"Michael Ellerman2014-10-281-3/+1
| | | | | | | | | | | | | | | | | | | | This reverts commit bf7588a0859580a45c63cb082825d77c13eca357. Ben says although the code is not correct "[this] fix was completely wrong and does more damages than it fixes things." Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powernv: Use _GLOBAL_TOC for opal wrappersJeremy Kerr2014-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we can't call opal wrappers from modules when using the LE ABIv2, which requires a TOC init. If we do we'll try and load the opal entry point using the wrong toc and probably explode or worse jump to the wrong address. Nothing in upstream is making opal calls from a module, but we do export one of the wrappers so we should fix this anyway. This change uses the _GLOBAL_TOC() macro (rather than _GLOBAL) for the opal wrappers, so that we can do non-local calls to them. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Wire up sys_bpf() syscallPranith Kumar2014-10-223-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch wires up the new syscall sys_bpf() on powerpc. Passes the tests in samples/bpf: #0 add+sub+mul OK #1 unreachable OK #2 unreachable2 OK #3 out of range jump OK #4 out of range jump2 OK #5 test1 ld_imm64 OK #6 test2 ld_imm64 OK #7 test3 ld_imm64 OK #8 test4 ld_imm64 OK #9 test5 ld_imm64 OK #10 no bpf_exit OK #11 loop (back-edge) OK #12 loop2 (back-edge) OK #13 conditional loop OK #14 read uninitialized register OK #15 read invalid register OK #16 program doesn't init R0 before exit OK #17 stack out of bounds OK #18 invalid call insn1 OK #19 invalid call insn2 OK #20 invalid function call OK #21 uninitialized stack1 OK #22 uninitialized stack2 OK #23 check valid spill/fill OK #24 check corrupted spill/fill OK #25 invalid src register in STX OK #26 invalid dst register in STX OK #27 invalid dst register in ST OK #28 invalid src register in LDX OK #29 invalid dst register in LDX OK #30 junk insn OK #31 junk insn2 OK #32 junk insn3 OK #33 junk insn4 OK #34 junk insn5 OK #35 misaligned read from stack OK #36 invalid map_fd for function call OK #37 don't check return value before access OK #38 access memory with incorrect alignment OK #39 sometimes access memory with incorrect alignment OK #40 jump test 1 OK #41 jump test 2 OK #42 jump test 3 OK #43 jump test 4 OK Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> [mpe: test using samples/bpf] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm: Remove redundant #if caseAneesh Kumar K.V2014-10-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove the check of CONFIG_PPC_SUBPAGE_PROT when deciding if is_hugepage_only_range() is extern or inline. The extern version is in slice.c and is built if CONFIG_PPC_MM_SLICES=y. There was no build break possible because CONFIG_PPC_SUBPAGE_PROT is only selectable under conditions which also mean CONFIG_PPC_MM_SLICES will be selected. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm: Fix build error with hugetlfs disabledAneesh Kumar K.V2014-10-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/powerpc/mm/slice.c:704:5: error: expected identifier or ‘(’ before numeric constant int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, ^ make[1]: *** [arch/powerpc/mm/slice.o] Error 1 make: *** [arch/powerpc/mm/slice.o] Error 2 This got introduced via 1217d34b531c76362217057ca70a8ce8950574e0 "powerpc: Ensure global functions include their prototype". We started including linux/hugetlb.h with that patch and now we have #define is_hugepage_only_range(mm, addr, len) 0 with hugetlbfs disabled. Fixes: 1217d34b531c ("powerpc: Ensure global functions include their prototype") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-10-2127-119/+255
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull more powerpc updates from Michael Ellerman: "Here's some more updates for powerpc for 3.18. They are a bit late I know, though must are actually bug fixes. In my defence I nearly cut the top of my finger off last weekend in a gruesome bike maintenance accident, so I spent a good part of the week waiting around for doctors. True story, I can send photos if you like :) Probably the most interesting fix is the sys_call_table one, which enables syscall tracing for powerpc. There's a fix for HMI handling for old firmware, more endian fixes for firmware interfaces, more EEH fixes, Anton fixed our routine that gets the current stack pointer, and a few other misc bits" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (22 commits) powerpc: Only do dynamic DMA zone limits on platforms that need it powerpc: sync pseries_le_defconfig with pseries_defconfig powerpc: Add printk levels to setup_system output powerpc/vphn: NUMA node code expects big-endian powerpc/msi: Use WARN_ON() in msi bitmap selftests powerpc/msi: Fix the msi bitmap alignment tests powerpc/eeh: Block CFG upon frozen Shiner adapter powerpc/eeh: Don't collect logs on PE with blocked config space powerpc/eeh: Block PCI config access upon frozen PE powerpc/pseries: Drop config requests in EEH accessors powerpc/powernv: Drop config requests in EEH accessors powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED powerpc/eeh: Fix condition for isolated state powerpc/pseries: Make CPU hotplug path endian safe powerpc/pseries: Use dump_stack instead of show_stack powerpc: Rename __get_SP() to current_stack_pointer() powerpc: Reimplement __get_SP() as a function not a define powerpc/numa: Add ability to disable and debug topology updates powerpc/numa: check error return from proc_create powerpc/powernv: Fallback to old HMI handling behavior for old firmware ...
| * powerpc: Only do dynamic DMA zone limits on platforms that need itMichael Ellerman2014-10-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scott's patch 1c98025c6c95 "Dynamic DMA zone limits" changed dma_direct_alloc_coherent() to start using dev->coherent_dma_mask. That seems fair enough, but it exposes the fact that some of the drivers we care about on IBM platforms aren't setting the coherent mask. The proper fix is to have drivers set the coherent mask and also have the platform code honor it. For now, just restrict the dynamic DMA zone limits to the platforms that need it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Scott Wood <scottwood@freescale.com>
| * powerpc: sync pseries_le_defconfig with pseries_defconfigAnton Blanchard2014-10-161-1/+6
| | | | | | | | | | | | | | | | Now KVM is working on LE, enable it. Also enable transarent hugepage which has already been enabled on BE. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Add printk levels to setup_system outputAnton Blanchard2014-10-161-16/+16
| | | | | | | | | | Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/vphn: NUMA node code expects big-endianGreg Kurz2014-10-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The associativity domain numbers are obtained from the hypervisor through registers and written into memory by the guest: the packed array passed to vphn_unpack_associativity() is then native-endian, unlike what was assumed in the following commit: commit b08a2a12e44eaec5024b2b969f4fcb98169d1ca3 Author: Alistair Popple <alistair@popple.id.au> Date: Wed Aug 7 02:01:44 2013 +1000 powerpc: Make NUMA device node code endian safe This issue fills the topology with bogus data and makes it unusable. It may lead to severe performance breakdowns. We should ideally patch the vphn_unpack_associativity() function to do the 64-bit loads, but this requires some more brain storming. In the meantime, let's go for a suboptimal and temporary bug fix: this patch converts each 64-bit value of the packed array to big endian, as expected by the current parsing code in vphn_unpack_associativity(). Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/msi: Use WARN_ON() in msi bitmap selftestsMichael Ellerman2014-10-151-30/+24
| | | | | | | | | | | | | | | | | | | | | | As demonstrated in the previous commit, the failure message from the msi bitmap selftests is a bit subtle, it's easy to miss a failure in a busy boot log. So drop our check() macro and use WARN_ON() instead. This necessitates inverting all the conditions as well. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/msi: Fix the msi bitmap alignment testsMichael Ellerman2014-10-151-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | When we added the alignment tests recently we failed to check they were actually passing - oops. They weren't passing, because the bitmap was full. We should also be a bit more careful when checking the return code, a negative error return could by divisible by our alignment value. Fixes: b0345bbc6d09 ("powerpc/msi: Improve IRQ bitmap allocator") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Block CFG upon frozen Shiner adapterGavin Shan2014-10-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Broadcom Shiner 2-ports 10G ethernet adapter has same problem commit 6f20bda0 ("powerpc/eeh: Block PCI config access upon frozen PE") fixes. Put it to the black list as well. # lspci -s 0004:01:00.0 0004:01:00.0 Ethernet controller: Broadcom Corporation \ NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) # lspci -n -s 0004:01:00.0 0004:01:00.0 0200: 14e4:168e (rev 10) Reported-by: John Walthour <jwalthour@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Don't collect logs on PE with blocked config spaceGavin Shan2014-10-151-0/+7
| | | | | | | | | | | | | | | | | | When the PE's config space is marked as blocked, PCI config read requests always return 0xFF's. It's pointless to collect logs in this case. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Block PCI config access upon frozen PEGavin Shan2014-10-153-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem was found when I tried to inject PCI config error by PHB3 PAPR error injection registers into Broadcom Austin 4-ports NIC adapter. The frozen PE was reported successfully and EEH core started to recover it. However, I run into fenced PHB when dumping PCI config space as EEH logs. I was told that PCI config requests should not be progagated to the adapter until PE reset is done successfully. Otherise, we would run out of PHB internal credits and trigger PCT (PCIE Completion Timeout), which leads to the fenced PHB. The patch introduces another PE flag EEH_PE_CFG_RESTRICTED, which is set during PE initialization time if the PE includes the specific PCI devices that need block PCI config access until PE reset is done. When the PE becomes frozen for the first time, EEH_PE_CFG_BLOCKED is set if the PE has flag EEH_PE_CFG_RESTRICTED. Then the PCI config access to the PE will be dropped by platform PCI accessors until PE reset is done successfully. The mechanism is shared by PowerNV platform owned PE or userland owned ones. It's not used on pSeries platform yet. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/pseries: Drop config requests in EEH accessorsGavin Shan2014-10-151-19/+11
| | | | | | | | | | | | | | | | | | | | | | The pSeires EEH config accessors rely on rtas_{read, write}_config() and the condition to check if the PE's config space is blocked should be moved to those 2 functions so that config requests from kernel, userland, EEH core can be dropped to avoid recursive EEH error if necessary. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/powernv: Drop config requests in EEH accessorsGavin Shan2014-10-151-2/+35
| | | | | | | | | | | | | | | | | | | | | | It's bad idea to access the PCI config registers of the adapters, which is experiencing reset. It leads to recursive EEH error without exception. The patch drops PCI config requests in EEH accessors if the PE has been marked to accept PCI config requests, for example during PE reseet time. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKEDGavin Shan2014-10-156-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | The flag EEH_PE_RESET indicates blocking config space of the PE during reset time. We potentially need block PE's config space other than reset time. So it's reasonable to replace it with EEH_PE_CFG_BLOCKED to indicate its usage. There are no substantial code or logic changes in this patch. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Fix condition for isolated stateGavin Shan2014-10-151-1/+1
| | | | | | | | | | | | | | | | | | Function eeh_pe_state_mark() could possibly have combination of multiple EEH PE state as its argument. The patch fixes the condition used to check if EEH_PE_ISOLATED is included. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/pseries: Make CPU hotplug path endian safeBharata B Rao2014-10-153-14/+15
| | | | | | | | | | | | | | | | | | | | | | - ibm,rtas-configure-connector should treat the RTAS data as big endian. - Treat ibm,ppc-interrupt-server#s as big-endian when setting smp_processor_id during hotplug. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/pseries: Use dump_stack instead of show_stackAnton Blanchard2014-10-151-6/+5
| | | | | | | | | | | | | | | | We can use the simpler dump_stack() instead of show_stack(current, __get_SP()) Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Rename __get_SP() to current_stack_pointer()Anton Blanchard2014-10-157-7/+7
| | | | | | | | | | | | | | | | Michael points out that __get_SP() is a pretty horrible function name. Let's give it a better name. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Reimplement __get_SP() as a function not a defineAnton Blanchard2014-10-156-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Li Zhong points out an issue with our current __get_SP() implementation. If ftrace function tracing is enabled (ie -pg profiling using _mcount) we spill a stack frame on 64bit all the time. If a function calls __get_SP() and later calls a function that is tail call optimised, we will pop the stack frame and the value returned by __get_SP() is no longer valid. An example from Li can be found in save_stack_trace -> save_context_stack: c0000000000432c0 <.save_stack_trace>: c0000000000432c0: mflr r0 c0000000000432c4: std r0,16(r1) c0000000000432c8: stdu r1,-128(r1) <-- stack frame for _mcount c0000000000432cc: std r3,112(r1) c0000000000432d0: bl <._mcount> c0000000000432d4: nop c0000000000432d8: mr r4,r1 <-- __get_SP() c0000000000432dc: ld r5,632(r13) c0000000000432e0: ld r3,112(r1) c0000000000432e4: li r6,1 c0000000000432e8: addi r1,r1,128 <-- pop stack frame c0000000000432ec: ld r0,16(r1) c0000000000432f0: mtlr r0 c0000000000432f4: b <.save_context_stack> <-- tail call optimized save_context_stack ends up with a stack pointer below the current one, and it is likely to be scribbled over. Fix this by making __get_SP() a function which returns the callers stack frame. Also replace inline assembly which grabs the stack pointer in save_stack_trace and show_stack with __get_SP(). This also fixes an issue with perf_arch_fetch_caller_regs(). It currently unwinds the stack once, which will skip a valid stack frame on a leaf function. With the __get_SP() fixes in this patch, we never need to unwind the stack frame to get to the first interesting frame. We have to export __get_SP() because perf_arch_fetch_caller_regs() (which is used in modules) calls it from a header file. Reported-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/numa: Add ability to disable and debug topology updatesNishanth Aravamudan2014-10-131-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have hit a few customer issues with the topology update code (VPHN and PRRN). It would be nice to be able to debug the notifications coming from the hypervisor in both cases to the LPAR, as well as to disable responding to the notifications at boot-time, to narrow down the source of the problems. Add a basic level of such functionality, similar to the numa= command-line parameter. We already have a toggle in /proc/powerpc/topology_updates that allows run-time enabling/disabling, so the updates can be started at run-time if desired. But the bugs we've run into have occured during boot or very shortly after coming to login, and have resulted in a broken NUMA topology. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/numa: check error return from proc_createNishanth Aravamudan2014-10-131-1/+2
| | | | | | | | | | | | | | | | | | proc_create can fail, we should check the return value and pass up the failure. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/powernv: Fallback to old HMI handling behavior for old firmwareMahesh Salgaonkar2014-10-131-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently we moved HMI handling into Linux kernel instead of taking HMI directly in OPAL. This new change is dependent on new OPAL call for HMI recovery which was introduced in newer firmware. While this new change works fine with latest OPAL firmware, we broke the HMI handling if we run newer kernel on old OPAL firmware that results in system hang. This patch fixes this issue by falling back to old HMI behavior on older OPAL firmware. This patch introduces a check for opal token OPAL_HANDLE_HMI to see if we are running on newer firmware or old firmware. On newer firmware this check would return OPAL_TOKEN_PRESENT, otherwise we are running on old firmware and fallback to old HMI behavior. Old firmware: POWER8 System Firmware Release as of today <= SV810_087 Action: Let OPAL handle HMIs Newer firmware: in development/yet to be released. Action: Let Linux host handle HMIs. This patch depends on opal check token patch posted at ppc-devel https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120224.html Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Minor comment and printk rewording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/book3s: Don't clear MSR_RI in hmi handler.Mahesh Salgaonkar2014-10-101-5/+0
| | | | | | | | | | | | | | | | In HMI interrupt handler we don't touch SRR0/SRR1, instead we touch HSRR0/HSRR1. Hence we don't need to clear MSR_RI bit. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Fix sys_call_table declaration to enable syscall tracingRomeo Cane2014-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Declaring sys_call_table as a pointer causes the compiler to generate the wrong lookup code in arch_syscall_addr(). <arch_syscall_addr>: lis r9,-16384 rlwinm r3,r3,2,0,29 - lwz r11,30640(r9) - lwzx r3,r11,r3 + addi r9,r9,30640 + lwzx r3,r9,r3 blr The actual sys_call_table symbol, declared in assembler, is an array. If we lie about that to the compiler we get the wrong code generated, as above. This definition seems only to be used by the syscall tracing code in kernel/trace/trace_syscalls.c. With this patch I can successfully use the syscall tracepoints: bash-3815 [002] .... 333.239082: sys_write -> 0x2 bash-3815 [002] .... 333.239087: sys_dup2(oldfd: a, newfd: 1) bash-3815 [002] .... 333.239088: sys_dup2 -> 0x1 bash-3815 [002] .... 333.239092: sys_fcntl(fd: a, cmd: 1, arg: 0) bash-3815 [002] .... 333.239093: sys_fcntl -> 0x1 bash-3815 [002] .... 333.239094: sys_close(fd: a) bash-3815 [002] .... 333.239094: sys_close -> 0x0 Signed-off-by: Romeo Cane <romeo.cane.ext@coriant.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2014-10-192-5/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit updates from Eric Paris: "So this change across a whole bunch of arches really solves one basic problem. We want to audit when seccomp is killing a process. seccomp hooks in before the audit syscall entry code. audit_syscall_entry took as an argument the arch of the given syscall. Since the arch is part of what makes a syscall number meaningful it's an important part of the record, but it isn't available when seccomp shoots the syscall... For most arch's we have a better way to get the arch (syscall_get_arch) So the solution was two fold: Implement syscall_get_arch() everywhere there is audit which didn't have it. Use syscall_get_arch() in the seccomp audit code. Having syscall_get_arch() everywhere meant it was a useless flag on the stack and we could get rid of it for the typical syscall entry. The other changes inside the audit system aren't grand, fixed some records that had invalid spaces. Better locking around the task comm field. Removing some dead functions and structs. Make some things static. Really minor stuff" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: rename audit_log_remove_rule to disambiguate for trees audit: cull redundancy in audit_rule_change audit: WARN if audit_rule_change called illegally audit: put rule existence check in canonical order next: openrisc: Fix build audit: get comm using lock to avoid race in string printing audit: remove open_arg() function that is never used audit: correct AUDIT_GET_FEATURE return message type audit: set nlmsg_len for multicast messages. audit: use union for audit_field values since they are mutually exclusive audit: invalid op= values for rules audit: use atomic_t to simplify audit_serial() kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0] audit: reduce scope of audit_log_fcaps audit: reduce scope of audit_net_id audit: arm64: Remove the audit arch argument to audit_syscall_entry arm64: audit: Add audit hook in syscall_trace_enter/exit() audit: x86: drop arch from __audit_syscall_entry() interface sparc: implement is_32bit_task sparc: properly conditionalize use of TIF_32BIT ...
| * | sparc: simplify syscall_get_arch()Eric Paris2014-09-231-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Include linux/thread_info.h so we can use is_32_bit_task() cleanly. Then just simplify syscall_get_arch() since is_32_bit_task() works for all configuration options. Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Paris <eparis@redhat.com>
| * | ARCH: AUDIT: audit_syscall_entry() should not require the archEric Paris2014-09-231-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a function where the arch can be queried, syscall_get_arch(). So rather than have every single piece of arch specific code use and/or duplicate syscall_get_arch(), just have the audit code use the syscall_get_arch() code. Based-on-patch-by: Richard Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-ia64@vger.kernel.org Cc: microblaze-uclinux@itee.uq.edu.au Cc: linux-mips@linux-mips.org Cc: linux@lists.openrisc.net Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: user-mode-linux-devel@lists.sourceforge.net Cc: linux-xtensa@linux-xtensa.org Cc: x86@kernel.org
| * | ARCH: AUDIT: implement syscall_get_arch for all archesEric Paris2014-09-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For all arches which support audit implement syscall_get_arch() They are all pretty easy and straight forward, stolen from how the call to audit_syscall_entry() determines the arch. Based-on-patch-by: Richard Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: linux-ia64@vger.kernel.org Cc: microblaze-uclinux@itee.uq.edu.au Cc: linux-mips@linux-mips.org Cc: linux@lists.openrisc.net Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org
* | | powerpc/pci: Fix IO space breakage after of_pci_range_to_resource() changeMichael Ellerman2014-10-161-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0b0b0893d49b "of/pci: Fix the conversion of IO ranges into IO resources" changed the behaviour of of_pci_range_to_resource(). Previously it simply populated the resource based on the arguments. Now it calls pci_register_io_range() and pci_address_to_pio(). These both have two implementations depending on whether PCI_IOBASE is defined, which it is not for powerpc. Further complicating matters, both routines are weak, and powerpc implements it's own version of one - pci_address_to_pio(). However powerpc's implementation depends on other initialisations which are done later in boot. The end result is incorrectly initialised IO space. Often we can get away with that, because we don't make much use of IO space. However virtio requires it, so we see eg: pci_bus 0000:00: root bus resource [io 0xffff] (bus address [0xffffffffffffffff-0xffffffffffffffff]) PCI: Cannot allocate resource region 0 of device 0000:00:01.0, will remap virtio-pci 0000:00:01.0: can't enable device: BAR 0 [io size 0x0020] not assigned The simplest fix for now is to just stop using of_pci_range_to_resource(), and open-code the original implementation, that's all we want it to do. Fixes: 0b0b0893d49b ("of/pci: Fix the conversion of IO ranges into IO resources") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | Merge branch 'for-3.18-consistent-ops' of ↵Linus Torvalds2014-10-151-3/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
| * | | Revert "powerpc: Replace __get_cpu_var uses"Tejun Heo2014-08-2732-105/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5828f666c069af74e00db21559f1535103c9f79a due to build failure after merging with pending powerpc changes. Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.au Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Replace __get_cpu_var usesChristoph Lameter2014-08-2632-103/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) tj: Folded a fix patch. http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | alpha: Replace __get_cpu_varChristoph Lameter2014-08-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2014-10-132-4/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave Hansen) - Various sched/idle refinements for better idle handling (Nicolas Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot) - sched/numa updates and optimizations (Rik van Riel) - sysbench speedup (Vincent Guittot) - capacity calculation cleanups/refactoring (Vincent Guittot) - Various cleanups to thread group iteration (Oleg Nesterov) - Double-rq-lock removal optimization and various refactorings (Kirill Tkhai) - various sched/deadline fixes ... and lots of other changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/dl: Use dl_bw_of() under rcu_read_lock_sched() sched/fair: Delete resched_cpu() from idle_balance() sched, time: Fix build error with 64 bit cputime_t on 32 bit systems sched: Improve sysbench performance by fixing spurious active migration sched/x86: Fix up typo in topology detection x86, sched: Add new topology for multi-NUMA-node CPUs sched/rt: Use resched_curr() in task_tick_rt() sched: Use rq->rd in sched_setaffinity() under RCU read lock sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' sched: Use dl_bw_of() under RCU read lock sched/fair: Remove duplicate code from can_migrate_task() sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW sched: print_rq(): Don't use tasklist_lock sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock() sched: Fix the task-group check in tg_has_rt_tasks() sched/fair: Leverage the idle state info when choosing the "idlest" cpu sched: Let the scheduler see CPU idle states sched/deadline: Fix inter- exclusive cpusets migrations sched/deadline: Clear dl_entity params when setscheduling to different class sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault() ...
| * | | | sched, time: Fix build error with 64 bit cputime_t on 32 bit systemsRik van Riel2014-10-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32 bit systems cmpxchg cannot handle 64 bit values, so some additional magic is required to allow a 32 bit system with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y enabled to build. Make sure the correct cmpxchg function is used when doing an atomic swap of a cputime_t. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Cc: oleg@redhat.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linux390@de.ibm.com Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20140930155947.070cdb1f@annuminas.surriel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | sched: Add helper for task stack page overrun checkingAaron Tomlin2014-09-191-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This facility is used in a few places so let's introduce a helper function to improve code readability. Signed-off-by: Aaron Tomlin <atomlin@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: dzickus@redhat.com Cc: bmr@redhat.com Cc: jcastillo@redhat.com Cc: oleg@redhat.com Cc: riel@redhat.com Cc: prarit@redhat.com Cc: jgh@redhat.com Cc: minchan@kernel.org Cc: mpe@ellerman.id.au Cc: tglx@linutronix.de Cc: hannes@cmpxchg.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1410527779-8133-3-git-send-email-atomlin@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | init/main.c: Give init_task a canaryAaron Tomlin2014-09-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tasks get their end of stack set to STACK_END_MAGIC with the aim to catch stack overruns. Currently this feature does not apply to init_task. This patch removes this restriction. Note that a similar patch was posted by Prarit Bhargava some time ago but was never merged: http://marc.info/?l=linux-kernel&m=127144305403241&w=2 Signed-off-by: Aaron Tomlin <atomlin@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: dzickus@redhat.com Cc: bmr@redhat.com Cc: jcastillo@redhat.com Cc: jgh@redhat.com Cc: minchan@kernel.org Cc: tglx@linutronix.de Cc: hannes@cmpxchg.org Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Daeseok Youn <daeseok.youn@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1410527779-8133-2-git-send-email-atomlin@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'locking-arch-for-linus' of ↵Linus Torvalds2014-10-131-121/+77
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull arch atomic cleanups from Ingo Molnar: "This is a series kept separate from the main locking tree, which cleans up and improves various details in the atomics type handling: - Remove the unused atomic_or_long() method - Consolidate and compress atomic ops implementations between architectures, to reduce linecount and to make it easier to add new ops. - Rewrite generic atomic support to only require cmpxchg() from an architecture - generate all other methods from that" * 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read() locking, mips: Fix atomics locking, sparc64: Fix atomics locking,arch: Rewrite generic atomic support locking,arch,xtensa: Fold atomic_ops locking,arch,sparc: Fold atomic_ops locking,arch,sh: Fold atomic_ops locking,arch,powerpc: Fold atomic_ops locking,arch,parisc: Fold atomic_ops locking,arch,mn10300: Fold atomic_ops locking,arch,mips: Fold atomic_ops locking,arch,metag: Fold atomic_ops locking,arch,m68k: Fold atomic_ops locking,arch,m32r: Fold atomic_ops locking,arch,ia64: Fold atomic_ops locking,arch,hexagon: Fold atomic_ops locking,arch,cris: Fold atomic_ops locking,arch,avr32: Fold atomic_ops locking,arch,arm64: Fold atomic_ops locking,arch,arm: Fold atomic_ops ...
| * | | | | locking,arch,powerpc: Fold atomic_opsPeter Zijlstra2014-08-141-121/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many of the atomic op implementations are the same except for one instruction; fold the lot into a few CPP macros and reduce LoC. Requires asm_op because PPC asm is weird :-) Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20140508135852.713980957@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2014-10-11198-1551/+3177
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Here's a first pull request for powerpc updates for 3.18. The bulk of the additions are for the "cxl" driver, for IBM's Coherent Accelerator Processor Interface (CAPI). Most of it's in drivers/misc, which Greg & Arnd maintain, Greg said he was happy for us to take it through our tree. There's the usual minor cleanups and fixes, including a bit of noise in drivers from some of those. A bunch of updates to our EEH code, which has been getting more testing. Several nice speedups from Anton, including 20% in clear_page(). And a bunch of updates for freescale from Scott" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits) cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking cxl: Add documentation for userspace APIs cxl: Add driver to Kbuild and Makefiles cxl: Add userspace header file cxl: Driver code for powernv PCIe based cards for userspace access cxl: Add base builtin support powerpc/mm: Add hooks for cxl powerpc/opal: Add PHB to cxl mode call powerpc/mm: Add new hash_page_mm() powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts cxl: Add new header for call backs and structs powerpc/powernv: Split out set MSI IRQ chip code powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize powerpc/msi: Improve IRQ bitmap allocator powerpc/cell: Make spu_flush_all_slbs() generic powerpc/cell: Move data segment faulting code out of cell platform powerpc/cell: Move spu_handle_mm_fault() out of cell platform powerpc/pseries: Use new defines when calling H_SET_MODE powerpc: Update contact info in Documentation files powerpc/perf/hv-24x7: Simplify catalog_read() ...
| * | | | | powerpc/mm: Add hooks for cxlIan Munsie2014-10-082-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds hooks into the core powerpc mm code for cxl. The core powerpc code sometimes uses local tlbie. Unfortunately this won't work with the current cxl driver as it relies on snooping tlbie broadcasts. The cxl hardware can have TLB entries invalidated via MMIO but this is not currently supported by the driver. In future we can make local tlbie smarter so that it invalidates cxl contexts via MMIO when it needs to but for now we have this workaround. This workaround checks for any active cxl contexts and if so, disables local tlbie. This also adds a hook for when SLBs are invalidated. This ensures any corresponding SLBs in cxl are also invalidated at the same time. This is required for segment demotion. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
OpenPOWER on IntegriCloud