summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel
Commit message (Collapse)AuthorAgeFilesLines
* powerpc/tm: Abort on emulation and alignment faultsMichael Neuling2013-06-011-0/+29
| | | | | | | | | | | | | | | | | | | | | | | If we are emulating an instruction inside an active user transaction that touches memory, the kernel can't emulate it as it operates in transactional suspend context. We need to abort these transactions and send them back to userspace for the hardware to rollback. We can service these if the user transaction is in suspend mode, since the kernel will operate in the same suspend context. This adds a check to all alignment faults and to specific instruction emulations (only string instructions for now). If the user process is in an active (non-suspended) transaction, we abort the transaction go back to userspace allowing the HW to roll back the transaction and tell the user of the failure. This also adds new tm abort cause codes to report the reason of the persistent error to the user. Crappy test case here http://neuling.org/devel/junkcode/aligntm.c Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Make radeon 32-bit MSI quirk work on powernvBenjamin Herrenschmidt2013-05-242-0/+18
| | | | | | | | | This moves the quirk itself to pci_64.c as to get built on all ppc64 platforms (the only ones with a pci_dn), factors the two implementations of get_pdn() into a single pci_get_dn() and use the quirk to do 32-bit MSIs on IODA based powernv platforms. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Context switch more PMU related SPRsMichael Ellerman2013-05-242-0/+34
| | | | | | | | | | | In commit 9353374 "Context switch the new EBB SPRs" we added support for context switching some new EBB SPRs. However despite four of us signing off on that patch we missed some. To be fair these are not actually new SPRs, but they are now potentially user accessible so need to be context switched. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pci: Fix bogus message at boot about empty memory resourcesBenjamin Herrenschmidt2013-05-241-3/+4
| | | | | | | The message is only meant to be displayed if resource 0 is empty, but was displayed if any is. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix TLB cleanup at boot on POWER8Benjamin Herrenschmidt2013-05-241-2/+6
| | | | | | | The TLB has 512 congruence classes (2048 entries 4 way set associative) while P7 had 128 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' of ↵Linus Torvalds2013-05-1415-40/+235
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "This is mostly bug fixes (some of them regressions, some of them I deemed worth merging now) along with some patches from Li Zhong hooking up the new context tracking stuff (for the new full NO_HZ)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Set show_unhandled_signals to 1 by default powerpc/perf: Fix setting of "to" addresses for BHRB powerpc/pmu: Fix order of interpreting BHRB target entries powerpc/perf: Move BHRB code into CONFIG_PPC64 region powerpc: select HAVE_CONTEXT_TRACKING for pSeries powerpc: Use the new schedule_user API on userspace preemption powerpc: Exit user context on notify resume powerpc: Exception hooks for context tracking subsystem powerpc: Syscall hooks for context tracking subsystem powerpc/booke64: Fix kernel hangs at kernel_dbg_exc powerpc: Fix irq_set_affinity() return values powerpc: Provide __bswapdi2 powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 powerpc/powernv: Detect OPAL v3 API version powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS powerpc: Bring all threads online prior to migration/hibernation powerpc/rtas_flash: Fix validate_flash buffer overflow issue powerpc/kexec: Fix kexec when using VMX optimised memcpy powerpc: Fix build errors STRICT_MM_TYPECHECKS ...
| * powerpc: Set show_unhandled_signals to 1 by defaultBenjamin Herrenschmidt2013-05-141-1/+1
| | | | | | | | | | | | Just like other architectures Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Use the new schedule_user API on userspace preemptionLi Zhong2013-05-141-1/+2
| | | | | | | | | | | | | | | | | | This patch corresponds to [PATCH] x86: Use the new schedule_user API on userspace preemption commit 0430499ce9d78691f3985962021b16bf8f8a8048 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Exit user context on notify resumeLi Zhong2013-05-141-0/+5
| | | | | | | | | | | | | | | | | | | | This patch allows RCU usage in do_notify_resume, e.g. signal handling. It corresponds to [PATCH] x86: Exit RCU extended QS on notify resume commit edf55fda35c7dc7f2d9241c3abaddaf759b457c6 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Exception hooks for context tracking subsystemLi Zhong2013-05-141-22/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the exception hooks for context tracking subsystem, including data access, program check, single step, instruction breakpoint, machine check, alignment, fp unavailable, altivec assist, unknown exception, whose handlers might use RCU. This patch corresponds to [PATCH] x86: Exception hooks for userspace RCU extended QS commit 6ba3c97a38803883c2eee489505796cb0a727122 But after the exception handling moved to generic code, and some changes in following two commits: 56dd9470d7c8734f055da2a6bac553caf4a468eb context_tracking: Move exception handling to generic code 6c1e0256fad84a843d915414e4b5973b7443d48d context_tracking: Restore correct previous context state on exception exit it is able for exception hooks to use the generic code above instead of a redundant arch implementation. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Syscall hooks for context tracking subsystemLi Zhong2013-05-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the syscall slow path hooks for context tracking subsystem, corresponding to [PATCH] x86: Syscall hooks for userspace RCU extended QS commit bf5a3c13b939813d28ce26c01425054c740d6731 TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is better for it to be in the same 16 bits with others in the group, so in the asm code, andi. with this group could work. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/booke64: Fix kernel hangs at kernel_dbg_excScott Wood2013-05-142-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MSR_DE is not cleared on entry to the kernel, and we don't clear it explicitly outside of debug code. If we have MSR_DE set in prime_debug_regs(), and the new thread has events enabled in DBCR0 (e.g. ICMP is set in thread->dbsr0, even though it was cleared in the real DBCR0 when the thread got scheduled out), we'll end up taking a debug exception in the kernel when DBCR0 is loaded. DSRR0 will not point to an exception vector, and the kernel ends up hanging at kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new debug state. Another observed source of kernel_dbg_exc hangs is with the branch taken event. If this event is active, but we take a non-debug trap (e.g. a TLB miss or an asynchronous interrupt) before the next branch. We end up taking a branch-taken debug exception on the initial branch instruction of the exception vector, but because the debug exception is DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even checking the DSRR0 address. Fix this by checking for DBSR_BT as well as DBSR_IC, which is what 32-bit does and what the comments suggest was intended in the 64-bit code as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Provide __bswapdi2David Woodhouse2013-05-143-1/+24
| | | | | | | | | | | | | | | | | | | | Some versions of GCC apparently expect this to be provided by libgcc. Updates from Mikey to fix 32 bit version and adding "r" to registers. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning againLi Zhong2013-05-143-4/+1
| | | | | | | | | | | | | | | | | | | | | | Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Bring all threads online prior to migration/hibernationRobert Jennings2013-05-141-0/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Cc: <stable@kernel.org> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/rtas_flash: Fix validate_flash buffer overflow issueVasant Hegde2013-05-141-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ibm,validate-flash-image RTAS call output buffer contains 150 - 200 bytes of data on latest system. Presently we have output buffer size as 64 bytes and we use sprintf to copy data from RTAS buffer to local buffer. This causes kernel oops (see below call trace). This patch increases local buffer size to 256 and also uses snprintf instead of sprintf to copy data from RTAS buffer. Kernel call trace : ------------------- Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod Supported: Yes NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000 REGS: c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64) MSR: 8000000040009032 <EE,ME,IR,DR> CR: 44022488 XER: 20000018 TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36 GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031 GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4 GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90 GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000 GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000 GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031 GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130 NIP [4520323031333130] 0x4520323031333130 LR [4520323031333130] 0x4520323031333130 Call Trace: [c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable) Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/kexec: Fix kexec when using VMX optimised memcpyAnton Blanchard2013-05-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b3f271e86e5a (powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch) uses VMX when it is safe to do so (ie not in interrupt). It also looks at the task struct to decide if we have to save the current tasks' VMX state. kexec calls memcpy() at a point where the task struct may have been overwritten by the new kexec segments. If it has been overwritten then when memcpy -> enable_altivec looks up current->thread.regs->msr we get a cryptic oops or lockup. I also notice we aren't initialising thread_info->cpu, which means smp_processor_id is broken. Fix that too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Fix build errors STRICT_MM_TYPECHECKSAneesh Kumar K.V2013-05-141-3/+2
| | | | | | | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Add an in memory udbg consoleAlistair Popple2013-05-081-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new udbg early debug console which utilises statically defined input and output buffers stored within the kernel BSS. It is primarily designed to assist with bring up of new hardware which may not have a working console but which has a method of reading/writing kernel memory. This version incorporates comments made by Ben H (thanks!). Changes from v1: - Add memory barriers. - Ensure updating of read/write positions is atomic. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2013-05-111-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit changes from Eric Paris: "Al used to send pull requests every couple of years but he told me to just start pushing them to you directly. Our touching outside of core audit code is pretty straight forward. A couple of interface changes which hit net/. A simple argument bug calling audit functions in namei.c and the removal of some assembly branch prediction code on ppc" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: fix message spacing printing auid Revert "audit: move kaudit thread start from auditd registration to kaudit init" audit: vfs: fix audit_inode call in O_CREAT case of do_last audit: Make testing for a valid loginuid explicit. audit: fix event coverage of AUDIT_ANOM_LINK audit: use spin_lock in audit_receive_msg to process tty logging audit: do not needlessly take a lock in tty_audit_exit audit: do not needlessly take a spinlock in copy_signal audit: add an option to control logging of passwords with pam_tty_audit audit: use spin_lock_irqsave/restore in audit tty code helper for some session id stuff audit: use a consistent audit helper to log lsm information audit: push loginuid and sessionid processing down audit: stop pushing loginid, uid, sessionid as arguments audit: remove the old depricated kernel interface audit: make validity checking generic audit: allow checking the type of audit message in the user filter audit: fix build break when AUDIT_DEBUG == 2 audit: remove duplicate export of audit_enabled Audit: do not print error when LSMs disabled ...
| * | powerpc: Remove static branch prediction in 64bit traced syscall pathAnton Blanchard2013-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some distros enable auditing by default which forces us through the syscall trace path. Remove the static branch prediction in our 64bit syscall handler and let the hardware do the prediction. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Eric Paris <eparis@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2013-05-101-8/+0
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull stray syscall bits from Al Viro: "Several syscall-related commits that were missing from the original" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE unicore32: just use mmap_pgoff()... unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)
| * | unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINEAl Viro2013-05-091-8/+0
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | powerpc/topology: Fix spurr attribute permissionBenjamin Herrenschmidt2013-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We are registering the attribute with permission 0600 but it doesn't have a store callback, which causes WARN_ON's during boot. Fix the permission. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/pci: Support per-aperture memory offsetBenjamin Herrenschmidt2013-05-063-71/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCI core supports an offset per aperture nowadays but our arch code still has a single offset per host bridge representing the difference betwen CPU memory addresses and PCI MMIO addresses. This is a problem as new machines and hypervisor versions are coming out where the 64-bit windows will have a different offset (basically mapped 1:1) from the 32-bit windows. This fixes it by using separate offsets. In the long run, we probably want to get rid of that intermediary struct pci_controller and have those directly stored into the pci_host_bridge as they are parsed but this will be a more invasive change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/pci: Don't add bogus empty resources to PHBsBenjamin Herrenschmidt2013-05-061-14/+16
| | | | | | | | | | | | | | | | | | | | | | | | When converting to use the new pci_add_resource_offset() we didn't properly account for empty resources (0 flags) and add those bogons to the PHBs. The result is some annoying messages in the log. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/cputable: Advertise support for ISEL/HTM/DSCR/TAR on POWER8Nishanth Aravamudan2013-05-061-0/+5
| | | | | | | | | | | | | | | Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/cputable: Advertise ISEL support on appropriate embedded processorsNishanth Aravamudan2013-05-061-0/+5
| | | | | | | | | | | | | | | Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/cputable: Advertise DSCR support on P7/P7+Nishanth Aravamudan2013-05-061-0/+4
| | | | | | | | | | | | | | | Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/pseries: Perform proper max_bus_speed detectionKleber Sacilotto de Souza2013-05-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries. This is done by overwriting the weak pcibios_root_bridge_prepare function which is called by pci_create_root_bus(). From: Lucas Kannebley Tavares <lucaskt@linux.vnet.ibm.com> Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc: Emulate non privileged DSCR read and writeAnton Blanchard2013-05-061-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POWER8 allows read and write of the DSCR in userspace. We added kernel emulation so applications could always use the instructions regardless of the CPU type. Unfortunately there are two SPRs for the DSCR and we only added emulation for the privileged one. Add code to match the non privileged one. A simple test was created to verify the fix: http://ozlabs.org/~anton/junkcode/user_dscr_test.c Without the patch we get a SIGILL and it passes with the patch. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2013-05-051-0/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Gleb Natapov: "Highlights of the updates are: general: - new emulated device API - legacy device assignment is now optional - irqfd interface is more generic and can be shared between arches x86: - VMCS shadow support and other nested VMX improvements - APIC virtualization and Posted Interrupt hardware support - Optimize mmio spte zapping ppc: - BookE: in-kernel MPIC emulation with irqfd support - Book3S: in-kernel XICS emulation (incomplete) - Book3S: HV: migration fixes - BookE: more debug support preparation - BookE: e6500 support ARM: - reworking of Hyp idmaps s390: - ioeventfd for virtio-ccw And many other bug fixes, cleanups and improvements" * tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) kvm: Add compat_ioctl for device control API KVM: x86: Account for failing enable_irq_window for NMI window request KVM: PPC: Book3S: Add API for in-kernel XICS emulation kvm/ppc/mpic: fix missing unlock in set_base_addr() kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write kvm/ppc/mpic: remove users kvm/ppc/mpic: fix mmio region lists when multiple guests used kvm/ppc/mpic: remove default routes from documentation kvm: KVM_CAP_IOMMU only available with device assignment ARM: KVM: iterate over all CPUs for CPU compatibility check KVM: ARM: Fix spelling in error message ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally KVM: ARM: Fix API documentation for ONE_REG encoding ARM: KVM: promote vfp_host pointer to generic host cpu context ARM: KVM: add architecture specific hook for capabilities ARM: KVM: perform HYP initilization for hotplugged CPUs ARM: KVM: switch to a dual-step HYP init code ARM: KVM: rework HYP page table freeing ARM: KVM: enforce maximum size for identity mapped code ARM: KVM: move to a KVM provided HYP idmap ...
| * | | KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVMBenjamin Herrenschmidt2013-04-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we wake up a CPU by sending a host IPI with smp_send_reschedule() to thread 0 of that core, which will take all threads out of the guest, and cause them to re-evaluate their interrupt status on the way back in. This adds a mechanism to differentiate real host IPIs from IPIs sent by KVM for guest threads to poke each other, in order to target the guest threads precisely when possible and avoid that global switch of the core to host state. We then use this new facility in the in-kernel XICS code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | | KVM: PPC: Book3S HV: Report VPA and DTL modifications in dirty mapPaul Mackerras2013-04-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present, the KVM_GET_DIRTY_LOG ioctl doesn't report modifications done by the host to the virtual processor areas (VPAs) and dispatch trace logs (DTLs) registered by the guest. This is because those modifications are done either in real mode or in the host kernel context, and in neither case does the access go through the guest's HPT, and thus no change (C) bit gets set in the guest's HPT. However, the changes done by the host do need to be tracked so that the modified pages get transferred when doing live migration. In order to track these modifications, this adds a dirty flag to the struct representing the VPA/DTL areas, and arranges to set the flag when the VPA/DTL gets modified by the host. Then, when we are collecting the dirty log, we also check the dirty flags for the VPA and DTL for each vcpu and set the relevant bit in the dirty log if necessary. Doing this also means we now need to keep track of the guest physical address of the VPA/DTL areas. So as not to lose track of modifications to a VPA/DTL area when it gets unregistered, or when a new area gets registered in its place, we need to transfer the dirty state to the rmap chain. This adds code to kvmppc_unpin_guest_page() to do that if the area was dirty. To simplify that code, we now require that all VPA, DTL and SLB shadow buffer areas fit within a single host page. Guests already comply with this requirement because pHyp requires that these areas not cross a 4k boundary. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | | KVM: PPC: booke: Added debug handlerBharat Bhushan2013-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Installed debug handler will be used for guest debug support and debug facility emulation features (patches for these features will follow this patch). Signed-off-by: Liu Yu <yu.liu@freescale.com> [bharat.bhushan@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
* | | | Merge branch 'next' of ↵Linus Torvalds2013-05-0234-169/+442
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlights this time around are: - A pile of addition POWER8 bits and nits, such as updated performance counter support (Michael Ellerman), new branch history buffer support (Anshuman Khandual), base support for the new PCI host bridge when not using the hypervisor (Gavin Shan) and other random related bits and fixes from various contributors. - Some rework of our page table format by Aneesh Kumar which fixes a thing or two and paves the way for THP support. THP itself will not make it this time around however. - More Freescale updates, including Altivec support on the new e6500 cores, new PCI controller support, and a pile of new boards support and updates. - The usual batch of trivial cleanups & fixes" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) powerpc: Fix build error for book3e powerpc: Context switch the new EBB SPRs powerpc: Turn on the EBB H/FSCR bits powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S powerpc: Setup BHRB instructions facility in HFSCR for POWER8 powerpc: Fix interrupt range check on debug exception powerpc: Update tlbie/tlbiel as per ISA doc powerpc: Print page size info during boot powerpc: print both base and actual page size on hash failure powerpc: Fix hpte_decode to use the correct decoding for page sizes powerpc: Decode the pte-lp-encoding bits correctly. powerpc: Use encode avpn where we need only avpn values powerpc: Reduce PTE table memory wastage powerpc: Move the pte free routines from common header powerpc: Reduce the PTE_INDEX_SIZE powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format powerpc: New hugepage directory format powerpc: Don't truncate pgd_index wrongly powerpc: Don't hard code the size of pte page powerpc: Save DAR and DSISR in pt_regs on MCE ...
| * | | powerpc: Context switch the new EBB SPRsMichael Ellerman2013-05-022-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This context switches the new Event Based Branching (EBB) SPRs. The three new SPRs are: - Event Based Branch Handler Register (EBBHR) - Event Based Branch Return Register (EBBRR) - Branch Event Status and Control Register (BESCR) Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Turn on the EBB H/FSCR bitsMichael Neuling2013-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns Event Based Branching (EBB) on in the Hypervisor Facility Status and Control Register (HFSCR) and Facility Status and Control Register (FSCR). Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207SMichael Ellerman2013-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are getting low on cpu feature bits. So rather than add a separate bit for every new Power8 feature, add a bit for arch 2.07 server catagory and use that instead. Hijack the value we had for BCTAR, but swap the value with CFAR so that all the ARCH defines are together. Note we don't touch CPU_FTR_TM, because it is conditionally enabled if the kernel is built with TM support. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Setup BHRB instructions facility in HFSCR for POWER8Anshuman Khandual2013-05-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Make BHRB instructions available in problem and privileged states. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Fix interrupt range check on debug exceptionBharat Bhushan2013-05-023-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not want to take single step and branch-taken debug exception in kernel exception code. But the address range check was not covering all kernel exception handlers address range. With this patch we defined the interrupt_end label which defines the end on kernel exception code. So now we check interrupt_base to interrupt_end range for not handling debug exception in kernel exception entry. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Reduce PTE table memory wastageAneesh Kumar K.V2013-04-301-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We allocate one page for the last level of linux page table. With THP and large page size of 16MB, that would mean we are wasting large part of that page. To map 16MB area, we only need a PTE space of 2K with 64K page size. This patch reduce the space wastage by sharing the page allocated for the last level of linux page table with multiple pmd entries. We call these smaller chunks PTE page fragments and allocated page, PTE page. In order to support systems which doesn't have 64K HPTE support, we also add another 2K to PTE page fragment. The second half of the PTE fragments is used for storing slot and secondary bit information of an HPTE. With this we now have a 4K PTE fragment. We use a simple approach to share the PTE page. On allocation, we bump the PTE page refcount to 16 and share the PTE page with the next 16 pte alloc request. This should help in the node locality of the PTE page fragment, assuming that the immediate pte alloc request will mostly come from the same NUMA node. We don't try to reuse the freed PTE page fragment. Hence we could be waisting some space. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Save DAR and DSISR in pt_regs on MCEAneesh Kumar K.V2013-04-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were not saving DAR and DSISR on MCE. Save then and also print the values along with exception details in xmon. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/booke: Remove obsolete macro FINISH_EXCEPTIONKevin Hao2013-04-301-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is stale and not used by anyone now. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/rtas_flash: Fix bad memory accessVasant Hegde2013-04-301-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use kmem_cache_alloc() to allocate memory to hold the new firmware which will be flashed. kmem_cache_alloc() calls rtas_block_ctor() to set memory to NULL. But these constructor is called only for newly allocated slabs. If we run below command multiple time without rebooting, allocator may allocate memory from the area which was free'd by kmem_cache_free and it will not call constructor. In this situation we may hit kernel oops. dd if=<fw image> of=/proc/ppc64/rtas/firmware_flash bs=4096 oops message: ------------- [ 1602.399755] Oops: Kernel access of bad area, sig: 11 [#1] [ 1602.399772] SMP NR_CPUS=1024 NUMA pSeries [ 1602.399779] Modules linked in: rtas_flash nfsd lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod sg ipv6 ses enclosure ehea ehci_pci ohci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh_rdac scsi_dh ipr libata scsi_mod [ 1602.399817] NIP: d00000000a170b9c LR: d00000000a170b64 CTR: c00000000079cd58 [ 1602.399823] REGS: c0000003b9937930 TRAP: 0300 Not tainted (3.9.0-rc4-0.27-ppc64) [ 1602.399828] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 22000428 XER: 20000000 [ 1602.399841] SOFTE: 1 [ 1602.399844] CFAR: c000000000005f24 [ 1602.399848] DAR: 8c2625a820631fef, DSISR: 40000000 [ 1602.399852] TASK = c0000003b4520760[3655] 'dd' THREAD: c0000003b9934000 CPU: 3 GPR00: 8c2625a820631fe7 c0000003b9937bb0 d00000000a179f28 d00000000a171f08 GPR04: 0000000010040000 0000000000001000 c0000003b9937df0 c0000003b5fb2080 GPR08: c0000003b58f7200 d00000000a179f28 c0000003b40058d4 c00000000079cd58 GPR12: d00000000a171450 c000000007f40900 0000000000000005 0000000010178d20 GPR16: 00000000100cb9d8 000000000000001d 0000000000000000 000000001003ffff GPR20: 0000000000000001 0000000000000000 00003fffa0b50d30 000000001001f010 GPR24: 0000000010020888 0000000010040000 d00000000a171f08 d00000000a172808 GPR28: 0000000000001000 0000000010040000 c0000003b4005880 8c2625a820631fe7 [ 1602.399924] NIP [d00000000a170b9c] .rtas_flash_write+0x7c/0x1e8 [rtas_flash] [ 1602.399930] LR [d00000000a170b64] .rtas_flash_write+0x44/0x1e8 [rtas_flash] [ 1602.399934] Call Trace: [ 1602.399939] [c0000003b9937bb0] [d00000000a170b64] .rtas_flash_write+0x44/0x1e8 [rtas_flash] (unreliable) [ 1602.399948] [c0000003b9937c60] [c000000000282830] .proc_reg_write+0x90/0xe0 [ 1602.399955] [c0000003b9937ce0] [c0000000001ff374] .vfs_write+0x114/0x238 [ 1602.399961] [c0000003b9937d80] [c0000000001ff5d8] .SyS_write+0x70/0xe8 [ 1602.399968] [c0000003b9937e30] [c000000000009cdc] syscall_exit+0x0/0xa0 [ 1602.399973] Instruction dump: [ 1602.399977] eb698010 801b0028 2f80dcd6 419e00a4 2fbc0000 419e009c ebfb0030 2fbf0000 [ 1602.399989] 409e0010 480000d8 60000000 7c1f0378 <e81f0008> 2fa00000 409efff4 e81f0000 [ 1602.400012] ---[ end trace b4136d115dc31dac ]--- [ 1602.402178] [ 1602.402185] Sending IPI to other CPUs [ 1602.403329] IPI complete This patch uses kmem_cache_zalloc() instead of kmem_cache_alloc() to allocate memory, which makes sure memory is set to 0 before using. Also removes rtas_block_ctor(), which is no longer required. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | Merge remote-tracking branch 'kumar/next' into nextBenjamin Herrenschmidt2013-04-305-5/+101
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From Kumar Gala: << Add support for T4 and B4 SoC families from Freescale, e6500 altivec support, some various board fixes and other minor cleanups. >>
| | * | | powerpc: Add paravirt idle loop for 64-bit Book-EStuart Yoder2013-03-132-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * | | powerpc/85xx: Add AltiVec support for e6500Kumar Gala2013-03-123-3/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The e6500 core adds support for AltiVec on a Book-E class processor. Connect up all the various exception handling code and build config mechanisms to allow user spaces apps to utilize AltiVec. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| * | | | powerpc: Initialise PMU related regs on Power8Michael Ellerman2013-04-261-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For both HV and guest kernels, intialise PMU regs to something sane. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc: Fix "attempt to move .org backwards" errorPaul Mackerras2013-04-261-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building a 64-bit powerpc kernel with PR KVM enabled currently gives this error: AS arch/powerpc/kernel/head_64.o arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards make[2]: *** [arch/powerpc/kernel/head_64.o] Error 1 This happens because the MASKABLE_EXCEPTION_PSERIES macro turns into 33 instructions, but we only have space for 32 at the decrementer interrupt vector (from 0x900 to 0x980). In the code generated by the MASKABLE_EXCEPTION_PSERIES macro, we currently have two instances of the HMT_MEDIUM macro, which has the effect of setting the SMT thread priority to medium. One is the first instruction, and is overwritten by a no-op on processors where we save the PPR (processor priority register), that is, POWER7 or later. The other is after we have saved the PPR. In order to reduce the code at 0x900 by one instruction, we omit the first HMT_MEDIUM. On processors without SMT this will have no effect since HMT_MEDIUM is a no-op there. On POWER5 and RS64 machines this will mean that the first few instructions take a little longer in the case where a decrementer interrupt occurs when the hardware thread is running at low SMT priority. On POWER6 and later machines, the hardware automatically boosts the thread priority when a decrementer interrupt is taken if the thread priority was below medium, so this change won't make any difference. The alternative would be to branch out of line after saving the CFAR. However, that would incur an extra overhead on all processors, whereas the approach adopted here only adds overhead on older threaded processors. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
OpenPOWER on IntegriCloud