summaryrefslogtreecommitdiffstats
path: root/Documentation/virtual
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'kvm-arm-for-v4.12-round2' of ↵Paolo Bonzini2017-05-092-0/+127
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD Second round of KVM/ARM Changes for v4.12. Changes include: - A fix related to the 32-bit idmap stub - A fix to the bitmask used to deode the operands of an AArch32 CP instruction - We have moved the files shared between arch/arm/kvm and arch/arm64/kvm to virt/kvm/arm - We add support for saving/restoring the virtual ITS state to userspace
| * KVM: arm/arm64: Clarification and relaxation to ITS save/restore ABIChristoffer Dall2017-05-091-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clarify what is meant by the save/restore ABI only supporting virtual physical interrupts. Relax the requirement of the order that the collection entries are written in and be clear that there is no particular ordering enforced. Some cosmetic changes in the capitalization of ID names to align with the GICv3 manual and remove the empty line in the bottom of the patch. Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
| * KVM: arm/arm64: Add GICV3 pending table save API documentationEric Auger2017-05-081-0/+6
| | | | | | | | | | | | | | | | | | Add description for how to save GICV3 LPI pending bit into guest RAM pending tables. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <cdall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
| * KVM: arm/arm64: Add ITS save/restore API documentationEric Auger2017-05-081-0/+120
| | | | | | | | | | | | | | | | Add description for how to access ITS registers and how to save/restore ITS tables into/from memory. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Eric Auger <eric.auger@redhat.com>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2017-05-086-216/+271
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "ARM: - HYP mode stub supports kexec/kdump on 32-bit - improved PMU support - virtual interrupt controller performance improvements - support for userspace virtual interrupt controller (slower, but necessary for KVM on the weird Broadcom SoCs used by the Raspberry Pi 3) MIPS: - basic support for hardware virtualization (ImgTec P5600/P6600/I6400 and Cavium Octeon III) PPC: - in-kernel acceleration for VFIO s390: - support for guests without storage keys - adapter interruption suppression x86: - usual range of nVMX improvements, notably nested EPT support for accessed and dirty bits - emulation of CPL3 CPUID faulting generic: - first part of VCPU thread request API - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) kvm: nVMX: Don't validate disabled secondary controls KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick Revert "KVM: Support vCPU-based gfn->hva cache" tools/kvm: fix top level makefile KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING KVM: Documentation: remove VM mmap documentation kvm: nVMX: Remove superfluous VMX instruction fault checks KVM: x86: fix emulation of RSM and IRET instructions KVM: mark requests that need synchronization KVM: return if kvm_vcpu_wake_up() did wake up the VCPU KVM: add explicit barrier to kvm_vcpu_kick KVM: perform a wake_up in kvm_make_all_cpus_request KVM: mark requests that do not need a wakeup KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up KVM: x86: always use kvm_make_request instead of set_bit KVM: add kvm_{test,clear}_request to replace {test,clear}_bit s390: kvm: Cpu model support for msa6, msa7 and msa8 KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK kvm: better MWAIT emulation for guests KVM: x86: virtualize cpuid faulting ...
| * | KVM: Documentation: remove VM mmap documentationJann Horn2017-04-281-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | Since commit 80f5b5e700fa9c ("KVM: remove vm mmap method"), the VM mmap handler is gone. Remove the corresponding documentation. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | Merge tag 'kvm-arm-for-v4.12' of ↵Paolo Bonzini2017-04-272-0/+94
| |\ \ | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.12. Changes include: - Using the common sysreg definitions between KVM and arm64 - Improved hyp-stub implementation with support for kexec and kdump on the 32-bit side - Proper PMU exception handling - Performance improvements of our GIC handling - Support for irqchip in userspace with in-kernel arch-timers and PMU support - A fix for a race condition in our PSCI code Conflicts: Documentation/virtual/kvm/api.txt include/uapi/linux/kvm.h
| | * KVM: arm/arm64: Add ARM user space interrupt signaling ABIAlexander Graf2017-04-091-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have 2 modes for dealing with interrupts in the ARM world. We can either handle them all using hardware acceleration through the vgic or we can emulate a gic in user space and only drive CPU IRQ pins from there. Unfortunately, when driving IRQs from user space, we never tell user space about events from devices emulated inside the kernel, which may result in interrupt line state changes, so we lose out on for example timer and PMU events if we run with user space gic emulation. Define an ABI to publish such device output levels to userspace. Reviewed-by: Alexander Graf <agraf@suse.de> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * arm/arm64: Add hyp-stub API documentationMarc Zyngier2017-04-091-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to help people understanding the hyp-stub API that exists between the host kernel and the hypervisor mode (whether a hypervisor has been installed or not), let's document said API. As with any form of documentation, I expect it to become obsolete and completely misleading within 20 minutes after having being merged. Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
| * | Merge tag 'kvm-s390-next-4.12-3' of ↵Paolo Bonzini2017-04-271-1/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: MSA8 feature for guests - Detect all function codes for KMA and export the features for use in the cpu model
| | * | s390: kvm: Cpu model support for msa6, msa7 and msa8Jason J. Herne2017-04-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | msa6 and msa7 require no changes. msa8 adds kma instruction and feature area. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | | kvm: better MWAIT emulation for guestsMichael S. Tsirkin2017-04-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guests that are heavy on futexes end up IPI'ing each other a lot. That can lead to significant slowdowns and latency increase for those guests when running within KVM. If only a single guest is needed on a host, we have a lot of spare host CPU time we can throw at the problem. Modern CPUs implement a feature called "MWAIT" which allows guests to wake up sleeping remote CPUs without an IPI - thus without an exit - at the expense of never going out of guest context. The decision whether this is something sensible to use should be up to the VM admin, so to user space. We can however allow MWAIT execution on systems that support it properly hardware wise. This patch adds a CAP to user space and a KVM cpuid leaf to indicate availability of native MWAIT execution. With that enabled, the worst a guest can do is waste as many cycles as a "jmp ." would do, so it's not a privilege problem. We consciously do *not* expose the feature in our CPUID bitmap, as most people will want to benefit from sleeping vCPUs to allow for over commit. Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [agraf: fix amd, change commit message] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: PPC: VFIO: Add in-kernel acceleration for VFIOAlexey Kardashevskiy2017-04-201-2/+16
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO without passing them to user space which saves time on switching to user space and back. This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM. KVM tries to handle a TCE request in the real mode, if failed it passes the request to the virtual mode to complete the operation. If it a virtual mode handler fails, the request is passed to the user space; this is not expected to happen though. To avoid dealing with page use counters (which is tricky in real mode), this only accelerates SPAPR TCE IOMMU v2 clients which are required to pre-register the userspace memory. The very first TCE request will be handled in the VFIO SPAPR TCE driver anyway as the userspace view of the TCE table (iommu_table::it_userspace) is not allocated till the very first mapping happens and we cannot call vmalloc in real mode. If we fail to update a hardware IOMMU table unexpected reason, we just clear it and move on as there is nothing really we can do about it - for example, if we hot plug a VFIO device to a guest, existing TCE tables will be mirrored automatically to the hardware and there is no interface to report to the guest about possible failures. This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd and associates a physical IOMMU table with the SPAPR TCE table (which is a guest view of the hardware IOMMU table). The iommu_table object is cached and referenced so we do not have to look up for it in real mode. This does not implement the UNSET counterpart as there is no use for it - once the acceleration is enabled, the existing userspace won't disable it unless a VFIO container is destroyed; this adds necessary cleanup to the KVM_DEV_VFIO_GROUP_DEL handler. This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user space. This adds real mode version of WARN_ON_ONCE() as the generic version causes problems with rcu_sched. Since we testing what vmalloc_to_phys() returns in the code, this also adds a check for already existing vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect(). This finally makes use of vfio_external_user_iommu_id() which was introduced quite some time ago and was considered for removal. Tests show that this patch increases transmission speed from 220MB/s to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
| * | Merge tag 'kvm-s390-next-4.12-1' of ↵Radim Krčmář2017-04-112-3/+55
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux From: Christian Borntraeger <borntraeger@de.ibm.com> KVM: s390: features for 4.12 1. guarded storage support for guests This contains an s390 base Linux feature branch that is necessary to implement the KVM part 2. Provide an interface to implement adapter interruption suppression which is necessary for proper zPCI support 3. Use more defines instead of numbers 4. Provide logging for lazy enablement of runtime instrumentation
| | * | KVM: s390: introduce AIS capabilityYi Min Zhao2017-04-071-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a cap to enable AIS facility bit, and add documentation for this capability. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | KVM: s390: introduce adapter interrupt inject functionYi Min Zhao2017-04-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inject adapter interrupts on a specified adapter which allows to retrieve the adapter flags, e.g. if the adapter is subject to AIS facility or not. And add documentation for this interface. For adapters subject to AIS, handle the airq injection suppression for a given ISC according to the interruption mode: - before injection, if NO-Interruptions Mode, just return 0 and suppress, otherwise, allow the injection. - after injection, if SINGLE-Interruption Mode, change it to NO-Interruptions Mode to suppress the following interrupts. Besides, add tracepoint for suppressed airq and AIS mode transitions. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | KVM: s390: introduce ais mode modify functionFei Li2017-04-061-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide an interface for userspace to modify AIS (adapter-interruption-suppression) mode state, and add documentation for the interface. Allowed target modes are ALL-Interruptions mode and SINGLE-Interruption mode. We introduce the 'simm' and 'nimm' fields in kvm_s390_float_interrupt to store interruption modes for each ISC. Each bit in 'simm' and 'nimm' targets to one ISC, and collaboratively indicate three modes: ALL-Interruptions, SINGLE-Interruption and NO-Interruptions. This interface can initiate most transitions between the states; transition from SINGLE-Interruption to NO-Interruptions via adapter interrupt injection will be introduced in a following patch. The meaningful combinations are as follows: interruption mode | simm bit | nimm bit ------------------|----------|---------- ALL | 0 | 0 SINGLE | 1 | 0 NO | 1 | 1 Besides, add tracepoint to track AIS mode transitions. Co-Authored-By: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | KVM: s390: interface for suppressible I/O adaptersFei Li2017-04-061-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to properly implement adapter-interruption suppression, we need a way for userspace to specify which adapters are subject to suppression. Let's convert the existing (and unused) 'pad' field into a 'flags' field and define a flag value for suppressible adapters. Besides, add documentation for the interface. Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | KVM: s390: gs support for kvm guestsFan Zhang2017-03-221-0/+9
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds guarded storage support for KVM guest. We need to setup the necessary control blocks, the kvm_run structure for the new registers, the necessary wrappers for VSIE, as well as the machine check save areas. GS is enabled lazily and the register saving and reloading is done in KVM code. As this feature adds new content for migration, we provide a new capability for enablement (KVM_CAP_S390_GS). Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: x86: drop legacy device assignmentPaolo Bonzini2017-04-071-204/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Legacy device assignment has been deprecated since 4.2 (released 1.5 years ago). VFIO is better and everyone should have switched to it. If they haven't, this should convince them. :) Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | Merge tag 'kvm_mips_4.12_1' of ↵Radim Krčmář2017-04-062-1/+94
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/kvm-mips From: James Hogan <james.hogan@imgtec.com> KVM: MIPS: VZ support, Octeon III, and TLBR Add basic support for the MIPS Virtualization Module (generally known as MIPS VZ) in KVM. We primarily support the ImgTec P5600, P6600, I6400, and Cavium Octeon III cores so far. Support is included for the following VZ / guest hardware features: - MIPS32 and MIPS64, r5 (VZ requires r5 or later) and r6 - TLBs with GuestID (IMG cores) or Root ASID Dealias (Octeon III) - Shared physical root/guest TLB (IMG cores) - FPU / MSA - Cop0 timer (up to 1GHz for now due to soft timer limit) - Segmentation control (EVA) - Hardware page table walker (HTW) both for root and guest TLB Also included is a proper implementation of the TLBR instruction for the trap & emulate MIPS KVM implementation. Preliminary MIPS architecture changes are applied directly with Ralf's ack.
| | * | KVM: MIPS/VZ: Emulate MAARs when necessaryJames Hogan2017-03-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add emulation of Memory Accessibility Attribute Registers (MAARs) when necessary. We can't actually do anything with whatever the guest provides, but it may not be possible to clear Guest.Config5.MRP so we have to emulate at least a pair of MAARs. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS/VZ: Support guest hardware page table walkerJames Hogan2017-03-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for VZ guest CP0_PWBase, CP0_PWField, CP0_PWSize, and CP0_PWCtl registers for controlling the guest hardware page table walker (HTW) present on P5600 and P6600 cores. These guest registers need initialising on R6, context switching, and exposing via the KVM ioctl API when they are present. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS/VZ: Support guest segmentation controlJames Hogan2017-03-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for VZ guest CP0_SegCtl0, CP0_SegCtl1, and CP0_SegCtl2 registers, as found on P5600 and P6600 cores. These guest registers need initialising, context switching, and exposing via the KVM ioctl API when they are present. They also require the GVA -> GPA translation code for handling a GVA root exception to be updated to interpret the segmentation registers and decode the faulting instruction enough to detect EVA memory access instructions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS/VZ: Support guest CP0_[X]ContextConfigJames Hogan2017-03-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for VZ guest CP0_ContextConfig and CP0_XContextConfig (MIPS64 only) registers, as found on P5600 and P6600 cores. These guest registers need initialising, context switching, and exposing via the KVM ioctl API when they are present. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS/VZ: Support guest CP0_BadInstr[P]James Hogan2017-03-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for VZ guest CP0_BadInstr and CP0_BadInstrP registers, as found on most VZ capable cores. These guest registers need context switching, and exposing via the KVM ioctl API when they are present. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS: Implement VZ supportJames Hogan2017-03-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the main support for the MIPS Virtualization ASE (A.K.A. VZ) to MIPS KVM. The bulk of this work is in vz.c, with various new state and definitions elsewhere. Enough is implemented to be able to run on a minimal VZ core. Further patches will fill out support for guest features which are optional or can be disabled. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS: Add 64BIT capabilityJames Hogan2017-03-281-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new KVM_CAP_MIPS_64BIT capability to indicate that 64-bit MIPS guests are available and supported. In this case it should still be possible to run 32-bit guest code. If not available it won't be possible to run 64-bit guest code and the instructions may not be available, or the kernel may not support full context switching of 64-bit registers. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS: Add VZ & TE capabilitiesJames Hogan2017-03-281-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new KVM_CAP_MIPS_VZ and KVM_CAP_MIPS_TE capabilities, and in order to allow MIPS KVM to support VZ without confusing old users (which expect the trap & emulate implementation), define and start checking KVM_CREATE_VM type codes. The codes available are: - KVM_VM_MIPS_TE = 0 This is the current value expected from the user, and will create a VM using trap & emulate in user mode, confined to the user mode address space. This may in future become unavailable if the kernel is only configured to support VZ, in which case the EINVAL error will be returned and KVM_CAP_MIPS_TE won't be available even though KVM_CAP_MIPS_VZ is. - KVM_VM_MIPS_VZ = 1 This can be provided when the KVM_CAP_MIPS_VZ capability is available to create a VM using VZ, with a fully virtualized guest virtual address space. If VZ support is unavailable in the kernel, the EINVAL error will be returned (although old kernels without the KVM_CAP_MIPS_VZ capability may well succeed and create a trap & emulate VM). This is designed to allow the desired implementation (T&E vs VZ) to be potentially chosen at runtime rather than being fixed in the kernel configuration. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
| | * | KVM: MIPS: Implement HYPCALL emulationJames Hogan2017-03-281-0/+5
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emulate the HYPCALL instruction added in the VZ ASE and used by the MIPS paravirtualised guest support that is already merged. The new hypcall.c handles arguments and the return value. No actual hypercalls are yet supported, but this still allows us to safely step over hypercalls and set an error code in the return value for forward compatibility. Non-zero HYPCALL codes are not handled. We also document the hypercall ABI which asm/kvm_para.h uses. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: David Daney <david.daney@cavium.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
* | | Merge tag 'kvm-arm-for-v4.11-rc6' of ↵Radim Krčmář2017-04-051-0/+6
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm From: Christoffer Dall <cdall@linaro.org> KVM/ARM Fixes for v4.11-rc6 Fixes include: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code
| * | KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABIChristoffer Dall2017-04-041-0/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As an oversight, for GICv2, we accidentally export the GICC_PMR register in the format of the GICH_VMCR.VMPriMask field in the lower 5 bits of a word, meaning that userspace must always use the lower 5 bits to communicate with the KVM device and must shift the value left by 3 places to obtain the actual priority mask level. Since GICv3 supports the full 8 bits of priority masking in the ICH_VMCR, we have to fix the value we export when emulating a GICv2 on top of a hardware GICv3 and exporting the emulated GICv2 state to userspace. Take the chance to clarify this aspect of the ABI. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* | KVM: Documentation: document MCE ioctlsLuiz Capitulino2017-03-201-0/+63
|/ | | | | | Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: Add documentation for KVM_CAP_NR_MEMSLOTSLinu Cherian2017-03-091-0/+4
| | | | | | | | Add documentation for KVM_CAP_NR_MEMSLOTS capability. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* Merge tag 'docs-4.11-fixes' of git://git.lwn.net/linuxLinus Torvalds2017-03-041-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation fixes from Jonathan Corbet: "A few fixes for the docs tree, including one for a 4.11 build regression" * tag 'docs-4.11-fixes' of git://git.lwn.net/linux: Documentation/sphinx: fix primary_domain configuration docs: Fix htmldocs build failure doc/ko_KR/memory-barriers: Update control-dependencies section pcieaer doc: update the link Documentation: Update path to sysrq.txt
| * Documentation: Update path to sysrq.txtKrzysztof Kozlowski2017-03-031-3/+3
| | | | | | | | | | | | | | | | | | Commit 9d85025b0418 ("docs-rst: create an user's manual book") moved the sysrq.txt leaving old paths in the kernel docs. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | KVM: race-free exit from KVM_RUN without POSIX signalsPaolo Bonzini2017-02-171-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" a VCPU out of KVM_RUN through a POSIX signal. A signal is attached to a dummy signal handler; by blocking the signal outside KVM_RUN and unblocking it inside, this possible race is closed: VCPU thread service thread -------------------------------------------------------------- check flag set flag raise signal (signal handler does nothing) KVM_RUN However, one issue with KVM_SET_SIGNAL_MASK is that it has to take tsk->sighand->siglock on every KVM_RUN. This lock is often on a remote NUMA node, because it is on the node of a thread's creator. Taking this lock can be very expensive if there are many userspace exits (as is the case for SMP Windows VMs without Hyper-V reference time counter). As an alternative, we can put the flag directly in kvm_run so that KVM can see it: VCPU thread service thread -------------------------------------------------------------- raise signal signal handler set run->immediate_exit KVM_RUN check run->immediate_exit Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | Merge tag 'kvmarm-for-4.11' of ↵Paolo Bonzini2017-02-091-3/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD kvmarm updates for 4.11 - GICv3 save restore - Cache flushing fixes - MSI injection fix for GICv3 ITS - Physical timer emulation support
| * | KVM: arm/arm64: Documentation: Update arm-vgic-v3.txtVijaya Kumar K2017-01-301-3/+8
| |/ | | | | | | | | | | | | | | | | | | Update error code returned for Invalid CPU interface register value and access in AArch32 mode. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* | Merge tag 'kvm_mips_4.11_1' of ↵Paolo Bonzini2017-02-071-0/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/kvm-mips into HEAD KVM: MIPS: GVA/GPA page tables, dirty logging, SYNC_MMU etc Numerous MIPS KVM fixes, improvements, and features for 4.11, many of which continue to pave the way for VZ support, the most interesting of which are: - Add GVA->HPA page tables for T&E, to cache GVA mappings. - Generate fast-path TLB refill exception handler which loads host TLB entries from GVA page table, avoiding repeated guest memory translation and guest TLB lookups. - Use uaccess macros when T&E needs to access guest memory, which with GVA page tables and the Linux TLB refill handler improves robustness against TLB faults and fixes EVA hosts. - Use BadInstr/BadInstrP registers when available to obtain instruction encodings after a synchronous trap. - Add GPA->HPA page tables to replace the inflexible linear array, allowing for multiple sparsely arranged memory regions. - Properly implement dirty page logging. - Add KVM_CAP_SYNC_MMU support so that changes in GPA mappings become effective in guests even if they are already running, allowing for copy-on-write, KSM, idle page tracking, swapping, and guest memory ballooning. - Add KVM_CAP_READONLY_MEM support, so writes to specified memory regions are treated as MMIO. - Implement proper CP0_EBase support in T&E. - Expose a few more missing CP0 registers to userland. - Add KVM_CAP_NR_VCPUS and KVM_CAP_MAX_VCPUS support, and allow up to 8 VCPUs to be created in a VM. - Various cleanups and dropping of dead and duplicated code.
| * | KVM: MIPS/T&E: Expose read-only CP0_IntCtl registerJames Hogan2017-02-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the CP0_IntCtl register through the KVM register access API, which is a required register since MIPS32r2. It is currently read-only since the VS field isn't implemented due to lack of Config3.VInt or Config3.VEIC. It is implemented in trap_emul.c so that a VZ implementation can allow writes. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
| * | KVM: MIPS/T&E: Expose CP0_EntryLo0/1 registersJames Hogan2017-02-031-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the CP0_EntryLo0 and CP0_EntryLo1 registers through the KVM register access API. This is fairly straightforward for trap & emulate since we don't support the RI and XI bits. For the sake of future proofing (particularly for VZ) it is explicitly specified that the API always exposes the 64-bit version of these registers (i.e. with the RI and XI bits in bit positions 63 and 62 respectively), and they are implemented in trap_emul.c rather than mips.c to allow them to be implemented differently for VZ. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
| * | KVM: MIPS/T&E: Implement CP0_EBase registerJames Hogan2017-02-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CP0_EBase register is a standard feature of MIPS32r2, so we should always have been implementing it properly. However the register value was ignored and wasn't exposed to userland. Fix the emulation of exceptions and interrupts to use the value stored in guest CP0_EBase, and fix the masks so that the top 3 bits (rather than the standard 2) are fixed, so that it is always in the guest KSeg0 segment. Also add CP0_EBASE to the KVM one_reg interface so it can be accessed by userland, also allowing the CPU number field to be written (which isn't permitted by the guest). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
* | | Merge branch 'kvm-ppc-next' of ↵Paolo Bonzini2017-02-071-7/+187
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD The big feature this time is support for POWER9 using the radix-tree MMU for host and guest. This required some changes to arch/powerpc code, so I talked with Michael Ellerman and he created a topic branch with this patchset, which I merged into kvm-ppc-next and which Michael will pull into his tree. Michael also put in some patches from Nick Piggin which fix bugs in the interrupt vector code in relocatable kernels when coming from a KVM guest. Other notable changes include: * Add the ability to change the size of the hashed page table, from David Gibson. * XICS (interrupt controller) emulation fixes and improvements, from Li Zhong. * Bug fixes from myself and Thomas Huth. These patches define some new KVM capabilities and ioctls, but there should be no conflicts with anything else currently upstream, as far as I am aware.
| * | | KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT sizeDavid Gibson2017-01-311-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KVM_PPC_ALLOCATE_HTAB ioctl() is used to set the size of hashed page table (HPT) that userspace expects a guest VM to have, and is also used to clear that HPT when necessary (e.g. guest reboot). At present, once the ioctl() is called for the first time, the HPT size can never be changed thereafter - it will be cleared but always sized as from the first call. With upcoming HPT resize implementation, we're going to need to allow userspace to resize the HPT at reset (to change it back to the default size if the guest changed it). So, we need to allow this ioctl() to change the HPT size. This patch also updates Documentation/virtual/kvm/api.txt to reflect the new behaviour. In fact the documentation was already slightly incorrect since 572abd5 "KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation ioctl" Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
| * | | KVM: PPC: Book3S HV: HPT resizing documentation and reserved numbersDavid Gibson2017-01-311-0/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new powerpc-specific KVM_CAP_SPAPR_RESIZE_HPT capability to advertise whether KVM is capable of handling the PAPR extensions for resizing the hashed page table during guest runtime. It also adds definitions for two new VM ioctl()s to implement this extension, and documentation of the same. Note that, HPT resizing is already possible with KVM PR without kernel modification, since the HPT is managed within userspace (qemu). The capability defined here will only be set where an in-kernel implementation of resizing is necessary, i.e. for KVM HV. To determine if the userspace resize implementation can be used, it's necessary to check KVM_CAP_PPC_ALLOC_HTAB. Unfortunately older kernels incorrectly set KVM_CAP_PPC_ALLOC_HTAB even with KVM PR. If userspace it want to support resizing with KVM PR on such kernels, it will need a workaround. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
| * | | Documentation: Correct duplicate section number in kvm/api.txtDavid Gibson2017-01-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both KVM_CREATE_SPAPR_TCE_64 and KVM_REINJECT_CONTROL have section number 4.98 in Documentation/virtual/kvm/api.txt, presumably due to a naive merge. This corrects the duplication. [paulus@ozlabs.org - correct section numbers for following sections, KVM_PPC_CONFIGURE_V3_MMU and KVM_PPC_GET_RMMU_INFO, as well.] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
| * | | Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-nextPaul Mackerras2017-01-311-0/+83
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | This merges in the POWER9 radix MMU host and guest support, which was put into a topic branch because it touches both powerpc and KVM code. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
| | * | KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMUPaul Mackerras2017-01-311-0/+83
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds two capabilities and two ioctls to allow userspace to find out about and configure the POWER9 MMU in a guest. The two capabilities tell userspace whether KVM can support a guest using the radix MMU, or using the hashed page table (HPT) MMU with a process table and segment tables. (Note that the MMUs in the POWER9 processor cores do not use the process and segment tables when in HPT mode, but the nest MMU does). The KVM_PPC_CONFIGURE_V3_MMU ioctl allows userspace to specify whether a guest will use the radix MMU or the HPT MMU, and to specify the size and location (in guest space) of the process table. The KVM_PPC_GET_RMMU_INFO ioctl gives userspace information about the radix MMU. It returns a list of supported radix tree geometries (base page size and number of bits indexed at each level of the radix tree) and the encoding used to specify the various page sizes for the TLB invalidate entry instruction. Initially, both capabilities return 0 and the ioctls return -EINVAL, until the necessary infrastructure for them to operate correctly is added. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | KVM: x86: add KVM_HC_CLOCK_PAIRING hypercallMarcelo Tosatti2017-02-071-0/+35
|/ / | | | | | | | | | | | | | | | | | | Add a hypercall to retrieve the host realtime clock and the TSC value used to calculate that clock read. Used to implement clock synchronization between host and guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OpenPOWER on IntegriCloud