summaryrefslogtreecommitdiffstats
path: root/arch/s390/include/asm/kvm_host.h
Commit message (Collapse)AuthorAgeFilesLines
* KVM: s390: fix memory overwrites when vx is disabledDavid Hildenbrand2016-01-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | The kernel now always uses vector registers when available, however KVM has special logic if support is really enabled for a guest. If support is disabled, guest_fpregs.fregs will only contain memory for the fpu. The kernel, however, will store vector registers into that area, resulting in crazy memory overwrites. Simply extending that area is not enough, because the format of the registers also changes. We would have to do additional conversions, making the code even more complex. Therefore let's directly use one place for the vector/fpu registers + fpc (in kvm_run). We just have to convert the data properly when accessing it. This makes current code much easier. Please note that vector/fpu registers are now always stored to vcpu->run->s.regs.vrs. Although this data is visible to QEMU and used for migration, we only guarantee valid values to user space when KVM_SYNC_VRS is set. As that is only the case when we have vector register support, we are on the safe side. Fixes: b5510d9b68c3 ("s390/fpu: always enable the vector facility if it is available") Cc: stable@vger.kernel.org # v4.4 d9a3a09af54d s390/kvm: remove dependency on struct save_area definition Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [adopt to d9a3a09af54d]
* KVM: move architecture-dependent requests to arch/Paolo Bonzini2016-01-081-0/+4
| | | | | | | | | Since the numbers now overlap, it makes sense to enumerate them in asm/kvm_host.h rather than linux/kvm_host.h. Functions that refer to architecture-specific requests are also moved to arch/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: s390: implement the RI support of guestFan Zhang2016-01-071-1/+2
| | | | | | | | | | | | | | | | This patch adds runtime instrumentation support for KVM guest. We need to setup a save area for the runtime instrumentation-controls control block(RICCB) and implement the necessary interfaces to live migrate the guest settings. We setup the sie control block in a way, that the runtime instrumentation instructions of a guest are handled by hardware. We also add a capability KVM_CAP_S390_RI to make this feature opt-in as it needs migration support. Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: fix mismatch between user and in-kernel guest limitDominik Dingel2015-12-151-0/+1
| | | | | | | | | | | | | | | | | | | | While the userspace interface requests the maximum size the gmap code expects to get a maximum address. This error resulted in bigger page tables than necessary for some guest sizes, e.g. a 2GB guest used 3 levels instead of 2. At the same time we introduce KVM_S390_NO_MEM_LIMIT, which allows in a bright future that a guest spans the complete 64 bit address space. We also switch to TASK_MAX_SIZE for the initial memory size, this is a cosmetic change as the previous size also resulted in a 4 level pagetable creation. Reported-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Enable up to 248 VCPUs per VMEugene (jno) Dvurechenski2015-11-301-1/+1
| | | | | | | | | This patch allows s390 to have more than 64 VCPUs for a guest (up to 248 for memory usage considerations), if supported by the underlaying hardware (sclp.has_esca). Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Introduce switching codeEugene (jno) Dvurechenski2015-11-301-0/+1
| | | | | | | | | | | | | | This patch adds code that performs transparent switch to Extended SCA on addition of 65th VCPU in a VM. Disposal of ESCA is added too. The entier ESCA functionality, however, is still not enabled. The enablement will be provided in a separate patch. This patch also uses read/write lock protection of SCA and its subfields for possible disposal at the BSCA-to-ESCA transition. While only Basic SCA needs such a protection (for the swap), any SCA access is now guarded. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Make provisions for ESCA utilizationEugene (jno) Dvurechenski2015-11-301-1/+2
| | | | | | | | | | | This patch updates the routines (sca_*) to provide transparent access to and manipulation on the data for both Basic and Extended SCA in use. The kvm.arch.sca is generalized to (void *) to handle BSCA/ESCA cases. Also the kvm.arch.use_esca flag is provided. The actual functionality is kept the same. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Introduce new structuresEugene (jno) Dvurechenski2015-11-301-6/+41
| | | | | | | | | | | | | This patch adds new structures and updates some existing ones to provide the base for Extended SCA functionality. The old sca_* structures were renamed to bsca_* to keep things uniform. The access to fields of SIGP controls were turned into bitfields instead of hardcoded bitmasks. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-11-051-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.4. s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: Quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits) KVM: VMX: Fix commit which broke PML KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() KVM: x86: allow RSM from 64-bit mode KVM: VMX: fix SMEP and SMAP without EPT KVM: x86: move kvm_set_irq_inatomic to legacy device assignment KVM: device assignment: remove pointless #ifdefs KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic KVM: x86: zero apic_arb_prio on reset drivers/hv: share Hyper-V SynIC constants with userspace KVM: x86: handle SMBASE as physical address in RSM KVM: x86: add read_phys to x86_emulate_ops KVM: x86: removing unused variable KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr() KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings KVM: arm/arm64: Optimize away redundant LR tracking KVM: s390: use simple switch statement as multiplexer KVM: s390: drop useless newline in debugging data KVM: s390: SCA must not cross page boundaries KVM: arm: Do not indent the arguments of DECLARE_BITMAP ...
| * KVM: Add kvm_arch_vcpu_{un}blocking callbacksChristoffer Dall2015-10-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Some times it is useful for architecture implementations of KVM to know when the VCPU thread is about to block or when it comes back from blocking (arm/arm64 needs to know this to properly implement timers, for example). Therefore provide a generic architecture callback function in line with what we do elsewhere for KVM generic-arch interactions. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* | s390/fpu: split fpu-internal.h into fpu internals, api, and type headersHendrik Brueckner2015-10-161-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | Split the API and FPU type definitions into separate header files similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b). The new header files and their meaning are: asm/fpu/types.h: FPU related data types, needed for 'struct thread_struct' and 'struct task_struct'. asm/fpu/api.h: FPU related 'public' functions for other subsystems and device drivers. asm/fpu/internal.h: FPU internal functions mainly used to convert FPU register contents in signal handling. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand2015-09-251-0/+1
| | | | | | | | | | | | We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: add halt_attempted_poll to VCPU statsPaolo Bonzini2015-09-161-0/+1
| | | | | | | | | | | | | | | | | This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2015-08-311-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The big one is support for fake NUMA, splitting a really large machine in more manageable piece improves performance in some cases, e.g. for a KVM host. The FICON Link Incident handling has been improved, this helps the operator to identify degraded or non-operational FICON connections. The save and restore of floating point and vector registers has been overhauled to allow the future use of vector registers in the kernel. A few small enhancement, magic sys-requests for the vt220 console via SCLP, some more assembler code has been converted to C, the PCI error handling is improved. And the usual cleanup and bug fixing" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits) s390/jump_label: Use %*ph to print small buffers s390/sclp_vt220: support magic sysrequests s390/ctrlchar: improve handling of magic sysrequests s390/numa: remove superfluous ARCH_WANT defines s390/3270: redraw screen on unsolicited device end s390/dcssblk: correct out of bounds array indexes s390/mm: simplify page table alloc/free code s390/pci: move debug messages to debugfs s390/nmi: initialize control register 0 earlier s390/zcrypt: use msleep() instead of mdelay() s390/hmcdrv: fix interrupt registration s390/setup: fix novx parameter s390/uaccess: remove uaccess_primary kernel parameter s390: remove unneeded sizeof(void *) comparisons s390/facilities: remove transactional-execution bits s390/numa: re-add DIE sched_domain_topology_level s390/dasd: enhance CUIR scope detection s390/dasd: fix failing path verification s390/vdso: emit a GNU hash s390/numa: make core to node mapping data dynamic ...
| * s390/kernel: lazy restore fpu registersHendrik Brueckner2015-07-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the save and restore behavior of FPU register contents to use the vector extension within the kernel. The kernel does not use floating-point or vector registers and, therefore, saving and restoring the FPU register contents are performed for handling signals or switching processes only. To prepare for using vector instructions and vector registers within the kernel, enhance the save behavior and implement a lazy restore at return to user space from a system call or interrupt. To implement the lazy restore, the save_fpu_regs() sets a CPU information flag, CIF_FPU, to indicate that the FPU registers must be restored. Saving and setting CIF_FPU is performed in an atomic fashion to be interrupt-safe. When the kernel wants to use the vector extension or wants to change the FPU register state for a task during signal handling, the save_fpu_regs() must be called first. The CIF_FPU flag is also set at process switch. At return to user space, the FPU state is restored. In particular, the FPU state includes the floating-point or vector register contents, as well as, vector-enablement and floating-point control. The FPU state restore and clearing CIF_FPU is also performed in an atomic fashion. For KVM, the restore of the FPU register state is performed when restoring the general-purpose guest registers before the SIE instructions is started. Because the path towards the SIE instruction is interruptible, the CIF_FPU flag must be checked again right before going into SIE. If set, the guest registers must be reloaded again by re-entering the outer SIE loop. This is the same behavior as if the SIE critical section is interrupted. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | KVM: s390: Provide global debug logChristian Borntraeger2015-07-291-1/+0
| | | | | | | | | | | | | | | | In addition to the per VM debug logs, let's provide a global one for KVM-wide events, like new guests or fatal errors. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
* | KVM: s390: add kvm stat counter for all diagnosesChristian Borntraeger2015-07-291-0/+3
|/ | | | | | | | | | | | Sometimes kvm stat counters are the only performance metric to check after something went wrong. Let's add additional counters for some diagnoses. In addition do the count for diag 10 all the time, even if we inject a program interrupt. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
* KVM: add memslots argument to kvm_arch_memslots_updatedPaolo Bonzini2015-05-261-1/+1
| | | | | | | Prepare for the case of multiple address spaces. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: s390: make exit_sie_sync more robustChristian Borntraeger2015-05-081-1/+2
| | | | | | | | | | | | | | | | | exit_sie_sync is used to kick CPUs out of SIE and prevent reentering at any point in time. This is used to reload the prefix pages and to set the IBS stuff in a way that guarantees that after this function returns we are no longer in SIE. All current users trigger KVM requests. The request must be set before we block the CPUs to avoid races. Let's make this implicit by adding the request into a new function kvm_s390_sync_requests that replaces exit_sie_sync and split out s390_vcpu_block and s390_vcpu_unblock, that can be used to keep CPUs out of SIE independent of requests. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* KVM: s390: Enable guest EDAT2 supportGuenther Hutzl2015-05-081-0/+1
| | | | | | | | | | | | | | | | | | 1. Enable EDAT2 in the list of KVM facilities 2. Handle 2G frames in pfmf instruction If we support EDAT2, we may enable handling of 2G frames if not in 24 bit mode. 3. Enable EDAT2 in sie_block If the EDAT2 facility is available we enable GED2 mode control in the sie_block. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: deliver floating interrupts in order of priorityJens Freimann2015-03-311-3/+27
| | | | | | | | | | | | | | | | | | | | This patch makes interrupt handling compliant to the z/Architecture Principles of Operation with regard to interrupt priorities. Add a bitmap for pending floating interrupts. Each bit relates to a interrupt type and its list. A turned on bit indicates that a list contains items (interrupts) which need to be delivered. When delivering interrupts on a cpu we can merge the existing bitmap for cpu-local interrupts and floating interrupts and have a single mechanism for delivery. Currently we have one list for all kinds of floating interrupts and a corresponding spin lock. This patch adds a separate list per interrupt type. An exception to this are service signal and machine check interrupts, as there can be only one pending interrupt at a time. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* KVM: s390: represent SIMD cap in kvm facilityMichael Mueller2015-03-171-1/+0
| | | | | | | | | | | | | | The patch represents capability KVM_CAP_S390_VECTOR_REGISTERS by means of the SIMD facility bit. This allows to a) disable the use of SIMD when used in conjunction with a not-SIMD-aware QEMU, b) to enable SIMD when used with a SIMD-aware version of QEMU and c) finally by means of a QEMU version using the future cpu model ioctls. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Tested-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: introduce post handlers for STSIEkaterina Tumanova2015-03-171-0/+1
| | | | | | | | | | | | | | | | | | The Store System Information (STSI) instruction currently collects all information it relays to the caller in the kernel. Some information, however, is only available in user space. An example of this is the guest name: The kernel always sets "KVMGuest", but user space knows the actual guest name. This patch introduces a new exit, KVM_EXIT_S390_STSI, guarded by a capability that can be enabled by user space if it wants to be able to insert such data. User space will be provided with the target buffer and the requested STSI function code. Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Enable vector support for capable guestEric Farman2015-03-061-1/+3
| | | | | | | | | | | | | We finally have all the pieces in place, so let's include the vector facility bit in the mask of available hardware facilities for the guest to recognize. Also, enable the vector functionality in the guest control blocks, to avoid a possible vector data exception that would otherwise occur when a vector instruction is issued by the guest operating system. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Add new SIGP order to kernel countersEric Farman2015-03-061-0/+1
| | | | | | | | | | | The new SIGP order Store Additional Status at Address is totally handled by user space, but we should still record the occurrence of this order in the kernel code. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Vector exceptionsEric Farman2015-03-061-0/+1
| | | | | | | | | | | A new exception type for vector instructions is introduced with the new processor, but is handled exactly like a Data Exception which is already handled by the system. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Allocate and save/restore vector registersEric Farman2015-03-061-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Define and allocate space for both the host and guest views of the vector registers for a given vcpu. The 32 vector registers occupy 128 bits each (512 bytes total), but architecturally are paired with 512 additional bytes of reserved space for future expansion. The kvm_sync_regs structs containing the registers are union'ed with 1024 bytes of padding in the common kvm_run struct. The addition of 1024 bytes of new register information clearly exceeds the existing union, so an expansion of that padding is required. When changing environments, we need to appropriately save and restore the vector registers viewed by both the host and guest, into and out of the sync_regs space. The floating point registers overlay the upper half of vector registers 0-15, so there's a bit of data duplication here that needs to be carefully avoided. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: include guest facilities in kvm facility testMichael Mueller2015-03-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Most facility related decisions in KVM have to take into account: - the facilities offered by the underlying run container (LPAR/VM) - the facilities supported by the KVM code itself - the facilities requested by a guest VM This patch adds the KVM driver requested facilities to the test routine. It additionally renames struct s390_model_fac to kvm_s390_fac and its field names to be more meaningful. The semantics of the facilities stored in the KVM architecture structure is changed. The address arch.model.fac->list now points to the guest facility list and arch.model.fac->mask points to the KVM facility mask. This patch fixes the behaviour of KVM for some facilities for guests that ignore the guest visible facility bits, e.g. guests could use transactional memory intructions on hosts supporting them even if the chosen cpu model would not offer them. The userspace interface is not affected by this change. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: add cpu model supportMichael Mueller2015-02-091-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | This patch enables cpu model support in kvm/s390 via the vm attribute interface. During KVM initialization, the host properties cpuid, IBC value and the facility list are stored in the architecture specific cpu model structure. During vcpu setup, these properties are taken to initialize the related SIE state. This mechanism allows to adjust the properties from user space and thus to implement different selectable cpu models. This patch uses the IBC functionality to block instructions that have not been implemented at the requested CPU type and GA level compared to the full host capability. Userspace has to initialize the cpu model before vcpu creation. A cpu model change of running vcpus is not possible. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: use facilities and cpu_id per KVMMichael Mueller2015-02-091-0/+21
| | | | | | | | | | | The patch introduces facilities and cpu_ids per virtual machine. Different virtual machines may want to expose different facilities and cpu ids to the guest, so let's make them per-vm instead of global. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390/CPACF: Choose crypto control block formatTony Krowiak2015-02-091-0/+2
| | | | | | | | | | | We need to specify a different format for the crypto control block depending on whether the APXA facility is installed or not. Let's test for it by executing the PQAP(QCI) function and use either a format-1 or a format-2 crypto control block accordingly. This is a host only change for z13 and does not affect the guest view. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* kvm: add halt_poll_ns module parameterPaolo Bonzini2015-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: s390/cpacf: Enable/disable protected key functions for kvm guestTony Krowiak2015-01-231-2/+8
| | | | | | | | | | | | | | | | Created new KVM device attributes for indicating whether the AES and DES/TDES protected key functions are available for programs running on the KVM guest. The attributes are used to set up the controls in the guest SIE block that specify whether programs running on the guest will be given access to the protected key functions available on the s390 hardware. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split MSA4/protected key into two patches]
* KVM: s390: Provide guest TOD Clock Get/Set ControlsJason J. Herne2015-01-231-0/+1
| | | | | | | | | | | | | | | | Provide controls for setting/getting the guest TOD clock based on the VM attribute interface. Provide TOD and TOD_HIGH vm attributes on s390 for managing guest Time Of Day clock value. TOD_HIGH is presently always set to 0. In the future it will contain a high order expansion of the tod clock value after it overflows the 64-bits of the TOD. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: forward most SIGP orders to user spaceDavid Hildenbrand2015-01-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Most SIGP orders are handled partially in kernel and partially in user space. In order to: - Get a correct SIGP SET PREFIX handler that informs user space - Avoid race conditions between concurrently executed SIGP orders - Serialize SIGP orders per VCPU We need to handle all "slow" SIGP orders in user space. The remaining ones to be handled completely in kernel are: - SENSE - SENSE RUNNING - EXTERNAL CALL - EMERGENCY SIGNAL - CONDITIONAL EMERGENCY SIGNAL According to the PoP, they have to be fast. They can be executed without conflicting to the actions of other pending/concurrently executing orders (e.g. STOP vs. START). This patch introduces a new capability that will - when enabled - forward all but the mentioned SIGP orders to user space. The instruction counters in the kernel are still updated. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: clear the pfault queue if user space sets the invalid tokenDavid Hildenbrand2015-01-231-1/+0
| | | | | | | | | | | | | We need a way to clear the async pfault queue from user space (e.g. for resets and SIGP SET ARCHITECTURE). This patch simply clears the queue as soon as user space sets the invalid pfault token. The definition of the invalid token is moved to uapi. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: only one external call may be pending at a timeDavid Hildenbrand2015-01-231-3/+5
| | | | | | | | | | | | | | | | | | | | Only one external call may be pending at a vcpu at a time. For this reason, we have to detect whether the SIGP externcal call interpretation facility is available. If so, all external calls have to be injected using this mechanism. SIGP EXTERNAL CALL orders have to return whether another external call is already pending. This check was missing until now. SIGP SENSE hasn't returned yet in all conditions whether an external call was pending. If a SIGP EXTERNAL CALL irq is to be injected and one is already pending, -EBUSY is returned. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: handle stop irqs without action_bitsDavid Hildenbrand2015-01-231-5/+0
| | | | | | | | | | | | | | | | | | | | | This patch removes the famous action_bits and moves the handling of SIGP STOP AND STORE STATUS directly into the SIGP STOP interrupt. The new local interrupt infrastructure is used to track pending stop requests. STOP irqs are the only irqs that don't get actively delivered. They remain pending until the stop function is executed (=stop intercept). If another STOP irq is already pending, -EBUSY will now be returned (needed for the SIGP handling code). Migration of pending SIGP STOP (AND STORE STATUS) orders should now be supported out of the box. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: new parameter for SIGP STOP irqsDavid Hildenbrand2015-01-231-0/+2
| | | | | | | | | | | | | | | In order to get rid of the action_flags and to properly migrate pending SIGP STOP irqs triggered e.g. by SIGP STOP AND STORE STATUS, we need to remember whether to store the status when stopping. For this reason, a new parameter (flags) for the SIGP STOP irq is introduced. These flags further define details of the requested STOP and can be easily migrated. Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: handle pending local interrupts via bitmapJens Freimann2014-11-281-2/+0
| | | | | | | | | | | | | | | | | | | This patch adapts handling of local interrupts to be more compliant with the z/Architecture Principles of Operation and introduces a data structure which allows more efficient handling of interrupts. * get rid of li->active flag, use bitmap instead * Keep interrupts in a bitmap instead of a list * Deliver interrupts in the order of their priority as defined in the PoP * Use a second bitmap for sigp emergency requests, as a CPU can have one request pending from every other CPU in the system. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: add bitmap for handling cpu-local interruptsJens Freimann2014-11-281-0/+86
| | | | | | | | | | | | Adds a bitmap to the vcpu structure which is used to keep track of local pending interrupts. Also add enum with all interrupt types sorted in order of priority (highest to lowest) Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Fix rewinding of the PSW pointing to an EXECUTE instructionThomas Huth2014-11-281-1/+1
| | | | | | | | | | | | | | | | | A couple of our interception handlers rewind the PSW to the beginning of the instruction to run the intercepted instruction again during the next SIE entry. This normally works fine, but there is also the possibility that the instruction did not get run directly but via an EXECUTE instruction. In this case, the PSW does not point to the instruction that caused the interception, but to the EXECUTE instruction! So we've got to rewind the PSW to the beginning of the EXECUTE instruction instead. This is now accomplished with a new helper function kvm_s390_rewind_psw(). Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: sigp: instruction counters for all sigp ordersDavid Hildenbrand2014-10-281-0/+7
| | | | | | | | | This patch introduces instruction counters for all known sigp orders and also a separate one for unknown orders that are passed to user space. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Make the simple ipte mutex specific to a VM instead of globalThomas Huth2014-10-281-0/+2
| | | | | | | | | | | | The ipte-locking should be done for each VM seperately, not globally. This way we avoid possible congestions when the simple ipte-lock is used and multiple VMs are running. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: count vcpu wakeups in stat.halt_wakeupDavid Hildenbrand2014-10-011-0/+1
| | | | | | | | | This patch introduces the halt_wakeup counter used by common code and uses it to count vcpu wakeups done in s390 arch specific code. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: CPACF: Enable MSA4 instructions for kvm guestTony Krowiak2014-09-101-1/+13
| | | | | | | | | | | | | | We have to provide a per guest crypto block for the CPUs to enable MSA4 instructions. According to icainfo on z196 or later this enables CCM-AES-128, CMAC-AES-128, CMAC-AES-192 and CMAC-AES-256. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split MSA4/protected key into two patches]
* KVM: remove garbage arg to *hardware_{en,dis}ableRadim Krčmář2014-08-291-1/+1
| | | | | | | | | | In the beggining was on_each_cpu(), which required an unused argument to kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten. Remove unnecessary arguments that stem from this. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: static inline empty kvm_arch functionsRadim Krčmář2014-08-291-0/+14
| | | | | | | | | | | | | | | Using static inline is going to save few bytes and cycles. For example on powerpc, the difference is 700 B after stripping. (5 kB before) This patch also deals with two overlooked empty functions: kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c 2df72e9bc KVM: split kvm_arch_flush_shadow and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c. e790d9ef6 KVM: add kvm_arch_sched_in Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: forward declare structs in kvm_types.hPaolo Bonzini2014-08-291-2/+3
| | | | | | | | | | Opaque KVM structs are useful for prototypes in asm/kvm_host.h, to avoid "'struct foo' declared inside parameter list" warnings (and consequent breakage due to conflicting types). Move them from individual files to a generic place in linux/kvm_types.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: s390: remove the tasklet used by the hrtimerDavid Hildenbrand2014-07-211-1/+0
| | | | | | | | | | We can get rid of the tasklet used for waking up a VCPU in the hrtimer code but wakeup the VCPU directly. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
OpenPOWER on IntegriCloud