summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* KVM: VMX: fix sparse warningHannes Eder2008-12-311-1/+1
| | | | | | | | | Impact: make global function static arch/x86/kvm/vmx.c:134:3: warning: symbol 'vmx_capability' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Remove extraneous semicolon after do/whileAvi Kivity2008-12-311-1/+1
| | | | | | Notices by Guillaume Thouvenin. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: fix popf emulationAvi Kivity2008-12-311-0/+2
| | | | | | Set operand type and size to get correct writeback behavior. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: fix ret emulationAvi Kivity2008-12-311-0/+2
| | | | | | | 'ret' did not set the operand type or size for the destination, so writeback ignored it. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: switch 'pop reg' instruction to emulate_pop()Avi Kivity2008-12-311-7/+4
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: allow pop from mmioAvi Kivity2008-12-311-3/+3
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Extract 'pop' sequence into a functionAvi Kivity2008-12-311-4/+17
| | | | | | Switch 'pop r/m' instruction to use the new function. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: s390: Fix memory leak of vcpu->runChristian Borntraeger2008-12-311-2/+2
| | | | | | | | | | | The s390 backend of kvm never calls kvm_vcpu_uninit. This causes a memory leak of vcpu->run pages. Lets call kvm_vcpu_uninit in kvm_arch_vcpu_destroy to free the vcpu->run. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: s390: Fix refcounting and allow module unloadChristian Borntraeger2008-12-311-14/+21
| | | | | | | | | | | | Currently it is impossible to unload the kvm module on s390. This patch fixes kvm_arch_destroy_vm to release all cpus. This make it possible to unload the module. In addition we stop messing with the module refcount in arch code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: consolidate emulation of two operand instructionsAvi Kivity2008-12-311-51/+28
| | | | | | No need to repeat the same assembly block over and over. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: reduce duplication in one operand emulation thunksAvi Kivity2008-12-311-43/+23
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: optimize set_spte for page syncMarcelo Tosatti2008-12-311-0/+9
| | | | | | | | | | | | | | | | | | The write protect verification in set_spte is unnecessary for page sync. Its guaranteed that, if the unsync spte was writable, the target page does not have a write protected shadow (if it had, the spte would have been write protected under mmu_lock by rmap_write_protect before). Same reasoning applies to mark_page_dirty: the gfn has been marked as dirty via the pagefault path. The cost of hash table and memslot lookups are quite significant if the workload is pagetable write intensive resulting in increased mmu_lock contention. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: KVM guest: sign kvmclock as paravirtGlauber Costa2008-12-311-0/+2
| | | | | | | | | | Currently, we only set the KVM paravirt signature in case of CONFIG_KVM_GUEST. However, it is possible to have it turned off, while CONFIG_KVM_CLOCK is turned on. This is also a paravirt case, and should be shown accordingly. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Conditionally request interrupt window after injecting irqAvi Kivity2008-12-311-0/+2
| | | | | | | | | | | If we're injecting an interrupt, and another one is pending, request an interrupt window notification so we don't have excess latency on the second interrupt. This shouldn't happen in practice since an EOI will be issued, giving a second chance to request an interrupt window, but... Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Clean up vmm_ivt.S using tab to indent every lineXiantao Zhang2008-12-311-741/+729
| | | | | | | Using tab for indentation for vmm_ivt.S. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Add handler for crashed vmmXiantao Zhang2008-12-314-12/+44
| | | | | | | | | Since vmm runs in an isolated address space and it is just a copy of host's kvm-intel module, so once vmm crashes, we just crash all guests running on it instead of crashing whole kernel. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Add some debug points to provide crash infomationXiantao Zhang2008-12-315-33/+88
| | | | | | | Use printk infrastructure to print out some debug info once VM crashes. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Define printk function for kvm-intel moduleXiantao Zhang2008-12-315-1/+54
| | | | | | | | | | kvm-intel module is relocated to an isolated address space with kernel, so it can't call host kernel's printk for debug purpose. In the module, we implement the printk to output debug info of vmm. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: disable VMX on all CPUs on rebootEduardo Habkost2008-12-311-2/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On emergency_restart, we may need to use an NMI to disable virtualization on all CPUs. We do that using nmi_shootdown_cpus() if VMX is enabled. Note: With this patch, we will run the NMI stuff only when the CPU where emergency_restart() was called has VMX enabled. This should work on most cases because KVM enables VMX on all CPUs, but we may miss the small window where KVM is doing that. Also, I don't know if all code using VMX out there always enable VMX on all CPUs like KVM does. We have two other alternatives for that: a) Have an API that all code that enables VMX on any CPU should use to tell the kernel core that it is going to enable VMX on the CPUs. b) Always call nmi_shootdown_cpus() if the CPU supports VMX. This is a bit intrusive and more risky, as it would run nmi_shootdown_cpus() on emergency_reboot() even on systems where virtualization is never enabled. Finding a proper point to hook the nmi_shootdown_cpus() call isn't trivial, as the non-emergency machine_restart() (that doesn't need the NMI tricks) uses machine_emergency_restart() directly. The solution to make this work without adding a new function or argument to machine_ops was setting a 'reboot_emergency' flag that tells if native_machine_emergency_restart() needs to do the virt cleanup or not. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* kdump: forcibly disable VMX and SVM on machine_crash_shutdown()Eduardo Habkost2008-12-311-0/+18
| | | | | | | | | | | | | We need to disable virtualization extensions on all CPUs before booting the kdump kernel, otherwise the kdump kernel booting will fail, and rebooting after the kdump kernel did its task may also fail. We do it using cpu_emergency_vmxoff() and cpu_emergency_svm_disable(), that should always work, because those functions check if the CPUs support SVM or VMX before doing their tasks. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: cpu_emergency_svm_disable() functionEduardo Habkost2008-12-311-0/+8
| | | | | | | | This function can be used by the reboot or kdump code to forcibly disable SVM on the CPU. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move svm_hardware_disable() code to asm/virtext.hEduardo Habkost2008-12-312-5/+15
| | | | | | | Create cpu_svm_disable() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move has_svm() code to asm/virtext.hEduardo Habkost2008-12-312-14/+46
| | | | | | | | | Use a trick to keep the printk()s on has_svm() working as before. gcc will take care of not generating code for the 'msg' stuff when the function is called with a NULL msg argument. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: cpu_emergency_vmxoff() functionEduardo Habkost2008-12-311-0/+23
| | | | | | | | Add cpu_emergency_vmxoff() and its friends: cpu_vmx_enabled() and __cpu_emergency_vmxoff(). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable()Eduardo Habkost2008-12-311-2/+11
| | | | | | | | Along with some comments on why it is different from the core cpu_vmxoff() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: asm/virtext.h: add cpu_vmxoff() inline functionEduardo Habkost2008-12-311-0/+15
| | | | | | | | | Unfortunately we can't use exactly the same code from vmx hardware_disable(), because the KVM function uses the __kvm_handle_fault_on_reboot() tricks. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.hEduardo Habkost2008-12-312-2/+33
| | | | | | | | It will be used by core code on kdump and reboot, to disable vmx if needed. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move ASM_VMX_* definitions from asm/kvm_host.h to asm/vmx.hEduardo Habkost2008-12-312-12/+15
| | | | | | | | | | | Those definitions will be used by code outside KVM, so move it outside of a KVM-specific source file. Those definitions are used only on kvm/vmx.c, that already includes asm/vmx.h, so they can be moved safely. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move svm.h to include/asmEduardo Habkost2008-12-312-1/+1
| | | | | | | | svm.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move vmx.h to include/asmEduardo Habkost2008-12-313-2/+2
| | | | | | | | vmx.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: fix userspace mapping invalidation on context switchHollis Blanchard2008-12-313-22/+20
| | | | | | | | | | | | | | | | We used to defer invalidating userspace TLB entries until jumping out of the kernel. This was causing MMU weirdness most easily triggered by using a pipe in the guest, e.g. "dmesg | tail". I believe the problem was that after the guest kernel changed the PID (part of context switch), the old process's mappings were still present, and so copy_to_user() on the "return to new process" path ended up using stale mappings. Testing with large pages (64K) exposed the problem, probably because with 4K pages, pressure on the TLB faulted all process A's mappings out before the guest kernel could insert any for process B. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: use prefetchable mappings for guest memoryHollis Blanchard2008-12-311-2/+7
| | | | | | | | | | | | | | | | | | | Bare metal Linux on 440 can "overmap" RAM in the kernel linear map, so that it can use large (256MB) mappings even if memory isn't a multiple of 256MB. To prevent the hardware prefetcher from loading from an invalid physical address through that mapping, it's marked Guarded. However, KVM must ensure that all guest mappings are backed by real physical RAM (since a deliberate access through a guarded mapping could still cause a machine check). Accordingly, we don't need to make our mappings guarded, so let's allow prefetching as the designers intended. Curiously this patch didn't affect performance at all on the quick test I tried, but it's clearly the right thing to do anyways and may improve other workloads. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: use MMUCR accessor to obtain TIDHollis Blanchard2008-12-311-1/+1
| | | | | | | We have an accessor; might as well use it. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Remove some macro definitions in asm-offsets.c.Xiantao Zhang2008-12-311-10/+1
| | | | | | | Use kernel's corresponding macro instead. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: fix Kconfig constraintsHollis Blanchard2008-12-311-10/+8
| | | | | | | | Make sure that CONFIG_KVM cannot be selected without processor support (currently, 440 is the only processor implementation available). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix cpuid iteration on multiple leaves per eacNitin A Kamble2008-12-311-1/+2
| | | | | | | | | | | | | | | | | | The code to traverse the cpuid data array list for counting type of leaves is currently broken. This patches fixes the 2 things in it. 1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without it the code will never find a valid entry. 2. Also the stop condition in the for loop while looking for the next unflaged entry is broken. It needs to stop when it find one matching entry; and in the case of count of 1, it will be the same entry found in this iteration. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix cpuid leaf 0xb loop terminationNitin A Kamble2008-12-311-1/+1
| | | | | | | | | For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting leaf. The previous code was using bits 0-7 for this purpose, which is a bug. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: improve trap emulationHollis Blanchard2008-12-311-2/+2
| | | | | | | | | | set ESR[PTR] when emulating a guest trap. This allows Linux guests to properly handle WARN_ON() (i.e. detect that it's a non-fatal trap). Also remove debugging printk in trap emulation. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: optimize irq delivery pathHollis Blanchard2008-12-314-156/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In kvmppc_deliver_interrupt is just one case left in the switch and it is a rare one (less than 8%) when looking at the exit numbers. Therefore we can at least drop the switch/case and if an if. I inserted an unlikely too, but that's open for discussion. In kvmppc_can_deliver_interrupt all frequent cases are in the default case. I know compilers are smart but we can make it easier for them. By writing down all options and removing the default case combined with the fact that ithe values are constants 0..15 should allow the compiler to write an easy jump table. Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems to be unable to reduce priority_exception[x] to a build time constant. Therefore I changed the usage of the translation arrays in the interrupt delivery path completely. It is now using priority without translation to irq on the full irq delivery path. To be able to do that ivpr regs are stored by their priority now. Additionally the decision made in kvmppc_can_deliver_interrupt is already sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace the 16x4byte array used here with a single 4byte variable (might still be one miss, but the chance to find this in cache should be better than the right entry of the whole array). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: optimize find first bitHollis Blanchard2008-12-311-1/+1
| | | | | | | | Since we use a unsigned long here anyway we can use the optimized __ffs. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: optimize kvm stat handlingHollis Blanchard2008-12-311-23/+9
| | | | | | | | | | | | | | | | | | | Currently we use an unnecessary if&switch to detect some cases. To be honest we don't need the ligh_exits counter anyway, because we can calculate it out of others. Sum_exits can also be calculated, so we can remove that too. MMIO, DCR and INTR can be counted on other places without these additional control structures (The INTR case was never hit anyway). The handling of BOOKE_INTERRUPT_EXTERNAL/BOOKE_INTERRUPT_DECREMENTER is similar, but we can avoid the additional if when copying 3 lines of code. I thought about a goto there to prevent duplicate lines, but rewriting three lines should be better style than a goto cross switch/case statements (its also not enough code to justify a new inline function). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: fix set regs to take care of msr changeHollis Blanchard2008-12-311-1/+1
| | | | | | | | | | | | | | When changing some msr bits e.g. problem state we need to take special care of that. We call the function in our mtmsr emulation (not needed for wrtee[i]), but we don't call kvmppc_set_msr if we change msr via set_regs ioctl. It's a corner case we never hit so far, but I assume it should be kvmppc_set_msr in our arch set regs function (I found it because it is also a corner case when using pv support which would miss the update otherwise). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: adjust vcpu types to support 64-bit coresHollis Blanchard2008-12-314-33/+33
| | | | | | | | However, some of these fields could be split into separate per-core structures in the future. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessorHollis Blanchard2008-12-319-63/+154
| | | | | | | | | | | This patch doesn't yet move all 44x-specific data into the new structure, but is the first step down that path. In the future we may also want to create a struct kvm_vcpu_booke. Based on patch from Liu Yu <yu.liu@freescale.com>. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: Move the last bits of 44x code out of booke.cHollis Blanchard2008-12-313-44/+58
| | | | | | | Needed to port to other Book E processors. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: refactor instruction emulation into generic and core-specific piecesHollis Blanchard2008-12-318-276/+415
| | | | | | | | | | | | | | | | | | Cores provide 3 emulation hooks, implemented for example in the new 4xx_emulate.c: kvmppc_core_emulate_op kvmppc_core_emulate_mtspr kvmppc_core_emulate_mfspr Strictly speaking the last two aren't necessary, but provide for more informative error reporting ("unknown SPR"). Long term I'd like to have instruction decoding autogenerated from tables of opcodes, and that way we could aggregate universal, Book E, and core-specific instructions more easily and without redundant switch statements. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* ppc: Create disassemble.h to extract instruction fieldsHollis Blanchard2008-12-312-56/+81
| | | | | | | | | | | | | This is used in a couple places in KVM, but isn't KVM-specific. However, this patch doesn't modify other in-kernel emulation code: - xmon uses a direct copy of ppc_opc.c from binutils - emulate_instruction() doesn't need it because it can use a series of mask tests. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: Refactor powerpc.c to relocate 440-specific codeHollis Blanchard2008-12-319-130/+207
| | | | | | | | | | | | This introduces a set of core-provided hooks. For 440, some of these are implemented by booke.c, with the rest in (the new) 44x.c. Note that these hooks are link-time, not run-time. Since it is not possible to build a single kernel for both e500 and 440 (for example), using function pointers would only add overhead. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: combine booke_guest.c and booke_host.cHollis Blanchard2008-12-314-89/+66
| | | | | | | | The division was somewhat artificial and cumbersome, and had no functional benefit anyways: we can only guests built for the real host processor. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: Rename "struct tlbe" to "struct kvmppc_44x_tlbe"Hollis Blanchard2008-12-316-34/+33
| | | | | | | | | This will ease ports to other cores. Also remove unused "struct kvm_tlb" while we're at it. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
OpenPOWER on IntegriCloud