summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/vmx.c
Commit message (Collapse)AuthorAgeFilesLines
* KVM: VMX: Rename VMX_EPT_IGMT_BIT to VMX_EPT_IPAT_BITSheng Yang2010-03-011-2/+2
| | | | | | | Following the new SDM. Now the bit is named "Ignore PAT memory type". Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Remove redundant test in vmx_set_efer()Julia Lawall2010-03-011-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | msr was tested above, so the second test is not needed. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ expression *x; expression e; identifier l; @@ if (x == NULL || ...) { ... when forall return ...; } ... when != goto l; when != x = e when != &x *x == NULL // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Wire up .fpu_activate() callbackAvi Kivity2010-03-011-0/+1
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Remove redundant check in vm_need_virtualize_apic_accesses()Gui Jianfeng2010-03-011-3/+1
| | | | | | | | flexpriority_enabled implies cpu_has_vmx_virtualize_apic_accesses() returning true, so we don't need this check here. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Trace failed msr reads and writesAvi Kivity2010-03-011-2/+3
| | | | | | | Record failed msrs reads and writes, and the fact that they failed as well. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Pass cr0.mp through to the guest when the fpu is activeAvi Kivity2010-03-011-6/+9
| | | | | | | | | | | | | | When cr0.mp is clear, the guest doesn't expect a #NM in response to a WAIT instruction. Because we always keep cr0.mp set, it will get a #NM, and potentially be confused. Fix by keeping cr0.mp set only when the fpu is inactive, and passing it through when inactive. Reported-by: Lorenzo Martignoni <martignlo@gmail.com> Analyzed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Rename vcpu->shadow_efer to eferAvi Kivity2010-03-011-7/+7
| | | | | | | None of the other registers have the shadow_ prefix. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Add a helper for checking if the guest is in protected modeAvi Kivity2010-03-011-2/+2
| | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Activate fpu on cltsAvi Kivity2010-03-011-0/+1
| | | | | | | | Assume that if the guest executes clts, it knows what it's doing, and load the guest fpu to prevent an #NM exception. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Clean up DR6 emulationJan Kiszka2010-03-011-17/+6
| | | | | | | | As we trap all debug register accesses, we do not need to switch real DR6 at all. Clean up update_exception_bitmap at this chance, too. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Fix emulation of DR4 and DR5Jan Kiszka2010-03-011-9/+26
| | | | | | | | Make sure DR4 and DR5 are aliased to DR6 and DR7, respectively, if CR4.DE is not set. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Fix exceptions of mov to drJan Kiszka2010-03-011-5/+8
| | | | | | | | | Injecting GP without an error code is a bad idea (causes unhandled guest exits). Moreover, we must not skip the instruction if we injected an exception. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Remove emulation failure reportSheng Yang2010-03-011-1/+0
| | | | | | | | | | | | | | | As Avi noted: >There are two problems with the kernel failure report. First, it >doesn't report enough data - registers, surrounding instructions, etc. >that are needed to explain what is going on. Second, it can flood >dmesg, which is a pretty bad thing to do. So we remove the emulation failure report in handle_invalid_guest_state(), and would inspected the guest using userspace tool in the future. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Give the guest ownership of cr0.ts when the fpu is activeAvi Kivity2010-03-011-2/+9
| | | | | | | | | | | | If the guest fpu is loaded, there is nothing interesing about cr0.ts; let the guest play with it as it will. This makes context switches between fpu intensive guest processes faster, as we won't trap the clts and cr0 write instructions. [marcelo: fix cr0 read shadow update on fpu deactivation; kills F8 install] Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Lazify fpu activation and deactivationAvi Kivity2010-03-011-16/+9
| | | | | | | | | | | Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep it loaded until the next heavyweight exit (where we are forced to unload it). This reduces unnecessary exits. We also defer fpu activation on clts; while clts signals the intent to use the fpu, we can't be sure the guest will actually use it. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Allow the guest to own some cr0 bitsAvi Kivity2010-03-011-0/+9
| | | | | | We will use this later to give the guest ownership of cr0.ts. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Replace read accesses of vcpu->arch.cr0 by an accessorAvi Kivity2010-03-011-8/+8
| | | | | | | Since we'd like to allow the guest to own a few bits of cr0 at times, we need to know when we access those bits. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: trace clts and lmsw instructions as cr accessesAvi Kivity2010-03-011-1/+4
| | | | | | clts writes cr0.ts; lmsw writes cr0[0:15] - record that in ftrace. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Enable EPT 1GB page supportSheng Yang2010-03-011-1/+10
| | | | | Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_opsSheng Yang2010-03-011-3/+3
| | | | | | | | | | Then the callback can provide the maximum supported large page level, which is more flexible. Also move the gb page support into x86_64 specific. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Fill out ftrace exit reason stringsAvi Kivity2010-03-011-19/+39
| | | | | | | Some exit reasons missed their strings; fill out the table. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: convert slots_lock to a mutexMarcelo Tosatti2010-03-011-4/+4
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: switch vcpu context to use SRCUMarcelo Tosatti2010-03-011-3/+3
| | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU updateMarcelo Tosatti2010-03-011-1/+5
| | | | | | | | | | Use two steps for memslot deletion: mark the slot invalid (which stops instantiation of new shadow pages for that slot, but allows destruction), then instantiate the new empty slot. Also simplifies kvm_handle_hva locking. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: modify memslots layout in struct kvmMarcelo Tosatti2010-03-011-2/+2
| | | | | | | | | Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Add instruction rdtscp support for guestSheng Yang2010-03-011-3/+57
| | | | | | | Before enabling, execution of "rdtscp" in guest would result in #UD. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Add cpuid_update() callback to kvm_x86_opsSheng Yang2010-03-011-0/+6
| | | | | | | | | | | Sometime, we need to adjust some state in order to reflect guest CPUID setting, e.g. if we don't expose rdtscp to guest, we won't want to enable it on hardware. cpuid_update() is introduced for this purpose. Also export kvm_find_cpuid_entry() for later use. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Remove redundant variableSheng Yang2010-03-011-2/+0
| | | | | | | It's no longer necessary. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fold ept_update_paging_mode_cr4() into its callerAvi Kivity2010-03-011-12/+8
| | | | | | | | | | ept_update_paging_mode_cr4() accesses vcpu->arch.cr4 directly, which usually needs to be accessed via kvm_read_cr4(). In this case, we can't, since cr4 is in the process of being updated. Instead of adding inane comments, fold the function into its caller (vmx_set_cr4), so it can use the not-yet-committed cr4 directly. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: When using ept, allow the guest to own cr4.pgeAvi Kivity2010-03-011-0/+2
| | | | | | | We make no use of cr4.pge if ept is enabled, but the guest does (to flush global mappings, as with vmap()), so give the guest ownership of this bit. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Make guest cr4 mask more conservativeAvi Kivity2010-03-011-4/+6
| | | | | | | | Instead of specifying the bits which we want to trap on, specify the bits which we allow the guest to change transparently. This is safer wrt future changes to cr4. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Add accessor for reading cr4 (or some bits of cr4)Avi Kivity2010-03-011-5/+8
| | | | | | | | | | | Some bits of cr4 can be owned by the guest on vmx, so when we read them, we copy them to the vcpu structure. In preparation for making the set of guest-owned bits dynamic, use helpers to access these bits so we don't need to know where the bit resides. No changes to svm since all bits are host-owned there. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Move some cr[04] related constants to vmx.cAvi Kivity2010-03-011-0/+13
| | | | | | They have no place in common code. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Trap and invalid MWAIT/MONITOR instructionSheng Yang2010-03-011-0/+10
| | | | | | | | | | We don't support these instructions, but guest can execute them even if the feature('monitor') haven't been exposed in CPUID. So we would trap and inject a #UD if guest try this way. Cc: stable@kernel.org Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix comparison of guest efer with stale host valueAvi Kivity2009-12-031-4/+5
| | | | | | | | | | | | | | | update_transition_efer() masks out some efer bits when deciding whether to switch the msr during guest entry; for example, NX is emulated using the mmu so we don't need to disable it, and LMA/LME are handled by the hardware. However, with shared msrs, the comparison is made against a stale value; at the time of the guest switch we may be running with another guest's efer. Fix by deferring the mask/compare to the actual point of guest entry. Noted by Marcelo. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Disable unrestricted guest when EPT disabledSheng Yang2009-12-031-1/+3
| | | | | | | | Otherwise would cause VMEntry failure when using ept=0 on unrestricted guest supported processors. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Add KVM_GET/SET_VCPU_EVENTSJan Kiszka2009-12-031-0/+30
| | | | | | | | | | | | This new IOCTL exports all yet user-invisible states related to exceptions, interrupts, and NMIs. Together with appropriate user space changes, this fixes sporadic problems of vmsave/restore, live migration and system reset. [avi: future-proof abi by adding a flags field] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Report unexpected simultaneous exceptions as internal errorsAvi Kivity2009-12-031-3/+8
| | | | | | | | | These happen when we trap an exception when another exception is being delivered; we only expect these with MCEs and page faults. If something unexpected happens, things probably went south and we're better off reporting an internal error and freezing. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Allow internal errors reported to userspace to carry extra dataAvi Kivity2009-12-031-0/+1
| | | | | | | | | Usually userspace will freeze the guest so we can inspect it, but some internal state is not available. Add extra data to internal error reporting so we can expose it to the debugger. Extra data is specific to the suberror. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Remove vmx->msr_offset_eferAvi Kivity2009-12-031-7/+3
| | | | | | | This variable is used to communicate between a caller and a callee; switch to a function argument instead. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move CR3/PDPTR update to vmx_set_cr3Marcelo Tosatti2009-12-031-4/+1
| | | | | | | | | | | | GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 is modified from outside guest context. Similarly pdptrs are updated via load_pdptrs. Let kvm_set_cr3 perform the update, removing it from the vcpu_run fast path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Acked-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Use shared msr infrastructureAvi Kivity2009-12-031-70/+42
| | | | | | | | | | | | | | Instead of reloading syscall MSRs on every preemption, use the new shared msr infrastructure to reload them at the last possible minute (just before exit to userspace). Improves vcpu/idle/vcpu switches by about 2000 cycles (when EFER needs to be reloaded as well). [jan: fix slot index missing indirection] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr areaAvi Kivity2009-12-031-13/+26
| | | | | | | | | Currently MSR_KERNEL_GS_BASE is saved and restored as part of the guest/host msr reloading. Since we wish to lazy-restore all the other msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using the common code. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Use macros instead of hex value on cr0 initializationEduardo Habkost2009-12-031-1/+1
| | | | | | | This should have no effect, it is just to make the code clearer. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: fix handle_pause declarationMarcelo Tosatti2009-12-031-1/+1
| | | | | | There's no kvm_run argument anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Add support for Pause-Loop ExitingZhai, Edwin2009-12-031-1/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution control fields: PLE_Gap - upper bound on the amount of time between two successive executions of PAUSE in a loop. PLE_Window - upper bound on the amount of time a guest is allowed to execute in a PAUSE loop If the time, between this execution of PAUSE and previous one, exceeds the PLE_Gap, processor consider this PAUSE belongs to a new loop. Otherwise, processor determins the the total execution time of this loop(since 1st PAUSE in this loop), and triggers a VM exit if total time exceeds the PLE_Window. * Refer SDM volume 3b section 21.6.13 & 22.1.3. Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where one VP is sched-out after hold a spinlock, then other VPs for same lock are sched-in to waste the CPU time. Our tests indicate that most spinlocks are held for less than 212 cycles. Performance tests show that with 2X LP over-commitment we can get +2% perf improvement for kernel build(Even more perf gain with more LPs). Signed-off-by: Zhai Edwin <edwin.zhai@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Refactor guest debug IOCTL handlingJan Kiszka2009-12-031-17/+1
| | | | | | | | | Much of so far vendor-specific code for setting up guest debug can actually be handled by the generic code. This also fixes a minor deficit in the SVM part /wrt processing KVM_GUESTDBG_ENABLE. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix hotplug of CPUsZachary Amsden2009-12-031-2/+4
| | | | | | | | | | | Both VMX and SVM require per-cpu memory allocation, which is done at module init time, for only online cpus. Backend was not allocating enough structure for all possible CPUs, so new CPUs coming online could not be hardware enabled. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Activate Virtualization On DemandAlexander Graf2009-12-031-3/+8
| | | | | | | | | | | | | | | | | | | | | | | X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Enhance invalid guest state emulationMohammed Gamal2009-12-031-24/+20
| | | | | | | | | - Change returned handle_invalid_guest_state() to return relevant exit codes - Move triggering the emulation from vmx_vcpu_run() to vmx_handle_exit() - Return to userspace instead of repeatedly trying to emulate instructions that have already failed Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
OpenPOWER on IntegriCloud