summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/mmu.c
Commit message (Collapse)AuthorAgeFilesLines
* KVM: MMU: add "oos_shadow" parameter to disable oosMarcelo Tosatti2008-10-151-1/+4
| | | | | | | Subject says it all. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: speed up mmu_unsync_walkMarcelo Tosatti2008-10-151-12/+60
| | | | | | | Cache the unsynced children information in a per-page bitmap. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: out of sync shadow coreMarcelo Tosatti2008-10-151-17/+193
| | | | | | | | | | | Allow guest pagetables to go out of sync. Instead of emulating write accesses to guest pagetables, or unshadowing them, we un-write-protect the page table and allow the guest to modify it at will. We rely on invlpg executions to synchronize individual ptes, and will synchronize the entire pagetable on tlb flushes. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: mmu_convert_notrap helperMarcelo Tosatti2008-10-151-0/+14
| | | | | | | | Need to convert shadow_notrap_nonpresent -> shadow_trap_nonpresent when unsyncing pages. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: awareness of new kvm_mmu_zap_page behaviourMarcelo Tosatti2008-10-151-4/+9
| | | | | | | | kvm_mmu_zap_page will soon zap the unsynced children of a page. Restart list walk in such case. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: mmu_parent_walkMarcelo Tosatti2008-10-151-0/+27
| | | | | | | Introduce a function to walk all parents of a given page, invoking a handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: trap invlpgMarcelo Tosatti2008-10-151-0/+18
| | | | | | | | | | With pages out of sync invlpg needs to be trapped. For now simply nuke the entry. Untested on AMD. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: sync roots on mmu reloadMarcelo Tosatti2008-10-151-0/+36
| | | | | Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: mode specific sync_pageMarcelo Tosatti2008-10-151-0/+10
| | | | | | | | Examine guest pagetable and bring the shadow back in sync. Caller is responsible for local TLB flush before re-entering guest mode. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: do not write-protect large mappingsMarcelo Tosatti2008-10-151-2/+8
| | | | | | | | | | | | | There is not much point in write protecting large mappings. This can only happen when a page is shadowed during the window between is_largepage_backed and mmu_lock acquision. Zap the entry instead, so the next pagefault will find a shadowed page via is_largepage_backed and fallback to 4k translations. Simplifies out of sync shadow. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: move local TLB flush to mmu_set_spteMarcelo Tosatti2008-10-151-4/+4
| | | | | | | Since the sync page path can collapse flushes. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: split mmu_set_spteMarcelo Tosatti2008-10-151-44/+57
| | | | | | | Split the spte entry creation code into a new set_spte function. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: switch to get_user_pages_fastMarcelo Tosatti2008-10-151-14/+9
| | | | | | | | | Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7% faster on VMX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Modify kvm_shadow_walk.entry to accept u64 addrSheng Yang2008-10-151-5/+5
| | | | | | | | | | | | | EPT is 4 level by default in 32pae(48 bits), but the addr parameter of kvm_shadow_walk->entry() only accept unsigned long as virtual address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX() overflow when try to fetch level 4 index. Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in parameter. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix setting the accessed bit on non-speculative sptesAvi Kivity2008-10-151-1/+1
| | | | | | | | | | | The accessed bit was accidentally turned on in a random flag word, rather than, the spte itself, which was lucky, since it used the non-EPT compatible PT_ACCESSED_MASK. Fix by turning the bit on in the spte and changing it to use the portable accessed mask. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Flush tlbs after clearing write permission when accessing dirty logAvi Kivity2008-10-151-0/+1
| | | | | | | Otherwise, the cpu may allow writes to the tracked pages, and we lose some display bits or fail to migrate correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add locking around kvm_mmu_slot_remove_write_access()Avi Kivity2008-10-151-0/+2
| | | | | | | It was generally safe due to slots_lock being held for write, but it wasn't very nice. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Account for npt/ept/realmode page faultsAvi Kivity2008-10-151-0/+1
| | | | | | | Now that two-dimensional paging is becoming common, account for tdp page faults. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Convert direct maps to use the generic shadow walkerAvi Kivity2008-10-151-38/+55
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add generic shadow walkerAvi Kivity2008-10-151-0/+34
| | | | | | | | | | | | | | We currently walk the shadow page tables in two places: direct map (for real mode and two dimensional paging) and paging mode shadow. Since we anticipate requiring a third walk (for invlpg), it makes sense to have a generic facility for shadow walk. This patch adds such a shadow walker, walks the page tables and calls a method for every spte encountered. The method can examine the spte, modify it, or even instantiate it. The walk can be aborted by returning nonzero from the method. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Infer shadow root level in direct_map()Avi Kivity2008-10-151-5/+4
| | | | | | | In all cases the shadow root level is available in mmu.shadow_root_level, so there is no need to pass it as a parameter. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Unify direct map 4K and large page pathsAvi Kivity2008-10-151-8/+3
| | | | | | | The two paths are equivalent except for one argument, which is already available. Merge the two codepaths. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Move SHADOW_PT_INDEX to mmu.cAvi Kivity2008-10-151-0/+2
| | | | | | It is not specific to the paging mode, so can be made global (and reusable). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Reduce stack usage in kvm_pv_mmu_op()Dave Hansen2008-10-151-15/+8
| | | | | | | | | | | | | We're in a hot path. We can't use kmalloc() because it might impact performance. So, we just stick the buffer that we need into the kvm_vcpu_arch structure. This is used very often, so it is not really a waste. We also have to move the buffer structure's definition to the arch-specific x86 kvm header. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify kvm_mmu_zap_page()Avi Kivity2008-10-151-9/+5
| | | | | | | | | The twisty maze of conditionals can be reduced. [joerg: fix tlb flushing] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Separate the code for unlinking a shadow page from its parentsAvi Kivity2008-10-151-2/+7
| | | | | | Place into own function, in preparation for further cleanups. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Always return old for clear_flush_young() when using EPTSheng Yang2008-09-111-0/+4
| | | | | | | As well as discard fake accessed bit and dirty bit of EPT. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Synchronize guest physical memory map to host virtual memory mapAndrea Arcangeli2008-07-291-0/+100
| | | | | | | | | | Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Avoid instruction emulation when event delivery is pendingAvi Kivity2008-07-271-0/+1
| | | | | | | | | | | | | | | | When an event (such as an interrupt) is injected, and the stack is shadowed (and therefore write protected), the guest will exit. The current code will see that the stack is shadowed and emulate a few instructions, each time postponing the injection. Eventually the injection may succeed, but at that time the guest may be unwilling to accept the interrupt (for example, the TPR may have changed). This occurs every once in a while during a Windows 2008 boot. Fix by unshadowing the fault address if the fault was due to an event injection. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: allow enabling/disabling NPT by reloading only the architecture moduleJoerg Roedel2008-07-271-0/+6
| | | | | | | | | | If NPT is enabled after loading both KVM modules on AMD and it should be disabled, both KVM modules must be reloaded. If only the architecture module is reloaded the behavior is undefined. With this patch it is possible to disable NPT only by reloading the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hostsAvi Kivity2008-07-201-3/+4
| | | | | | | | | | | | | The direct mapped shadow code (used for real mode and two dimensional paging) sets upper-level ptes using direct assignment rather than calling set_shadow_pte(). A nonpae host will split this into two writes, which opens up a race if another vcpu accesses the same memory area. Fix by calling set_shadow_pte() instead of assigning directly. Noticed by Izik Eidus. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: improve invalid shadow root page handlingMarcelo Tosatti2008-07-201-1/+4
| | | | | | | | Harden kvm_mmu_zap_page() against invalid root pages that had been shadowed from memslots that are gone. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be heldMarcelo Tosatti2008-07-201-0/+3
| | | | | | | | | | | | kvm_mmu_zap_page() needs slots lock held (rmap_remove->gfn_to_memslot, for example). Since kvm_lock spinlock is held in mmu_shrink(), do a non-blocking down_read_trylock(). Untested. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix printk formatAvi Kivity2008-07-201-1/+1
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: When debug is enabled, make it a run-time parameterAvi Kivity2008-07-201-1/+2
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Avoid page prefetch on SVMAvi Kivity2008-07-201-1/+4
| | | | | | | | SVM cannot benefit from page prefetching since guest page fault bypass cannot by made to work there. Avoid accessing the guest page table in this case. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Move nonpaging_prefetch_page()Avi Kivity2008-07-201-9/+9
| | | | | | In preparation for next patch. No code change. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix false flooding when a pte points to page tableAvi Kivity2008-07-201-1/+16
| | | | | | | | | | | | | | | | | | | | | | The KVM MMU tries to detect when a speculative pte update is not actually used by demand fault, by checking the accessed bit of the shadow pte. If the shadow pte has not been accessed, we deem that page table flooded and remove the shadow page table, allowing further pte updates to proceed without emulation. However, if the pte itself points at a page table and only used for write operations, the accessed bit will never be set since all access will happen through the emulator. This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels. The kernel points a kmap_atomic() pte at a page table, and then proceeds with read-modify-write operations to look at the dirty and accessed bits. We get a false flood trigger on the kmap ptes, which results in the mmu spending all its time setting up and tearing down shadows. Fix by setting the shadow accessed bit on emulated accesses. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: add statics were possible, function definition in lapic.hHarvey Harrison2008-07-201-1/+1
| | | | | | | | | | | | | | | | Noticed by sparse: arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static? arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static? arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static? arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static? arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static? arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static? arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static? arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static? arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix oops on guest userspace access to guest pagetableAvi Kivity2008-06-241-6/+0
| | | | | | | | | | | | | | | | KVM has a heuristic to unshadow guest pagetables when userspace accesses them, on the assumption that most guests do not allow userspace to access pagetables directly. Unfortunately, in addition to unshadowing the pagetables, it also oopses. This never triggers on ordinary guests since sane OSes will clear the pagetables before assigning them to userspace, which will trigger the flood heuristic, unshadowing the pagetables before the first userspace access. One particular guest, though (Xenner) will run the kernel in userspace, triggering the oops. Since the heuristic is incorrect in this case, we can simply remove it. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)Marcelo Tosatti2008-06-241-5/+7
| | | | | | | | | | | | | | kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed guests properly. It will instantiate two 2MB sptes pointing to the same physical 2MB page when a guest large pte update is trapped. Instead of duplicating code to handle this, disallow directory level updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes emulating one guest 4MB pte can be correctly created by the page fault handling path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix rmap_write_protect() hugepage iteration bugMarcelo Tosatti2008-06-241-0/+1
| | | | | | | | rmap_next() does not work correctly after rmap_remove(), as it expects the rmap chains not to change during iteration. Fix (for now) by restarting iteration from the beginning. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix is_empty_shadow_page() checkAvi Kivity2008-06-061-1/+1
| | | | | | The check is only looking at one of two possible empty ptes. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: reschedule during shadow teardownAvi Kivity2008-06-061-0/+1
| | | | | | | Shadows for large guests can take a long time to tear down, so reschedule occasionally to avoid softlockup warnings. Signed-off-by: Avi Kivity <avi@qumranet.com>
* namespacecheck: automated fixesIngo Molnar2008-05-231-1/+1
| | | | Signed-off-by: Ingo Molnar <mingo@elte.hu>
* KVM: MMU: Allow more than PAGES_PER_HPAGE write protections per large pageAvi Kivity2008-05-041-1/+0
| | | | | | | | | nonpae guests can call rmap_write_protect twice per page (for page tables) or four times per page (for page directories), triggering a bogus warning. Remove the warning. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Enable EPT feature for KVMSheng Yang2008-05-041-2/+3
| | | | | Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Remove #ifdef CONFIG_X86_64 to support 4 level EPTSheng Yang2008-05-041-4/+0
| | | | | | | | Currently EPT level is 4 for both pae and x86_64. The patch remove the #ifdef for alloc root_hpa and free root_hpa to support EPT. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add EPT supportSheng Yang2008-05-041-10/+33
| | | | | | | Enable kvm_set_spte() to generate EPT entries. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add kvm_x86_ops get_tdp_level()Sheng Yang2008-05-041-2/+2
| | | | | | | | The function get_tdp_level() provided the number of tdp level for EPT and NPT rather than the NPT specific macro. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
OpenPOWER on IntegriCloud