summaryrefslogtreecommitdiffstats
path: root/virt/kvm/iommu.c
Commit message (Collapse)AuthorAgeFilesLines
* KVM: lock slots_lock around device assignmentAlex Williamson2012-04-191-8/+15
| | | | | | | | | | | | | | | As pointed out by Jason Baron, when assigning a device to a guest we first set the iommu domain pointer, which enables mapping and unmapping of memory slots to the iommu. This leaves a window where this path is enabled, but we haven't synchronized the iommu mappings to the existing memory slots. Thus a slot being removed at that point could send us down unexpected code paths removing non-existent pinnings and iommu mappings. Take the slots_lock around creating the iommu domain and initial mappings as well as around iommu teardown to avoid this race. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: unmap pages from the iommu when slots are removedAlex Williamson2012-04-111-1/+6
| | | | | | | | | | | | | | | | | | | We've been adding new mappings, but not destroying old mappings. This can lead to a page leak as pages are pinned using get_user_pages, but only unpinned with put_page if they still exist in the memslots list on vm shutdown. A memslot that is destroyed while an iommu domain is enabled for the guest will therefore result in an elevated page reference count that is never cleared. Additionally, without this fix, the iommu is only programmed with the first translation for a gpa. This can result in peer-to-peer errors if a mapping is destroyed and replaced by a new mapping at the same gpa as the iommu will still be pointing to the original, pinned memory address. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* module_param: make bool parameters really bool (drivers & misc)Rusty Russell2012-01-131-1/+1
| | | | | | | | | | | | module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds2012-01-101-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits) iommu/amd: Set IOTLB invalidation timeout iommu/amd: Init stats for iommu=pt iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume iommu/amd: Add invalidate-context call-back iommu/amd: Add amd_iommu_device_info() function iommu/amd: Adapt IOMMU driver to PCI register name changes iommu/amd: Add invalid_ppr callback iommu/amd: Implement notifiers for IOMMUv2 iommu/amd: Implement IO page-fault handler iommu/amd: Add routines to bind/unbind a pasid iommu/amd: Implement device aquisition code for IOMMUv2 iommu/amd: Add driver stub for AMD IOMMUv2 support iommu/amd: Add stat counter for IOMMUv2 events iommu/amd: Add device errata handling iommu/amd: Add function to get IOMMUv2 domain for pdev iommu/amd: Implement function to send PPR completions iommu/amd: Implement functions to manage GCR3 table iommu/amd: Implement IOMMUv2 TLB flushing routines iommu/amd: Add support for IOMMUv2 domain mode iommu/amd: Add amd_iommu_domain_direct_map function ...
| * iommu/core: split mapping to page sizes as supported by the hardwareOhad Ben-Cohen2011-11-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mapping a memory region, split it to page sizes as supported by the iommu hardware. Always prefer bigger pages, when possible, in order to reduce the TLB pressure. The logic to do that is now added to the IOMMU core, so neither the iommu drivers themselves nor users of the IOMMU API have to duplicate it. This allows a more lenient granularity of mappings; traditionally the IOMMU API took 'order' (of a page) as a mapping size, and directly let the low level iommu drivers handle the mapping, but now that the IOMMU core can split arbitrary memory regions into pages, we can remove this limitation, so users don't have to split those regions by themselves. Currently the supported page sizes are advertised once and they then remain static. That works well for OMAP and MSM but it would probably not fly well with intel's hardware, where the page size capabilities seem to have the potential to be different between several DMA remapping devices. register_iommu() currently sets a default pgsize behavior, so we can convert the IOMMU drivers in subsequent patches. After all the drivers are converted, the temporary default settings will be removed. Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted to deal with bytes instead of page order. Many thanks to Joerg Roedel <Joerg.Roedel@amd.com> for significant review! Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Cc: David Brown <davidb@codeaurora.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <Joerg.Roedel@amd.com> Cc: Stepan Moskovchenko <stepanm@codeaurora.org> Cc: KyongHo Cho <pullip.cho@samsung.com> Cc: Hiroshi DOYU <hdoyu@nvidia.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: kvm@vger.kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* | KVM: introduce kvm_for_each_memslot macroXiao Guangrong2011-12-271-8/+9
|/ | | | | | | Introduce kvm_for_each_memslot to walk all valid memslot Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* kvm: iommu.c file requires the full module.h present.Paul Gortmaker2011-10-311-0/+1
| | | | | | | | | | | | | This file has things like module_param_named() and MODULE_PARM_DESC() so it needs the full module.h header present. Without it, you'll get: CC arch/x86/kvm/../../../virt/kvm/iommu.o virt/kvm/iommu.c:37: error: expected ‘)’ before ‘bool’ virt/kvm/iommu.c:39: error: expected ‘)’ before string constant make[3]: *** [arch/x86/kvm/../../../virt/kvm/iommu.o] Error 1 make[2]: *** [arch/x86/kvm] Error 2 Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* kvm: fix implicit use of stat.h header filePaul Gortmaker2011-10-311-0/+1
| | | | | | | | This was coming in via an implicit module.h (and its sub-includes) before, but we'll be cleaning that up shortly. Call out the stat.h include requirement in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds2011-10-301-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (33 commits) iommu/core: Remove global iommu_ops and register_iommu iommu/msm: Use bus_set_iommu instead of register_iommu iommu/omap: Use bus_set_iommu instead of register_iommu iommu/vt-d: Use bus_set_iommu instead of register_iommu iommu/amd: Use bus_set_iommu instead of register_iommu iommu/core: Use bus->iommu_ops in the iommu-api iommu/core: Convert iommu_found to iommu_present iommu/core: Add bus_type parameter to iommu_domain_alloc Driver core: Add iommu_ops to bus_type iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API iommu/amd: Fix wrong shift direction iommu/omap: always provide iommu debug code iommu/core: let drivers know if an iommu fault handler isn't installed iommu/core: export iommu_set_fault_handler() iommu/omap: Fix build error with !IOMMU_SUPPORT iommu/omap: Migrate to the generic fault report mechanism iommu/core: Add fault reporting mechanism iommu/core: Use PAGE_SIZE instead of hard-coded value iommu/core: use the existing IS_ALIGNED macro iommu/msm: ->unmap() should return order of unmapped page ... Fixup trivial conflicts in drivers/iommu/Makefile: "move omap iommu to dedicated iommu folder" vs "Rename the DMAR and INTR_REMAP config options" just happened to touch lines next to each other.
| * iommu/core: Convert iommu_found to iommu_presentJoerg Roedel2011-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | With per-bus iommu_ops the iommu_found function needs to work on a bus_type too. This patch adds a bus_type parameter to that function and converts all call-places. The function is also renamed to iommu_present because the function now checks if an iommu is present for a given bus and does not check for a global iommu anymore. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
| * iommu/core: Add bus_type parameter to iommu_domain_allocJoerg Roedel2011-10-211-1/+1
| | | | | | | | | | | | | | | | This is necessary to store a pointer to the bus-specific iommu_ops in the iommu-domain structure. It will be used later to call into bus-specific iommu-ops. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* | pci: Add flag indicating device has been assigned by KVMGreg Rose2011-09-231-0/+4
|/ | | | | | | | | | | | | | | | Device drivers that create and destroy SR-IOV virtual functions via calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic failures if they attempt to destroy VFs while they are assigned to guest virtual machines. By adding a flag for use by the KVM module to indicate that a device is assigned a device driver can check that flag and avoid destroying VFs while they are assigned and avoid system failures. CC: Ian Campbell <ijc@hellion.org.uk> CC: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* KVM: IOMMU: Disable device assignment without interrupt remappingAlex Williamson2011-07-241-0/+18
| | | | | | | | | | | | | | | | | | | | IOMMU interrupt remapping support provides a further layer of isolation for device assignment by preventing arbitrary interrupt block DMA writes by a malicious guest from reaching the host. By default, we should require that the platform provides interrupt remapping support, with an opt-in mechanism for existing behavior. Both AMD IOMMU and Intel VT-d2 hardware support interrupt remapping, however we currently only have software support on the Intel side. Users wishing to re-enable device assignment when interrupt remapping is not supported on the platform can use the "allow_unsafe_assigned_interrupts=1" module option. [avi: break long lines] Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Use u64 for frame data typesJoerg Roedel2010-08-021-1/+1
| | | | | | | | | | | | For 32bit machines where the physical address width is larger than the virtual address width the frame number types in KVM may overflow. Fix this by changing them to u64. [sfr: fix build on 32-bit ppc] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Fix IOMMU memslot reference warningSheng Yang2010-08-011-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warning. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- include/linux/kvm_host.h:259 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 no locks held by qemu-system-x86/29679. stack backtrace: Pid: 29679, comm: qemu-system-x86 Not tainted 2.6.35-rc3+ #200 Call Trace: [<ffffffff810a224e>] lockdep_rcu_dereference+0xa8/0xb1 [<ffffffffa018a06f>] kvm_iommu_unmap_memslots+0xc9/0xde [kvm] [<ffffffffa018a0c4>] kvm_iommu_unmap_guest+0x40/0x4e [kvm] [<ffffffffa018f772>] kvm_arch_destroy_vm+0x1a/0x186 [kvm] [<ffffffffa01800d0>] kvm_put_kvm+0x110/0x167 [kvm] [<ffffffffa0180ecc>] kvm_vcpu_release+0x18/0x1c [kvm] [<ffffffff81156f5d>] fput+0x22a/0x3a0 [<ffffffff81152288>] filp_close+0xb4/0xcd [<ffffffff8106599f>] put_files_struct+0x1b7/0x36b [<ffffffff81065830>] ? put_files_struct+0x48/0x36b [<ffffffff8131ee59>] ? do_raw_spin_unlock+0x118/0x160 [<ffffffff81065bc0>] exit_files+0x6d/0x75 [<ffffffff81068348>] do_exit+0x47d/0xc60 [<ffffffff8177e7b5>] ? _raw_spin_unlock_irq+0x30/0x36 [<ffffffff81068bfa>] do_group_exit+0xcf/0x134 [<ffffffff81080790>] get_signal_to_deliver+0x732/0x81d [<ffffffff81095996>] ? cpu_clock+0x4e/0x60 [<ffffffff81002082>] do_notify_resume+0x117/0xc43 [<ffffffff810a2fa3>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81080d79>] ? sys_rt_sigtimedwait+0x2b5/0x3bf [<ffffffff8177d9f2>] ? trace_hardirqs_off_thunk+0x3a/0x3c [<ffffffff81003221>] ? sysret_signal+0x5/0x3d [<ffffffff8100343b>] int_signal+0x12/0x17 Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Update Red Hat copyrightsAvi Kivity2010-08-011-0/+2
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix order passed to iommu_unmapJan Kiszka2010-06-091-1/+1
| | | | | | | | | | This is obviously a left-over from the the old interface taking the size. Apparently a mostly harmless issue with the current iommu_unmap implementation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2010-05-211-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits) KVM: x86: Add missing locking to arch specific vcpu ioctls KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls KVM: MMU: Segregate shadow pages with different cr0.wp KVM: x86: Check LMA bit before set_efer KVM: Don't allow lmsw to clear cr0.pe KVM: Add cpuid.txt file KVM: x86: Tell the guest we'll warn it about tsc stability x86, paravirt: don't compute pvclock adjustments if we trust the tsc x86: KVM guest: Try using new kvm clock msrs KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID KVM: x86: add new KVMCLOCK cpuid feature KVM: x86: change msr numbers for kvmclock x86, paravirt: Add a global synchronization point for pvclock x86, paravirt: Enable pvclock flags in vcpu_time_info structure KVM: x86: Inject #GP with the right rip on efer writes KVM: SVM: Don't allow nested guest to VMMCALL into host KVM: x86: Fix exception reinjection forced to true KVM: Fix wallclock version writing race KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) ...
| * KVM: use the correct RCU API for PROVE_RCU=yLai Jiangshan2010-05-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | kvm: Change kvm_iommu_map_pages to map large pagesJoerg Roedel2010-03-071-22/+91
|/ | | | | | | | | This patch changes the implementation of of kvm_iommu_map_pages to map the pages with the host page size into the io virtual address space. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-By: Avi Kivity <avi@redhat.com>
* KVM: enable PCI multiple-segments for pass-through deviceZhai, Edwin2010-03-011-3/+6
| | | | | | | | | Enable optional parameter (default 0) - PCI segment (or domain) besides BDF, when assigning PCI device to guest. Signed-off-by: Zhai Edwin <edwin.zhai@intel.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU updateMarcelo Tosatti2010-03-011-2/+2
| | | | | | | | | | Use two steps for memslot deletion: mark the slot invalid (which stops instantiation of new shadow pages for that slot, but allows destruction), then instantiate the new empty slot. Also simplifies kvm_handle_hva locking. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: use gfn_to_pfn_memslot in kvm_iommu_map_pagesMarcelo Tosatti2010-03-011-7/+6
| | | | | | | So its possible to iommu map a memslot before making it visible to kvm. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: modify memslots layout in struct kvmMarcelo Tosatti2010-03-011-6/+12
| | | | | | | | | Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Enable snooping control for supported hardwareSheng Yang2009-06-101-3/+24
| | | | | | | | | | | | | | | | Memory aliases with different memory type is a problem for guest. For the guest without assigned device, the memory type of guest memory would always been the same as host(WB); but for the assigned device, some part of memory may be used as DMA and then set to uncacheable memory type(UC/WC), which would be a conflict of host memory type then be a potential issue. Snooping control can guarantee the cache correctness of memory go through the DMA engine of VT-d. [avi: fix build on ia64] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix assigned devices circular locking dependencyMark McLoughlin2009-02-151-4/+2
| | | | | | | | | | | kvm->slots_lock is outer to kvm->lock, so take slots_lock in kvm_vm_ioctl_assign_device() before taking kvm->lock, rather than taking it in kvm_iommu_map_memslots(). Cc: stable@kernel.org Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* kvm/iommu: fix compile warningJoerg Roedel2009-01-031-1/+1
| | | | | | | This fixes a compile warning about a variable thats maybe used uninitialized in the function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* KVM: change KVM to use IOMMU APIJoerg Roedel2009-01-031-24/+21
| | | | Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* KVM: rename vtd.c to iommu.cJoerg Roedel2009-01-031-0/+217
Impact: file renamed The code in the vtd.c file can be reused for other IOMMUs as well. So rename it to make it clear that it handle more than VT-d. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
OpenPOWER on IntegriCloud