vfio: hugepage support for vfio_iommu_type1

We currently send all mappings to the iommu in PAGE_SIZE chunks, which prevents the iommu from enabling support for larger page sizes. We still need to pin pages, which means we step through them in PAGE_SIZE chunks, but we can batch up contiguous physical memory chunks to allow the iommu the opportunity to use larger pages. The approach here is a bit different that the one currently used for legacy KVM device assignment. Rather than looking at the vma page size and using that as the maximum size to pass to the iommu, we instead simply look at whether the next page is physically contiguous. This means we might ask the iommu to map a 4MB region, while legacy KVM might limit itself to a maximum of 2MB. Splitting our mapping path also allows us to be smarter about locked memory because we can more easily unwind if the user attempts to exceed the limit. Therefore, rather than assuming that a mapping will result in locked memory, we test each page as it is pinned to determine whether it locks RAM vs an mmap'd MMIO region. This should result in better locking granularity and less locked page fudge factors in userspace. The unmap path uses the same algorithm as legacy KVM. We don't want to track the pfn for each mapping ourselves, but we need the pfn in order to unpin pages. We therefore ask the iommu for the iova to physical address translation, ask it to unpin a page, and see how many pages were actually unpinned. iommus supporting large pages will often return something bigger than a page here, which we know will be physically contiguous and we can unpin a batch of pfns. iommus not supporting large mappings won't see an improvement in batching here as they only unmap a page at a time. With this change, we also make a clarification to the API for mapping and unmapping DMA. We can only guarantee unmaps at the same granularity as used for the original mapping. In other words, unmapping a subregion of a previous mapping is not guaranteed and may result in a larger or smaller unmapping than requested. The size field in the unmapping structure is updated to reflect this. Previously this was unmodified on mapping, always returning the the requested unmap size. This is now updated to return the actual unmap size on success, allowing userspace to appropriately track mappings. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
author: Alex Williamson <alex.williamson@redhat.com> 2013-06-21 09:38:02 -0600
committer: Alex Williamson <alex.williamson@redhat.com> 2013-06-21 09:38:02 -0600
commit: 166fd7d94afdac040b28c473e45241820ca522a2 (patch)
tree: 044cd4540cb2a949ed8a55949cc39471b05a73b3 /include/uapi
parent: cd9b22685e4ccd728550d51fbe108c473f89df4f (diff)
download: op-kernel-dev-166fd7d94afdac040b28c473e45241820ca522a2.zip
op-kernel-dev-166fd7d94afdac040b28c473e45241820ca522a2.tar.gz
1 files changed, 6 insertions, 2 deletions
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 284ff24..5136006 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -361,10 +361,14 @@ struct vfio_iommu_type1_dma_map {
 #define VFIO_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 13)
 
 /**
- * VFIO_IOMMU_UNMAP_DMA - _IOW(VFIO_TYPE, VFIO_BASE + 14, struct vfio_dma_unmap)
+ * VFIO_IOMMU_UNMAP_DMA - _IOWR(VFIO_TYPE, VFIO_BASE + 14,
+ *							struct vfio_dma_unmap)
  *
  * Unmap IO virtual addresses using the provided struct vfio_dma_unmap.
- * Caller sets argsz.
+ * Caller sets argsz.  The actual unmapped size is returned in the size
+ * field.  No guarantee is made to the user that arbitrary unmaps of iova
+ * or size different from those used in the original mapping call will
+ * succeed.
  */
 struct vfio_iommu_type1_dma_unmap {
 	__u32	argsz;
author	Alex Williamson <alex.williamson@redhat.com>	2013-06-21 09:38:02 -0600
committer	Alex Williamson <alex.williamson@redhat.com>	2013-06-21 09:38:02 -0600
commit	166fd7d94afdac040b28c473e45241820ca522a2 (patch)
tree	044cd4540cb2a949ed8a55949cc39471b05a73b3 /include/uapi
parent	cd9b22685e4ccd728550d51fbe108c473f89df4f (diff)
download	op-kernel-dev-166fd7d94afdac040b28c473e45241820ca522a2.zip op-kernel-dev-166fd7d94afdac040b28c473e45241820ca522a2.tar.gz