summaryrefslogtreecommitdiffstats
path: root/hw/vfio_pci.c
Commit message (Collapse)AuthorAgeFilesLines
* vfio-pci: Enable PCIe extended config spaceAlex Williamson2013-01-301-0/+4
| | | | | | | | | | | | | We don't know pre-init time whether the device we're exposing is PCIe or legacy PCI. We could ask for it to be specified via a device option, but that seems like too much to ask of the user. Instead we can assume everything will be PCIe, which makes PCI-core allocate enough config space. Removing the flag during init leaves the space allocated, but allows legacy PCI devices to report the real device config space size to rest of Qemu. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vfio-pci: Loosen sanity checks to allow future featuresAlex Williamson2013-01-081-2/+2
| | | | | | | | | | | VFIO_PCI_NUM_REGIONS and VFIO_PCI_NUM_IRQS should never have been used in this manner as it locks a specific kernel implementation. Future features may introduce new regions or interrupt entries (VGA may add legacy ranges, AER might add an IRQ for error signalling). Fix this before it gets us into trouble. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org
* vfio-pci: Make host MSI-X enable track guestAlex Williamson2013-01-081-4/+26
| | | | | | | | | | | | | | | | | | | | | | | Guests typically enable MSI-X with all of the vectors in the MSI-X vector table masked. Only when the vector is enabled does the vector get unmasked, resulting in a vector_use callback. These two points, enable and unmask, correspond to pci_enable_msix() and request_irq() for Linux guests. Some drivers rely on VF/PF or PF/fw communication channels that expect the physical state of the device to match the guest visible state of the device. They don't appreciate lazily enabling MSI-X on the physical device. To solve this, enable MSI-X with a single vector when the MSI-X capability is enabled and immediate disable the vector. This leaves the physical device in exactly the same state between host and guest. Furthermore, the brief gap where we enable vector 0, it fires into userspace, not KVM, so the guest doesn't get spurious interrupts. Ideally we could call VFIO_DEVICE_SET_IRQS with the right parameters to enable MSI-X with zero vectors, but this will currently return an error as the Linux MSI-X interfaces do not allow it. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org
* msi: add API to get notified about pending bit pollMichael S. Tsirkin2012-12-261-1/+1
| | | | | | Update all users. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* softmmu: move include files to include/sysemu/Paolo Bonzini2012-12-191-1/+1
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* misc: move include files to include/qemu/Paolo Bonzini2012-12-191-4/+4
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: move include files to include/exec/Paolo Bonzini2012-12-191-2/+2
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge commit '1dd3a74d2ee2d873cde0b390b536e45420b3fe05' into HEADPaolo Bonzini2012-12-171-3/+3
|\ | | | | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * pci: update all users to look in pci/Michael S. Tsirkin2012-12-171-3/+3
| | | | | | | | | | | | update all users so we can remove the makefile hack. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vfio-pci: Don't use kvm_irqchip_in_kernelAlex Williamson2012-12-101-2/+3
|/ | | | | | | | | | | kvm_irqchip_in_kernel() has an architecture specific meaning, so we shouldn't be using it to determine whether to enabled KVM INTx bypass. kvm_irqfds_enabled() seems most appropriate. Also use this to protect our other call to kvm_check_extension() as that explodes when KVM isn't enabled. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org
* vfio-pci: Use common msi_get_messageAlex Williamson2012-11-131-23/+1
| | | | | | We can get rid of our local version now that a helper exists. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Add KVM INTx accelerationAlex Williamson2012-11-131-0/+186
| | | | | | | | | | | | This makes use of the new level irqfd support enabling bypass of qemu userspace both on INTx injection and unmask. This significantly boosts the performance of devices making use of legacy interrupts (ex. ~60% better netperf TCP_RR scores for an e1000e assigned to a Linux guest and booted with pci=nomsi). This also avoids flipping mmaps on and off to simulate EOIs, so greatly improves performance of device access in addition to interrupt latency. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* Rename target_phys_addr_t to hwaddrAvi Kivity2012-10-231-18/+18
| | | | | | | | | | | | | | | target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are reserved) and its purpose doesn't match the name (most target_phys_addr_t addresses are not target specific). Replace it with a finger-friendly, standards conformant hwaddr. Outstanding patchsets can be fixed up with the command git rebase -i --exec 'find -name "*.[ch]" | xargs s/target_phys_addr_t/hwaddr/g' origin Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* Merge remote-tracking branch 'awilliam/tags/vfio-pci-for-qemu-20121017.0' ↵Anthony Liguori2012-10-221-3/+8
|\ | | | | | | | | | | | | | | into staging * awilliam/tags/vfio-pci-for-qemu-20121017.0: vfio-pci: Mark non-migratable vfio-pci: Fix debug build
| * vfio-pci: Mark non-migratableAlex Williamson2012-10-171-0/+6
| | | | | | | | | | | | We haven't magically fixed this yet. Toss in a description too. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
| * vfio-pci: Fix debug buildAlex Williamson2012-10-171-3/+2
| | | | | | | | | | | | Stray variable from before MSI-X rework Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* | memory: use AddressSpace for MemoryListener filteringAvi Kivity2012-10-221-2/+1
| | | | | | | | | | | | | | | | Using the AddressSpace type reduces confusion, as you can't accidentally supply the MemoryRegion you're interested in. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | vfio: drop no-op MemoryListener callbacksAvi Kivity2012-10-151-29/+0
|/ | | | | | Removes quite a bit of useless code. Signed-off-by: Avi Kivity <avi@redhat.com>
* vfio-pci: Fix BAR->VFIODevice translation inJan Kiszka2012-10-081-2/+2
| | | | | | | | | | | DO_UPCAST is supposed to translate from the first member of a struct to that struct, not from arbitrary ones. And it (usually) breaks the build when neglecting this rule. Use container_of to fix the build breakage and likely also the runtime behavior. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> aw: runtime behavior is actually the same, but clearly misuse of DO_UPCAST Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Clang cleanupAlex Williamson2012-10-081-43/+58
| | | | | | | | | | | Blue Swirl reports that Clang doesn't like the structure we define to avoid dynamic allocation for a number of calls to VFIO_DEVICE_SET_IRQS. Adding an element after a variable sized type is a GNU extension. Switch back to dynamic allocation, which really isn't a problem since this is only done on interrupt setup changes. Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Cleanup on INTx setup failureAlex Williamson2012-10-081-0/+2
| | | | | | Missing some unwind code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Extend resetAlex Williamson2012-10-081-7/+22
| | | | | | | | | Take what we've learned from pci-assign and apply it to vfio-pci. On reset, disable previous interrupt config, perform a device reset if available, re-enable INTx, and disable memory regions on the device to prevent continuing DMA. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Remove setting of MSI qsizeAlex Williamson2012-10-081-18/+0
| | | | | | | This was a misinterpretation of the spec, hardware doesn't get to specify how many were actually enabled through this field. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Use uintptr_t for void* castAlex Williamson2012-10-081-1/+1
| | | | | | | We don't seem to run into any sign extension problems, but unsigned looks more correct. Signed-off-by: Alex williamson <alex.williamson@redhat.com>
* vfio-pci: Don't peak at msi_supportedAlex Williamson2012-10-081-16/+6
| | | | | | Let the init function fail, just don't warn for -ENOTSUP. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Roll the header into the .c fileAlex Williamson2012-10-081-1/+96
| | | | | | | It's only ~100 lines and nobody else should be using this. Suggested by Michael Tsirkin. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: No spurious MSIsAlex Williamson2012-10-081-15/+0
| | | | | | | FreeBSD doesn't like these spurious MSIs, remove them as they're mostly paranoia anyway. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Rework MSIX setup/teardownAlex Williamson2012-10-081-53/+55
| | | | | | | | | | | | | | We try to do lazy initialization of MSIX since we don't actually need to setup anything until MSIX vectors start getting used. This leads to problems if MSIX is enabled, but never used (we can end up trying to re-enable INTx while it's still enabled). We also run into problems trying to expand our reset function to tear down interrupts as we can then get vector release notifications after we've released data structures. By making explicit initialization and teardown we can avoid both of these problems and behave more similar to bare metal. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Unmap and retry DMA mappingAlex Williamson2012-10-081-4/+11
| | | | | | | | | | Occasionally we get regions added that overlap with existing mappings. These always seems to be in the VGA ROM range. VFIO returns EBUSY for these mapping attempts. We can try a little harder and assume that the latest mapping is correct by removing any overlapping ranges and retrying the original request. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Re-order map/unmapAlex Williamson2012-10-081-18/+18
| | | | | | This cleans up the next patch that calls unmap from map. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio-pci: Update slow path INTx algorithmAlex Williamson2012-10-081-23/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | We can't afford the overhead of switching out and back into mmap mode around each interrupt, but we can do it lazily via a timer. On INTx interrupt, disable the mmap'd memory regions and set a timer. On every interrupt, push the timer out. If the timer expires and the interrupt is no longer pending, switch back to mmap mode. This has the benefit that things like graphics cards, which rarely or never, fire an interrupt don't need manual user intervention to add the x-intx=off parameter. They'll just remain in mmap mode until they trigger an interrupt, and if they don't continue to regularly fire interrupts, they'll switch back. The default timeout is tuned for network cards so that a ping is just enough to keep them in non-mmap mode, where they have much better latency. It is tunable with an experimental option, x-intx-mmap-timeout-ms. A value of 0 keeps the device in non-mmap mode after the first interrupt. It's possible we could look at the class code of devices and come up with reasonable per-class defaults based on expected interrupt frequency and latency. None of this is used for MSI interrupts and also won't be used if we can bypass through KVM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio_pci: fix build on 32-bit systemsAnthony Liguori2012-10-011-1/+1
| | | | | | | | | We cannot cast directly from pointer to uint64. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alex Barcelo <abarcelo@ac.upc.edu> Reported-by: Alex Barcelo <abarcelo@ac.upc.edu> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* vfio: vfio-pci device assignment driverAlex Williamson2012-10-011-0/+1864
This adds the core of the QEMU VFIO-based PCI device assignment driver. To make use of this driver, enable CONFIG_VFIO, CONFIG_VFIO_IOMMU_TYPE1, and CONFIG_VFIO_PCI in your host Linux kernel config. Load the vfio-pci module. To assign device 0000:05:00.0 to a guest, do the following: for dev in $(ls /sys/bus/pci/devices/0000:05:00.0/iommu_group/devices); do vendor=$(cat /sys/bus/pci/devices/$dev/vendor) device=$(cat /sys/bus/pci/devices/$dev/device) if [ -e /sys/bus/pci/devices/$dev/driver ]; then echo $dev > /sys/bus/pci/devices/$dev/driver/unbind fi echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id done See Documentation/vfio.txt in the Linux kernel tree for further description of IOMMU groups and VFIO. Then launch qemu including the option: -device vfio-pci,host=0000:05:00.0 Legacy PCI interrupts (INTx) currently makes use of a kludge where we trap BAR accesses and assume the access is in response to an interrupt, therefore de-asserting and unmasking the interrupt. It's not quite as targetted as using the EOI for this, but it's self contained and seems to work across all architectures. The side-effect is a significant performance slow-down for device in INTx mode. Some devices, like graphics cards, don't really use their interrupt, so this can be turned off with the x-intx=off option, which disables INTx alltogether. This should be considered an experimental option until we refine this code. Both MSI and MSI-X are supported and avoid these issues. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
OpenPOWER on IntegriCloud