summaryrefslogtreecommitdiffstats
path: root/usr.sbin/bhyve/pci_passthru.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r302362,r302363,r302364,r302365,r302373:ngie2016-07-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r302362: Fix gcc warnings - Remove -Wunused-but-set-variable (newcpu) - Always return VMEXIT_CONTINUE as the code always set retval to that value. r302363: Fix gcc warnings Put cfl/prdt under AHCI_DEBUG #defines as they are only used in those cases. r302364: Fix gcc warnings Add `WRAPPED_CTASSERT` macro by annotating CTASSERTs with __unused to deal with -Wunused-local-typedefs warnings from gcc 4.8+. All other compilers (clang, etc) use CTASSERT as-is. A more generic solution for this issue will be proposed after ^/stable/11 is forked. Consolidate all CTASSERTs under one block instead of inlining them in functions. r302365: Fix gcc warnings Remove -Wunused-but-set-variable (`error`). Cast calls with `(void)` to note that the return value is explicitly ignored. r302373: Fix CTASSERT issue in a more clean way - Replace all CTASSERT macro instances with static_assert's. - Remove the WRAPPED_CTASSERT macro; it's now an unnecessary obfuscation. - Localize all static_assert's to the structures being tested. - Sort some headers per-style(9).
* MFC 297932,298295:jhb2016-04-271-32/+133
| | | | | | | | | | | | | | | | | | | | Improvements for PCI passthru devices. 297932: Handle PBA that shares a page with MSI-X table for passthrough devices. If the PBA shares a page with the MSI-X table, map the shared page via /dev/mem and emulate accesses to the portion of the PBA in the shared page by accessing the mapped page. 298295: Always emit an error message on passthru configuration errors. Previously, many errors (such as the PCI device not being attached to the ppt(4) driver) resulted in bhyve silently exiting without starting the virtual machine. Now any errors encountered when configuring a virtual slot for a PCI passthru device should be noted on stderr.
* MFC r284539, r284630, r284688, r284877, r285217, r285218,grehan2016-02-011-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r286837, r286838, r288470, r288522, r288524, r288826, r289001 Pull in bhyve bug fixes and changes to allow UEFI booting. This provides Windows support. Tested on Intel and AMD with: - Arch Linux i386+amd64 (kernel 4.3.3) - Ubuntu 15.10 server 64-bit - FreeBSD-CURRENT/amd64 20160127 snap - FreeBSD 10.2 i386+amd64 - OpenBSD 5.8 i386+amd64 - SmartOS latest - Windows 10 build 1511' Huge thanks to Yamagi Burmeister who submitted the patch and did the majority of the testing. r284539 - bootrom mem allocation support r284630 - Add SO_REUSEADDR when starting debug port r284688 - Fix a regression in "movs" emulation r284877 - verify_gla() non-zero segment base fix r285217 - Always assert DCD and DSR in the uart r285218 - devmem nodes moved to /dev/vmm.io/ r286837 - Add define for SATA Check-Power-Mode r286838 - Add simple (no-op) SATA cmd emulations r288470 - Increase virtio-blk indirect descs r288522 - Firmware guest query interface r288524 - Fix post-test typo r288826 - Clean up SATA unimplemented cmd msg r289001 - Add -l option to specify userboot path Submitted by: Yamagi Burmeister Approved by: re (kib)
* MFC 264353,264509,264768,264770,264825,264846,264988,265114,265165,265365,jhb2014-07-211-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | 265941,265951,266390,266550,266910: Various bhyve fixes: - Don't save host's return address in 'struct vmxctx'. - Permit non-32-bit accesses to local APIC registers. - Factor out common ioport handler code. - Use calloc() in favor of malloc + memset. - Change the vlapic timer frequency to be in the ballpark of contemporary hardware. - Allow the guest to read the TSC via MSR 0x10. - A VMCS is always inactive when it exits the vmx_run() loop. Remove redundant code and the misleading comment that suggest otherwise. - Ignore writes to microcode update MSR. This MSR is accessed by RHEL7 guest. Add KTR tracepoints to annotate wrmsr and rdmsr VM exits. - Provide an alias for the userboot console and name it 'comconsole'. - Use EV_ADD to create an mevent and EV_ENABLE to enable it. - abort(3) the process in response to a VMEXIT_ABORT. - Don't include the guest memory segments in the bhyve(8) process core dump. - Make the vmx asm code dtrace-fbt-friendly. - Allow vmx_getdesc() and vmx_setdesc() to be called for a vcpu that is in the VCPU_RUNNING state. - Enable VMX in the IA32_FEATURE_CONTROL MSR if it not enabled and the MSR isn't locked.
* MFC 261904,261905,262143,262184,264921,265211,267169,267292,267294:jhb2014-07-191-18/+48
| | | | | | | | | | | Various PCI fixes: - Allow PCI devices to be configured on all valid bus numbers from 0 to 255. - Tweak the handling of PCI capabilities in emulated devices to remove the non-standard zero capability list terminator. - Add a check to validate that memory BARs of passthru devices are 4KB aligned. - Respect and track the enable bit in the PCI configuration address word. - Handle quad-word access to 32-bit register pairs.
* MFC 258859,259081,259085,259205,259213,259275,259482,259537,259702,259779:jhb2014-02-231-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several changes to the local APIC support in bhyve: - Rename 'vm_interrupt_hostcpu()' to 'vcpu_notify_event()'. - If a vcpu disables its local apic and then executes a 'HLT' then spin down the vcpu and destroy its thread context. Also modify the 'HLT' processing to ignore pending interrupts in the IRR if interrupts have been disabled by the guest. The interrupt cannot be injected into the guest in any case so resuming it is futile. - Use callout(9) to drive the vlapic timer instead of clocking it on each VM exit. - When the guest is bringing up the APs in the x2APIC mode a write to the ICR register will now trigger a return to userspace with an exitcode of VM_EXITCODE_SPINUP_AP. - Change the vlapic timer lock to be a spinlock because the vlapic can be accessed from within a critical section (vm run loop) when guest is using x2apic mode. - Fix the vlapic version register. - Add a command to bhyvectl to inject an NMI on a specific vcpu. - Add an API to deliver message signalled interrupts to vcpus. This allows callers to treat the MSI 'addr' and 'data' fields as opaque and also lets bhyve implement multiple destination modes: physical, flat and clustered. - Rename the ambiguously named 'vm_setup_msi()' and 'vm_setup_msix()' to 'vm_setup_pptdev_msi()' and 'vm_setup_pptdev_msix()' respectively. - Consolidate the virtual apic initialization in a single function: vlapic_reset() - Add a generic routine to trigger an LVT interrupt that supports both fixed and NMI delivery modes. - Add an ioctl and bhyvectl command to trigger local interrupts inside a guest. In particular, a global NMI similar to that raised by SERR# or PERR# can be simulated by asserting LINT1 on all vCPUs. - Extend the LVT table in the vCPU local APIC to support CMCI. - Flesh out the local APIC error reporting a bit to cache errors and report them via ESR when ESR is written to. Add support for asserting the error LVT when an error occurs. Raise illegal vector errors when attempting to signal an invalid vector for an interrupt or when sending an IPI. - Export table entries in the MADT and MP Table advertising the stock x86 config of LINT0 set to ExtInt and LINT1 wired to NMI.
* Convert the offset into the bar that contains the MSI-X table to an offsetneel2013-03-111-0/+3
| | | | | | | | | | | | into the MSI-X table before using it to calculate the table index. In the common case where the MSI-X table is located at the begining of the BAR these two offsets are identical and thus the code was working by accident. This change will fix the case where the MSI-X table is located in the middle or at the end of the BAR that contains it. Obtained from: NetApp
* Fix a broken assumption in the passthru implementation that the MSI-X tableneel2013-02-011-23/+37
| | | | | | | | | | | can only be located at the beginning or the end of the BAR. If the MSI-table is located in the middle of a BAR then we will split the BAR into two and create two mappings - one before the table and one after the table - leaving a hole in place of the table so accesses to it can be trapped and emulated. Obtained from: NetApp
* Fix a bug in the passthru implementation where it would assume that allneel2013-02-011-4/+8
| | | | | | | | | | | devices are MSI-X capable. This in turn would lead it to treat bar 0 as the MSI-X table bar even if the underlying device did not support MSI-X. Fix this by providing an API to query the MSI-X table index of the emulated device. If the underlying device does not support MSI-X then this API will return -1. Obtained from: NetApp
* Allocate the memory for the MSI-X table dynamically instead of allocating 32KBneel2013-01-211-10/+28
| | | | | | | | | | statically. In most cases the number of table entries will be far less than the maximum of 2048 allowed by the PCI specification. Reuse macros from pcireg.h to interpret the MSI-X capability instead of rolling our own. Obtained from: NetApp
* Get rid of redundant 'table_size' field in struct pi_msix. If needed it canneel2013-01-211-2/+1
| | | | | | always be calculated from the number of entries in the MSI-X table. Obtained from: NetApp
* Revamp the x86 instruction emulation in bhyve.neel2012-11-281-1/+0
| | | | | | | | | | | | | | | | | | | On a nested page table fault the hypervisor will: - fetch the instruction using the guest %rip and %cr3 - decode the instruction in 'struct vie' - emulate the instruction in host kernel context for local apic accesses - any other type of mmio access is punted up to user-space (e.g. ioapic) The decoded instruction is passed as collateral to the user-space process that is handling the PAGING exit. The emulation code is fleshed out to include more addressing modes (e.g. SIB) and more types of operands (e.g. imm8). The source code is unified into a single file (vmm_instruction_emul.c) that is compiled into vmm.ko as well as /usr/sbin/bhyve. Reviewed by: grehan Obtained from: NetApp
* MSI-X does not need to be enabled in the message control register for theneel2012-11-221-2/+2
| | | | | | guest to access the MSI-x tables. Obtained from: NetApp
* Rework how guest MMIO regions are dealt with.grehan2012-10-191-126/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - New memory region interface. An RB tree holds the regions, with a last-found per-vCPU cache to deal with the common case of repeated guest accesses to MMIO registers in the same page. - Support memory-mapped BARs in PCI emulation. mem.c/h - memory region interface instruction_emul.c/h - remove old region interface. Use gpa from EPT exit to avoid a tablewalk to determine operand address. Determine operand size and use when calling through to region handler. fbsdrun.c - call into region interface on paging exit. Distinguish between instruction emul error and region not found pci_emul.c/h - implement new BAR callback api. Split BAR alloc routine into routines that require/don't require the BAR phys address. ioapic.c pci_passthru.c pci_virtio_block.c pci_virtio_net.c pci_uart.c - update to new BAR callback i/f Reviewed by: neel Obtained from: NetApp
* MSI-x interrupt support for PCI pass-thru devices.grehan2012-04-281-13/+257
| | | | | | | | | | Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine
* Import of bhyve hypervisor and utilities, part 1.grehan2011-05-131-0/+508
vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski
OpenPOWER on IntegriCloud