summaryrefslogtreecommitdiffstats
path: root/usr.sbin/bhyve/pci_passthru.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC 264353,264509,264768,264770,264825,264846,264988,265114,265165,265365,jhb2014-07-211-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | 265941,265951,266390,266550,266910: Various bhyve fixes: - Don't save host's return address in 'struct vmxctx'. - Permit non-32-bit accesses to local APIC registers. - Factor out common ioport handler code. - Use calloc() in favor of malloc + memset. - Change the vlapic timer frequency to be in the ballpark of contemporary hardware. - Allow the guest to read the TSC via MSR 0x10. - A VMCS is always inactive when it exits the vmx_run() loop. Remove redundant code and the misleading comment that suggest otherwise. - Ignore writes to microcode update MSR. This MSR is accessed by RHEL7 guest. Add KTR tracepoints to annotate wrmsr and rdmsr VM exits. - Provide an alias for the userboot console and name it 'comconsole'. - Use EV_ADD to create an mevent and EV_ENABLE to enable it. - abort(3) the process in response to a VMEXIT_ABORT. - Don't include the guest memory segments in the bhyve(8) process core dump. - Make the vmx asm code dtrace-fbt-friendly. - Allow vmx_getdesc() and vmx_setdesc() to be called for a vcpu that is in the VCPU_RUNNING state. - Enable VMX in the IA32_FEATURE_CONTROL MSR if it not enabled and the MSR isn't locked.
* MFC 261904,261905,262143,262184,264921,265211,267169,267292,267294:jhb2014-07-191-18/+48
| | | | | | | | | | | Various PCI fixes: - Allow PCI devices to be configured on all valid bus numbers from 0 to 255. - Tweak the handling of PCI capabilities in emulated devices to remove the non-standard zero capability list terminator. - Add a check to validate that memory BARs of passthru devices are 4KB aligned. - Respect and track the enable bit in the PCI configuration address word. - Handle quad-word access to 32-bit register pairs.
* MFC 258859,259081,259085,259205,259213,259275,259482,259537,259702,259779:jhb2014-02-231-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several changes to the local APIC support in bhyve: - Rename 'vm_interrupt_hostcpu()' to 'vcpu_notify_event()'. - If a vcpu disables its local apic and then executes a 'HLT' then spin down the vcpu and destroy its thread context. Also modify the 'HLT' processing to ignore pending interrupts in the IRR if interrupts have been disabled by the guest. The interrupt cannot be injected into the guest in any case so resuming it is futile. - Use callout(9) to drive the vlapic timer instead of clocking it on each VM exit. - When the guest is bringing up the APs in the x2APIC mode a write to the ICR register will now trigger a return to userspace with an exitcode of VM_EXITCODE_SPINUP_AP. - Change the vlapic timer lock to be a spinlock because the vlapic can be accessed from within a critical section (vm run loop) when guest is using x2apic mode. - Fix the vlapic version register. - Add a command to bhyvectl to inject an NMI on a specific vcpu. - Add an API to deliver message signalled interrupts to vcpus. This allows callers to treat the MSI 'addr' and 'data' fields as opaque and also lets bhyve implement multiple destination modes: physical, flat and clustered. - Rename the ambiguously named 'vm_setup_msi()' and 'vm_setup_msix()' to 'vm_setup_pptdev_msi()' and 'vm_setup_pptdev_msix()' respectively. - Consolidate the virtual apic initialization in a single function: vlapic_reset() - Add a generic routine to trigger an LVT interrupt that supports both fixed and NMI delivery modes. - Add an ioctl and bhyvectl command to trigger local interrupts inside a guest. In particular, a global NMI similar to that raised by SERR# or PERR# can be simulated by asserting LINT1 on all vCPUs. - Extend the LVT table in the vCPU local APIC to support CMCI. - Flesh out the local APIC error reporting a bit to cache errors and report them via ESR when ESR is written to. Add support for asserting the error LVT when an error occurs. Raise illegal vector errors when attempting to signal an invalid vector for an interrupt or when sending an IPI. - Export table entries in the MADT and MP Table advertising the stock x86 config of LINT0 set to ExtInt and LINT1 wired to NMI.
* Convert the offset into the bar that contains the MSI-X table to an offsetneel2013-03-111-0/+3
| | | | | | | | | | | | into the MSI-X table before using it to calculate the table index. In the common case where the MSI-X table is located at the begining of the BAR these two offsets are identical and thus the code was working by accident. This change will fix the case where the MSI-X table is located in the middle or at the end of the BAR that contains it. Obtained from: NetApp
* Fix a broken assumption in the passthru implementation that the MSI-X tableneel2013-02-011-23/+37
| | | | | | | | | | | can only be located at the beginning or the end of the BAR. If the MSI-table is located in the middle of a BAR then we will split the BAR into two and create two mappings - one before the table and one after the table - leaving a hole in place of the table so accesses to it can be trapped and emulated. Obtained from: NetApp
* Fix a bug in the passthru implementation where it would assume that allneel2013-02-011-4/+8
| | | | | | | | | | | devices are MSI-X capable. This in turn would lead it to treat bar 0 as the MSI-X table bar even if the underlying device did not support MSI-X. Fix this by providing an API to query the MSI-X table index of the emulated device. If the underlying device does not support MSI-X then this API will return -1. Obtained from: NetApp
* Allocate the memory for the MSI-X table dynamically instead of allocating 32KBneel2013-01-211-10/+28
| | | | | | | | | | statically. In most cases the number of table entries will be far less than the maximum of 2048 allowed by the PCI specification. Reuse macros from pcireg.h to interpret the MSI-X capability instead of rolling our own. Obtained from: NetApp
* Get rid of redundant 'table_size' field in struct pi_msix. If needed it canneel2013-01-211-2/+1
| | | | | | always be calculated from the number of entries in the MSI-X table. Obtained from: NetApp
* Revamp the x86 instruction emulation in bhyve.neel2012-11-281-1/+0
| | | | | | | | | | | | | | | | | | | On a nested page table fault the hypervisor will: - fetch the instruction using the guest %rip and %cr3 - decode the instruction in 'struct vie' - emulate the instruction in host kernel context for local apic accesses - any other type of mmio access is punted up to user-space (e.g. ioapic) The decoded instruction is passed as collateral to the user-space process that is handling the PAGING exit. The emulation code is fleshed out to include more addressing modes (e.g. SIB) and more types of operands (e.g. imm8). The source code is unified into a single file (vmm_instruction_emul.c) that is compiled into vmm.ko as well as /usr/sbin/bhyve. Reviewed by: grehan Obtained from: NetApp
* MSI-X does not need to be enabled in the message control register for theneel2012-11-221-2/+2
| | | | | | guest to access the MSI-x tables. Obtained from: NetApp
* Rework how guest MMIO regions are dealt with.grehan2012-10-191-126/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - New memory region interface. An RB tree holds the regions, with a last-found per-vCPU cache to deal with the common case of repeated guest accesses to MMIO registers in the same page. - Support memory-mapped BARs in PCI emulation. mem.c/h - memory region interface instruction_emul.c/h - remove old region interface. Use gpa from EPT exit to avoid a tablewalk to determine operand address. Determine operand size and use when calling through to region handler. fbsdrun.c - call into region interface on paging exit. Distinguish between instruction emul error and region not found pci_emul.c/h - implement new BAR callback api. Split BAR alloc routine into routines that require/don't require the BAR phys address. ioapic.c pci_passthru.c pci_virtio_block.c pci_virtio_net.c pci_uart.c - update to new BAR callback i/f Reviewed by: neel Obtained from: NetApp
* MSI-x interrupt support for PCI pass-thru devices.grehan2012-04-281-13/+257
| | | | | | | | | | Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine
* Import of bhyve hypervisor and utilities, part 1.grehan2011-05-131-0/+508
vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski
OpenPOWER on IntegriCloud