summaryrefslogtreecommitdiffstats
path: root/usr.sbin/bhyve
Commit message (Collapse)AuthorAgeFilesLines
* Specify the length of the mapping requested from 'paddr_guest2host()'.neel2013-03-017-37/+49
| | | | | | | | | This seems prudent to do in its own right but it also opens up the possibility of not having to mmap the entire guest address space in the 'bhyve' process context. Discussed with: grehan Obtained from: NetApp
* Ignore the BARRIER flag in the virtio block header.neel2013-02-261-4/+11
| | | | | | | | This capability is not advertised by the host so ignore it even if the guest insists on setting the flag. Reviewed by: grehan Obtained from: NetApp
* Get rid of unused struct member.neel2013-02-251-1/+0
| | | | | Pointed out by: Gopakumar T Obtained from: NetApp
* Add the ability to have a 'fallback' search for memory ranges.grehan2013-02-223-17/+65
| | | | | | | | | | | | | | | These set of ranges will be looked at if a standard memory range isn't found, and won't be installed in the cache. Use this to implement the memory behaviour of the PCI hole on x86 systems, where writes are ignored and reads always return -1. This allows breakpoints to be set when issuing a 'boot -d', which has the side effect of accessing the PCI hole when changing the PTE protection on kernel code, since the pmap layer hasn't been initialized (a bug, but present in existing FreeBSD releases so has to be handled). Reviewed by: neel Obtained from: NetApp
* Advertise PCI-E capability in the hostbridge device presented to the guest.neel2013-02-153-0/+72
| | | | | | | FreeBSD wants to see this capability in at least one device in the PCI hierarchy before it allows use of MSI or MSI-X. Obtained from: NetApp
* Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'.neel2013-02-111-1/+5
| | | | | | | | | | | | | | Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING) that called 'sched_bind()' on behalf of the user thread. The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn runs afoul of the assertion '(td_pinned == 0)' in userret(). Using the cpuset affinity to implement pinning of the vcpu threads works with both 4BSD and ULE schedulers and has the happy side-effect of getting rid of a bunch of code in vmm.ko. Discussed with: grehan
* Install <dev/agp/agpreg.h> and <dev/pci/pcireg.h> as userland headersjhb2013-02-051-2/+0
| | | | | | in /usr/include. MFC after: 2 weeks
* Add support for MSI-X interrupts in the virtio block device and make thatneel2013-02-011-8/+98
| | | | | | | | | | | | the default. The current behavior of advertising a single MSI vector can be requested by setting the environment variable "BHYVE_USE_MSI" to "yes". The use of MSI is not compliant with the virtio specification and will be eventually phased out. Submitted by: Gopakumar T Obtained from: NetApp
* Fix a broken assumption in the passthru implementation that the MSI-X tableneel2013-02-011-23/+37
| | | | | | | | | | | can only be located at the beginning or the end of the BAR. If the MSI-table is located in the middle of a BAR then we will split the BAR into two and create two mappings - one before the table and one after the table - leaving a hole in place of the table so accesses to it can be trapped and emulated. Obtained from: NetApp
* Fix a bug in the passthru implementation where it would assume that allneel2013-02-014-8/+34
| | | | | | | | | | | devices are MSI-X capable. This in turn would lead it to treat bar 0 as the MSI-X table bar even if the underlying device did not support MSI-X. Fix this by providing an API to query the MSI-X table index of the emulated device. If the underlying device does not support MSI-X then this API will return -1. Obtained from: NetApp
* Add support for MSI-X interrupts in the virtio network device and make thatneel2013-01-304-30/+348
| | | | | | | | | | | | the default. The current behavior of advertising a single MSI vector can be requested by setting the environment variable "BHYVE_USE_MSI" to "true". The use of MSI is not compliant with the virtio specification and will be eventually phased out. Submitted by: Gopakumar T Obtained from: NetApp
* Improve correctness of rtc register implementation.grehan2013-01-251-5/+15
| | | | Submitted by: tycho nightingale at pluribusnetworks com
* Use the correct type (uint64_t) to retrieve sysctl machdep.tsc_freq.neel2013-01-251-5/+2
| | | | | | | | Simplify the function a bit by falling through after initialization and return via the normal code path. Reviewed by: grehan Obtained from: NetApp
* Allocate the memory for the MSI-X table dynamically instead of allocating 32KBneel2013-01-212-15/+32
| | | | | | | | | | statically. In most cases the number of table entries will be far less than the maximum of 2048 allowed by the PCI specification. Reuse macros from pcireg.h to interpret the MSI-X capability instead of rolling our own. Obtained from: NetApp
* Get rid of redundant 'table_size' field in struct pi_msix. If needed it canneel2013-01-212-3/+1
| | | | | | always be calculated from the number of entries in the MSI-X table. Obtained from: NetApp
* Use <vmname> in a consistent manner in usage messages output by 'bhyve',neel2013-01-201-1/+2
| | | | | | 'bhyveload' and 'bhyvectl'. Pointed out by: joel@
* Don't completely drain the read file descriptor. Instead, onlygrehan2013-01-071-10/+34
| | | | | | | | | | | | fill up to the uart's rx fifo size, and leave any remaining input for when the rx fifo is read. This allows cut'n'paste of long lines to be done into the bhyve console without truncation. Also, introduce a mutex since the file input will run in the mevent thread context and may corrupt state accessed by a vCPU thread. Reviewed by: neel Approved by: NetApp
* Use 64-bit arithmetic throughout, and lock accesses to globals.grehan2013-01-071-6/+13
| | | | | | | | With this change, dbench with >= 4 processes runs without getting weird jumps forward in time when the APCI pmtimer is the default timecounter. Obtained from: NetApp
* The "unrestricted guest" capability is a feature of Intel VT-x that allowsneel2013-01-042-80/+31
| | | | | | | | | | | | | | | the guest to execute real or unpaged protected mode code - bhyve relies on this feature to execute the AP bootstrap code. Get rid of the hack that allowed bhyve to support SMP guests on processors that do not have the "unrestricted guest" capability. This hack was entirely FreeBSD-specific and would not work with any other guest OS. Instead, limit the number of vcpus to 1 when executing on processors without "unrestricted guest" capability. Suggested by: grehan Obtained from: NetApp
* Change thread name for the main kqueue event loop to "<vmname> mevent" sogrehan2012-12-201-0/+13
| | | | | | | it can be easily distinguished from other non-vCPU threads in forthcoming changes. Obtained from: NetApp
* Rename fbsdrun.* -> bhyverun.*grehan2012-12-1312-11/+11
| | | | | | | | | bhyve is intended to be a generic hypervisor, and not FreeBSD-specific. (renaming internal routines will come later) Reviewed by: neel Obtained from: NetApp
* Properly reset the tx/rx rings when a guest requests a device reset.grehan2012-12-121-0/+19
| | | | Obtained from: NetApp
* Create unique MAC addresses for virtio devices that aregrehan2012-12-121-5/+6
| | | | | | | | created with non-zero PCI function numbers. Remove obsolete reference to CFE. Obtained from: NetApp
* Determine the correct length and sector size for raw devices.grehan2012-12-081-3/+22
| | | | | Obtained from: NetApp Tested by: Michael Dexter with iscsi LUNs
* - Add in an XSDT to stop acpidump from exiting with agrehan2012-11-301-8/+48
| | | | | | | | | | | | | | | | | | 'XSDT corrupted' error - Fix up OEMID/OEM Table ID string padding in the DSDT. Output on a verbose boot now looks like ... ACPI: RSDP 0xf0400 00024 (v02 BHYVE ) ACPI: XSDT 0xf0480 00034 (v01 BHYVE BVXSDT 00000001 INTL 20120320) ACPI: APIC 0xf0500 0004A (v01 BHYVE BVMADT 00000001 INTL 20120320) ACPI: FACP 0xf0600 0010C (v05 BHYVE BVFACP 00000001 INTL 20120320) ACPI: DSDT 0xf0800 000F2 (v02 BHYVE BVDSDT 00000001 INTL 20120320) ACPI: FACS 0xf0780 00040 ... Obtained from: NetApp
* Cleanup the user-space paging exit handler now that the unified instructionneel2012-11-283-6/+3
| | | | | | emulation is in place. Obtained from: NetApp
* Revamp the x86 instruction emulation in bhyve.neel2012-11-288-697/+44
| | | | | | | | | | | | | | | | | | | On a nested page table fault the hypervisor will: - fetch the instruction using the guest %rip and %cr3 - decode the instruction in 'struct vie' - emulate the instruction in host kernel context for local apic accesses - any other type of mmio access is punted up to user-space (e.g. ioapic) The decoded instruction is passed as collateral to the user-space process that is handling the PAGING exit. The emulation code is fleshed out to include more addressing modes (e.g. SIB) and more types of operands (e.g. imm8). The source code is unified into a single file (vmm_instruction_emul.c) that is compiled into vmm.ko as well as /usr/sbin/bhyve. Reviewed by: grehan Obtained from: NetApp
* MSI-X does not need to be enabled in the message control register for theneel2012-11-221-2/+2
| | | | | | guest to access the MSI-x tables. Obtained from: NetApp
* Mask the %eax register properly based on whether the "out" instruction isneel2012-11-211-0/+16
| | | | | | | | operating on 1, 2 or 4 bytes. There could be garbage in the unused bytes so zero them off. Obtained from: NetApp
* ACPI support for bhyve.grehan2012-11-205-5/+958
| | | | | | | | | | | | | | | | | The -A option will create the minimal set of required ACPI tables in guest memory. Since ACPI mandates an IOAPIC, the -I option must also be used. Template ASL files are created, and then passed to the iasl compiler to generate AML files. These are then loaded into guest physical mem. In support of this, the ACPI PM timer is implemented, in 32-bit mode. Tested on 7.4/8.*/9.*/10-CURRENT. Reviewed by: neel Obtained from: NetApp Discussed with: jhb (a long while back)
* IFC @ r242684neel2012-11-111-1/+1
|
* Change the thread name of the vCPU threads to contain thegrehan2012-10-311-1/+8
| | | | | | | | name of the VM and the vCPU number. This helps hugely when using top -H to identify what a VM is doing. Reviewed by: neel Obtained from: NetApp
* Exit if the requested num vCPUs exceeds the maximum rathergrehan2012-10-311-16/+27
| | | | | | | | | than waiting until AP bringup detects an out-of-range vCPU. While here, fix all error output to use fprintf(stderr, ... Reviewed by: neel Reported by: @allanjude
* Present the bvm dbgport to the guest only when explicitly requested vianeel2012-10-271-27/+40
| | | | | | | the "-g" command line option. Suggested by: grehan Obtained from: NetApp
* Present the bvm console device to the guest only when explicitly requested vianeel2012-10-273-3/+25
| | | | | | | the "-b" command line option. Reviewed by: grehan Obtained from: NetApp
* Ignore PCI configuration accesses to all bus numbers other than PCI bus 0.neel2012-10-271-1/+5
| | | | Obtained from: NetApp
* Remove mptable generation code from libvmmapi and move it to bhyve.grehan2012-10-267-219/+437
| | | | | | | | | | | | Firmware tables require too much knowledge of system configuration, and it's difficult to pass that information in general terms to a library. The upcoming ACPI work exposed this - it will also livein bhyve. Also, remove code specific to NetApp from the mptable name, and remove the -n option from bhyve. Reviewed by: neel Obtained from: NetApp
* Rework how guest MMIO regions are dealt with.grehan2012-10-1913-374/+850
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - New memory region interface. An RB tree holds the regions, with a last-found per-vCPU cache to deal with the common case of repeated guest accesses to MMIO registers in the same page. - Support memory-mapped BARs in PCI emulation. mem.c/h - memory region interface instruction_emul.c/h - remove old region interface. Use gpa from EPT exit to avoid a tablewalk to determine operand address. Determine operand size and use when calling through to region handler. fbsdrun.c - call into region interface on paging exit. Distinguish between instruction emul error and region not found pci_emul.c/h - implement new BAR callback api. Split BAR alloc routine into routines that require/don't require the BAR phys address. ioapic.c pci_passthru.c pci_virtio_block.c pci_virtio_net.c pci_uart.c - update to new BAR callback i/f Reviewed by: neel Obtained from: NetApp
* Deal with transient EBUSY error return from vm_run() by retrying the operation.neel2012-10-121-2/+13
|
* Add an option "-a" to present the local apic in the XAPIC mode instead of theneel2012-09-263-3/+31
| | | | default X2APIC mode to the guest.
* Add an explicit exit code 'SPINUP_AP' to tell the controlling process that anneel2012-09-255-219/+240
| | | | | | | | | | AP needs to be activated by spinning up an execution context for it. The local apic emulation is now completely done in the hypervisor and it will detect writes to the ICR_LO register that try to bring up the AP. In response to such writes it will return to userspace with an exit code of SPINUP_AP. Reviewed by: grehan
* Fix a bug in how a 64-bit bar in a pci passthru device would be presented toneel2012-08-061-1/+6
| | | | | | | | | | the guest. Prior to the fix it was possible for such a bar to appear as a 32-bit bar as long as it was allocated from the region below 4GB. This had the potential to confuse some drivers that were particular about the size of the bars. Obtained from: NetApp
* Add support for emulating PCI multi-function devices.neel2012-08-061-54/+146
| | | | | | | | | These function number is specified by an optional [:<func>] after the slot number: -s 1:0,virtio-net,tap0 Ditto for the mptable naming: -n 1:0,e0a Obtained from: NetApp
* Device model for ioapic emulation.neel2012-08-055-3/+347
| | | | | | With this change the uart emulation is entirely interrupt driven. Obtained from: NetApp
* The displacement field in the decoded instruction should be treated as a 8-bitneel2012-08-041-20/+14
| | | | | | | | | | | or 32-bit signed integer. Simplify the handling of indirect addressing with displacement by unconditionally adding the 'instruction->disp' to the target address. This is alright since 'instruction->disp' is non-zero only for the addressing modes that specify a displacement. Obtained from: NetApp
* Add the "-I" option to control whether or not an ioapic is visible to the guest.neel2012-08-041-5/+10
| | | | Obtained from: NetApp
* Use the correct variable to index into the 'lirq[]' array to check the legacyneel2012-08-041-1/+1
| | | | IRQ ownership.
* Check that 'opts' is actually not NULL before dereferencing it. It is expectedneel2012-08-041-1/+1
| | | | that 'opts' will be NULL for the second serial port (-S <slot>,uart)
* Add 16550 uart emulation as a PCI device. This allows it togrehan2012-05-035-27/+725
| | | | | | | | | | | | | | | | | | | | | | | | be activated as part of the slot config options. The syntax is: -s <slotnum>,uart[,stdio] The stdio parameter instructs the code to perform i/o using stdin/stdout. It can only be used for one instance. To allow legacy i/o ports/irqs to be used, a new variant of the slot command, -S, is introduced. When used to specify a slot, the device will use legacy resources if it supports them; otherwise it will be treated the same as the '-s' option. Specifying the -S option with the uart will first use the 0x3f8/irq 4 config, and the second -S will use 0x2F8/irq 3. Interrupt delivery is awaiting the arrival of the i/o apic code, but this works fine in uart(4)'s polled mode. This code was written by Cynthia Lu @ MIT while an intern at NetApp, with further work from neel@ and grehan@. Obtained from: NetApp
* MSI-x interrupt support for PCI pass-thru devices.grehan2012-04-287-20/+953
| | | | | | | | | | Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine
OpenPOWER on IntegriCloud