summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Add svn properties to the recently merged bhyve source files.neel2013-01-202-2/+2
| | | | | The pre-commit hook will not allow any commits without the svn:keywords property in head.
* Merge projects/bhyve to head.neel2013-01-1946-0/+12243
|\ | | | | | | | | | | | | | | | | 'bhyve' was developed by grehan@ and myself at NetApp (thanks!). Special thanks to Peter Snyder, Joe Caradonna and Michael Dexter for their support and encouragement. Obtained from: NetApp
| * IFC @ r245509neel2013-01-172-0/+17
| |\
| * | IFC @ r245205neel2013-01-091-4/+3
| | |
| * | IFC @ r245178neel2013-01-091-9/+1
| |\ \
| * | | Revert changes for x2apic support from projects/bhyve.neel2013-01-063-40/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the early days of bhyve it did not support instruction emulation which necessitated the use of x2apic to access the local apic. This is no longer the case and the dependency on x2apic has gone away. The x2apic patches can be considered independently of bhyve and will be merged into head via projects/x2apic. Discussed with: grehan
| * | | bhyve does not require a custom configuration file anymore so make the GENERICneel2013-01-051-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | identical to the one in HEAD. Obtained from: NetApp
| * | | IFC @ r244983.neel2013-01-042-9/+48
| |\ \ \
| * | | | There is no need for a special 'BHYVE' kernel configuration file anymore -neel2013-01-041-345/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'GENERIC' works fine. Obtained from: NetApp
| * | | | There is no need for 'start_emulating()' and 'stop_emulating()' to be definedneel2013-01-042-19/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in <machine/cpufunc.h> so remove them from there. Obtained from: NetApp
| * | | | The "unrestricted guest" capability is a feature of Intel VT-x that allowsneel2013-01-041-43/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the guest to execute real or unpaged protected mode code - bhyve relies on this feature to execute the AP bootstrap code. Get rid of the hack that allowed bhyve to support SMP guests on processors that do not have the "unrestricted guest" capability. This hack was entirely FreeBSD-specific and would not work with any other guest OS. Instead, limit the number of vcpus to 1 when executing on processors without "unrestricted guest" capability. Suggested by: grehan Obtained from: NetApp
| * | | | Modify the default behavior of bhyve such that it no longer forces the use ofneel2012-12-161-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x2apic mode on the guest. The guest can decide whether or not it wants to use legacy mmio or x2apic access to the APIC by writing to the MSR_APICBASE register. Obtained from: NetApp
| * | | | Prefer x2apic mode when running inside a virtual machine.neel2012-12-162-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a tunable 'machdep.x2apic_desired' to let the administrator override the default behavior. Provide a read-only sysctl 'machdep.x2apic' to let the administrator know whether the kernel is using x2apic or legacy mmio to access local apic. Tested with Parallels Desktop 8 and bhyve hypervisors. Also tested running on bare metal Intel Xeon E5-2658. Obtained from: NetApp Discussed with: jhb, attilio, avg, grehan
| * | | | IFC @r243836neel2012-12-042-25/+31
| |\ \ \ \
| * | | | | Properly screen for the AND 0x81 instruction from the setgrehan2012-11-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of group1 0x81 instructions that use the reg bits as an extended opcode. Still todo: properly update rflags. Pointed out by: jilles@
| * | | | | Remove debug printf.grehan2012-11-291-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Pointed out by: emaste
| * | | | | Add support for the 0x81 AND instruction, now generatedgrehan2012-11-292-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by clang in the local APIC code. 0x81 is a read-modify-write instruction - the EPT check that only allowed read or write and not both has been relaxed to allow read and write. Reviewed by: neel Obtained from: NetApp
| * | | | | Cleanup the user-space paging exit handler now that the unified instructionneel2012-11-282-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | emulation is in place. Obtained from: NetApp
| * | | | | Change emulate_rdmsr() and emulate_wrmsr() to return 0 on sucess and errno onneel2012-11-283-65/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | failure. The conversion from the return value to HANDLED or UNHANDLED can be done locally in vmx_exit_process(). Obtained from: NetApp
| * | | | | Revamp the x86 instruction emulation in bhyve.neel2012-11-288-218/+605
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a nested page table fault the hypervisor will: - fetch the instruction using the guest %rip and %cr3 - decode the instruction in 'struct vie' - emulate the instruction in host kernel context for local apic accesses - any other type of mmio access is punted up to user-space (e.g. ioapic) The decoded instruction is passed as collateral to the user-space process that is handling the PAGING exit. The emulation code is fleshed out to include more addressing modes (e.g. SIB) and more types of operands (e.g. imm8). The source code is unified into a single file (vmm_instruction_emul.c) that is compiled into vmm.ko as well as /usr/sbin/bhyve. Reviewed by: grehan Obtained from: NetApp
| * | | | | Fix a bug in the MSI-X resource allocation for PCI passthrough devices.neel2012-11-221-37/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where the underlying host had disabled MSI-X via the "hw.pci.enable_msix" tunable, the ppt_setup_msix() function would fail and return an error without properly cleaning up. This in turn would cause a page fault on the next boot of the guest. Fix this by calling ppt_teardown_msix() in all the error return paths. Obtained from: NetApp
| * | | | | Get rid of redundant comparision which is guaranteed to be "true" for unsignedneel2012-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | integers. Obtained from: NetApp
| * | | | | Handle CPUID leaf 0x7 now that FreeBSD is using it.grehan2012-11-202-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return 0's for now. Reviewed by: neel Obtained from: NetApp
| * | | | | IFC @ r243164neel2012-11-172-6/+3
| |\ \ \ \ \
| * \ \ \ \ \ IFC @ r242940neel2012-11-131-1/+15
| |\ \ \ \ \ \
| * \ \ \ \ \ \ IFC @ r242684neel2012-11-1127-1134/+458
| |\ \ \ \ \ \ \
| * | | | | | | | Fix issue found with clang build. Avoid code insertion by the compilergrehan2012-11-061-29/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | between inline asm statements that would in turn modify the flags value set by the first asm, and used by the second. Solve by making the common error block a string that can be pulled into the first inline asm, and using symbolic labels for asm variables. bhyve can now build/run fine when compiled with clang. Reviewed by: neel Obtained from: NetApp
| * | | | | | | | Convert VMCS_ENTRY_INTR_INFO field into a vmcs identifier before passing itneel2012-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to vmcs_getreg(). Without this conversion vmcs_getreg() will return EINVAL. In particular this prevented injection of the breakpoint exception into the guest via the "-B" option to /usr/sbin/bhyve which is hugely useful when debugging guest hangs. This was broken in r241921. Pointy hat: me Obtained from: NetApp
| * | | | | | | | Corral all the host state associated with the virtual machine into its own file.neel2012-10-295-24/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This state is independent of the type of hardware assist used so there is really no need for it to be in Intel-specific code. Obtained from: NetApp
| * | | | | | | | Set the valid field of the newly allocated field as all othergrehan2012-10-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm page allocators do. This fixes a panic when a virtio block device is mounted as root, with the host system dying in vm_page_dirty with invalid bits. Reviewed by: neel Obtained from: NetApp
| * | | | | | | | Unconditionally enable fpu emulation by setting CR0.TS in the host after theneel2012-10-262-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | guest does a vm exit. This allows us to trap any fpu access in the host context while the fpu still has "dirty" state belonging to the guest. Reported by: "s vas" on freebsd-virtualization@ Obtained from: NetApp
| * | | | | | | | If the guest vcpu wants to idle then use that opportunity to relinquish theneel2012-10-252-23/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | host cpu to the scheduler until the guest is ready to run again. This implies that the host cpu utilization will now closely mirror the actual load imposed by the guest vcpu. Also, the vcpu mutex now needs to be of type MTX_SPIN since we need to acquire it inside a critical section. Obtained from: NetApp
| * | | | | | | | Hide the monitor/mwait instruction capability from the guest until we know howneel2012-10-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to properly intercept it. Obtained from: NetApp
| * | | | | | | | Maintain state regarding NMI delivery to guest vcpu in VT-x independent manner.neel2012-10-245-34/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also add a stats counter to count the number of NMIs delivered per vcpu. Obtained from: NetApp
| * | | | | | | | Test for AST pending with interrupts disabled right before entering the guest.neel2012-10-234-28/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an IPI was delivered to this cpu before interrupts were disabled then return right away via vmx_setjmp() with a return value of VMX_RETURN_AST. Obtained from: NetApp
| * | | | | | | | Calculate the number of host ticks until the next guest timer interrupt.neel2012-10-204-56/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This information will be used in conjunction with guest "HLT exiting" to yield the thread hosting the virtual cpu. Obtained from: NetApp
| * | | | | | | | Add the guest physical address and r/w/x bits togrehan2012-10-122-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the paging exit in preparation for a rework of bhyve MMIO handling. Reviewed by: neel Obtained from: NetApp
| * | | | | | | | Provide per-vcpu locks instead of relying on a single big lock.neel2012-10-127-76/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This also gets rid of all the witness.watch warnings related to calling malloc(M_WAITOK) while holding a mutex. Reviewed by: grehan
| * | | | | | | | Fix warnings generated by 'debug.witness.watch' during VM creation andneel2012-10-113-39/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | destruction for calling malloc() with M_WAITOK while holding a mutex. Do not allow vmm.ko to be unloaded until all virtual machines are destroyed.
| * | | | | | | | Deliver the MSI to the correct guest virtual cpu.neel2012-10-111-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change the MSI was being delivered unconditionally to vcpu 0 regardless of how the guest programmed the MSI delivery.
| * | | | | | | | Allocate memory pages for the guest from the host's free page queue.neel2012-10-088-374/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is no longer necessary to hard-partition the memory between the host and guests at boot time.
| * | | | | | | | Change vm_malloc() to map pages in the guest physical address space in 4KBneel2012-10-045-20/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | chunks. This breaks the assumption that the entire memory segment is contiguously allocated in the host physical address space. This also paves the way to satisfy the 4KB page allocations by requesting free pages from the VM subsystem as opposed to hard-partitioning host memory at boot time.
| * | | | | | | | Get rid of assumptions in the hypervisor that the host physical memoryneel2012-10-032-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | associated with guest physical memory is contiguous. Add check to vm_gpa2hpa() that the range indicated by [gpa,gpa+len) is all contained within a single 4KB page.
| * | | | | | | | Get rid of assumptions in the hypervisor that the host physical memoryneel2012-10-036-45/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | associated with guest physical memory is contiguous. Rewrite vm_gpa2hpa() to get the GPA to HPA mapping by querying the nested page tables.
| * | | | | | | | Get rid of assumptions in the hypervisor that the host physical memoryneel2012-09-293-11/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | associated with guest physical memory is contiguous. In this case vm_malloc() was using vm_gpa2hpa() to indirectly infer whether or not the address range had already been allocated. Replace this instead with an explicit API 'vm_gpa_available()' that returns TRUE if a page is available for allocation in guest physical address space.
| * | | | | | | | Intel VT-x provides the length of the instruction at the time of the nestedneel2012-09-275-35/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | page table fault. Use this when fetching the instruction bytes from the guest memory. Also modify the lapic_mmio() API so that a decoded instruction is fed into it instead of having it fetch the instruction bytes from the guest. This is useful for hardware assists like SVM that provide the faulting instruction as part of the vmexit.
| * | | | | | | | Add an option "-a" to present the local apic in the XAPIC mode instead of theneel2012-09-263-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | default X2APIC mode to the guest.
| * | | | | | | | Add support for trapping MMIO writes to local apic registers and emulating them.neel2012-09-2510-25/+676
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default behavior is still to present the local apic to the guest in the x2apic mode.
| * | | | | | | | Add ioctls to control the X2APIC capability exposed by the virtual machine toneel2012-09-254-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the guest. At the moment this simply sets the state in the 'vcpu' instance but there is no code that acts upon these settings.
| * | | | | | | | Add an explicit exit code 'SPINUP_AP' to tell the controlling process that anneel2012-09-253-5/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AP needs to be activated by spinning up an execution context for it. The local apic emulation is now completely done in the hypervisor and it will detect writes to the ICR_LO register that try to bring up the AP. In response to such writes it will return to userspace with an exit code of SPINUP_AP. Reviewed by: grehan
OpenPOWER on IntegriCloud