summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* MFcalloutng:davide2013-02-281-9/+10
| | | | | | | | | | | When CPU becomes idle, cpu_idleclock() calculates time to the next timer event in order to reprogram hw timer. Return that time in sbintime_t to the caller and pass it to acpi_cpu_idle(), where it can be used as one more factor (quite precise) to extimate furter sleep time and choose optimal sleep state. This is a preparatory change for further callout improvements will be committed in the next days. The commmit is not targeted for MFC.
* Merge from vmobj-rwlock:attilio2013-02-271-3/+2
| | | | | | | | | | | VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
* Convert machine/elf.h, machine/frame.h, machine/sigframe.h,kib2013-02-205-454/+15
| | | | | | | | | | | | | | machine/signal.h and machine/ucontext.h into common x86 includes, copying from amd64 and merging with i386. Kernel-only compat definitions are kept in the i386/include/sigframe.h and i386/include/signal.h, to reduce amd64 kernel namespace pollution. The amd64 compat uses its own definitions so far. The _MACHINE_ELF_WANT_32BIT definition is to allow the sys/boot/userboot/userboot/elf32_freebsd.c to use i386 ELF definitions on the amd64 compile host. The same hack could be usefully abused by other code too.
* Consistently use round_page(x) rather than roundup(x, PAGE_SIZE). There isjkim2013-02-152-5/+5
| | | | no functional change.
* Print slightly more useful information on the 'bad pte' panic.kib2013-02-141-2/+4
| | | | | No objections from: alc MFC after: 1 week
* Assert that user address is never qremoved.kib2013-02-141-0/+1
| | | | | No objections from: alc MFC after: 1 week
* Requests for invalid CPUID leaves should map to the highest known leaf instead.neel2013-02-131-2/+2
| | | | | Reviewed by: grehan Obtained from: NetApp
* Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'.neel2013-02-115-111/+2
| | | | | | | | | | | | | | Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING) that called 'sched_bind()' on behalf of the user thread. The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn runs afoul of the assertion '(td_pinned == 0)' in userret(). Using the cpuset affinity to implement pinning of the vcpu threads works with both 4BSD and ULE schedulers and has the happy side-effect of getting rid of a bunch of code in vmm.ko. Discussed with: grehan
* Compute the number of initial kernel page table pages (NKPT) dynamically.neel2013-02-063-19/+59
| | | | | | | | | | | This eliminates the need to recompile the kernel when the default value of NKPT is not big enough - for e.g. when loading large kernel modules or memory disk images from the loader. If NKPT is defined in the kernel configuration file then it overrides the dynamic calculation. Reviewed by: alc, kib
* cpususpend_handler: mark AP as resumed only after fully setting up lapicavg2013-02-021-2/+2
| | | | | | | Reviewed by: jhb Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>, KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp> MFC after: 12 days
* x86 suspend/resume: suspend pics and pseudo-pics in reverse orderavg2013-02-021-1/+1
| | | | | | | | | | - change 'pics' from STAILQ to TAILQ - ensure that Local APIC is always first in 'pics' Reviewed by: jhb Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>, KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp> MFC after: 12 days
* Remove support for plip from the GENERIC kernel as no systems in theeadler2013-02-011-1/+0
| | | | | | | | | | | last 10 years require this support. Discussed with: db Discussed with: kib Reviewed by: imp Reviewed by: jhb Reviewed by: -hackers Approved by: cperciva (mentor)
* Fix a broken assumption in the passthru implementation that the MSI-X tableneel2013-02-011-1/+10
| | | | | | | | | | | can only be located at the beginning or the end of the BAR. If the MSI-table is located in the middle of a BAR then we will split the BAR into two and create two mappings - one before the table and one after the table - leaving a hole in place of the table so accesses to it can be trapped and emulated. Obtained from: NetApp
* Increase the number of passthru devices supported by bhyve.neel2013-02-012-17/+29
| | | | | | | | | The maximum length of an environment variable puts a limitation on the number of passthru devices that can be specified via a single variable. The workaround is to allow user to specify passthru devices via multiple environment variables instead of a single one. Obtained from: NetApp
* Add emulation support for instruction "88/r: mov r/m8, r8".neel2013-01-302-1/+60
| | | | | | This instruction moves a byte from a register to a memory location. Tested by: tycho nightingale at pluribusnetworks com
* Reduce duplication between i386/linux/linux.h and amd64/linux32/linux.hjhb2013-01-292-160/+1
| | | | | | | by moving bits that are MI out into headers in compat/linux. Reviewed by: Chagin Dmitry dmitry | gmail MFC after: 2 weeks
* Always allow access to the sysenter cs/esp/eip MSRs since theygrehan2013-01-251-0/+7
| | | | | | | are automatically saved and restored in the VMCS. Reviewed by: neel Obtained from: NetApp
* Don't assume that all Linux TCP-level socket options are identical tojhb2013-01-231-0/+7
| | | | | | | | FreeBSD TCP-level socket options (only the first two are). Instead, using a mapping function and fail unsupported options as we do for other socket option levels. MFC after: 2 weeks
* Postpone vmm module initialization until after SMP is initialized - particularlyneel2013-01-211-4/+8
| | | | | | | | | | | | that 'smp_started != 0'. This is required because the VT-x initialization calls smp_rendezvous() to set the CR4_VMXE bit on all the cpus. With this change we can preload vmm.ko from the loader. Reported by: alfred@, sbruno@ Obtained from: NetApp
* Add svn properties to the recently merged bhyve source files.neel2013-01-202-2/+2
| | | | | The pre-commit hook will not allow any commits without the svn:keywords property in head.
* Merge projects/bhyve to head.neel2013-01-1946-0/+12243
|\ | | | | | | | | | | | | | | | | 'bhyve' was developed by grehan@ and myself at NetApp (thanks!). Special thanks to Peter Snyder, Joe Caradonna and Michael Dexter for their support and encouragement. Obtained from: NetApp
| * IFC @ r245509neel2013-01-172-0/+17
| |\
| * | IFC @ r245205neel2013-01-091-4/+3
| | |
| * | IFC @ r245178neel2013-01-091-9/+1
| |\ \
| * | | Revert changes for x2apic support from projects/bhyve.neel2013-01-063-40/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the early days of bhyve it did not support instruction emulation which necessitated the use of x2apic to access the local apic. This is no longer the case and the dependency on x2apic has gone away. The x2apic patches can be considered independently of bhyve and will be merged into head via projects/x2apic. Discussed with: grehan
| * | | bhyve does not require a custom configuration file anymore so make the GENERICneel2013-01-051-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | identical to the one in HEAD. Obtained from: NetApp
| * | | IFC @ r244983.neel2013-01-042-9/+48
| |\ \ \
| * | | | There is no need for a special 'BHYVE' kernel configuration file anymore -neel2013-01-041-345/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'GENERIC' works fine. Obtained from: NetApp
| * | | | There is no need for 'start_emulating()' and 'stop_emulating()' to be definedneel2013-01-042-19/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in <machine/cpufunc.h> so remove them from there. Obtained from: NetApp
| * | | | The "unrestricted guest" capability is a feature of Intel VT-x that allowsneel2013-01-041-43/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the guest to execute real or unpaged protected mode code - bhyve relies on this feature to execute the AP bootstrap code. Get rid of the hack that allowed bhyve to support SMP guests on processors that do not have the "unrestricted guest" capability. This hack was entirely FreeBSD-specific and would not work with any other guest OS. Instead, limit the number of vcpus to 1 when executing on processors without "unrestricted guest" capability. Suggested by: grehan Obtained from: NetApp
| * | | | Modify the default behavior of bhyve such that it no longer forces the use ofneel2012-12-161-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x2apic mode on the guest. The guest can decide whether or not it wants to use legacy mmio or x2apic access to the APIC by writing to the MSR_APICBASE register. Obtained from: NetApp
| * | | | Prefer x2apic mode when running inside a virtual machine.neel2012-12-162-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a tunable 'machdep.x2apic_desired' to let the administrator override the default behavior. Provide a read-only sysctl 'machdep.x2apic' to let the administrator know whether the kernel is using x2apic or legacy mmio to access local apic. Tested with Parallels Desktop 8 and bhyve hypervisors. Also tested running on bare metal Intel Xeon E5-2658. Obtained from: NetApp Discussed with: jhb, attilio, avg, grehan
| * | | | IFC @r243836neel2012-12-042-25/+31
| |\ \ \ \
| * | | | | Properly screen for the AND 0x81 instruction from the setgrehan2012-11-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of group1 0x81 instructions that use the reg bits as an extended opcode. Still todo: properly update rflags. Pointed out by: jilles@
| * | | | | Remove debug printf.grehan2012-11-291-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Pointed out by: emaste
| * | | | | Add support for the 0x81 AND instruction, now generatedgrehan2012-11-292-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by clang in the local APIC code. 0x81 is a read-modify-write instruction - the EPT check that only allowed read or write and not both has been relaxed to allow read and write. Reviewed by: neel Obtained from: NetApp
| * | | | | Cleanup the user-space paging exit handler now that the unified instructionneel2012-11-282-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | emulation is in place. Obtained from: NetApp
| * | | | | Change emulate_rdmsr() and emulate_wrmsr() to return 0 on sucess and errno onneel2012-11-283-65/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | failure. The conversion from the return value to HANDLED or UNHANDLED can be done locally in vmx_exit_process(). Obtained from: NetApp
| * | | | | Revamp the x86 instruction emulation in bhyve.neel2012-11-288-218/+605
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a nested page table fault the hypervisor will: - fetch the instruction using the guest %rip and %cr3 - decode the instruction in 'struct vie' - emulate the instruction in host kernel context for local apic accesses - any other type of mmio access is punted up to user-space (e.g. ioapic) The decoded instruction is passed as collateral to the user-space process that is handling the PAGING exit. The emulation code is fleshed out to include more addressing modes (e.g. SIB) and more types of operands (e.g. imm8). The source code is unified into a single file (vmm_instruction_emul.c) that is compiled into vmm.ko as well as /usr/sbin/bhyve. Reviewed by: grehan Obtained from: NetApp
| * | | | | Fix a bug in the MSI-X resource allocation for PCI passthrough devices.neel2012-11-221-37/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where the underlying host had disabled MSI-X via the "hw.pci.enable_msix" tunable, the ppt_setup_msix() function would fail and return an error without properly cleaning up. This in turn would cause a page fault on the next boot of the guest. Fix this by calling ppt_teardown_msix() in all the error return paths. Obtained from: NetApp
| * | | | | Get rid of redundant comparision which is guaranteed to be "true" for unsignedneel2012-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | integers. Obtained from: NetApp
| * | | | | Handle CPUID leaf 0x7 now that FreeBSD is using it.grehan2012-11-202-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return 0's for now. Reviewed by: neel Obtained from: NetApp
| * | | | | IFC @ r243164neel2012-11-172-6/+3
| |\ \ \ \ \
| * \ \ \ \ \ IFC @ r242940neel2012-11-131-1/+15
| |\ \ \ \ \ \
| * \ \ \ \ \ \ IFC @ r242684neel2012-11-1127-1134/+458
| |\ \ \ \ \ \ \
| * | | | | | | | Fix issue found with clang build. Avoid code insertion by the compilergrehan2012-11-061-29/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | between inline asm statements that would in turn modify the flags value set by the first asm, and used by the second. Solve by making the common error block a string that can be pulled into the first inline asm, and using symbolic labels for asm variables. bhyve can now build/run fine when compiled with clang. Reviewed by: neel Obtained from: NetApp
| * | | | | | | | Convert VMCS_ENTRY_INTR_INFO field into a vmcs identifier before passing itneel2012-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to vmcs_getreg(). Without this conversion vmcs_getreg() will return EINVAL. In particular this prevented injection of the breakpoint exception into the guest via the "-B" option to /usr/sbin/bhyve which is hugely useful when debugging guest hangs. This was broken in r241921. Pointy hat: me Obtained from: NetApp
| * | | | | | | | Corral all the host state associated with the virtual machine into its own file.neel2012-10-295-24/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This state is independent of the type of hardware assist used so there is really no need for it to be in Intel-specific code. Obtained from: NetApp
| * | | | | | | | Set the valid field of the newly allocated field as all othergrehan2012-10-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm page allocators do. This fixes a panic when a virtio block device is mounted as root, with the host system dying in vm_page_dirty with invalid bits. Reviewed by: neel Obtained from: NetApp
| * | | | | | | | Unconditionally enable fpu emulation by setting CR0.TS in the host after theneel2012-10-262-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | guest does a vm exit. This allows us to trap any fpu access in the host context while the fpu still has "dirty" state belonging to the guest. Reported by: "s vas" on freebsd-virtualization@ Obtained from: NetApp
OpenPOWER on IntegriCloud