summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Fix a leak of the wired pages when unwiring of the PROT_NONE-mappedkib2014-09-011-35/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wired region. Rework the handling of unwire to do the it in batch, both at pmap and object level. All commits below are by alc. MFC r268327: Introduce pmap_unwire(). MFC r268591: Implement pmap_unwire() for powerpc. MFC r268776: Implement pmap_unwire() for arm. MFC r268806: pmap_unwire(9) man page. MFC r269134: When unwiring a region of an address space, do not assume that the underlying physical pages are mapped by the pmap. This fixes a leak of the wired pages on the unwiring of the region mapped with no access allowed. MFC r269339: In the implementation of the new function pmap_unwire(), the call to MOEA64_PVO_TO_PTE() must be performed before any changes are made to the PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic. MFC r269365: Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed by the combination of r268591 and r269134: When we attempt to add the wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing. (They only set the wired attribute on newly created mappings.) MFC r269433: Handle wiring failures in vm_map_wire() with the new functions pmap_unwire() and vm_object_unwire(). Retire vm_fault_{un,}wire(), since they are no longer used. MFC r269438: Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable "rv" is uninitialized. MFC r269485: Retire pmap_change_wiring(). Reviewed by: alc
* MFC 270438grehan2014-08-271-4/+4
| | | | | | | Change __inline style to be consistent with FreeBSD usage, and also fix gcc build. PR: 192880
* MFC r270202:kib2014-08-271-1/+1
| | | | Increase max number of physical segments on amd64 to 63.
* Merge the changes to pmap_enter(9) for sleep-less operation (requestedkib2014-08-241-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | by flag). The ia64 pmap.c changes are direct commit, since ia64 is removed on head. MFC r269368 (by alc): Retire PVO_EXECUTABLE. MFC r269728: Change pmap_enter(9) interface to take flags parameter and superpage mapping size (currently unused). MFC r269759 (by alc): Update the text of a KASSERT() to reflect the changes in r269728. MFC r269822 (by alc): Change {_,}pmap_allocpte() so that they look for the flag PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether to sleep on page table page allocation. MFC r270151 (by alc): Replace KASSERT that no PV list locks are held with a conditional unlock. Reviewed by: alc Approved by: re (gjb) Sponsored by: The FreeBSD Foundation
* MFC r263822: amd64: Parse the EFI memory map if presentemaste2014-08-222-3/+111
| | | | | | | | With this change (and loader.efi from [HEAD]) we can now boot under qemu using the OVMF UEFI firmware image with the limitation that a serial console is required. Sponsored by: The FreeBSD Foundation
* MFC r267921, r267934, r267949, r267959, r267966, r268202, r268276,grehan2014-08-1914-334/+1529
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r268427, r268428, r268521, r268638, r268639, r268701, r268777, r268889, r268922, r269008, r269042, r269043, r269080, r269094, r269108, r269109, r269281, r269317, r269700, r269896, r269962, r269989. Catch bhyve up to CURRENT. Lightly tested with FreeBSD i386/amd64, Linux i386/amd64, and OpenBSD/amd64. Still resolving an issue with OpenBSD/i386. Many thanks to jhb@ for all the hard work on the prior MFCs ! r267921 - support the "mov r/m8, imm8" instruction r267934 - document options r267949 - set DMI vers/date to fixed values r267959 - doc: sort cmd flags r267966 - EPT misconf post-mortem info r268202 - use correct flag for event index r268276 - 64-bit virtio capability api r268427 - invalidate guest TLB when cr3 is updated, needed for TSS r268428 - identify vcpu's operating mode r268521 - use correct offset in guest logical-to-linear translation r268638 - chs value r268639 - chs fake values r268701 - instr emul operand/address size override prefix support r268777 - emulation for legacy x86 task switching r268889 - nested exception support r268922 - fix INVARIANTS build r269008 - emulate instructions found in the OpenBSD/i386 5.5 kernel r269042 - fix fault injection r269043 - Reduce VMEXIT_RESTARTs in task_switch.c r269080 - fix issues in PUSH emulation r269094 - simplify return values from the inout handlers r269108 - don't return -1 from the push emulation handler r269109 - avoid permanent sleep in vm_handle_hlt() r269281 - list VT-x features in base kernel dmesg r269317 - Mark AHCI fatal errors as not completed r269700 - Support PCI extended config space in bhyve r269896 - Minor cleanup r269962 - use max guest memory when creating IOMMU domain r269989 - fix interrupt mode names
* MFC r267338grehan2014-08-171-50/+47
| | | | Replace enum forward declarations with complete definitions
* MFC r267311, r267330, r267811, r267884grehan2014-08-175-39/+120
| | | | | | | | | | | | Turn on interrupt window exiting unconditionally when an ExtINT is being injected into the guest. Add helper functions to populate VM exit information for rendezvous and astpending exits. Provide APIs to directly get 'lowmem' and 'highmem' size directly. Expose the amount of resident and wired memory from the guest's vmspace
* MFC r267178, r267300grehan2014-08-173-53/+192
| | | | | | Support guest accesses to %cr8 Add reserved bit checking when doing %CR8 emulation and inject #GP if required.
* MFC r267216grehan2014-08-176-80/+153
| | | | | | Add ioctl(VM_REINIT) to reinitialize the virtual machine state maintained by vmm.ko. This allows the virtual machine to be restarted without having to destroy it first.
* MFC r266933grehan2014-08-175-15/+70
| | | | | Activate vcpus from bhyve(8) using the ioctl VM_ACTIVATE_CPU instead of doing it implicitly in vmm.ko.
* MFC r266826, r266827markj2014-08-091-22/+0
| | | | | Move some duplicated hook definitions from machine-dependent files to kern_dtrace.c.
* MFC r258436: Refactor amd64 startup SMAP parsingemaste2014-08-011-33/+44
| | | | | | Extracted from the projects/uefi branch, this change is a reasonable cleanup and will reduce the diffs to review when bringing in the UEFI work.
* MFC r263329:markj2014-07-291-19/+21
| | | | | | | Only invoke fasttrap hooks for traps from user mode, and ensure that they're called with interrupts enabled. Calling fasttrap_pid_probe() with interrupts disabled can lead to deadlock if fasttrap writes to the process' address space.
* MFC: r269051marius2014-07-291-0/+8
| | | | | | | | | | | | | | | | | Copying pages via temporary mappings in the !DMAP case of pmap_copy_pages() involves updating the corresponding page tables followed by accesses to the pages in question. This sequence is subject to the situation exactly described in the "AMD64 Architecture Programmer's Manual Volume 2: System Programming" rev. 3.23, "7.3.1 Special Coherency Considerations" [1, p. 171 f.]. Therefore, issuing the INVLPG right after modifying the PTE bits is crucial (see also r269050, MFCed to stable/10 in r269235). For the amd64 PMAP code, the order of instructions was already correct. The above fact still is worth documenting, though. 1: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/24593_APM_v21.pdf Reviewed by: alc Sponsored by: Bally Wulff Games & Entertainment GmbH
* MFC r267213 (by alc):kib2014-07-241-3/+1
| | | | | | Add a page size field to struct vm_page. Approved by: alc
* MFC r258471: Don't abort SMAP processing after an entry of length 0emaste2014-07-241-1/+1
| | | | | | | Length 0 is not special and should just be skipped. This is the same behaviour as i386. Sponsored by: The FreeBSD Foundation
* MFC r268660:kib2014-07-241-5/+53
| | | | Make amd64 pmap_copy_pages() functional for pages not mapped by DMAP.
* MFC 266424,266476,266524,266573,266595,266626,266627,266633,266641,266642,jhb2014-07-2210-174/+762
| | | | | | | | 266708,266724,266934,266935,268521: Emulation of the "ins" and "outs" instructions. Various fixes for translating guest linear addresses to guest physical addresses.
* MFC 266125:jhb2014-07-225-0/+57
| | | | | Implement a PCI interrupt router to route PCI legacy INTx interrupts to the legacy 8259A PICs.
* MFC 264353,264509,264768,264770,264825,264846,264988,265114,265165,265365,jhb2014-07-219-64/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | 265941,265951,266390,266550,266910: Various bhyve fixes: - Don't save host's return address in 'struct vmxctx'. - Permit non-32-bit accesses to local APIC registers. - Factor out common ioport handler code. - Use calloc() in favor of malloc + memset. - Change the vlapic timer frequency to be in the ballpark of contemporary hardware. - Allow the guest to read the TSC via MSR 0x10. - A VMCS is always inactive when it exits the vmx_run() loop. Remove redundant code and the misleading comment that suggest otherwise. - Ignore writes to microcode update MSR. This MSR is accessed by RHEL7 guest. Add KTR tracepoints to annotate wrmsr and rdmsr VM exits. - Provide an alias for the userboot console and name it 'comconsole'. - Use EV_ADD to create an mevent and EV_ENABLE to enable it. - abort(3) the process in response to a VMEXIT_ABORT. - Don't include the guest memory segments in the bhyve(8) process core dump. - Make the vmx asm code dtrace-fbt-friendly. - Allow vmx_getdesc() and vmx_setdesc() to be called for a vcpu that is in the VCPU_RUNNING state. - Enable VMX in the IA32_FEATURE_CONTROL MSR if it not enabled and the MSR isn't locked.
* MFC 264347:jhb2014-07-211-1/+10
| | | | | Account for the "plus 1" encoding of the CPUID Function 4 reported core per package and cache sharing values.
* MFC 263780,264516,265062,265101,265203,265364:jhb2014-07-216-72/+220
| | | | | | | | | | | | Add an ioctl to suspend a virtual machine (VM_SUSPEND). Add logic in the HLT exit handler to detect if the guest has put all vcpus to sleep permanently by executing a HLT with interrupts disabled. When this condition is detected the guest with be suspended with a reason of VM_SUSPEND_HALT and the bhyve(8) process will exit. This logic can be disabled via the tunable 'hw.vmm.halt_detection'.
* MFC 260847,264055,264867:jhb2014-07-211-0/+1
| | | | | | | - Add a very simple virtio_random(4) driver for FreeBSD guests to harvest entropy from hypervisors. - Add support to bhyve for the virtio RNG entropy-source device to provide entry to bhyve guests.
* MFC 259942,262274,263035,263054,263211,263744,264179,264324,264468,264631,jhb2014-07-1913-14/+1650
| | | | | 264648,264650,264651,266572,267558: Flesh out the AT PIC and 8254 PIT emulations and move them into the kernel.
* MFC r263749,267146:imp2014-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | >r267146 | imp | 2014-06-05 22:08:55 -0600 (Thu, 05 Jun 2014) | 4 lines >Restore comments accidentally removed. >r263749 | imp | 2014-03-25 16:08:31 -0600 (Tue, 25 Mar 2014) | 18 lines >Rather than require a makeoptions DEBUG to get debug correct, >add it in kern.mk, but only if we're using clang. While this >option is supported by both clang and gcc, in the future there >may be changes to clang which change the defaults that require >a tweak to build our kernel such that other tools in our tree >will work. Set a good example by forcing -gdwarf-2 only for >clang builds, and only if the user hasn't specified another >dwarf level already. Update UPDATING to reflect the changed >state of affairs. This also keeps us from having to update >all the ARM kernels to add this, and also keeps us from >in the future having to update all the MIPS kernels and is >one less place the user will have to know to do something >special for clang and one less thing developers will need >to do when moving an architecture to clang.
* MFC r268471:kib2014-07-161-1/+3
| | | | | For safety, ensure that any consumer of the set_regs() and ptrace_set_pc() use the correct return to userspace using iret.
* MFC r268383:kib2014-07-151-0/+4
| | | | Correct si_code for the SIGBUS signal generated by the alignment trap.
* Temporary disable build of vt_efifb vt(4) driver, not all parts of UEFI supportray2014-07-071-1/+0
| | | | | | | here yet. This direct commit to STABLE-10, because HEAD already support UEFI FB. Sponsored by: The FreeBSD Foundation
* 267622 Log:ray2014-07-072-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename vt(4) vga module to dismiss interference with syscons(4) vga module. 267623 Log: Remove stale link to deleted vt(4) xboxfb driver. 267624 Log: syscons(4) and vt(4) can be built together now. 267625 Log: Allow to disable syscons(4) if "hw.syscons.disable" kenv is set. 267626 Log: Suspend vt(4) initialization if "kern.vt.disable" kenv is set. 267965 by emaste@ Log: Use a common tunable to choose between vt(4)/sc(4) With this change and previous work from ray@ it will be possible to put both in GENERIC, and have one enabled by default, but allow the other to be selected via the loader. (The previous implementation had separate kern.vt.disable and hw.syscons.disable tunables, and would panic if both drivers were compiled in and neither was explicitly disabled.) 268175 by emaste@ Log: Fix vt(4) detection in kbdcontrol and vidcontrol As sc(4) and vt(4) coexist and are both enabled in GENERIC, the existence of a vt(4) sysctl is not sufficient to determine that vt(4) is in use. Reported by: Trond Endrestøl 268045 by emaste@ Log: Add vt(4) to GENERIC and retire the separate VT config vt(4) and sc(4) can now coexist in the same kernel. To choose the vt driver, set the loader tunable kern.vty=vt . Sponsored by: The FreeBSD Foundation
* MFC r267767:kib2014-06-302-0/+9
| | | | | | Add FPU_KERN_KTHR flag to fpu_kern_enter(9). Apply the flag to padlock(4) and aesni(4). In aesni_cipher_process(), do not leak FPU context state on error.
* MFC 261781:jhb2014-06-271-8/+5
| | | | | | Don't waste a page of KVA for the boot-time memory test on x86. For amd64, reuse the first page of the crashdumpmap as CMAP1/CADDR1. For i386, remove CMAP1/CADDR1 entirely and reuse CMAP3/CADDR3 for the memory test.
* MFC r266901neel2014-06-171-1/+1
| | | | | | Allocate a zeroed LDT. Failing to do this might result in the LDT appearing to run out of free descriptors because of random junk in the descriptor's 'sd_type' field.
* Revert MFC r266925 because it can lead to instant panic at fexecve():dchagin2014-06-171-2/+2
| | | | | | To allow to run interpreter itself add a new ELF branding type. Pointed out by: kib, mjg
* MFC 262139,262140,262236,262281,262532:jhb2014-06-139-64/+268
| | | | | | | | | | | Various x2APIC fixes and enhancements: - Use spinlocks for the vioapic. - Handle the SELF_IPI MSR. - Simplify the APIC mode switching between MMIO and x2APIC. The guest is no longer allowed to switch modes at runtime. Instead, the desired mode is set when the virtual machine is created. - Disallow MMIO access in x2APIC mode and MSR access in xAPIC mode. - Add support for x2APIC virtualization assist in Intel VT-x.
* MFC 262615,262624:jhb2014-06-121-0/+7
| | | | | | Workaround an apparent bug in VMWare Fusion's nested VT support where it triggers a VM exit with the exit reason of an external interrupt but without a valid interrupt set in the exit interrupt information.
* MFC 261638,262144,262506,266765:jhb2014-06-1210-109/+377
| | | | | | | | | | | | | | | | | | | | | | | | | Add virtualized XSAVE support to bhyve which permits guests to use XSAVE and XSAVE-enabled features like AVX. - Store a per-cpu guest xcr0 register and handle xsetbv VM exits by emulating the instruction. - Only expose XSAVE to guests if XSAVE is enabled in the host. Only expose a subset of XSAVE features currently supported by the guest and for which the proper emulation of xsetbv is known. Currently this includes X87, SSE, AVX, AVX-512, and Intel MPX. - Add support for injecting hardware exceptions into the guest and use this to trigger exceptions in the guest for invalid xsetbv operations instead of potentially faulting in the host. - Queue pending exceptions in the 'struct vcpu' instead of directly updating the processor-specific VMCS or VMCB. The pending exception will be delivered right before entering the guest. - Rename the unused ioctl VM_INJECT_EVENT to VM_INJECT_EXCEPTION and restrict it to only deliver x86 hardware exceptions. This new ioctl is now used to inject a protection fault when the guest accesses an unimplemented MSR. - Expose a subset of known-safe features from leaf 0 of the structured extended features to guests if they are supported on the host including RDFSBASE/RDGSBASE, BMI1/2, AVX2, AVX-512, HLE, ERMS, and RTM. Aside from AVX-512, these features are all new instructions available for use in ring 3 with no additional hypervisor changes needed.
* MFC 266263,266551,266552:jhb2014-06-122-3/+17
| | | | | | - Add definitions for more structured extended features as well as XSAVE Extended Features for AVX512 and MPX (Memory Protection Extensions). - Don't permit users to request a subset of the AVX512 or MPX xsave masks.
* MFC 261504:jhb2014-06-125-30/+129
| | | | Add support for FreeBSD/i386 guests under bhyve.
* MFC 261503,264501:jhb2014-06-121-0/+123
| | | | Emulate the byte move and zero/sign extend instructions.
* MFC 260239,261268,265058:jhb2014-06-122-1/+7
| | | | | | Expand the support for PCI INTx interrupts including providing interrupt routing information for INTx interrupts to I/O APIC pins and enabling INTx interrupts in the virtio and AHCI backends.
* MFC r266846:kib2014-06-051-3/+20
| | | | | | When usermode loaded non-default segment selector into the %gs, correctly prepare KGSBASE msr to restore the user descriptor base on the last swapgs during return to usermode.
* MFC 260972:jhb2014-06-044-7/+34
| | | | | There is no need to initialize the IOMMU if no passthru devices have been configured for bhyve to use.
* MFC r266925:dchagin2014-06-031-2/+2
| | | | | To allow to run the interpreter itself add a new ELF branding type. Allow Linux ABI to run ELF interpreter.
* MFC 260802,260836,260863,261001,261074,261617:jhb2014-05-234-81/+232
| | | | | | | | | | | | | | | | | | | Various fixes for NMI and interrupt injection. - If a VM-exit happens during an NMI injection then clear the "NMI Blocking" bit in the Guest Interruptibility-state VMCS field. - If the guest exits due to a fault while it is executing IRET then restore the state of "Virtual NMI blocking" in the guest's interruptibility-state field before resuming the guest. - Inject a pending NMI only if NMI_BLOCKING, MOVSS_BLOCKING, STI_BLOCKING are all clear. If any of these bits are set then enable "NMI window exiting" and inject the NMI in the VM-exit handler. - Handle a VM-exit due to a NMI properly by vectoring to the host's NMI handler via a software interrupt. - Set "Interrupt Window Exiting" in the case where there is a vector to be injected into the vcpu but the VM-entry interruption information field already has the valid bit set. - For VM-exits due to an NMI, handle the NMI with interrupts disabled in addition to "blocking by NMI" already established by the VM-exit.
* MFC 260237:jhb2014-05-201-67/+84
| | | | | | | | | | | | | Fix a bug in the HPET emulation where a timer interrupt could be lost when the guest disables the HPET. The HPET timer interrupt is triggered from the callout handler associated with the timer. It is possible for the callout handler to be delayed before it gets a chance to execute. If the guest disables the HPET during this window then the handler never gets a chance to execute and the timer interrupt is lost. This is now fixed by injecting a timer interrupt into the guest if the callout time is detected to be in the past when the HPET is disabled.
* MFC 259737, 262646:jhb2014-05-183-25/+63
| | | | | | | | | | Fix a couple of issues with vcpu state: - Add a parameter to 'vcpu_set_state()' to enforce that the vcpu is in the IDLE state before the requested state transition. This guarantees that there is exactly one ioctl() operating on a vcpu at any point in time and prevents unintended state transitions. - Fix a race between VMRUN() and vcpu_notify_event() due to 'vcpu->hostcpu' being updated outside of the vcpu_lock().
* MFC 259641,259863,259924,259937,259961,259978,260380,260383,260410,260466,jhb2014-05-1724-593/+1753
| | | | | | | | | | | | | | | | 260531,260532,260550,260619,261170,261453,261621,263280,263290,264516: Add support for local APIC hardware-assist. - Restructure vlapic access and register handling to support hardware-assist for the local APIC. - Use the 'Virtual Interrupt Delivery' and 'Posted Interrupt Processing' feature of Intel VT-x if supported by hardware. - Add an API to rendezvous all active vcpus in a virtual machine and use it to support level triggered interrupts with VT-x 'Virtual Interrupt Delivery'. - Use a cheaper IPI handler than IPI_AST for nested page table shootdowns and avoid doing unnecessary nested TLB invalidations. Reviewed by: neel
* MFC 263301ian2014-05-171-2/+2
| | | | | | In kernel config files, it is supposed to be 'options<space><tab>' not 'options<tab><tab>', per long standing (but recently not so strictly enforced) convention.
* MFC 263246: Align all comments in config files on same column.ian2014-05-171-188/+188
|
OpenPOWER on IntegriCloud