summaryrefslogtreecommitdiffstats
path: root/sys/amd64/amd64/cpu_switch.S
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'origin/releng/11.1' into RELENG_2_4Luiz Souza2018-05-081-3/+3
|\
| * Add mitigations for two classes of speculative execution vulnerabilitiesgordon2018-03-141-3/+10
| | | | | | | | | | | | | | | | | | on amd64. [FreeBSD-SA-18:03.speculative_execution] Approved by: so Security: FreeBSD-SA-18:03.speculative_execution Security: CVE-2017-5715 Security: CVE-2017-5754
* | Revert "Revert "MFC ↵Luiz Souza2018-02-231-2/+9
| | | | | | | | | | | | r328083,328096,328116,328119,328120,328128,328135,328153,328157,"" This reverts commit d3d59b01294138e59995b31d2bcbbbdf45e26a3c.
* | Revert "Revert "MFC r327817:""Luiz Souza2018-02-231-1/+1
| | | | | | | | This reverts commit 671279ad5be3f1feb75857cf032daae1f04972dd.
* | Revert "Revert "MFC r324301:""Luiz Souza2018-02-231-1/+1
| | | | | | | | This reverts commit 4635c206cc4dfcc984a17067a62e67d75095c647.
* | Revert "Revert "MFC r322940:""Luiz Souza2018-02-231-1/+1
| | | | | | | | This reverts commit 72b499fe038338698da9878361ba68f79cd05af6.
* | Revert "Revert "MFC r322762, r322799, r322832, r322833:""Luiz Souza2018-02-231-1/+24
| | | | | | | | This reverts commit 5919c0a9658dde48bd090704915aa3a85a6c0d26.
* | Revert "MFC r322762, r322799, r322832, r322833:"Luiz Souza2018-02-211-24/+1
| | | | | | | | This reverts commit 2589da26b930eaf9441b6bf27c0f410062adf507.
* | Revert "MFC r322940:"Luiz Souza2018-02-211-1/+1
| | | | | | | | This reverts commit a9197dec5d4dc4631abb11db58f5cc72ce0625fd.
* | Revert "MFC r324301:"Luiz Souza2018-02-211-1/+1
| | | | | | | | This reverts commit 6501017038915547fe361a5ae4ca94ba466d0e4e.
* | Revert "MFC r327817:"Luiz Souza2018-02-211-1/+1
| | | | | | | | This reverts commit 242fd9ef5c10f63b2abc67e7479b7c4d83f0f4c3.
* | Revert "MFC r328083,328096,328116,328119,328120,328128,328135,328153,328157,"Luiz Souza2018-02-211-9/+2
| | | | | | | | This reverts commit 430a2bea3907149b30cc75fc722b6cf1f81da82a.
* | MFC r328083,328096,328116,328119,328120,328128,328135,328153,328157,kib2018-02-191-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 328166,328177,328199,328202,328205,328468,328470,328624,328625,328627, 328628,329214,329297,329365: Meltdown mitigation by PTI, PCID optimization of PTI, and kernel use of IBRS for some mitigations of Spectre. Tested by: emaste, Arshan Khanifar <arshankhanifar@gmail.com> Discussed with: jkim Sponsored by: The FreeBSD Foundation (cherry picked from commit 6dd025b40ee6870bea6ba670f30dcf684edc3f6c)
* | MFC r327817:kib2018-02-191-1/+1
| | | | | | | | | | | | Rename COMMON_TSS_RSP0 to TSS_RSP0. (cherry picked from commit 18a2f90a6ea9bb9ba24aa12792dd50864d7fe8c7)
* | MFC r324301:kib2018-02-191-1/+1
| | | | | | | | | | | | Update comment. (cherry picked from commit 5596db6a009420f7f1e764cc67d15e03ecb75601)
* | MFC r322940:rlibby2018-02-191-1/+1
| | | | | | | | | | | | amd64: drop q suffix from rd[fg]sbase for gas compatibility (cherry picked from commit c78f11f66bbfbc66d4b5ed31a9dc66831eacdf19)
* | MFC r322762, r322799, r322832, r322833:kib2018-02-191-1/+24
|/ | | | | | Make WRFSBASE and WRGSBASE instructions functional. (cherry picked from commit b1a7a7418e73251aad628dc4f9418e550a9fd3d7)
* MFC r318318:kib2017-05-291-1/+1
| | | | | | Ensure that resume path on amd64 only accesses page tables for normal operation after processor is configured to allow all required features.
* Rewrite amd64 PCID implementation to follow an algorithm described inkib2015-05-091-60/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the Vahalia' "Unix Internals" section 15.12 "Other TLB Consistency Algorithms". The same algorithm is already utilized by the MIPS pmap to handle ASIDs. The PCID for the address space is now allocated per-cpu during context switch to the thread using pmap, when no PCID on the cpu was ever allocated, or the current PCID is invalidated. If the PCID is reused, bit 63 of %cr3 can be set to avoid TLB flush. Each cpu has PCID' algorithm generation count, which is saved in the pmap pcpu block when pcpu PCID is allocated. On invalidation, the pmap generation count is zeroed, which signals the context switch code that already allocated PCID is no longer valid. The implication is the TLB shootdown for the given cpu/address space, due to the allocation of new PCID. The pm_save mask is no longer has to be tracked, which (significantly) reduces the targets of the TLB shootdown IPIs. Previously, pm_save was reset only on pmap_invalidate_all(), which made it accumulate the cpuids of all processors on which the thread was scheduled between full TLB shootdowns. Besides reducing the amount of TLB shootdowns and removing atomics to update pm_saves in the context switch code, the algorithm is much simpler than the maintanence of pm_save and selection of the right address space in the shootdown IPI handler. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
* Create a separate structure for per-CPU state saved across suspend andjhb2014-09-061-28/+1
| | | | | | | | | resume that is a superset of a pcb. Move the FPU state out of the pcb and into this new structure. As part of this, move the FPU resume code on amd64 into a C function. This allows resumectx() to still operate only on a pcb and more closely mirrors the i386 code. Reviewed by: kib (earlier version)
* Move fpusave() wrapper for suspend hander to sys/amd64/amd64/fpu.c.jkim2014-03-041-13/+0
| | | | Inspired by: jhb
* Properly save and restore CR0.jkim2014-03-041-0/+2
| | | | MFC after: 3 days
* Remove dead code since r230426, fix a comment, and tidy up.jkim2014-03-041-7/+5
| | | | | Reported by: jhb MFC after: 3 days
* Implement support for the process-context identifiers ('PCID') onkib2013-08-301-7/+27
| | | | | | | | | | | | | | | | | | | | | | | Intel CPUs. The feature tags TLB entries with the Id of the address space and allows to avoid TLB invalidation on the context switch, it is available only in the long mode. In the microbenchmarks, using the PCID decreased latency of the context switches by ~30% on SandyBridge class desktop CPUs, measured with the lat_ctx program from lmbench. If available, use INVPCID instruction when a TLB entry in non-current address space needs to be invalidated. The instruction is typically available on the Haswell. If needed, the use of PCID can be turned off with the vm.pmap.pcid_enabled loader tunable set to 0. The state of the feature is reported by the vm.pmap.pcid_enabled sysctl. The sysctl vm.pmap.pcid_save_cnt reports the number of context switches which avoided invalidating the TLB; compare with the total number of context switches, available as sysctl vm.stats.sys.v_swtch. Sponsored by: The FreeBSD Foundation Reviewed by: alc Tested by: pho, bf
* Add support for the XSAVEOPT instruction use. Our XSAVE/XRSTOR usagekib2012-07-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mostly meets the guidelines set by the Intel SDM: 1. We use XRSTOR and XSAVE from the same CPL using the same linear address for the store area 2. Contrary to the recommendations, we cannot zero the FPU save area for a new thread, since fork semantic requires the copy of the previous state. This advice seemingly contradicts to the advice from the item 6. 3. We do use XSAVEOPT in the context switch code only, and the area for XSAVEOPT already always contains the data saved by XSAVE. 4. We do not modify the save area between XRSTOR, when the area is loaded into FPU context, and XSAVE. We always spit the fpu context into save area and start emulation when directly writing into FPU context. 5. We do not use segmented addressing to access save area, or rather, always address it using %ds basing. 6. XSAVEOPT can be only executed in the area which was previously loaded with XRSTOR, since context switch code checks for FPU use by outgoing thread before saving, and thread which stopped emulation forcibly get context loaded with XRSTOR. 7. The PCB cannot be paged out while FPU emulation is turned off, since stack of the executing thread is never swapped out. The context switch code is patched to issue XSAVEOPT instead of XSAVE if supported. This approach eliminates one conditional in the context switch code, which would be needed otherwise. For user-visible machine context to have proper data, fpugetregs() checks for unsaved extension blocks and manually copies pristine FPU state into them, according to the description provided by CPUID leaf 0xd. MFC after: 1 month
* Use assembler mnemonic instead of manually assembling, contination for r238142.kib2012-07-061-6/+3
| | | | | Reviewed by: jhb MFC after: 1 month
* - Remove unused code for CR3 and CR4.jkim2012-06-131-2/+3
| | | | - Fix few style(9) nits while I am here.
* - Fix resumectx() prototypes to reflect reality.jkim2012-06-131-1/+1
| | | | | - For i386, simply jump to resumectx() with PCB in %ecx. - Fix a style(9) nit while I am here.
* Add x86/acpica/acpi_wakeup.c for amd64 and i386. Difference ofiwasaki2012-06-091-0/+159
| | | | | | | | | | | | | | | | | | | | | | suspend/resume procedures are minimized among them. common: - Add global cpuset suspended_cpus to indicate APs are suspended/resumed. - Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used). - Add some variables in acpi_wakecode.S in order to minimize the difference among amd64 and i386. - Disable load_cr3() because now CR3 is restored in resumectx(). amd64: - Add suspend/resume related members (such as MSR) in PCB. - Modify savectx() for above new PCB members. - Merge acpi_switch.S into cpu_switch.S as resumectx(). i386: - Merge(and remove) suspendctx() into savectx() in order to match with amd64 code. Reviewed by: attilio@, acpi@
* Update incorrect comment.jhb2012-02-271-1/+1
|
* Add support for the extended FPU states on amd64, both for nativekib2012-01-211-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 64bit and 32bit ABIs. As a side-effect, it enables AVX on capable CPUs. In particular: - Query the CPU support for XSAVE, list of the supported extensions and the required size of FPU save area. The hw.use_xsave tunable is provided for disabling XSAVE, and hw.xsave_mask may be used to select the enabled extensions. - Remove the FPU save area from PCB and dynamically allocate the (run-time sized) user save area on the top of the kernel stack, right above the PCB. Reorganize the thread0 PCB initialization to postpone it after BSP is queried for save area size. - The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as well. FPU state is only useful for suspend, where it is saved in dynamically allocated suspfpusave area. - Use XSAVE and XRSTOR to save/restore FPU state, if supported and enabled. - Define new mcontext_t flag _MC_HASFPXSTATE, indicating that mcontext_t has a valid pointer to out-of-struct extended FPU state. Signal handlers are supplied with stack-allocated fpu state. The sigreturn(2) and setcontext(2) syscall honour the flag, allowing the signal handlers to inspect and manipilate extended state in the interrupted context. - The getcontext(2) never returns extended state, since there is no place in the fixed-sized mcontext_t to place variable-sized save area. And, since mcontext_t is embedded into ucontext_t, makes it impossible to fix in a reasonable way. Instead of extending getcontext(2) syscall, provide a sysarch(2) facility to query extended FPU state. - Add ptrace(2) support for getting and setting extended state; while there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries. - Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to consumers, making it opaque. Internally, struct fpu_kern_ctx now contains a space for the extended state. Convert in-kernel consumers of fpu_kern KPI both on i386 and amd64. First version of the support for AVX was submitted by Tim Bird <tim.bird am sony com> on behalf of Sony. This version was written from scratch. Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org> MFC after: 1 month
* Increase size of pcb_flags to four bytes.jkim2010-12-221-3/+3
| | | | Requested by: bde, jhb
* Improve PCB flags handling and make it more robust. Add two new functionsjkim2010-12-221-3/+3
| | | | | | | | | | | | | | | | for manipulating pcb_flags. These inline functions are very similar to atomic_set_char(9) and atomic_clear_char(9) but without unnecessary LOCK prefix for SMP. Add comments about the rationale[1]. Use these functions wherever possible. Although there are some places where it is not strictly necessary (e.g., a PCB is copied to create a new PCB), it is done across the board for sake of consistency. Turn pcb_full_iret into a PCB flag as it is safe now. Move rarely used fields before pcb_flags and reduce size of pcb_flags to one byte. Fix some style(9) nits in pcb.h while I am in the neighborhood. Reviewed by: kib Submitted by: kib[1] MFC after: 2 months
* Change ambiguous (or invalid, depending on how strict you want to be :)dim2010-11-241-1/+1
| | | | | | | | assembly instruction "movw %rcx,2(%rax)" to "movw %cx,2(%rax)", since the intent was to move 16 bits of data, in this case. Found by: clang Reviewed by: kib
* Save MSR_FSBASE, MSR_GSBASE and MSR_KGSBASE directly to PCB as we do not usejkim2010-08-301-9/+6
| | | | these values in the function.
* savectx() has not been used for fork(2) for about 15 years. [1]jkim2010-08-031-60/+32
| | | | | | | Do not clobber FPU thread's PCB as it is more harmful. When we resume CPU, unconditionally reload FPU state. Pointed out by: bde [1]
* - Merge savectx2() with savectx() and struct xpcb with struct pcb. [1]jkim2010-08-021-90/+55
| | | | | | | | | | | | | | savectx() is only used for panic dump (dumppcb) and kdb (stoppcbs). Thus, saving additional information does not hurt and it may be even beneficial. Unfortunately, struct pcb has grown larger to accommodate more data. Move 512-byte long pcb_user_save to the end of struct pcb while I am here. - savectx() now saves FPU state unconditionally and copy it to the PCB of FPU thread if necessary. This gives panic dump and kdb a chance to take a look at the current FPU state even if the FPU is "supposedly" not used. - Resuming CPU now unconditionally reinitializes FPU. If the saved FPU state was irrelevant, it could be in an unknown state. Suggested by: bde [1]
* Fix another fallout from r208833. savectx() is used to save CPU contextjkim2010-07-291-1/+1
| | | | | | for crash dump (dumppcb) and kdb (stoppcbs). For both cases, there cannot have a valid pointer in pcb_save. This should restore the previous behaviour.
* Rename PCB_USER_FPU to PCB_USERFPU not to clash with a macro from fpu.h.jkim2010-07-291-1/+1
|
* Re-implement FPU suspend/resume for amd64. This removes superfluous usesjkim2010-07-261-0/+7
| | | | | | | of critical_enter(9) and critical_exit(9) by fpugetregs() and fpusetregs(). Also, we do not touch PCB flags any more. MFC after: 1 month
* When switching the thread from the processor, store %dr7 contentkib2010-07-121-1/+1
| | | | | | | | | into the pcb before disabling watchpoints. Otherwise, when the thread is restored on a processor, watchpoints are still disabled. Submitted by: Tijl Coosemans <tijl coosemans org> (I would be much happier if Tijl commited this himself) MFC after: 1 week
* Correctly maintain the per-cpu field "curpmap" on amd64 just like wealc2010-07-081-12/+10
| | | | | | | | | | | | | | | | | do on i386. The consequences of not doing so on amd64 became apparent with the introduction of the COUNT_IPIS and COUNT_XINVLTLB_HITS options. Specifically, single-threaded applications were generating unnecessary IPIs to shoot-down the TLB on other processors. However, this is clearly nonsensical because a single-threaded application is only running on the current processor. The reason that this happens is that pmap_activate() is unable to properly update the old pmap's field "pm_active" without the correct "curpmap". So, in effect, stale bits in "pm_active" were leading pmap_protect(), pmap_remove(), pmap_remove_pages(), etc. to flush the TLB contents on some arbitrary processor that wasn't even running the same application. Reviewed by: kib MFC after: 3 weeks
* Introduce the x86 kernel interfaces to allow kernel code to usekib2010-06-051-3/+3
| | | | | | | | | | | | | | | | FPU/SSE hardware. Caller should provide a save area that is chained into the stack of the areas; pcb save_area for usermode FPU state is on top. The pcb now contains a pointer to the current FPU saved area, used during FPUDNA handling and context switches. There is also a facility to allow the kernel thread to use pcb save_area. Change the dreaded warnings "npxdna in kernel mode!" into the panics when FPU usage is not registered. KPI discussed with: fabient Tested by: pho, fabient Hardware provided by: Sentex Communications MFC after: 1 month
* Restore the segment registers and segment base MSRs for amd64 syscallkib2009-07-091-0/+1
| | | | | | | | | | | | | | | | | return path only when neither thread was context switched while executing syscall code nor syscall explicitely modified LDT or MSRs. Save segment registers in trap handlers before interrupts are enabled, to not allow context switches to happen before registers are saved. Use separated byte in pcb for indication of fast/full return, since pcb_flags are not synchronized with context switches. The change puts back syscall microbenchmark numbers that were slowed down after commit of the support for LDT on amd64. Reviewed by: jeff Tested (and tested, and tested ...) by: pho Approved by: re (kensmith)
* Save and restore segment registers on amd64 when entering and leavingkib2009-04-011-94/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the kernel on amd64. Fill and read segment registers for mcontext and signals. Handle traps caused by restoration of the invalidated selectors. Implement user-mode creation and manipulation of the process-specific LDT descriptors for amd64, see sysarch(2). Implement support for TSS i/o port access permission bitmap for amd64. Context-switch LDT and TSS. Do not save and restore segment registers on the context switch, that is handled by kernel enter/leave trampolines now. Remove segment restore code from the signal trampolines for freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason. Implement amd64-specific compat shims for sysarch. Linuxolator (temporary ?) switched to use gsbase for thread_area pointer. TODO: Currently, gdb is not adapted to show segment registers from struct reg. Also, no machine-depended ptrace command is added to set segment registers for debugged process. In collaboration with: pho Discussed with: peter Reviewed by: jhb Linuxolator tested by: dchagin
* Initial suspend/resume support for amd64.jkim2009-03-171-2/+72
| | | | | | This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.
* Change some movl's to mov's. Newer GAS no longer accept 'movl' instructionsobrien2009-01-311-8/+8
| | | | | | for moving between a segment register and a 32-bit memory location. Looked at by: jhb
* The context switch to the 32bit binary does not properly restorekib2009-01-201-1/+2
| | | | | | | | | | | the fsbase value. The switch loads the fs segment register, that invalidates the value in fsbase msr, thus value in %r9 can not be considered the current value for fsbase anymore. Unconditionally reload fsbase when switching to 32bit binary. PR: 130526 MFC after: 3 weeks
* The pcb_gs32p should be per-cpu, not per-thread pointer. This iskib2008-09-081-2/+2
| | | | | | | | location in GDT where the segment descriptor from pcb_gs32sd is copied, and the location is in GDT local to CPU. Noted and reviewed by: peter MFC after: 1 week
* - When executing FreeBSD/amd64 binaries from FreeBSD/i386 or Linux/i386kib2008-09-021-2/+18
| | | | | | | | | | | | | | | | | processes, clear PCB_32BIT and PCB_GS32BIT bits [1]. - Reread the fs and gs bases from the msr unconditionally, not believing the values in pcb_fsbase and pcb_gsbase, since usermode may reload segment registers, invalidating the cache. [2]. Both problems resulted in the wrong fs base, causing wrong tls pointer be dereferenced in the usermode. Reported and tested by: Vyacheslav Bocharov <adeepv at gmail com> [1] Reported by: Bernd Walter <ticsoat cicely7 cicely de>, Artem Belevich <fbsdlist at src cx>[2] Reviewed by: peter MFC after: 3 days
OpenPOWER on IntegriCloud