summaryrefslogtreecommitdiffstats
path: root/sys/amd64/include/pcb.h
Commit message (Collapse)AuthorAgeFilesLines
* - Remove unused code for CR3 and CR4.jkim2012-06-131-1/+1
| | | | - Fix few style(9) nits while I am here.
* - Fix resumectx() prototypes to reflect reality.jkim2012-06-131-2/+2
| | | | | - For i386, simply jump to resumectx() with PCB in %ecx. - Fix a style(9) nit while I am here.
* Add x86/acpica/acpi_wakeup.c for amd64 and i386. Difference ofiwasaki2012-06-091-1/+13
| | | | | | | | | | | | | | | | | | | | | | suspend/resume procedures are minimized among them. common: - Add global cpuset suspended_cpus to indicate APs are suspended/resumed. - Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used). - Add some variables in acpi_wakecode.S in order to minimize the difference among amd64 and i386. - Disable load_cr3() because now CR3 is restored in resumectx(). amd64: - Add suspend/resume related members (such as MSR) in PCB. - Modify savectx() for above new PCB members. - Merge acpi_switch.S into cpu_switch.S as resumectx(). i386: - Merge(and remove) suspendctx() into savectx() in order to match with amd64 code. Reviewed by: attilio@, acpi@
* Add a convenience macro for the returns_twice attribute, and apply it todim2012-04-291-1/+1
| | | | | | | the prototypes of the appropriate functions (getcontext, savectx, setjmp, sigsetjmp and vfork). MFC after: 2 weeks
* Add support for the extended FPU states on amd64, both for nativekib2012-01-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 64bit and 32bit ABIs. As a side-effect, it enables AVX on capable CPUs. In particular: - Query the CPU support for XSAVE, list of the supported extensions and the required size of FPU save area. The hw.use_xsave tunable is provided for disabling XSAVE, and hw.xsave_mask may be used to select the enabled extensions. - Remove the FPU save area from PCB and dynamically allocate the (run-time sized) user save area on the top of the kernel stack, right above the PCB. Reorganize the thread0 PCB initialization to postpone it after BSP is queried for save area size. - The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as well. FPU state is only useful for suspend, where it is saved in dynamically allocated suspfpusave area. - Use XSAVE and XRSTOR to save/restore FPU state, if supported and enabled. - Define new mcontext_t flag _MC_HASFPXSTATE, indicating that mcontext_t has a valid pointer to out-of-struct extended FPU state. Signal handlers are supplied with stack-allocated fpu state. The sigreturn(2) and setcontext(2) syscall honour the flag, allowing the signal handlers to inspect and manipilate extended state in the interrupted context. - The getcontext(2) never returns extended state, since there is no place in the fixed-sized mcontext_t to place variable-sized save area. And, since mcontext_t is embedded into ucontext_t, makes it impossible to fix in a reasonable way. Instead of extending getcontext(2) syscall, provide a sysarch(2) facility to query extended FPU state. - Add ptrace(2) support for getting and setting extended state; while there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries. - Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to consumers, making it opaque. Internally, struct fpu_kern_ctx now contains a space for the extended state. Convert in-kernel consumers of fpu_kern KPI both on i386 and amd64. First version of the support for AVX was submitted by Tim Bird <tim.bird am sony com> on behalf of Sony. This version was written from scratch. Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org> MFC after: 1 month
* Increase size of pcb_flags to four bytes.jkim2010-12-221-7/+7
| | | | Requested by: bde, jhb
* Improve PCB flags handling and make it more robust. Add two new functionsjkim2010-12-221-10/+42
| | | | | | | | | | | | | | | | for manipulating pcb_flags. These inline functions are very similar to atomic_set_char(9) and atomic_clear_char(9) but without unnecessary LOCK prefix for SMP. Add comments about the rationale[1]. Use these functions wherever possible. Although there are some places where it is not strictly necessary (e.g., a PCB is copied to create a new PCB), it is done across the board for sake of consistency. Turn pcb_full_iret into a PCB flag as it is safe now. Move rarely used fields before pcb_flags and reduce size of pcb_flags to one byte. Fix some style(9) nits in pcb.h while I am in the neighborhood. Reviewed by: kib Submitted by: kib[1] MFC after: 2 months
* Retire write-only PCB_FULLCTX pcb flag on amd64.kib2010-12-071-1/+0
| | | | | | Reminded by: Petr Salinger <Petr.Salinger seznam cz> Tested by: pho MFC after: 1 week
* Rearrange struct pcb. r177532 (CVS r1.64 of pcb.h) moved pcb_flags to makejkim2010-08-021-9/+9
| | | | | better use of cache lines by placing it before pcb_save (now pcb_user_save), which is moved to the end of pcb since r210777.
* - Merge savectx2() with savectx() and struct xpcb with struct pcb. [1]jkim2010-08-021-22/+19
| | | | | | | | | | | | | | savectx() is only used for panic dump (dumppcb) and kdb (stoppcbs). Thus, saving additional information does not hurt and it may be even beneficial. Unfortunately, struct pcb has grown larger to accommodate more data. Move 512-byte long pcb_user_save to the end of struct pcb while I am here. - savectx() now saves FPU state unconditionally and copy it to the PCB of FPU thread if necessary. This gives panic dump and kdb a chance to take a look at the current FPU state even if the FPU is "supposedly" not used. - Resuming CPU now unconditionally reinitializes FPU. If the saved FPU state was irrelevant, it could be in an unknown state. Suggested by: bde [1]
* Introduce the x86 kernel interfaces to allow kernel code to usekib2010-06-051-1/+4
| | | | | | | | | | | | | | | | FPU/SSE hardware. Caller should provide a save area that is chained into the stack of the areas; pcb save_area for usermode FPU state is on top. The pcb now contains a pointer to the current FPU saved area, used during FPUDNA handling and context switches. There is also a facility to allow the kernel thread to use pcb save_area. Change the dreaded warnings "npxdna in kernel mode!" into the panics when FPU usage is not registered. KPI discussed with: fabient Tested by: pho, fabient Hardware provided by: Sentex Communications MFC after: 1 month
* Restore the segment registers and segment base MSRs for amd64 syscallkib2009-07-091-1/+2
| | | | | | | | | | | | | | | | | return path only when neither thread was context switched while executing syscall code nor syscall explicitely modified LDT or MSRs. Save segment registers in trap handlers before interrupts are enabled, to not allow context switches to happen before registers are saved. Use separated byte in pcb for indication of fast/full return, since pcb_flags are not synchronized with context switches. The change puts back syscall microbenchmark numbers that were slowed down after commit of the support for LDT on amd64. Reviewed by: jeff Tested (and tested, and tested ...) by: pho Approved by: re (kensmith)
* Garbage collect unused stack segment since r190620.jkim2009-04-011-1/+0
|
* Save and restore segment registers on amd64 when entering and leavingkib2009-04-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the kernel on amd64. Fill and read segment registers for mcontext and signals. Handle traps caused by restoration of the invalidated selectors. Implement user-mode creation and manipulation of the process-specific LDT descriptors for amd64, see sysarch(2). Implement support for TSS i/o port access permission bitmap for amd64. Context-switch LDT and TSS. Do not save and restore segment registers on the context switch, that is handled by kernel enter/leave trampolines now. Remove segment restore code from the signal trampolines for freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason. Implement amd64-specific compat shims for sysarch. Linuxolator (temporary ?) switched to use gsbase for thread_area pointer. TODO: Currently, gdb is not adapted to show segment registers from struct reg. Also, no machine-depended ptrace command is added to set segment registers for debugged process. In collaboration with: pho Discussed with: peter Reviewed by: jhb Linuxolator tested by: dchagin
* Initial suspend/resume support for amd64.jkim2009-03-171-0/+14
| | | | | | This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.
* A better fix for handling different FPU initial control words for differentjhb2009-03-051-0/+1
| | | | | | | | | | | | | | | | ABIs: - Store the FPU initial control word in the pcb for each thread. - When first using the FPU, load the initial control word after restoring the clean state if it is not the standard control word. - Provide a correct control word for Linux/i386 binaries under FreeBSD/amd64. - Adjust the control word returned for fpugetregs()/npxgetregs() when a thread hasn't used the FPU yet to reflect the real initial control word for the current ABI. - The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control word instead of trashing whatever the current state of the FPU is. Reviewed by: bde
* Move the PCB flag macros up next to the 'pcb_flags' member in the struct.jhb2009-03-051-5/+6
|
* The pcb_gs32p should be per-cpu, not per-thread pointer. This iskib2008-09-081-1/+0
| | | | | | | | location in GDT where the segment descriptor from pcb_gs32sd is copied, and the location is in GDT local to CPU. Noted and reviewed by: peter MFC after: 1 week
* Bring back the save/restore of the %ds, %es, %fs and %gs registers forkib2008-07-301-0/+1
| | | | | | | | | | | | | | | the 32bit images on amd64. Change the semantic of the PCB_32BIT pcb flag to request the context switch code to operate on the segment registers. Its previous meaning of saving or restoring the %gs base offset is assigned to the new PCB_GS32BIT flag. FreeBSD 32bit image activator sets the PCB_32BIT flag, while Linux 32bit emulation sets PCB_32BIT | PCB_GS32BIT. Reviewed by: peter MFC after: 2 weeks
* Move pcb_flags to make trivially better use of cache lines.peter2008-03-231-1/+1
|
* MFP4: Linux set_thread_area syscall (aka TLS) support for amd64.jkim2007-03-301-0/+5
| | | | | | Initial version was submitted by Divacky Roman and mostly rewritten by me. Tested by: emulation
* I believe the stack underflows during early development that caused me topeter2005-09-271-1/+0
| | | | | add spare padding at the beginning of the pcb are long gone. Remove the padding fields.
* Kill pcb_rflags. It served no purpose.peter2005-09-271-1/+0
| | | | Reported by: bde
* Implement makectx(). The makectx() function is used by KDB to createmarcel2004-07-101-0/+3
| | | | | | | | | a PCB from a trapframe for purposes of unwinding the stack. The PCB is used as the thread context and all but the thread that entered the debugger has a valid PCB. This function can also be used to create a context for the threads running on the CPUs that have been stopped when the debugger got entered. This however is not done at the time of this commit.
* Checkpoint some of what I was starting to tinker with for having somepeter2004-05-161-0/+1
| | | | | | | | | | | different context support for 32 vs 64 bit processes. This simply omits the save/restore of the segment selector registers for non 32 bit processes. This avoids the rdmsr/rwmsr juggling when restoring %gs clobbers the kernel msr that holds the gsbase. However, I suspect it might be better to conditionally do this at user<->kernel transition where we wouldn't need to do the juggling in the first place. Or have per-thread extended context save/restore hooks.
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999 and email from Peter Wemm. Approved by: core, peter
* Add dbreg struct definitions for /proc/*/dbregs and a place to store thepeter2004-01-281-2/+9
| | | | registers in the pcb
* Update the graffiti.peter2003-11-081-3/+4
|
* The great s/npx/fpu/gipeter2003-11-081-1/+1
|
* Rename npx* to fpu*. I haven't done the flags/function names yet.peter2003-11-081-5/+4
|
* Collect the nastiness for preserving the kernel MSR_GSBASE around thepeter2003-05-151-1/+0
| | | | | | | | | | load_gs() calls into a single place that is less likely to go wrong. Eliminate the per-process context switching of MSR_GSBASE, because it should be constant for a single cpu. Instead, save/restore it during the loading of the new %gs selector for the new process. Approved by: re (amd64/* blanket)
* Add BASIC i386 binary support for the amd64 kernel. This is largelypeter2003-05-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | stolen from the ia64/ia32 code (indeed there was a repocopy), but I've redone the MD parts and added and fixed a few essential syscalls. It is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic) and p4. The ia64 code has not implemented signal delivery, so I had to do that. Before you say it, yes, this does need to go in a common place. But we're in a freeze at the moment and I didn't want to risk breaking ia64. I will sort this out after the freeze so that the common code is in a common place. On the AMD64 side, this required adding segment selector context switch support and some other support infrastructure. The %fs/%gs etc code is hairy because loading %gs will clobber the kernel's current MSR_GSBASE setting. The segment selectors are not used by the kernel, so they're only changed at context switch time or when changing modes. This still needs to be optimized. Approved by: re (amd64/* blanket)
* Give a %fs and %gs to userland. Use swapgs to obtain the kernel %GS.basepeter2003-05-121-0/+2
| | | | | | | | | | | | | | | | | value on entry and exit. This isn't as easy as it sounds because when we recursively trap or interrupt, we have to avoid duplicating the swapgs instruction or we end up back with the userland %gs. I implemented this by testing TF_CS to see if we're coming from supervisor mode already, and check for returning to supervisor. To avoid a race with interrupts in the brief period after beginning executing the handler and before the swapgs, convert all trap gates to interrupt gates, and reenable interrupts immediately after the swapgs. I am not happy with this. There are other possible ways to do this that should be investigated. (eg: storing the GS.base MSR value in the trapframe) Add some sysarch functions to let the userland code get to this. Approved by: re (blanket amd64/*)
* Commit MD parts of a loosely functional AMD64 port. This is based onpeter2003-05-011-25/+16
| | | | | | | | | | | | | | | | | | | | | | a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.
* 1.Fix smp race between kernel vm86 BIOS calling and userland vm86 mode code,davidxu2002-11-071-0/+2
| | | | | | | | | | remove global variable in_vm86call, set vm86 calling flag in PCB flags. 2.Fix vm86 BIOS calling preempted problem by changing vm86_lock mutex type from MTX_DEF to MTX_SPIN. vm86pcb is not remembered in thread struct, when the thread calling vm86 BIOS is preempted by interrupt thread, and later switching back to the thread would cause incorrect context be loaded into CPU registers, this leads to kernel crash.
* The a.out md_coredump stuff isn't referenced anywhere anymore, andpeter2002-10-151-10/+0
| | | | hasn't been filled in for ages.. Nuked.
* It is too much work convincing lint why we would want empty structures,phk2002-10-011-0/+3
| | | | so make the non-empty #ifdef lint.
* Add kernel support needed for the KSE-aware libpthread:mini2002-09-161-0/+1
| | | | | | | | - Maintain fpu state across signals. - Save and restore FPU state properly in ucontext_t's. Reviewed by: deischen, julian Approved by: -arch
* Compromise for critical*()/cpu_critical*() recommit. Cleanup the interruptdillon2002-03-271-1/+2
| | | | | | | | | | | | | | | | | | | disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up). Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement. This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical*() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary. Reviewed by: core Approved by: core
* Remove __P.alfred2002-03-201-1/+1
|
* revert last commit temporarily due to whining on the lists.dillon2002-02-261-2/+1
|
* STAGE-1 of 3 commit - allow (but do not require) interrupts to remaindillon2002-02-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled in critical sections and streamline critical_enter() and critical_exit(). This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change. This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs. This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_*() functions are moved entirely into MD files. The following changes have been made: * critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section. Other architectures are unaffected. * fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code. The new code places the burden of entering the critical section in the trampoline code where it belongs. * I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask. * In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made. * Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags. * FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock. * ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts. * Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them. Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code. * Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code. Performance issues: One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant. The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer. The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example). The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit(). Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
* Changed the type of pcb_flags from u_char to u_int and adjusted things.bde2002-01-171-1/+1
| | | | | This removes the only atomic operation on a char type in the entire kernel.
* Split the per-process Local Descriptor Table out of the PCB and intojhb2001-10-251-1/+0
| | | | | | | struct mdproc. Submitted by: Andrew R. Reiter <arr@watson.org> Silence on: -current
* The #define for pcb_savefpu seems to do more harm than good.peter2001-07-121-1/+0
|
* Activate SSE/SIMD. This is the extra context switching support thatpeter2001-07-121-1/+2
| | | | | | | | | | | | | | | we are required to do if we let user processes use the extra 128 bit registers etc. This is the base part of the diff I got from: http://www.issei.org/issei/FreeBSD/sse.html I believe this is by: Mr. SUZUKI Issei <issei@issei.org> SMP support apparently by: Takekazu KATO <kato@chino.it.okayama-u.ac.jp> Test code by: NAKAMURA Kazushi <kaz@kobe1995.net>, see http://kobe1995.net/~kaz/FreeBSD/SSE.en.html I have fixed a couple of style(9) deviations. I have some followup commits to fix a couple of non-style things.
* Convert npx interrupts into traps instead of vice versa. This is muchbde2001-05-221-0/+1
| | | | | | | | simpler for npx exceptions that start as traps (no assembly required...) and works better for npx exceptions that start as interrupts (there is no longer a problem for nested interrupts). Submitted by: original (pre-SMPng) version by luoqi
* Activate USER_LDT by default. The new thread libraries are going topeter2001-02-231-4/+0
| | | | | | | | depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago. Reviewed by: jhb, imp
* - Don't call clear_resched() in userret(), instead, clear the resched flagjhb2001-02-201-1/+0
| | | | | | | | | | | | in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.
* Declare or #define per-cpu globals in <machine/globals.h> in all cases.bde2000-10-271-6/+0
| | | | The i386 UP case was messily different.
OpenPOWER on IntegriCloud