summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Rename the IVY_RNG option to RDRAND_RNG.kib2012-09-131-1/+1
| | | | | Based on submission by: Arthur Mesh <arthurmesh@gmail.com> MFC after: 2 weeks
* Simplify pmap_unmapdev(). Since kmem_free() eventually calls pmap_remove(),alc2012-09-101-4/+1
| | | | | | | | | | | | | pmap_unmapdev()'s own direct efforts to destroy the page table entries are redundant, so eliminate them. Don't set PTE_W on the page table entry in pmap_kenter{,_attr}() on MIPS. Setting PTE_W on MIPS is inconsistent with the implementation of this function on other architectures. Moreover, PTE_W should not be set, unless the pmap's wired mapping count is incremented, which pmap_kenter{,_attr}() doesn't do. MFC after: 10 days
* userret() already checks for td_locks when INVARIANTS is enabled, soattilio2012-09-081-1/+0
| | | | | | | there is no need to check if Giant is acquired after it. Reviewed by: kib MFC after: 1 week
* Add support for new Intel on-CPU Bull Mountain random numberkib2012-09-051-0/+2
| | | | | | | | | | | | | | | | | | | generator, found on IvyBridge and supposedly later CPUs, accessible with RDRAND instruction. From the Intel whitepapers and articles about Bull Mountain, it seems that we do not need to perform post-processing of RDRAND results, like AES-encryption of the data with random IV and keys, which was done for Padlock. Intel claims that sanitization is performed in hardware. Make both Padlock and Bull Mountain random generators support code covered by kernel config options, for the benefit of people who prefer minimal kernels. Also add the tunables to disable hardware generator even if detected. Reviewed by: markm, secteam (simon) Tested by: bapt, Michael Moll <kvedulv@kvedulv.de> MFC after: 3 weeks
* Rename {_,}pmap_unwire_pte_hold() to {_,}pmap_unwire_ptp() and update thealc2012-09-051-22/+22
| | | | | | | | | | comment describing them. Both the function names and the comment had grown stale. Quite some time has passed since these pmap implementations last used the page's hold count to track the number of valid mapping within a page table page. Also, returning TRUE from pmap_unwire_ptp() rather than _pmap_unwire_ptp() eliminates a few instructions from callers like pmap_enter_quick_locked() where pmap_unwire_ptp()'s return value is used directly by a conditional statement.
* Add hpt27xx to GENERIC kernel for amd64 and i386 systems.delphij2012-09-041-0/+1
| | | | MFC after: 2 weeks
* Fix duplicate entries for mwl(4):jhb2012-09-041-6/+1
| | | | | | | - Move mwlfw from {amd64,i386}/conf/NOTES to sys/conf/NOTES (mwl(4) is already present in sys/conf/NOTES). - Remove duplicate mwl(4) entries from {amd64,i386}/conf/NOTES. - While here, add a description to the sfxge line in amd64/conf/NOTES.
* Fix misspelled "Infiniband".jhb2012-08-281-1/+1
| | | | | Submitted by: gcooper MFC after: 3 days
* Grammar fix: s/NIC's/NICs/gjb2012-08-261-1/+1
| | | | MFC after: 3 days
* As discussed on -current, remove the hardcoded default maxswzone.des2012-08-141-8/+0
| | | | MFC after: 3 weeks
* Add a hackish debugging facility to provide a bit of information aboutkib2012-08-141-2/+20
| | | | | | | | | | | | | | reason for generated trap. The dump of basic signal information and 8 bytes of the faulting instruction are printed on the controlling terminal of the process, if the machdep.uprintf_signal syscal is enabled. The print is the only practical way to debug traps from a.out processes I am aware of. Because I have to reimplement it each time I debug an issue with a.out support on amd64, commit the hack to main tree. MFC after: 1 week
* Real hardware, as opposed to QEMU, does not allow to have a call gatekib2012-08-142-6/+27
| | | | | | | | | | | | in long mode which transfers control to 32bit code segment. Unbreak the lcall $7,$0 implementation on amd64 by putting the 64bit user code segment' selector into call gate, and execute the 64bit trampoline which converts the return frame into 32bit format and switches back to 32bit mode for executing int $0x80 trampoline. Note that all jumps over the hoops are performed in the user mode. MFC after: 1 week
* Remove the deassert INIT IPI from the IPI startup sequence for APs.jhb2012-08-131-16/+1
| | | | | | | | | It is not listed in the boot sequence in the MP specification (1.4), and it is explicitly ignored on modern CPUs. It was only ever required when bootstrapping systems with external APICs (that is, SMP machines with 486s), which FreeBSD has never supported (and never will). While here, tidy some comments and remove some banal ones.
* Add a 10 millisecond delay after sending the initial INIT IPI. Thisjhb2012-08-131-1/+2
| | | | | | matches the algorithm in the MP specification (1.4). Previously we were sending out the deassert INIT IPI immediately after the initial INIT IPI was sent.
* Build modules along with the XENHVM kernels.cperciva2012-08-131-2/+0
| | | | | No objections from: freebsd-xen mailing list MFC after: 1 week
* The assertion that I added in r238889 could legitimately fail when aalc2012-08-081-1/+2
| | | | | | | debugger creates a breakpoint. Replace that assertion with a narrower one that still achieves my objective. Reported and tested by: kib
* Do not apply errata 721 workaround when under hypervisor, sincekib2012-08-071-1/+7
| | | | | | | | | typical hypervisor does not implement access to the required MSR, causing #GP on boot. Reported and tested by: olgeni PR: amd64/170388 MFC after: 3 days
* Remove duplicate header inclusion of <sys/sysent.h>pluknet2012-08-071-1/+0
| | | | Discussed with: bz
* Shave off a few more cycles from the average execution time of pmap_enter()alc2012-08-051-7/+2
| | | | by simplifying the control flow and reducing the live range of "om".
* Add lfence().kib2012-08-011-0/+7
| | | | MFC after: 1 week
* Revise pmap_enter()'s handling of mapping updates that change thealc2012-08-011-22/+30
| | | | | | | | | | | | | | PTE's PG_M and PG_RW bits but not the physical page frame. First, only perform vm_page_dirty() on a managed vm_page when the PG_M bit is being cleared. If the updated PTE continues to have PG_M set, then there is no requirement to perform vm_page_dirty(). Second, flush the mapping from the TLB when PG_M alone is cleared, not just when PG_M and PG_RW are cleared. Otherwise, a stale TLB entry may stop PG_M from being set again on the next store to the virtual page. However, since the vm_page's dirty field already shows the physical page as being dirty, no actual harm comes from the PG_M bit not being set. Nonetheless, it is potentially confusing to someone expecting to see the PTE change after a store to the virtual page.
* Change (unused) prototype for stmxcsr() to match reality.kib2012-07-301-1/+1
| | | | | Noted by: jhb MFC after: 1 week
* Shave off a few more cycles from pmap_enter()'s critical section. Inalc2012-07-291-14/+17
| | | | particular, do a little less work with the PV list lock held.
* Forcibly shut up clang warning about NULL pointer dereference.kib2012-07-231-0/+7
| | | | MFC after: 3 weeks
* Constently use 2-space sentence breaks.kib2012-07-211-2/+2
| | | | | Submitted by: bde MFC after: 1 week
* Stop caching curpcb in the local variable.kib2012-07-211-16/+12
| | | | | Requested by: bde MFC after: 1 week
* The PT_I386_{GET,SET}XMMREGS and PT_{GET,SET}XSTATE operate on thekib2012-07-211-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | stopped threads. Implementation assumes that the thread's FPU context is spilled into the PCB due to stop. This is mostly true, except when FPU state for the thread is not initialized. Then the requests operate on the garbage state which is currently left in the PCB, causing confusion. The situation is indeed observed after a signal delivery and before #NM fault on execution of any FPU instruction in the signal handler, since sendsig(9) drops FPU state for current thread, clearing PCB_FPUINITDONE. When inspecting context state for the signal handler, debugger sees the FPU state of the main program context instead of the clear state supposed to be provided to handler. Fix this by forcing clean FPU state in PCB user FPU save area by performing getfpuregs(9) before accessing user FPU save area in ptrace_machdep.c. Note: this change will be merged to i386 kernel as well, where it is much more important, since e.g. gdb on i386 uses PT_I386_GETXMMREGS to inspect FPU context on CPUs that support SSE. Amd64 version of gdb uses PT_GETFPREGS to inspect both 64 and 32 bit processes, which does not exhibit the bug. Reported by: bde MFC after: 1 week
* Stop clearing x87 exceptions in the #MF handler on amd64. If user codekib2012-07-211-18/+13
| | | | | | | | | | | | | | | | | | | | understands FPU hardware enough to catch SIGFPE and unmask exceptions in control word, then it may as well properly handle return from SIGFPE without causing an infinite loop of #MF exceptions due to faulting instruction restart, when needed. Clearing exceptions causes information loss for handlers which do understand FPU hardware, and struct siginfo si_code member cannot be considered adequate replacement for en_sw content due to translation. Supposed reason for clearing the exceptions, which is IRQ13 handling oddities, were never applicable to amd64. Note: this change will be merged to i386 kernel as well, since we do not support IRQ13 delivery of #MF notifications for some time. Requested by: bde MFC after: 1 week
* Introduce curpcb magic variable, similar to curthread, which is MDkib2012-07-195-14/+30
| | | | | | | | | | | | | | | | | | | | | amd64. It is implemented as __pure2 inline with non-volatile asm read from pcpu, which allows a compiler to cache its results. Convert most PCPU_GET(pcb) and curthread->td_pcb accesses into curpcb. Note that __curthread() uses magic value 0 as an offsetof(struct pcpu, pc_curthread). It seems to be done this way due to machine/pcpu.h needs to be processed before sys/pcpu.h, because machine/pcpu.h contributes machine-depended fields to the struct pcpu definition. As result, machine/pcpu.h cannot use struct pcpu yet. The __curpcb() also uses a magic constant instead of offsetof(struct pcpu, pc_curpcb) for the same reason. The constants are now defined as symbols and CTASSERTs are added to ensure that future KBI changes do not break the code. Requested and reviewed by: bde MFC after: 3 weeks
* Don't unnecessarily set PGA_REFERENCED in pmap_enter().alc2012-07-191-7/+5
|
* On AMD64, provide siginfo.si_code for floating point errors when errorkib2012-07-183-15/+34
| | | | | | | | | | | | | | occurs using the SSE math processor. Update comments describing the handling of the exception status bits in coprocessors control words. Remove GET_FPU_CW and GET_FPU_SW macros which were used only once. Prefer to use curpcb to access pcb_save over the longer path of referencing pcb through the thread structure. Based on the submission by: Ed Alley <wea llnl gov> PR: amd64/169927 Reviewed by: bde MFC after: 3 weeks
* Add stmxcsr.kib2012-07-181-0/+2
| | | | | | Submitted by: Ed Alley <wea llnl gov> PR: amd64/169927 MFC after: 3 weeks
* Add support for the XSAVEOPT instruction use. Our XSAVE/XRSTOR usagekib2012-07-143-1/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mostly meets the guidelines set by the Intel SDM: 1. We use XRSTOR and XSAVE from the same CPL using the same linear address for the store area 2. Contrary to the recommendations, we cannot zero the FPU save area for a new thread, since fork semantic requires the copy of the previous state. This advice seemingly contradicts to the advice from the item 6. 3. We do use XSAVEOPT in the context switch code only, and the area for XSAVEOPT already always contains the data saved by XSAVE. 4. We do not modify the save area between XRSTOR, when the area is loaded into FPU context, and XSAVE. We always spit the fpu context into save area and start emulation when directly writing into FPU context. 5. We do not use segmented addressing to access save area, or rather, always address it using %ds basing. 6. XSAVEOPT can be only executed in the area which was previously loaded with XRSTOR, since context switch code checks for FPU use by outgoing thread before saving, and thread which stopped emulation forcibly get context loaded with XRSTOR. 7. The PCB cannot be paged out while FPU emulation is turned off, since stack of the executing thread is never swapped out. The context switch code is patched to issue XSAVEOPT instead of XSAVE if supported. This approach eliminates one conditional in the context switch code, which would be needed otherwise. For user-visible machine context to have proper data, fpugetregs() checks for unsaved extension blocks and manually copies pristine FPU state into them, according to the description provided by CPUID leaf 0xd. MFC after: 1 month
* Wring a few cycles out of pmap_enter(). In particular, on a user-spacealc2012-07-131-75/+86
| | | | pmap, avoid walking the page table twice.
* Add a clts() wrapper around the 'clts' instruction to <machine/cpufunc.h>jhb2012-07-092-6/+13
| | | | | | | | | | | on x86 and use that to implement stop_emulating() in the fpu/npx code. Reimplement start_emulating() in the non-XEN case by using load_cr0() and rcr0() instead of the 'lmsw' and 'smsw' instructions. Intel explicitly discourages the use of 'lmsw' and 'smsw' on 80386 and later processors in the description of these instructions in Volume 2 of the ADM. Reviewed by: kib MFC after: 1 month
* Partially revert r217515 so that the mem_range_softc variable is alwaysjhb2012-07-092-2/+3
| | | | | | | present on x86 kernels. This fixes the build of kernels that include 'device acpi' but do not include 'device mem'. MFC after: 1 month
* Use assembler mnemonic instead of manually assembling, contination for r238142.kib2012-07-061-6/+3
| | | | | Reviewed by: jhb MFC after: 1 month
* Several fixes to the amd64 disassembler:jhb2012-07-061-22/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | - Add generic support for opcodes that are escape bytes used for multi-byte opcodes (such as the 0x0f prefix). Use this to replace the hard-coded 0x0f special case and add support for three-byte opcodes that use the 0x0f38 prefix. - Decode all Intel VMX instructions. invept and invvpid in particular are three-byte opcodes that use the 0x0f38 escape prefix. - Rework how the special 'SDEP' size flag works such that the default instruction name (i_name) is the instruction when the data size prefix (0x66) is not specified, and the alternate name in i_extra is used when the prefix is included. - Add a new 'ADEP' size flag similar to 'SDEP' except that it chooses between i_name and i_extra based on the address size prefix (0x67). Use this to fix the decoding for jrcxz vs jecxz which is determined by the address size prefix, not the operand size prefix. Also, jcxz is not possible in 64-bit mode, but jrcxz is the default instruction for that opcode. - Add support for handling instructions that have a mandatory 'rep' prefix (this means not outputting the 'repe ' prefix until determining if it is used as part of an opcode). Make 'pause' less of a special case this way. - Decode 'cmpxchg16b' and 'cdqe' which are variants of other instructions but with a REX.W prefix. MFC after: 1 month
* Make pmap_enter()'s management of PV entries consistent with the other pmapalc2012-07-061-20/+15
| | | | | | | | functions that manage PV entries. Specifically, remove the PV entry from the containing PV list only after the corresponding PTE is destroyed. Update the pmap's wired mapping count in pmap_enter() before the PV list lock is acquired.
* Now that our assembler supports the xsave family of instructions, use themjhb2012-07-052-19/+23
| | | | | | | | natively rather than hand-assembled versions. For xgetbv/xsetbv, add a wrapper API to deal with xcr* registers: rxcr() and load_xcr(). Reviewed by: kib MFC after: 1 month
* Calculate the new PTE value in pmap_enter() before acquiring any locks.alc2012-07-051-32/+27
| | | | Move an assertion to the beginning of pmap_enter().
* Correct an error in r237513. The call to reserve_pv_entries() must comealc2012-07-051-8/+12
| | | | | | | | before pmap_demote_pde() updates the PDE. Otherwise, pmap_pv_demote_pde() can crash. Crash reported by: kib Patch tested by: kib
* Decode the 'xsave', 'xrstor', 'xsaveopt', 'xgetbv', 'xsetbv', andjhb2012-07-041-3/+18
| | | | | | 'rdtscp' instructions. MFC after: 1 month
* tws(4) is interfaced with CAM so move it to the same section.delphij2012-07-011-1/+1
| | | | | Reported by: joel MFC after: 3 days
* Optimize reserve_pv_entries() using the popcnt instruction.alc2012-06-302-3/+18
|
* In r237592, I forgot that pmap_enter() might already hold a PV list lockalc2012-06-291-15/+30
| | | | | | | | at the point that it calls get_pv_entry(). Thus, pmap_enter()'s PV list lock pointer must be passed to get_pv_entry() for those rare occasions when get_pv_entry() calls reclaim_pv_chunk(). Update some related comments.
* Avoid some unnecessary PV list locking in pmap_enter().alc2012-06-281-1/+1
|
* Optimize pmap_pv_demote_pde().alc2012-06-281-10/+35
|
* Add new pmap layer locks to the predefined lock order. Change the namesalc2012-06-271-3/+3
| | | | of a few existing VM locks to follow a consistent naming scheme.
* Introduce RELEASE_PV_LIST_LOCK().alc2012-06-261-8/+11
|
OpenPOWER on IntegriCloud