summaryrefslogtreecommitdiffstats
path: root/sys/amd64/include
Commit message (Collapse)AuthorAgeFilesLines
* Handling all the three clocks (hardclock, softclock, profclock) with theattilio2010-01-151-1/+7
| | | | | | | | | | | | | | | | | | LAPIC may lead to aliasing for softclock and profclock because frequencies are sized in order to fit mainly hardclock. atrtc used to take care of the softclock and profclock and it does still do, if the LAPIC can't handle the clocks properly. Revert the change when the LAPIC started taking charge of all three of them and let atrtc handle softclock and profclock if not explicitly requested. Such request can be made setting != 0 the new tunable machdep.lapic_allclocks or if the new device ATPIC is not present within the i386 kernel config (atrtc is linked to atpic presence). Diagnosed by: Sandvine Incorporated Reviewed by: jhb, emaste Sponsored by: Sandvine Incorporated MFC: 3 weeks
* Use io(4) for I/O port access on ia64, rather than through sysarch(2).marcel2010-01-111-0/+1
| | | | | | | | | | | | | | | | | I/O port access is implemented on Itanium by reading and writing to a special region in memory. To hide details and avoid misaligned memory accesses, a process did I/O port reads and writes by making a MD system call. There's one fatal problem with this approach: unprivileged access was not being prevented. /dev/io serves that purpose on amd64/i386, so employ it on ia64 as well. Use an ioctl for doing the actual I/O and remove the sysarch(2) interface. Backward compatibility is not being considered. The sysarch(2) approach was added to support X11, but support for FreeBSD/ia64 was never fully implemented in X11. Thus, nothing gets broken that didn't need more work to begin with. MFC after: 1 week
* Quiet variable "shadows" warning:obrien2010-01-011-18/+18
| | | | | | | sys/vmmeter.h: warning: shadowed declaration is here machine/cpufunc.h: In function 'insw': machine/cpufunc.h: warning: declaration of 'cnt' shadows a global declaration ..snip..
* mca: improve status checking, recording and reportingavg2009-12-021-0/+1
| | | | | | | | | | - directly print mca information in case we fail to allocate memory for a record - include bank number into mca record - print raw mca status value for extended information Reviewed by: jhb MFC after: 10 days
* x86 cpu features: add MOVBE reporting and flagavg2009-11-301-0/+1
| | | | | The check is glimpsed from Linux and OpenSolaris. MOVBE instruction is found in Intel Atom processors.
* Uppercase the UL suffix on a constant, so Flexelint doesn't worry thatphk2009-11-161-1/+1
| | | | | 'u1' might have been intended. No, that does not make sense and yes I have told them.
* Amd64 init_secondary() calls initializecpu() while curthread is stillkib2009-11-131-0/+1
| | | | | | | | | | | | | | | | not properly set up. r199067 added the call to TUNABLE_INT_FETCH() to initializecpu() that results in hang because AP are started when kernel environment is already dynamic and thus needs to acquire mutex, that is too early in AP start sequence to work. Extract the code that should be executed only once, because it sets up global variables, from initializecpu() to initializecpucache(), and call the later only from hammer_time() executed on BSP. Now, TUNABLE_INT_FETCH() is done only once at BSP at the early boot stage. In collaboration with: Mykola Dzham <freebsd levsha org ua> Reviewed by: jhb Tested by: ed, battlez
* Add a facility for associating optional descriptions with active interruptjhb2009-10-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | handlers. This is primarily intended as a way to allow devices that use multiple interrupts (e.g. MSI) to meaningfully distinguish the various interrupt handlers. - Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate a description with an active interrupt handler setup by BUS_SETUP_INTR. It has a default method (bus_generic_describe_intr()) which simply passes the request up to the parent device. - Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports printf(9) style formatting using var args. - Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of an interrupt handler and copy the name passed to intr_event_add_handler() into that buffer instead of just saving the pointer to the name. - Add a new intr_event_describe_handler() which appends a description string to an interrupt handler's name. - Implement support for interrupt descriptions on amd64 and i386 by having the nexus(4) driver supply a custom bus_describe_intr method that invokes a new intr_describe() MD routine which in turn looks up the associated interrupt event and invokes intr_event_describe_handler(). Requested by: many Reviewed by: scottl MFC after: 2 weeks
* Define architectural load bases for PIE binaries. Addresses were selectedkib2009-10-101-0/+6
| | | | | | | | | | by looking at the bases used for non-relocatable executables by gnu ld(1), and adjusting it slightly. Discussed with: bz Reviewed by: kan Tested by: bz (i386, amd64), bsam (linux) MFC after: some time
* atomic_cmpset_barr_* was added in order to cope with compilers willing toattilio2009-10-091-32/+44
| | | | | | | | | | | | | | | | | specify their own version of atomic_cmpset_* which could have been different than the membar version. Right now, however, FreeBSD is bound mostly to GCC-like compilers and it is desired to add new support and compat shim mostly when there is a real necessity, in order to avoid too much compatibility bloats. In this optic, bring back atomic_cmpset_{acq, rel}_* to be the same as atomic_cmpset_* and unwind the atomic_cmpset_barr_* introduction. Requested by: jhb Reviewed by: jhb Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* - All the functions in atomic.h needs to be in "physical" form (likeattilio2009-10-061-46/+29
| | | | | | | | | | | | not defined through macros or similar) in order to be later compiled in the kernel and offer this way the support for modules (and compatibility among the UP case and SMP case). Fix this for the newly introduced atomic_cmpset_barr_* cases by defining and specifying a template. Note that the new DEFINE_CMPSET_GEN() template save more typing on amd64 than the current code. [1] - Fix the style for memory barriers on amd64. [1] Reported by: Paul B. Mahol <onemda at gmail dot com>
* Per their definition, atomic instructions used in conjuction withattilio2009-10-061-46/+67
| | | | | | | | | | | | | | | | | | | | | | | | memory barriers should also ensure that the compiler doesn't reorder paths where they are used. GCC, however, does that aggressively, even in presence of volatile operands. The most reliable way GCC offers for avoid instructions reordering is clobbering "memory" even if that is theoretically an heavy-weight operation, flushing the content of all the registers and forcing reload of them (We could rely, however, on gcc DTRT by just understanding the purpose as this is a well-known pattern for many modern operating-systems). Not all our memory barriers, right now, clobber memory for GCC-like compilers. The most notable cases are IA32 and amd64 where the memory barrier are treacted the same as normal atomic instructions. Fix this by offering the possibility to implement atomic instructions with memory barriers separately from the normal version and implement the GCC-like specific one using memory clobbering. Thanks to Chris Lattner (@apple) for his discussion on llvm specifics. Reported by: jhb Reviewed by: jhb Tested by: rdivacky, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* cpufunc.h: unify/correct style of c extension namesavg2009-09-301-3/+3
| | | | | | | | | | i386 and amd64 archs only. inline => __inline. [1] __asm__ => __asm. [2] Reviewed by: kib, jhb [1] Suggested by: kib [2] MFC after: 1 week
* Copy apm(4) emulation from sys/i386/acpica/acpi_machdep.c andjkim2009-09-271-0/+264
| | | | install apm(8) and apm_bios.h on amd64.
* Extract the code to find and map the MADT ACPI table during early kerneljhb2009-09-231-0/+3
| | | | | | | | | startup and genericize it so it can be reused to map other tables as well: - Add a routine to walk a list of ACPI subtables such as those used in the APIC and SRAT tables in the MI acpi(4) driver. - Move the routines for mapping and unmapping an ACPI table as well as mapping the RSDT or XSDT and searching for a table with a given signature out into acpica_machdep.c for both amd64 and i386.
* Add a new sysctl for reporting all of the supported page sizes.alc2009-09-181-0/+2
| | | | | Reviewed by: jhb MFC after: 3 weeks
* Consolidate CPUID to CPU family/model macros for amd64 and i386 to reducejkim2009-09-101-2/+2
| | | | unnecessary #ifdef's for shared code between them.
* Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating anphk2009-09-082-14/+56
| | | | | architecture specific include file containing the _ALIGN* stuff which <sys/socket.h> needs.
* Move multi-include protection back up to the top of the file andphk2009-09-081-4/+4
| | | | name after the physical file rather than the aliased name.
* Adjust the handling of the local APIC PMC interrupt vector:jhb2009-08-142-1/+3
| | | | | | | | | | | | | | | | - Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc() routines in the local APIC code that the hwpmc(4) driver can use to manage the local APIC PMC interrupt vector. - Do not enable the local APIC PMC interrupt vector by default when HWPMC_HOOKS is enabled. Instead, the hwpmc(4) driver explicitly enables the interrupt when it is succesfully initialized and disables the interrupt when it is unloaded. This avoids enabling the interrupt on unsupported CPUs which may result in spurious NMIs. Reported by: rnoland Reviewed by: jkoshy Approved by: re (kib) MFC after: 2 weeks
* * Completely Remove the option STOP_NMI from the kernel. This optionattilio2009-08-132-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists. Please don't forget to update your config file with the STOP_NMI option removal Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)
* When the page caching attributes are changed, after new mapping iskib2009-07-222-0/+15
| | | | | | | | | | | | | | | established, OS shall flush the caches on all processors that may have used the mapping previously. This operation is not needed if processors support self-snooping. If not, but clflush instruction is implemented on the CPU, series of the clflush can be used on the mapping region. Otherwise, we have to flush the whole cache. The later operation is very expensive, and AMD-made CPUs do not have self-snooping. Implement cache flush for remapped region by using clflush for amd64, when supported by CPU. Proposed and reviewed by: alc Approved by: re (kensmith)
* Add support to the virtual memory system for configuring machine-alc2009-07-122-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
* Restore the segment registers and segment base MSRs for amd64 syscallkib2009-07-091-1/+2
| | | | | | | | | | | | | | | | | return path only when neither thread was context switched while executing syscall code nor syscall explicitely modified LDT or MSRs. Save segment registers in trap handlers before interrupts are enabled, to not allow context switches to happen before registers are saved. Use separated byte in pcb for indication of fast/full return, since pcb_flags are not synchronized with context switches. The change puts back syscall microbenchmark numbers that were slowed down after commit of the support for LDT on amd64. Reviewed by: jeff Tested (and tested, and tested ...) by: pho Approved by: re (kensmith)
* Cleanup ALIGNED_POINTER:sam2009-07-051-10/+7
| | | | | | | | | | | o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v) o define as "1" on amd64 and i386 where there is no restriction o make the type returned consistent with ALIGN o remove _ALIGNED_POINTER o make associated comments consistent Reviewed by: bde, imp, marcel Approved by: re (kensmith)
* Improve the handling of cpuset with interrupts.jhb2009-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | - For x86, change the interrupt source method to assign an interrupt source to a specific CPU to return an error value instead of void, thus allowing it to fail. - If moving an interrupt to a CPU fails due to a lack of IDT vectors in the destination CPU, fail the request with ENOSPC rather than panicing. - For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used on the first interrupt in a group. Moving the first interrupt in a group moves the entire group. - Use the icu_lock to protect intr_next_cpu() on x86 instead of the intr_table_lock to fix a LOR introduced in the last set of MSI changes. - Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with interrupts. Previously, binding an interrupt to a CPU only performed a privilege check if the interrupt had an interrupt thread. Interrupts without a thread could be bound by non-root users as a result. - If an interrupt event's assign_cpu method fails, then restore the original cpuset mask for the associated interrupt thread. Approved by: re (kib)
* Correct the #endif comment.alc2009-06-261-1/+1
| | | | | Noticed by: jmallett Approved by: re (kib)
* This change is the next step in implementing the cache control functionalityalc2009-06-261-0/+45
| | | | | | | | | | | required by video card drivers. Specifically, this change introduces vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all architectures. In addition, this changes adds a vm_cache_mode_t parameter to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the interfaces for allocating mapped kernel memory and physical memory, respectively, with non-default cache modes. In collaboration with: jhb
* Fix kernels compiled without SMP support. Make intr_next_cpu() availablejhb2009-06-251-2/+0
| | | | | | | for UP kernels but as a stub that always returns the single CPU's local APIC ID. Reported by: kib
* - Restore the behavior of pre-allocating IDT vectors for MSI interrupts.jhb2009-06-251-0/+3
| | | | | | | | | | | | | | | | | | This is mostly important for the multiple MSI message case where the IDT vectors for the entire group need to be allocated together. This also restores the assumptions made by the PCI bus code that it could invoke PCIB_MAP_MSI() once MSI vectors were allocated. - To avoid whiplash with CPU assignments, change the way that CPUs are assigned to interrupt sources on activation. Instead of assigning the CPU via pic_assign_cpu() before calling enable_intr(), allow the different interrupt source drivers to ask the MD interrupt code which CPU to use when they allocate an IDT vector. I/O APIC interrupt pins do this in their pic_enable_intr() routines giving the same behavior as before. MSI sources do it when the IDT vectors are allocated during msi_alloc() and msix_alloc(). - Change the intr_table_lock from an sx lock to a mutex. Tested by: rnoland
* Eliminate dead code. These definitions should have been deleted with thealc2009-06-221-10/+0
| | | | | | introduction of i686_mem.c in r45405. Merge adjacent #ifdef _KERNEL/#endif blocks.
* Now that amd64's kernel map is 512GB (SVN rev 192216), there is no reasonalc2009-06-081-9/+0
| | | | | | to cap its buffer map at 1GB. MFC after: 6 weeks
* Bump CACHE_LINE_SIZE to 128 for x86. Intel's manuals explicitly recommendjhb2009-05-181-1/+1
| | | | using 128 byte alignment for locks. (See IA-32 SDM Vol 3A 7.11.6.7)
* correct range in commentkmacy2009-05-161-1/+1
| | | | pointed out by alc
* update vm map commentkmacy2009-05-161-1/+0
| | | | pointed out by Larry Rosenman
* Increase default kernel map to 512GBkmacy2009-05-161-2/+2
| | | | | I briefly discussed this with alc. It could lead to problems for greater than 64GB. However, that seems unlikely in practice.
* FreeBSD right now support 32 CPUs on all the architectures at least.attilio2009-05-141-4/+4
| | | | | | | | | | | | | | | | With the arrival of 128+ cores it is necessary to handle more than that. One of the first thing to change is the support for cpumask_t that needs to handle more than 32 bits masking (which happens now). Some places, however, still assume that cpumask_t is a 32 bits mask. Fix that situation by using always correctly cpumask_t when needed. While here, remove the part under STOP_NMI for the Xen support as it is broken in any case. Additively make ipi_nmi_pending as static. Reviewed by: jhb, kmacy Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Implement simple machine check support for amd64 and i386.jhb2009-05-132-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - For CPUs that only support MCE (the machine check exception) but not MCA (i.e. Pentium), all this does is print out the value of the machine check registers and then panic when a machine check exception occurs. - For CPUs that support MCA (the machine check architecture), the support is a bit more involved. - First, there is limited support for decoding the CPU-independent MCA error codes in the kernel, and the kernel uses this to output a short description of any machine check events that occur. - When a machine check exception occurs, all of the MCx banks on the current CPU are scanned and any events are reported to the console before panic'ing. - To catch events for correctable errors, a periodic timer kicks off a task which scans the MCx banks on all CPUs. The frequency of these checks is controlled via the "hw.mca.interval" sysctl. - Userland can request an immediate scan of the MCx banks by writing a non-zero value to "hw.mca.force_scan". - If any correctable events are encountered, the appropriate details are stored in a 'struct mca_record' (defined in <machine/mca.h>). The "hw.mca.count" is a count of such records and each record may be queried via the "hw.mca.records" tree by specifying the record index (0 .. count - 1) as the next name in the MIB similar to using PIDs with the kern.proc.* sysctls. The idea is to export machine check events to userland for more detailed processing. - The periodic timer and hw.mca sysctls are only present if the CPU supports MCA. Discussed with: emaste (briefly) MFC after: 1 month
* Fix XENHVM build.dfr2009-05-061-1/+1
|
* Rename statclock_disable variable to atrtcclock_disable that it actually is,mav2009-05-031-1/+0
| | | | | | | | | | | | | and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock controlling it. Setting it to 0 disables using RTC clock as stat-/ profclock sources. Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254 hardclock, when LAPIC and RTC clocks are disabled. This allows to reduce global interrupt rate of idle system down to about 100 interrupts per core, permitting C3 and deeper C-states provide maximum CPU power efficiency.
* Add support for using i8254 and rtc timers as event sources for amd64 SMPmav2009-05-022-1/+10
| | | | system. Redistribute hard-/stat-/profclock events to other CPUs using IPIs.
* - Add support for cpuid leaf 0xb. This allows us to determine thejeff2009-04-292-4/+7
| | | | | | | | topology of nehalem/corei7 based systems. - Remove the cpu_cores/cpu_logical detection from identcpu. - Describe the layout of the system in cpu_mp_announce(). Sponsored by: Nokia
* Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizingrwatson2009-04-201-2/+0
| | | | | | | | | | | | a fair number of static data structures, making this an unlikely option to try to change without also changing source code. [1] Change default cache line size on ia64, sparc64, and sun4v to 128 bytes, as this was what rtld-elf was already using on those platforms. [2] Suggested by: bde [1], jhb [2] MFC after: 2 weeks
* Add description and cautionary note regarding CACHE_LINE_SIZE.rwatson2009-04-191-0/+4
| | | | | MFC after: 2 weeks Suggested by: alc
* For each architecture, define CACHE_LINE_SHIFT and a derivedrwatson2009-04-191-0/+4
| | | | | | | | | | | | | CACHE_LINE_SIZE constant. These constants are intended to over-estimate the cache line size, and be used at compile-time when a run-time tuning alternative isn't appropriate or available. Defaults for all architectures are 64 bytes, except powerpc where it is 128 bytes (used on G5 systems). MFC after: 2 weeks Discussed on: arch@
* A simple rewrite of biossmap.c:jkim2009-04-152-0/+5
| | | | | | | | | | | | | - Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to store each return buffer and copy all at once. - Export optional extended attributes defined in ACPI 3.0 as separate metadata. Currently, there are only two bits defined in the specification. For example, if the descriptor has extended attributes and it is not enabled, it has to be ignored by OS. We may implement it in the kernel later if it is necessary and proven correct in reality. - Check return buffer size strictly as suggested in ACPI 3.0. Reviewed by: jhb
* Simplify in/out functions (for i386 and AMD64).ed2009-04-111-79/+8
| | | | | | | | Remove a hack to generate more efficient code for port numbers below 0x100, which has been obsolete for at least ten years, because GCC has an asm constraint to specify that. Submitted by: Christoph Mallon <christoph mallon gmx de>
* Also remove the unused __word_swap_int*() macros.ed2009-04-081-19/+0
| | | | Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
* Implement __bswap16() without using inline assembly.ed2009-04-081-22/+1
| | | | | | | Most compilers nowadays (including GCC) are smart enough to know what's going on and generate more efficient code anyway. Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
* Don't explicitly force ecx to be used for MSR_FSBASE/MSR_GSBASE.ed2009-04-071-10/+4
| | | | | | | | | Because the "c" input constaint is used, the compiler will already place the MSR_FSBASE/MSR_GSBASE constants in ecx. Using __asm("ecx") makes LLVM crash. Even though this is also an LLVM bug, we'd better remove the unnecessary GCCism as well. Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
OpenPOWER on IntegriCloud