summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Regen.ru2006-03-193-3/+4
|
* Unbreak COMPAT_LINUX32 option support on amd64.ru2006-03-192-0/+2
| | | | Broken by: netchild
* regennetchild2006-03-181-3/+0
|
* Enable global pages TLB extension on Application Processors.ups2006-03-181-0/+7
| | | | MFC after: 3 days
* regen after COMPAT_43 removalnetchild2006-03-185-12/+42
|
* Get rid of the need of COMPAT_43 in the linuxolator.netchild2006-03-181-9/+5
| | | | | Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)
* Don't allow userland to set hardware watch points on kernel memory at all.jhb2006-03-141-23/+20
| | | | | | | | | | | | | | Previously, we tried to allow this only for root. However, we were calling suser() on the *target* process rather than the current process. This means that if you can ptrace() a process running as root you can set a hardware watch point in the kernel. In practice I think you probably have to be root in order to pass the p_candebug() checks in ptrace() to attach to a process running as root anyway. Rather than fix the suser(), I just axed the entire idea, as I can't think of any good reason _at all_ for userland to set hardware watch points for KVM. MFC after: 3 days Also thinks hardware watch points on KVM from userland are bad: bde, rwatson
* Merge/sync with i386: various cosmetic tweakspeter2006-03-143-2/+13
|
* MFi386: The SIGFPE macros were moved to signal.h (FPE_INTOVF etc)peter2006-03-141-10/+0
|
* MFi386: rename pcib_devclass to hostb_devclass (cosmetic here)peter2006-03-131-2/+2
|
* MFi386: add a TRAP_INTERRUPT casepeter2006-03-131-0/+8
|
* Cosmetic sync with i386peter2006-03-134-7/+6
|
* Fix the format/display descriptor of vm.kmem_size and vm.kmem_freeps2006-03-131-2/+2
| | | | | | | to be 'long' instead of 'int' so that sysctl(8) correctly displays the 8 returned bytes as a single 'long' instead of two 'int' values. Submitted by: peter
* Flip the switch and don't route interrupts to hyperthreads in a HT system.jhb2006-03-091-2/+2
| | | | | | | | | In at least one benchmark this showed around a 20% performance increase. If other workloads do benefit from having hyperthreads service interrupts, we can always make this a loader tunable. MFC after: 3 days Tested by: ps
* Fix exec_map resource leaks.ups2006-03-081-9/+14
| | | | Tested by: kris@
* MFi386 revision 1.1220: options TDFX_LINUX --> device tdfx_linuxyar2006-03-061-3/+2
|
* guard function decls with _KERNEL so user code can include this filesam2006-03-011-1/+2
|
* Rework how we wire up interrupt sources to CPUs:jhb2006-02-287-135/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Throw out all of the logical APIC ID stuff. The Intel docs are somewhat ambiguous, but it seems that the "flat" cluster model we are currently using is only supported on Pentium and P6 family CPUs. The other "hierarchy" cluster model that is supported on all Intel CPUs with local APICs is severely underdocumented. For example, it's not clear if the OS needs to glean the topology of the APIC hierarchy from somewhere (neither ACPI nor MP Table include it) and setup the logical clusters based on the physical hierarchy or not. Not only that, but on certain Intel chipsets, even though there were 4 CPUs in a logical cluster, all the interrupts were only sent to one CPU anyway. - We now bind interrupts to individual CPUs using physical addressing via the local APIC IDs. This code has also moved out of the ioapic PIC driver and into the common interrupt source code so that it can be shared with MSI interrupt sources since MSI is addressed to APICs the same way that I/O APIC pins are. - Interrupt source classes grow a new method pic_assign_cpu() to bind an interrupt source to a specific local APIC ID. - The SMP code now tells the interrupt code which CPUs are avaiable to handle interrupts in a simpler and more intuitive manner. For one thing, it means we could now choose to not route interrupts to HT cores if we wanted to (this code is currently in place in fact, but under an #if 0 for now). - For now we simply do static round-robin of IRQs to CPUs when the first interrupt handler just as before, with the change that IRQs are now bound to individual CPUs rather than groups of up to 4 CPUs. - Because the IRQ to CPU mapping has now been moved up a layer, it would be easier to manage this mapping from higher levels. For example, we could allow drivers to specify a CPU affinity map for their interrupts, or we could allow a userland tool to bind IRQs to specific CPUs. The MFC is tentative, but I want to see if this fixes problems some folks had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose interrupts). MFC after: 1 week
* It seems bit 5 of cpu_feature2 is the VMX (Virtual Machine Extensions)dwmalone2006-02-151-2/+2
| | | | | bit. While I'm here, delete a comment that was cut and past from the cpu_features code that doesn't belong here.
* CPU time accounting speedup (step 2)phk2006-02-111-0/+1
| | | | | | | | | | | | | | | | | | | Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64
* Simplify system time accounting for profiling.phk2006-02-082-9/+6
| | | | | | | | | | Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.
* Modify the way we account for CPU time spent (step 1)phk2006-02-071-1/+1
| | | | | | | | | | | | | | | | Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
* - Always call exec_free_args() in kern_execve() instead of doing it in alljhb2006-02-061-1/+0
| | | | | | the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.
* Call the audit syscall enter/exit functions for the amd64 architecture,wsalamon2006-02-042-1/+10
| | | | | | | both 32-bit and 64-bit paths. System calls will now be audited. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)
* MFi386:davidxu2006-02-031-1/+2
| | | | | Clear carry flag in get_mconetxt so that setcontext does not return a bogus error.
* Make PV entries dynamic on amd64. i386 has a pre-reserved block of kvapeter2006-02-031-6/+36
| | | | | | | | | | | | | | | | | | dedicated to storing pv entries, originally so that kva didn't have to be allocated at inconvenient times. For amd64, we can get the same effect by using the direct map area. Allocating pages is the same as with the object backed method, but now we can just lookup the page in the direct map area. Thus, no more pageable kva is reserved. This is the single largest consumer of kva on our work machines and this change should help conserve the fixed size 2GB pageable kva on the amd64 kernel. There are a pair of sysctl nodes introduced, named the same as their tunable counterparts. vm.pmap.shpgperproc and vm.pmap.pv_entry_max They work just like the tunables of the same path, except the values are linked. The pv entry cap is now dynamically changeable. I didn't make them totally unlimited because we need some sort of safety limit still. One could consume all physical memory without a cap.
* Call WITNESS_CHECK() in the page fault handler and immediately assume itjhb2006-01-271-1/+9
| | | | | | | | | | | is a fatal fault if we are holding any non-sleepable locks. This should cut down on the number of bogus LORs we currently get when the kernel panics due to a NULL (or bogus) pointer dereference that goes wandering off into the VM system which tries to acquire locks and then kicks off the spurious LORs. This should probably be ported to all the archs at some point. Tested on: i386
* Free the newtag if we exit with a failure from alloc_bounce_zone().scottl2006-01-141-1/+3
| | | | Found by: Coverity Prevent(tm)
* Move linux support to the linux section.obrien2006-01-121-1/+1
|
* Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)phk2006-01-101-0/+1
| | | | | | | | | | | | to COMPAT_43TTY. Add COMPAT_43TTY to NOTES and */conf/GENERIC Compile tty_compat.c only under the new option. Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.
* By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT intoimp2006-01-092-2/+3
| | | | | | | | param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __. Submitted by: bde, jhb, scottl, many others.
* - Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all thejhb2006-01-062-22/+9
| | | | | | | various pcib drivers to use their own private devclass_t variables for their modules. - Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib drivers while I'm here.
* Fix various places that were testing td_critnest to see if interruptsjhb2006-01-061-3/+3
| | | | | should remain disabled during a trap or not to check td_md.md_spinlock_count instead.
* - Explicitly validate an empty filter to match bpf_filter() comment[1].jkim2006-01-031-0/+4
| | | | | | - Do not use BPF JIT compiler for an empty filter. [1] Pointed out by: darrenr
* Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate forimp2006-01-011-0/+2
| | | | | each platform. These will be used in the pci code in preference to the complicated #ifdefs we have there now.
* Unbreak kernel build.netchild2006-01-011-3/+3
| | | | | | | | A happy new year to all. Submitted by: Goran Gajic <ggajic@afrodita.rcub.bg.ac.yu>, bz Pointy hat to: netchild Appologies to: all
* MI changes:netchild2005-12-311-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)
* Fix watch address truncation. The address was truncated when it was passed topjd2005-12-271-3/+3
| | | | | | | amd64_set_watch() as 'unsigned int' and 'unsigned int' is 32bit long on amd64. Even with that fix hardware watchpoint don't work for me on amd64, ie. when I set the watchpoint and write a byte there, nothing happens.
* Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structuresobomax2005-12-262-0/+4
| | | | | | | | | | with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>
* - Improve the INKERNEL macro such that it can no longer give false positives.jeff2005-12-231-1/+5
| | | | | | This fixes the stack(9) functionality. Submitted by: Antoine Brodin <antoine.brodin@laposte.net>
* Tweak how the MD code calls the fooclock() methods some. Instead ofjhb2005-12-228-49/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)
* Move the hostb driver out of the i386 and amd64 PCI code (where it wasjhb2005-12-201-58/+0
| | | | | | | duplicated anyways) and into a single MI driver. Extend the driver a bit to implement the bus and PCI kobj interfaces such that other drivers can attach to it and transparently act as if their parent device is the PCI bus (for the most part).
* Make our ELF64 type definitions match standards. In particular thismarcel2005-12-181-1/+1
| | | | | | | | | | | | | means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size. MFC after: 2 weeks
* Don peril sensitive sunglasses and jack up the MAX_BPAGES limit to 8192scottl2005-12-161-1/+1
| | | | | on amd64. If you're going to stuff >4GB into your box, reserving 32MB for bonce pages amounts to a rounding error in the overall scheme of things.
* Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)jhb2005-12-151-1/+0
| | | | | | | | | which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT() has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here was redundant and resulted in panics in debug kernels. MFC after: 1 week Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu
* Fix stale comment.jhb2005-12-141-2/+1
|
* Revert previous commit. The BIOS braindamage is even worse than Ijhb2005-12-131-0/+4
| | | | | | | | originally thought. The BIOS that cleared CPUID_APIC actually managed to disable the local APIC entirely and even Windows 64 doesn't boot on it. Reported by: bz
* Don't check the CPUID_APIC bit in the cpu_features flags field to determinejhb2005-12-131-4/+0
| | | | | | | | | | | | | | if the boot CPU has a local APIC because some BIOS vendors are not competent enough to set this bit. Instead, just assume that we always have a local APIC on amd64. For i386 the check is a bit more subtle. FreeBSD requires either an MP Table or an ACPI MADT table to enumerate APICs. The only systems that have one of those tables that don't have local APICs are some presumably rare (and old) SMP 486 systems using external APICs. Thus, instead of checking the CPUID_APIC flag, check the CPU class and abort if we are running on a 486. MFC after: 1 week Reported by: bz
* For the amd64 platform, we can depend on the TSC being present. This patchpeter2005-12-121-0/+15
| | | | | | | changes DELAY to use the TSC once it has been calibrated. This does NOT use the TSC for long-term timekeeping. It only uses it to bound the DELAY() spinloop. This should not be affected by the Athlon64 X2 TSC quirks because the cpu is not halted while we use DELAY().
* Sync with i386, fix compiling for non-SMP.davidxu2005-12-091-0/+2
|
OpenPOWER on IntegriCloud