summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Remove ctl(4) from GENERIC. Also remove 'options CTL_DISABLE'trasz2013-04-121-4/+1
| | | | | | | | | | | and kern.cam.ctl.disable tunable; those were introduced as a workaround to make it possible to boot GENERIC on low memory machines. With ctl(4) being built as a module and automatically loaded by ctladm(8), this makes CTL work out of the box. Reviewed by: ken Sponsored by: FreeBSD Foundation
* If vmm.ko could not be initialized correctly then prevent the creation ofneel2013-04-123-8/+21
| | | | | | virtual machines subsequently. Submitted by: Chris Torek
* Make the code to check if VMX is enabled more readable by using macrosneel2013-04-111-1/+2
| | | | | | instead of magic numbers. Discussed with: Chris Torek
* Unsynchronized TSCs on the host require special handling in bhyve:neel2013-04-102-1/+21
| | | | | | | | | | | | - use clock_gettime(2) as the time base for the emulated ACPI timer instead of directly using rdtsc(). - don't advertise the invariant TSC capability to the guest to discourage it from using the TSC as its time base. Discussed with: jhb@ (about making 'smp_tsc' a global) Reported by: Dan Mack on freebsd-virtualization@ Obtained from: NetApp
* Merge from projects/counters: counter(9).glebius2013-04-081-0/+51
| | | | | | | | | | | | | Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data. See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details. In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.
* Merge from projects/counters:glebius2013-04-081-1/+3
| | | | | | | Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones. Sponsored by: Nginx, Inc.
* Don't panic when a valid divisor of 1 has been requested.grehan2013-04-051-0/+2
| | | | Obtained from: NetApp
* Remove all legacy ATA code parts, not used since options ATA_CAM enabled inmav2013-04-041-1/+0
| | | | | | | | | most kernels before FreeBSD 9.0. Remove such modules and respective kernel options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the atacontrol utility and some man pages. Remove useless now options ATA_CAM. No objections: current@, stable@ MFC after: never
* Add counter to keep track of the number of timer interrupts generated byneel2013-03-311-0/+4
| | | | the local apic for each virtual cpu.
* Add some more stats to keep track of all the reasons that a vcpu is exiting.neel2013-03-303-1/+42
|
* Allow caller to skip 'guest linear address' validation when doing instructionneel2013-03-282-5/+16
| | | | | | | decode. This is to accomodate hardware assist implementations that do not provide the 'guest linear address' as part of nested page fault collateral. Submitted by: Anish Gupta (akgupt3 at gmail dot com)
* Implement the concept of the unmapped VMIO buffers, i.e. buffers whichkib2013-03-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
* MFCattilio2013-03-175-11/+68
|\
| * Fix the '-Wtautological-compare' warning emitted by clang for comparing theneel2013-03-161-1/+1
| | | | | | | | | | | | unsigned enum type with a negative value. Obtained from: NetApp
| * Allow vmm stats to be specific to the underlying hardware assist technology.neel2013-03-164-10/+43
| | | | | | | | | | | | | | | | | | This can be done by using the new macros VMM_STAT_INTEL() and VMM_STAT_AMD(). Statistic counters that are common across the two are defined using VMM_STAT(). Suggested by: Anish Gupta Discussed with: grehan Obtained from: NetApp
| * Add pmap function pmap_copy_pages(), which copies the content of thekib2013-03-141-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks
* | Simplify the interface to vm_radix_insert() by eliminating the parameteralc2013-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | "index". The content of a radix tree leaf, or at least its "key", is not opaque to the other radix tree operations. Specifically, they know how to extract the "key" from a leaf. So, eliminating the parameter "index" isn't breaking the abstraction. Moreover, eliminating the parameter "index" effectively prevents the caller from passing an inconsistent "index" and leaf to vm_radix_insert(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division
* | When a superpage promotion occurs, the page table page that the superpagealc2013-03-122-123/+15
| | | | | | | | | | | | | | | | | | mapping replaces is added to an ordered collection of page table pages. Rather than preserving the code that implements the splay tree of pages in the pmap for just this one purpose, use the new MI radix tree. The extra overhead of using a radix tree for this purpose is small enough, about 4% added run-time to pmap_promote_pde(), that I don't see the point of preserving the splay tree code.
* | MFCattilio2013-03-111-1/+0
|\ \ | |/
| * The kernel pmap is statically allocated, so there is really no need toalc2013-03-101-1/+0
| | | | | | | | | | | | explicitly initialize its pm_root field to zero. Sponsored by: EMC / Isilon Storage Division
* | MFCattilio2013-03-092-6/+7
|\ \ | |/
| * MFCattilio2013-03-082-1/+11
| |\
| * \ MFCattilio2013-03-024-38/+39
| |\ \
| * | | Hide the details for the assertion for VM_OBJECT_LOCK operations.attilio2013-02-211-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc
| * | | MFCattilio2013-02-215-454/+15
| |\ \ \
| * | | | There is no need to use VM_OBJECT_LOCKED() as the assertion won'tattilio2013-02-201-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().
| * | | | Switch vm_object lock to be a rwlock.attilio2013-02-202-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* | | | | MFCattilio2013-03-072-1/+11
|\ \ \ \ \ | | |_|_|/ | |/| | |
| * | | | Remove the virtio dependency entry for the VirtIO device drivers. Thisbryanv2013-03-061-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | will prevent the kernel from linking if the device driver are included without the virtio module. Remove pci and scbus for the same reason. Also explain the relationship and necessity of the virtio and virtio_pci modules. Currently in FreeBSD, we only support VirtIO PCI, but it could be replaced with a different interface (like MMIO) and the device (network, block, etc) will still function. Requested by: luigi Approved by: grehan (mentor) MFC after: 3 days
| * | | | Re-enable CTL in GENERIC on i386 and amd64, but turn on the CTL disableken2013-03-041-1/+4
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tunable by default. This will allow GENERIC configurations to boot on small memory boxes, but not require end users who want to use CTL to recompile their kernel. They can simply set kern.cam.ctl.disable=0 in loader.conf. The eventual solution to the memory usage problem is to change the way CTL allocates memory to be more configurable, but this should fix things for small memory situations in the mean time. UPDATING: Explain the change in the CTL configuration, and how users can enable CTL if they would like to use it. sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL from initializing. ctl.c: If CTL_DISABLE is turned on, don't initialize. i386/conf/GENERIC, amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE option.
* | | | MFCattilio2013-03-023-11/+12
|\ \ \ \ | |/ / /
| * | | Merge from vmc-playground branch:attilio2013-03-022-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the pv_entry_t iterator from pv_list to pv_next. Besides being more correct technically (as the name seems to suggest this is a list while it is an iterator), it will also be needed by vm_radix work to avoid a nameclash on macro expansions. Sponsored by: EMC / Isilon storage division Reviewed by: alc, jeff Tested by: flo, pho, jhb, davide
| * | | Disable the ctl driver in GENERIC.adrian2013-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | It unfortunately steals a fair chunk of RAM at startup even if it's not actively used, which prevents FreeBSD VMs of 128MB from successfully booting and running.
| * | | MFcalloutng:davide2013-02-281-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CPU becomes idle, cpu_idleclock() calculates time to the next timer event in order to reprogram hw timer. Return that time in sbintime_t to the caller and pass it to acpi_cpu_idle(), where it can be used as one more factor (quite precise) to extimate furter sleep time and choose optimal sleep state. This is a preparatory change for further callout improvements will be committed in the next days. The commmit is not targeted for MFC.
| * | | Merge from vmobj-rwlock:attilio2013-02-271-3/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
* | | MFCattilio2013-02-271-3/+2
| | |
* | | MFCattilio2013-02-247-459/+20
|\ \ \ | |/ /
| * | Convert machine/elf.h, machine/frame.h, machine/sigframe.h,kib2013-02-205-454/+15
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | machine/signal.h and machine/ucontext.h into common x86 includes, copying from amd64 and merging with i386. Kernel-only compat definitions are kept in the i386/include/sigframe.h and i386/include/signal.h, to reduce amd64 kernel namespace pollution. The amd64 compat uses its own definitions so far. The _MACHINE_ELF_WANT_32BIT definition is to allow the sys/boot/userboot/userboot/elf32_freebsd.c to use i386 ELF definitions on the amd64 compile host. The same hack could be usefully abused by other code too.
| * Consistently use round_page(x) rather than roundup(x, PAGE_SIZE). There isjkim2013-02-152-5/+5
| | | | | | | | no functional change.
* | MFCattilio2013-02-151-2/+5
|\ \ | |/
| * Print slightly more useful information on the 'bad pte' panic.kib2013-02-141-2/+4
| | | | | | | | | | No objections from: alc MFC after: 1 week
| * Assert that user address is never qremoved.kib2013-02-141-0/+1
| | | | | | | | | | No objections from: alc MFC after: 1 week
* | MFCattilio2013-02-141-2/+2
|\ \ | |/
| * Requests for invalid CPUID leaves should map to the highest known leaf instead.neel2013-02-131-2/+2
| | | | | | | | | | Reviewed by: grehan Obtained from: NetApp
* | MFCattilio2013-02-135-111/+2
|\ \ | |/
| * Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'.neel2013-02-115-111/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING) that called 'sched_bind()' on behalf of the user thread. The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn runs afoul of the assertion '(td_pinned == 0)' in userret(). Using the cpuset affinity to implement pinning of the vcpu threads works with both 4BSD and ULE schedulers and has the happy side-effect of getting rid of a bunch of code in vmm.ko. Discussed with: grehan
* | MFCattilio2013-02-063-19/+59
|\ \ | |/
| * Compute the number of initial kernel page table pages (NKPT) dynamically.neel2013-02-063-19/+59
| | | | | | | | | | | | | | | | | | | | | | This eliminates the need to recompile the kernel when the default value of NKPT is not big enough - for e.g. when loading large kernel modules or memory disk images from the loader. If NKPT is defined in the kernel configuration file then it overrides the dynamic calculation. Reviewed by: alc, kib
* | MFCattilio2013-02-0353-157/+12367
|\ \ | |/
| * cpususpend_handler: mark AP as resumed only after fully setting up lapicavg2013-02-021-2/+2
| | | | | | | | | | | | | | Reviewed by: jhb Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>, KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp> MFC after: 12 days
OpenPOWER on IntegriCloud