summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Fix the incomplete conversion from atomic_t to long for test_bit().jkim2013-08-291-1/+1
|
* Clarify confusions between atomic_t and bitmap. Fix bitmap operationsjkim2013-08-292-13/+19
| | | | accordingly.
* Implement vector callback for PVHVM and unify event channel implementationsgibbs2013-08-2964-2654/+2433
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-structure Xen HVM support so that: - Xen is detected and hypercalls can be performed very early in system startup. - Xen interrupt services are implemented using FreeBSD's native interrupt delivery infrastructure. - the Xen interrupt service implementation is shared between PV and HVM guests. - Xen interrupt handlers can optionally use a filter handler in order to avoid the overhead of dispatch to an interrupt thread. - interrupt load can be distributed among all available CPUs. - the overhead of accessing the emulated local and I/O apics on HVM is removed for event channel port events. - a similar optimization can eventually, and fairly easily, be used to optimize MSI. Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure, and misc Xen cleanups: Sponsored by: Spectra Logic Corporation Unification of PV & HVM interrupt infrastructure, bug fixes, and misc Xen cleanups: Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D sys/x86/x86/local_apic.c: sys/amd64/include/apicvar.h: sys/i386/include/apicvar.h: sys/amd64/amd64/apic_vector.S: sys/i386/i386/apic_vector.s: sys/amd64/amd64/machdep.c: sys/i386/i386/machdep.c: sys/i386/xen/exception.s: sys/x86/include/segments.h: Reserve IDT vector 0x93 for the Xen event channel upcall interrupt handler. On Hypervisors that support the direct vector callback feature, we can request that this vector be called directly by an injected HVM interrupt event, instead of a simulated PCI interrupt on the Xen platform PCI device. This avoids all of the overhead of dealing with the emulated I/O APIC and local APIC. It also means that the Hypervisor can inject these events on any CPU, allowing upcalls for different ports to be handled in parallel. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: Map Xen per-vcpu area during AP startup. sys/amd64/include/intr_machdep.h: sys/i386/include/intr_machdep.h: Increase the FreeBSD IRQ vector table to include space for event channel interrupt sources. sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: Remove Xen HVM per-cpu variable data. These fields are now allocated via the dynamic per-cpu scheme. See xen_intr.c for details. sys/amd64/include/xen/hypercall.h: sys/dev/xen/blkback/blkback.c: sys/i386/include/xen/xenvar.h: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/xen/gnttab.c: Prefer FreeBSD primatives to Linux ones in Xen support code. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: sys/dev/xen/balloon/balloon.c: sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/console/xencons_ring.c: sys/dev/xen/control/control.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/xenpci/xenpci.c: sys/i386/i386/machdep.c: sys/i386/include/pmap.h: sys/i386/include/xen/xenfunc.h: sys/i386/isa/npx.c: sys/i386/xen/clock.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/mptable.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/xen_rtc.c: sys/xen/evtchn/evtchn_dev.c: sys/xen/features.c: sys/xen/gnttab.c: sys/xen/gnttab.h: sys/xen/hvm.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstore.c: sys/xen/xenstore/xenstore_dev.c: sys/xen/xenstore/xenstorevar.h: Pull common Xen OS support functions/settings into xen/xen-os.h. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: Remove constants, macros, and functions unused in FreeBSD's Xen support. sys/xen/xen-os.h: sys/i386/xen/xen_machdep.c: sys/x86/xen/hvm.c: Introduce new functions xen_domain(), xen_pv_domain(), and xen_hvm_domain(). These are used in favor of #ifdefs so that FreeBSD can dynamically detect and adapt to the presence of a hypervisor. The goal is to have an HVM optimized GENERIC, but more is necessary before this is possible. sys/amd64/amd64/machdep.c: sys/dev/xen/xenpci/xenpcivar.h: sys/dev/xen/xenpci/xenpci.c: sys/x86/xen/hvm.c: sys/sys/kernel.h: Refactor magic ioport, Hypercall table and Hypervisor shared information page setup, and move it to a dedicated HVM support module. HVM mode initialization is now triggered during the SI_SUB_HYPERVISOR phase of system startup. This currently occurs just after the kernel VM is fully setup which is just enough infrastructure to allow the hypercall table and shared info page to be properly mapped. sys/xen/hvm.h: sys/x86/xen/hvm.c: Add definitions and a method for configuring Hypervisor event delievery via a direct vector callback. sys/amd64/include/xen/xen-os.h: sys/x86/xen/hvm.c: sys/conf/files: sys/conf/files.amd64: sys/conf/files.i386: Adjust kernel build to reflect the refactoring of early Xen startup code and Xen interrupt services. sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: sys/dev/xen/control/control.c: sys/dev/xen/evtchn/evtchn_dev.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/xen/xenstore/xenstore.c: sys/xen/evtchn/evtchn_dev.c: sys/dev/xen/console/console.c: sys/dev/xen/console/xencons_ring.c Adjust drivers to use new xen_intr_*() API. sys/dev/xen/blkback/blkback.c: Since blkback defers all event handling to a taskqueue, convert this task queue to a "fast" taskqueue, and schedule it via an interrupt filter. This avoids an unnecessary ithread context switch. sys/xen/xenstore/xenstore.c: The xenstore driver is MPSAFE. Indicate as much when registering its interrupt handler. sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusvar.h: Remove unused event channel APIs. sys/xen/evtchn.h: Remove all kernel Xen interrupt service API definitions from this file. It is now only used for structure and ioctl definitions related to the event channel userland device driver. Update the definitions in this file to match those from NetBSD. Implementing this interface will be necessary for Dom0 support. sys/xen/evtchn/evtchnvar.h: Add a header file for implemenation internal APIs related to managing event channels event delivery. This is used to allow, for example, the event channel userland device driver to access low-level routines that typical kernel consumers of event channel services should never access. sys/xen/interface/event_channel.h: sys/xen/xen_intr.h: Standardize on the evtchn_port_t type for referring to an event channel port id. In order to prevent low-level event channel APIs from leaking to kernel consumers who should not have access to this data, the type is defined twice: Once in the Xen provided event_channel.h, and again in xen/xen_intr.h. The double declaration is protected by __XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared twice within a given compilation unit. sys/xen/xen_intr.h: sys/xen/evtchn/evtchn.c: sys/x86/xen/xen_intr.c: sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/xenpci/xenpcivar.h: New implementation of Xen interrupt services. This is similar in many respects to the i386 PV implementation with the exception that events for bound to event channel ports (i.e. not IPI, virtual IRQ, or physical IRQ) are further optimized to avoid mask/unmask operations that aren't necessary for these edge triggered events. Stubs exist for supporting physical IRQ binding, but will need additional work before this implementation can be fully shared between PV and HVM. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: sys/i386/xen/mp_machdep.c sys/x86/xen/hvm.c: Add support for placing vcpu_info into an arbritary memory page instead of using HYPERVISOR_shared_info->vcpu_info. This allows the creation of domains with more than 32 vcpus. sys/i386/i386/machdep.c: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/exception.s: Add support for new event channle implementation.
* - Remove test_and_set_bit() macro. It is unused since r255037.jkim2013-08-291-5/+3
| | | | | | | | - Relax atomic_read() and atomic_set() macros. Linux does not require any memory barrier. Also, these macros may be even reordered or optimized away according to the API documentation: https://www.kernel.org/doc/Documentation/atomic_ops.txt
* Convert the if_lagg rwlock to an rmlock.adrian2013-08-292-33/+56
| | | | | | | | | | | | | We've been seeing lots of cache line contention (but not lock contention!) in our workloads between the various TX and RX threads going on. The write lock is only grabbed when configuration changes are made - which are infrequent. With this patch, the contention and cycles spent waiting for updates disappear. Sponsored by: Netflix, Inc.
* Fix atomic operations on context_flag without altering semantics.jkim2013-08-291-3/+3
|
* Add directories that is installed as part of bsdconfig.delphij2013-08-291-0/+74
| | | | | | | | | | These are included unconditionally for now because bsdconfig is currently installed unconditionally. This fixes 'make -j 17 installworld' caused by a race condition. MFC candidate.
* Add a few missing language directories for /usr.delphij2013-08-291-0/+8
|
* Update to 2013-08-29 NetBSD libexecinfo snapshotemaste2013-08-293-13/+27
| | | | | This adds my patch to use the kern.proc.pathname sysctl instead of relying on procfs(5).
* Fix some issues in change 254760 pointed out by Bruce Evans:ken2013-08-291-11/+8
| | | | | | | | | | - Remove excessive parenthesis - Use KNF continuation indentation - Cut down on excessive continuation lines - More consistent style in messages - Use uprintf() instead of printf() Submitted by: bde
* Work-around a timing problem with the ITE IT8513E now that the coremarcel2013-08-291-1/+13
| | | | | | | | | | | calls ns8250_bus_ipend() almost immediately after ns8250_bus_attach(). As it appears, a line break condition is being signalled for almost all received characters due to this. A delay of 150ms seems enough to allow the H/W to settle and to avoid the problem. More analysis is needed, but for now a regression has been addressed. Reported by: kevlo@ Tested by: kevlo@
* Don't return an error for socket timeouts that are too large. Justjhb2013-08-291-7/+2
| | | | | | | | cap them to INT_MAX ticks instead. PR: kern/181416 (r254699 really) Requested by: bde MFC after: 2 weeks
* Fix after r255014antoine2013-08-292-4/+4
|
* Significantly reduce the cost, i.e., run time, of calls to madvise(...,alc2013-08-2913-11/+419
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2). Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%. Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise(). Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division
* Migrate iwn(4) to use the new ieee80211_tx_complete() API.adrian2013-08-291-35/+23
| | | | | | Tested: * Intel 5100, STA mode
* Remove the duplicate LLC_MISS event and put it in the right order.adrian2013-08-291-3/+2
|
* Prevent the full restart cycle every time arge_start() is called. Onlyloos2013-08-291-1/+2
| | | | | | | | | (re)start the interface when it is down. This change fix a race with BOOTP where the response packet is lost because the interface is being reset by a netmask change right after send the packet. PR: 178318 Approved by: adrian (mentor)
* Remove GNU_PATCH leftover.andreast2013-08-291-5/+0
|
* Merge r254736 from user/np/cxl_tuning.np2013-08-291-12/+14
| | | | | | Don't leak tags when M_NOFREE | M_PKTHDR mbufs are freed. Reviewed by: andre
* Merge r254386 from user/np/cxl_tuning. Add an INET|INET6 check missingnp2013-08-293-0/+17
| | | | | | | in said revision. r254386: Flush inactive LRO entries periodically.
* Drop build option switch for the older GNU patch.pfg2013-08-294-19/+0
| | | | | | | | As promised, drop the option to make the older GNU patch the default. GNU patch is still being built but something drastic may happen to it to it before Release.
* Correct atomic operations in i915.jkim2013-08-283-11/+11
|
* Fix a compiler warning and add couple of VM map types.jkim2013-08-281-3/+21
|
* Whitespace nit.np2013-08-281-1/+1
|
* Merge r254336 from user/np/cxl_tuning.np2013-08-282-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | Add a last-modified timestamp to each LRO entry and provide an interface to flush all inactive entries. Drivers decide when to flush and what the inactivity threshold should be. Network drivers that process an rx queue to completion can enter a livelock type situation when the rate at which packets are received reaches equilibrium with the rate at which the rx thread is processing them. When this happens the final LRO flush (normally when the rx routine is done) does not occur. Pure ACKs and segments with total payload < 64K can get stuck in an LRO entry. Symptoms are that TCP tx-mostly connections' performance falls off a cliff during heavy, unrelated rx on the interface. Flushing only inactive LRO entries works better than any of these alternates that I tried: - don't LRO pure ACKs - flush _all_ LRO entries periodically (every 'x' microseconds or every 'y' descriptors) - stop rx processing in the driver periodically and schedule remaining work for later. Reviewed by: andre
* Fix a compiler warning. With this fix, a negative time can be converted tojkim2013-08-281-1/+1
| | | | a struct timeval and back to the original nanoseconds correctly.
* Support storing 7 additional file flags in tmpfs:ken2013-08-281-3/+4
| | | | | | | | | | | UF_SYSTEM, UF_SPARSE, UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY, and UF_HIDDEN. Sort the file flags tmpfs supports alphabetically. tmpfs now supports the same flags as UFS, with the exception of SF_SNAPSHOT. Reported by: bdrewery, antoine Sponsored by: Spectra Logic
* libutil: Use O_CLOEXEC for internal file descriptors from open().jilles2013-08-285-9/+12
|
* Change t4_list_lock and t4_uld_list_lock from mutexes to sx'es.np2013-08-283-35/+34
| | | | | | | | - tom_uninit had to be reworked not to hold the adapter lock (a mutex) around t4_deactivate_uld, which acquires the uld_list_lock. - the ifc_match for the interface cloner that creates the tracer ifnet had to be reworked as the kernel calls ifc_match with the global if_cloners_mtx held.
* Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update anp2013-08-286-8/+30
| | | | couple of assertions in the TOE driver as well.
* Reduce diff against stable/9 slightly.jkim2013-08-281-3/+2
|
* ql_minidump() should be performed only by port 0.davidcs2013-08-281-2/+2
| | | | Submitted by: David C Somayajulu
* Xref capsicum(4) and procdesc(4) from pdfork(2).rwatson2013-08-281-4/+18
| | | | | Suggested by: sbruno MFC after: 3 days
* Add a simple procdesc(4) man page describing "options PROCDESC" and therwatson2013-08-283-5/+103
| | | | | | | | high-level facility, supplementing pdfork(2) and friends. Update capsicum.4 to xref. Suggested by: sbruno MFC after: 3 days
* Do not save/restore video memory if we are not using linear frame buffer.jkim2013-08-281-22/+15
| | | | Note this partially revert r233896.
* Make sure to free stale buffer before allocating new one for safety.jkim2013-08-281-2/+6
|
* Avoid unnecessary signedness conversion.jkim2013-08-281-3/+3
|
* In looking at block layouts as part of fixing filesystem blockmckusick2013-08-281-2/+2
| | | | | | | | allocations under low free-space conditions (-r254995), determine that old block-preference search order used before -r249782 worked a bit better. This change reverts to that block-preference search order. MFC after: 2 weeks
* A performance problem was reported in PR kern/181226:mckusick2013-08-281-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have 25TB Dell PERC 6 RAID5 array. When it becomes almost full (10-20GB free), processes which write data to it start eating 100% CPU and write speed drops below 1MB/sec (normally to gives 400MB/sec). The revision at which it first became apparent was http://svnweb.freebsd.org/changeset/base/249782. The offending change reserved an area in each cylinder group to store metadata. The new algorithm attempts to save this area for metadata and allows its use for non-metadata only after all the data areas have been exhausted. The size of the reserved area defaults to half of minfree, so the filesystem reports full before the data area can completely fill. However, in this report, the filesystem has had minfree reduced to 1% thus forcing the metadata area to be used for data. As the filesystem approached full, it had only metadata areas left to allocate. The result was that every block allocation had to scan summary data for 30,000 cylinder groups before falling back to searching up to 30,000 metadata areas. The fix is to give up on saving the metadata areas once the free space reserve drops below 2%. The effect of this change is to use the old algorithm of just accepting the first available block that we find. Since most filesystems use the default 5% minfree, this will have no effect on their operation. For those that want to push to the limit, they will get their crappy block placements quickly. Submitted by: Dmitry Sivachenko Fix Tested by: Dmitry Sivachenko PR: kern/181226 MFC after: 2 weeks
* * Whitespace.kargl2013-08-281-1/+1
|
* Add firmware for Centrino 2200-N wireless devices.gnn2013-08-282-0/+12256
| | | | Driver software for this firmware will be updated in a following commit.
* After writing a kernel core dump into /var/crash, call sync(8).gavin2013-08-281-0/+1
| | | | | | | | | | | | | If we panic again shortly after boot (say, within 30 seconds), any core dump we wrote out may be lost on reboot. In this situation, we really want to keep that core file, as it may be the only way to have the issue resolved. Call sync(8) after writing out the core file and running crashinfo(8), in the hope that these will not be lost if we panic again. sync(8) is only called in the case where there is a core dump to be written out, so won't be called during normal boots. Discovered by: Trying to debug an IPSEC panic MFC after: 1 week
* Fix a few typos for s25fl types.loos2013-08-281-2/+2
| | | | Approved by: adrian (mentor)
* Make ar71xx_spi attach the next free unit of spibus and not only spibus0.loos2013-08-281-1/+1
| | | | Approved by: adrian (mentor)
* Add the default hints to make the GPIO pins, rf led and reset switch workloos2013-08-281-0/+18
| | | | | | | | out of the box on RouterStation. PR: 177832 Submitted by: Petko Bordjukov (bordjukov@gmail.com) Approved by: adrian (mentor)
* Properly free gpiobus ivars when gpiobus_parse_pins() fails and also onloos2013-08-281-4/+18
| | | | | | | gpiobus detachment. Suggested by: imp Approved by: adrian (mentor)
* Support the PCI-Express SSD in the new MacBook Air (model A1465)gavin2013-08-281-0/+1
| | | | | Submitted by: Johannes Lundberg <johannes brilliantservice.co.jp> MFC after: 3 days
* Take a very small step toward the Century of the Anchovy by increasing theivoras2013-08-281-1/+1
| | | | | time dirhash entries stay in memory before being considered for eviction to 1 minute.
* Fix 'make depend'uqs2013-08-281-1/+1
|
* mdoc fixjoel2013-08-281-1/+1
|
OpenPOWER on IntegriCloud