summaryrefslogtreecommitdiffstats
path: root/sys/sun4v
Commit message (Collapse)AuthorAgeFilesLines
* Remove actual files supporting sun4v.attilio2011-05-14158-32685/+0
| | | | Approved by: re
* Move the ZERO_REGION_SIZE to a machine-dependent file, as on manymdf2011-05-131-0/+2
| | | | | | | | | | | | | | | | | | architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853
* This patch changes head so that the default NFS client is now the newrmacklem2011-04-271-2/+2
| | | | | | | | | | | | | | NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.
* Switch the GENERIC kernels for all architectures to the new CAM-based ATAmav2011-04-241-7/+1
| | | | | | | | | | | | | stack. It means that all legacy ATA drivers are disabled and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential numbers for each type in order of detection, unless configured otherwise with tunables, see cam(4)). ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX.
* Correct spelling in comments.marius2011-04-222-2/+2
| | | | Submitted by: brucec
* MF sparc64: r181701 (partial), r182020 (partial), r182730 (partial), r216628,marius2011-04-221-31/+25
| | | | | | | | | | | r216801 - cosmetic changes and style fixes - Trick GAS/GCC into compiling access to TICK/(S)TICK_COMPARE independently of the selected instruction set. Moreover, sun4v doesn't need the WAR for BlackBird CPUs. - Rename the "xor" parameter to "xorval" as the former is a reserved keyword in C++.
* Remove the advertising clause from the UCB license according to themarius2011-03-134-16/+0
| | | | July 22, 1999 addendum.
* Sync licenses and the corresponding RCS IDs with NetBSD, mainly switchingmarius2011-03-127-92/+47
| | | | | | the licenses of Matthew R. Green and the TNF to 2-clause. Obtained from: NetBSD
* Mostly revert r219468, as I had misremembered the C standard regardingmdf2011-03-111-1/+1
| | | | | | the size of an extern array. Keep one change from strncpy to strlcpy.
* Use MAXPATHLEN rather than the size of an extern array when copying themdf2011-03-101-1/+1
| | | | | | kernel name. Also consistenly use strlcpy(). Suggested by: Warner Losh
* Add a small change to the comment in the GENRIC config files that include udbpjulian2011-03-091-1/+1
| | | | | Submitted by: Chris Forgron, cforgeron at acsi dot ca MFC after: 1 week
* Remove pmap fields that are either unused or not fully implemented.alc2011-02-171-2/+0
| | | | Discussed with: kib
* Put the general logic for being a CPU hog into a new functionmdf2011-02-021-2/+1
| | | | | | | | | | should_yield(). Use this in various places. Encapsulate the common case of check-and-yield into a new function maybe_yield(). Change several checks for a magic number of iterations to use should_yield() instead. MFC after: 1 week
* Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.pluknet2011-01-212-4/+2
| | | | | | | Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
* Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().jkim2011-01-171-1/+4
| | | | | | | | | Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits. MFC after: 1 month
* Remove unneeded includes of <sys/linker_set.h>. Other headers that usejhb2011-01-113-3/+0
| | | | | | it internally contain nested includes. Reviewed by: bde
* Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h.kib2011-01-091-11/+0
| | | | | | | Update the outdated comments describing MAXSLP and the process selection algorithm for swap out. Comments wording and reviewed by: alc
* Fix the value for DECIMAL_DIG on UltraSparcs. The previous value ofdas2011-01-091-1/+1
| | | | | 35 wasn't quite big enough to ensure correct rounding for very-close- to-halfway cases.
* On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather thantijl2011-01-081-2/+2
| | | | | | | | | | | | | architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1] Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX. Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency. Suggested by: bde [1] Approved by: kib (mentor)
* Fix types of some values in machine/_limits.h.tijl2011-01-081-9/+7
| | | | | | | | | | | | | | | | | On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int. Where MIN/MAX constants implicitly have the correct type the suffix has been removed. While here, correct some comments. Reviewed by: bde Approved by: kib (mentor)
* Add AT_STACKPROT elf aux vector. Will be used to inform rtld about thekib2011-01-071-1/+2
| | | | initial stack protection set by the kernel image activator.
* Revert r216134. This checkin broke platforms where bus_space are macros:brucec2010-12-031-59/+56
| | | | | they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.
* Disallow passing in a count of zero bytes to the bus_space(9) functions.brucec2010-12-021-56/+59
| | | | | | | | | Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero. PR: kern/80980 Discussed with: jhb
* Fix a few more places to use cpumask_t rather than 'u_int'. These arejhb2010-11-113-10/+9
| | | | just cosmetic.
* - Remove <machine/mutex.h>. Most of the headers were empty, and thejhb2010-11-091-32/+0
| | | | | | | | | | | | contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files. Suggested by: bde (1, 2)
* Adjust the order of operations in spinlock_enter() and spinlock_exit() tojhb2010-11-051-6/+7
| | | | | | | | | | | | | | | | | | work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers. In cooperation with: bde MFC after: 1 month
* Just use the sparc64 version of this header rather than duplicating it.marius2010-10-081-117/+2
|
* Follow r213098, kernel POSIX semaphore module is no longerdavidxu2010-09-261-1/+0
| | | | needed.
* Sync with other platforms:marius2010-09-151-9/+17
| | | | | | - make dflt_lock() always panic, - add kludge to use contigmalloc() when the alignment is larger than the size and print a diagnostic when we didn't satisfy the alignment.
* Remove a KASSERT which will also trigger for perfectly valid combinationsmarius2010-09-141-3/+0
| | | | | | | of small maxsize and "large" (including BUS_SPACE_UNRESTRICTED) nsegments parameters. Generally using a presz of 0 (which indeed might indicate the use of bogus parameters for DMA tag creation) is not fatal, it just means that no additional DVMA space will be preallocated.
* Refactor timer management code with priority to one-shot operation mode.mav2010-09-134-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
* bus_add_child: change type of order parameter to u_intavg2010-09-102-2/+2
| | | | | | | | | | This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days
* Remove unused KTRACE includes.jhb2010-08-191-5/+0
|
* Supply some useful information to the started image using ELF aux vectors.kib2010-08-171-1/+7
| | | | | | | | In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month
* Update various places that store or manipulate CPU masks to use cpumask_tjhb2010-08-113-4/+4
| | | | | instead of int or u_int. Since cpumask_t is currently u_int on all platforms this should just be a cosmetic change.
* Add a new ipi_cpu() function to the MI IPI API that can be used to send anjhb2010-08-062-5/+28
| | | | | | | | | | | | IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month
* Adapt sparc64 and sun4v timer code for the new event timers infrastructure.mav2010-07-299-56/+109
| | | | Reviewed by: marius@
* Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple umamdf2010-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)
* Very rough first cut at NUMA support for the physical page allocator. Forjhb2010-07-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc
* KTR_CTx are long time aliased by existing classes so they can't serveattilio2010-07-211-1/+1
| | | | | | | | their purpose anymore. Axe them out. Sponsored by: Sandvine Incorporated Discussed with: jhb, emaste Possible MFC: TBD
* Allocate proper ammount of memory for interrupt names on sparc64 andmav2010-07-162-2/+1
| | | | | | | sun4v, same as done on other architectures. This removes garbage from `vmstat -ia` output. Reviewed by: marius@
* Add a missing architecture declaration to the machine specificationnwhitehorn2010-07-131-1/+1
| | | | for sun4v.
* Move prototypes for kern_sigtimedwait() and kern_sigprocmask() tojhb2010-06-301-0/+1
| | | | <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
* Missed change to sun4v while adding iparent lookup to the OFW interruptnwhitehorn2010-06-181-1/+1
| | | | map interface.
* Relax one of the new assertions in pmap_enter() a little. Specifically,alc2010-06-111-1/+2
| | | | | | allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
* Reduce the scope of the page queues lock and the number ofalc2010-06-101-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference(). The last parameter to moea*_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@
* Merge portions of r208645 and supporting code from the i386 pmap:alc2010-06-011-13/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | When I pushed down the page queues lock into pmap_is_modified(), I created an ordering dependence: A pmap operation that clears PG_WRITEABLE and calls vm_page_dirty() must perform the call first. Otherwise, pmap_is_modified() could return FALSE without acquiring the page queues lock because the page is not (currently) writeable, and the caller to pmap_is_modified() might believe that the page's dirty field is clear because it has not seen the effect of the vm_page_dirty() call. When I pushed down the page queues lock into pmap_is_modified(), I overlooked one place where this ordering dependence is violated: pmap_enter(). In a rare situation pmap_enter() can be called to replace a dirty mapping to one page with a mapping to another page. (I say rare because replacements generally occur as a result of a copy-on-write fault, and so the old page is not dirty.) This change delays clearing PG_WRITEABLE until after vm_page_dirty() has been called. Fixing the ordering dependency also makes it easy to introduce a small optimization: When pmap_enter() used to replace a mapping to one page with a mapping to another page, it freed the pv entry for the first mapping and later called the pv entry allocator for the new mapping. Now, pmap_enter() attempts to recycle the old pv entry, saving two calls to the pv entry allocator. There is no point in setting PG_WRITEABLE on unmanaged pages, so don't.
* Eliminate a stale comment.alc2010-05-311-4/+0
|
* Simplify the inner loop of get_pv_entry(): While iterating over the page'salc2010-05-301-2/+2
| | | | | pv list, there is no point in checking whether or not the pv list is empty, wait instead until the loop completes.
* Push down page queues lock acquisition in pmap_enter_object() andalc2010-05-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
OpenPOWER on IntegriCloud