summaryrefslogtreecommitdiffstats
path: root/sys/sparc64
Commit message (Collapse)AuthorAgeFilesLines
* - Wrap exchanging td_intr_frame and calling the event timer callback inmarius2010-10-193-17/+19
| | | | | | | | | | a critical section as apparently required by both. I don't think either belongs in the event timer front-ends but the callback should handle this as necessary instead just like for example intr_event_handle() does but this is how the other architectures currently handle it, either explicitly or implicitly. - Further rename and reword references to hardclock as this front-end no longer has a notion of actually calling it.
* - In oneshot-mode it doesn't make sense to try to compensate the clockmarius2010-10-171-45/+41
| | | | | | | | | | | | | drift in order to achieve a more stable clock as the tick intervals may vary in the first place. In fact I haven't seen this code kick in when in oneshot-mode so just skip it in that case. - There's no need to explicitly stop the (S)TICK counter in oneshot-mode with every tick as it just won't trigger again with the (S)TICK compare register set to a value in the past (with a wrap-around once every ~195 years of uptime at 1.5 GHz this isn't something we have to worry about in practice). - Given that we'll disable interrupts completely anyway there's no need to enter critical sections.
* Explicitly lower the PIL to 0 as part of enabling interrupts, similar tomarius2010-10-141-0/+1
| | | | | | | | | | | what is done on other platforms. Unlike as with the sched_throw(NULL) called on BSPs during their startup apparently there's nothing which will reliably lower it on APs. I'm unsure why this only came up on V215 though, breaking these with r207248. My best guess is that these are the only supported ones so far fast enough to loose some race. PR: 151404 MFC after: 3 days
* - In the spirit of r212559 add a comment describing what will eventuallymarius2010-10-141-1/+11
| | | | | | lower the PIL. - Just as with the AP ensure that the (S)TICK timer(s) are in a known state when starting BSPs.
* In the replacement text of the __bswapN_const() macros cast the argumentmarius2010-10-081-12/+14
| | | | | | to the expected type so they work like the corresponding __bswapN_var() functions and the compiler doesn't complain when arguments of different width are passed.
* Fix bogus error message from bus_dmamem_alloc() about incorrect alignment.neel2010-09-291-1/+1
| | | | | | | | | The check for alignment should be made against the physical address and not the virtual address that maps it. Sponsored by: NetApp Submitted by: Will McGovern (will at netapp dot com) Reviewed by: mjacob, jhb
* minor simplifications and cosmeticsmarius2010-09-241-10/+6
|
* Now userland POSIX semaphore is based on umtx. The kernel moduledavidxu2010-09-241-1/+0
| | | | | is only used to support binary compatible, if want to run old binary, you need to kldload the module.
* For sparc64 relocations that directly put bits of the symbol value intokib2010-09-221-18/+22
| | | | | | | | | | the location, apply elf_relocaddr to the symbol value to have right values for the symbols from dpcpu segment. PR: kern/147769 Discussed with: avg Tested by: marius MFC after: 2 weeks
* Remove accidentally committed test code which effectively prevented themarius2010-09-161-2/+0
| | | | | use of the SPARC64 V VIS-based block copy function added in r212709. Reported by: Michael Moll
* Add a VIS-based block copy function for SPARC64 V and later, whichmarius2010-09-153-1/+121
| | | | | | | | additionally takes advantage of the prefetch cache of these CPUs. Unlike the uncommitted US-III version, which provide no measurable speedup or even resulted in a slight slowdown on certain CPUs models compared to using the US-I version with these, the SPARC64 version actually results in a slight improvement.
* Add macros for alternate entry points.marius2010-09-151-1/+8
|
* Sync with other platforms:marius2010-09-151-9/+17
| | | | | | - make dflt_lock() always panic, - add kludge to use contigmalloc() when the alignment is larger than the size and print a diagnostic when we didn't satisfy the alignment.
* - Update the comment in swi_vm() regarding busdma bounce buffers; it'smarius2010-09-151-5/+2
| | | | | | | | | | unlikely that support for these ever will be implemented on sparc64 as the IOMMUs are able to translate to up to the maximum physical address supported by the respective machine, bypassing the IOMMU is affected by hardware errata and being able to support DMA engines which cannot do at least 32-bit DMA does not justify the costs. - The page zeroing in uma_small_alloc() may use the VIS-based block zero function so take advantage of it.
* Remove a KASSERT which will also trigger for perfectly valid combinationsmarius2010-09-141-3/+0
| | | | | | | of small maxsize and "large" (including BUS_SPACE_UNRESTRICTED) nsegments parameters. Generally using a presz of 0 (which indeed might indicate the use of bogus parameters for DMA tag creation) is not fatal, it just means that no additional DVMA space will be preallocated.
* Remove redundant raising of the PIL to PIL_TICK as the respective locoremarius2010-09-142-2/+0
| | | | code already did that.
* Refactor timer management code with priority to one-shot operation mode.mav2010-09-134-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
* Sparc64 uses dummy cpu_idle() method. It's CPUs never sleeping. Tellmav2010-09-111-1/+1
| | | | scheduler that it doesn't need to use IPI to "wake up" CPU.
* bus_add_child: change type of order parameter to u_intavg2010-09-101-1/+1
| | | | | | | | | | This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days
* Catch up to rename of the constant for the Master Data Parity Error bit injhb2010-09-091-1/+1
| | | | | | | the PCI status register. Pointed out by: mdf Pointy hat to: jhb
* Enable sis(4). sis(4) should work on all architectures.yongari2010-09-021-1/+1
|
* Skip a KASSERT which isn't appropriate when not employing page coloring.marius2010-08-211-3/+4
| | | | Reported by: Michael Moll
* Remove unused KTRACE includes.jhb2010-08-191-5/+0
|
* Supply some useful information to the started image using ELF aux vectors.kib2010-08-171-1/+7
| | | | | | | | In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month
* Update various places that store or manipulate CPU masks to use cpumask_tjhb2010-08-113-9/+9
| | | | | instead of int or u_int. Since cpumask_t is currently u_int on all platforms this should just be a cosmetic change.
* Wrap some sun4u-only symbols.marius2010-08-081-5/+3
|
* - As it is not possible for sched_bind(9) to context switch withmarius2010-08-085-34/+113
| | | | | | | | | | | | td_critnest > 1 when not already running on the desired CPU read the TICK counter of the BSP via a direct cross trap request in that case instead. - Treat the STICK based timecounter the same way as the TICK based one regarding its quality and obtaining the counter value from the BSP. Like the TICK timers the STICK ones also are only synchronized during their startup (which might not result in good synchronicity in the first place) but not afterwards and might drift over time, causing problems when the time is read from different CPUs (see r135972).
* - Introduce a cpu_ipi_single() function pointer in order to send IPIsmarius2010-08-082-25/+176
| | | | | | | | | | to single CPUs more efficiently with Cheetah(-class) and Jalapeno CPUs. Besides being used to implement the ipi_cpu() introduced in r210939, cpu_ipi_single() will also be used internally by the sparc64 MD code. - Factor out the Jalapeno support from the Cheetah IPI send functions in order to be able to more easily and efficiently implement support for more than 32 target CPUs as well as a workaround for Cheetah+ erratum 25 for the latter.
* For CPUs which ignore TD_CV and support hardware unaliasing don'tmarius2010-08-086-60/+80
| | | | | bother doing page coloring. This results in a small but measurable performance improvement in buildworld times.
* Add a new ipi_cpu() function to the MI IPI API that can be used to send anjhb2010-08-061-0/+11
| | | | | | | | | | | | IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month
* Adapt sparc64 and sun4v timer code for the new event timers infrastructure.mav2010-07-298-64/+152
| | | | Reviewed by: marius@
* Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple umamdf2010-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)
* Very rough first cut at NUMA support for the physical page allocator. Forjhb2010-07-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc
* KTR_CTx are long time aliased by existing classes so they can't serveattilio2010-07-213-3/+3
| | | | | | | | their purpose anymore. Axe them out. Sponsored by: Sandvine Incorporated Discussed with: jhb, emaste Possible MFC: TBD
* Allocate proper ammount of memory for interrupt names on sparc64 andmav2010-07-163-2/+2
| | | | | | | sun4v, same as done on other architectures. This removes garbage from `vmstat -ia` output. Reviewed by: marius@
* - Pin the IPI cache and TLB demap functions in order to prevent migrationmarius2010-07-041-8/+24
| | | | | | | | | | between determining the other CPUs and calling cpu_ipi_selected(), which apart from generally doing the wrong thing can lead to a panic when a CPU is told to IPI itself (which sun4u doesn't support). Reported and tested by: Nathaniel W Filardo - Add __unused where appropriate. MFC after: 3 days
* Move prototypes for kern_sigtimedwait() and kern_sigprocmask() tojhb2010-06-301-0/+1
| | | | <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
* Provide for multiple, cascaded PICs on PowerPC systems, and extend thenwhitehorn2010-06-186-6/+6
| | | | | | OFW interrupt map interface to also return the device's interrupt parent. MFC after: 8.1-RELEASE
* Update a branch missed in r207537.marius2010-06-131-1/+1
| | | | MFC after: 3 days
* Relax one of the new assertions in pmap_enter() a little. Specifically,alc2010-06-111-1/+2
| | | | | | allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
* Reduce the scope of the page queues lock and the number ofalc2010-06-101-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference(). The last parameter to moea*_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@
* Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.alc2010-06-051-3/+5
| | | | Correct a typo in a nearby comment on sparc64.
* Push down page queues lock acquisition in pmap_enter_object() andalc2010-05-261-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
* Roughly half of a typical pmap_mincore() implementation is machine-alc2010-05-241-12/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)
* Reorganize syscall entry and leave handling.kib2010-05-233-110/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* Change ad_firmware_geom_adjust() to operate on a struct disk * only andmarius2010-05-202-7/+4
| | | | | | | | | hook it up to ada(4) also. While at it, rename *ad_firmware_geom_adjust() to *ata_disk_firmware_geom_adjust() etc now that these are no longer limited to ad(4). Reviewed by: mav MFC after: 3 days
* On entry to pmap_enter(), assert that the page is busy. While I'malc2010-05-161-1/+12
| | | | | | | | | | | | | | | | | | | | here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
* - Enable DMA write parity error interrupts on Schizo with a workingmarius2010-05-143-41/+91
| | | | | | | | | | | | implementation. - Revert the Sun Fire V890 WAR of r205254. Instead let schizo_pci_bus() only panic in case of fatal errors as the interrupt triggered by the error the firmware of these and also Sun Fire 280R with version 7 Schizo caused may happen as late as using the HBA and not only prior to touching the PCI bus (in the former case the actual error still is fatal but we clear it before touching the PCI bus). While at it count and export non-fatal error interrupts via sysctl(9). - Remove unnecessary locking from schizo_ue().
* Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), andalc2010-05-081-3/+8
| | | | | | | | | | | vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.
* Push down the page queues lock inside of vm_page_free_toq() andalc2010-05-061-5/+10
| | | | | | | | | | pmap_page_is_mapped() in preparation for removing page queues locking around calls to vm_page_free(). Setting aside the assertion that calls pmap_page_is_mapped(), vm_page_free_toq() now acquires and holds the page queues lock just long enough to actually add or remove the page from the paging queues. Update vm_page_unhold() to reflect the above change.
OpenPOWER on IntegriCloud