summaryrefslogtreecommitdiffstats
path: root/sys/sun4v
Commit message (Collapse)AuthorAgeFilesLines
* Add a new ipi_cpu() function to the MI IPI API that can be used to send anjhb2010-08-062-5/+28
| | | | | | | | | | | | IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month
* Adapt sparc64 and sun4v timer code for the new event timers infrastructure.mav2010-07-299-56/+109
| | | | Reviewed by: marius@
* Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple umamdf2010-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)
* Very rough first cut at NUMA support for the physical page allocator. Forjhb2010-07-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc
* KTR_CTx are long time aliased by existing classes so they can't serveattilio2010-07-211-1/+1
| | | | | | | | their purpose anymore. Axe them out. Sponsored by: Sandvine Incorporated Discussed with: jhb, emaste Possible MFC: TBD
* Allocate proper ammount of memory for interrupt names on sparc64 andmav2010-07-162-2/+1
| | | | | | | sun4v, same as done on other architectures. This removes garbage from `vmstat -ia` output. Reviewed by: marius@
* Add a missing architecture declaration to the machine specificationnwhitehorn2010-07-131-1/+1
| | | | for sun4v.
* Move prototypes for kern_sigtimedwait() and kern_sigprocmask() tojhb2010-06-301-0/+1
| | | | <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
* Missed change to sun4v while adding iparent lookup to the OFW interruptnwhitehorn2010-06-181-1/+1
| | | | map interface.
* Relax one of the new assertions in pmap_enter() a little. Specifically,alc2010-06-111-1/+2
| | | | | | allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
* Reduce the scope of the page queues lock and the number ofalc2010-06-101-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference(). The last parameter to moea*_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@
* Merge portions of r208645 and supporting code from the i386 pmap:alc2010-06-011-13/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | When I pushed down the page queues lock into pmap_is_modified(), I created an ordering dependence: A pmap operation that clears PG_WRITEABLE and calls vm_page_dirty() must perform the call first. Otherwise, pmap_is_modified() could return FALSE without acquiring the page queues lock because the page is not (currently) writeable, and the caller to pmap_is_modified() might believe that the page's dirty field is clear because it has not seen the effect of the vm_page_dirty() call. When I pushed down the page queues lock into pmap_is_modified(), I overlooked one place where this ordering dependence is violated: pmap_enter(). In a rare situation pmap_enter() can be called to replace a dirty mapping to one page with a mapping to another page. (I say rare because replacements generally occur as a result of a copy-on-write fault, and so the old page is not dirty.) This change delays clearing PG_WRITEABLE until after vm_page_dirty() has been called. Fixing the ordering dependency also makes it easy to introduce a small optimization: When pmap_enter() used to replace a mapping to one page with a mapping to another page, it freed the pv entry for the first mapping and later called the pv entry allocator for the new mapping. Now, pmap_enter() attempts to recycle the old pv entry, saving two calls to the pv entry allocator. There is no point in setting PG_WRITEABLE on unmanaged pages, so don't.
* Eliminate a stale comment.alc2010-05-311-4/+0
|
* Simplify the inner loop of get_pv_entry(): While iterating over the page'salc2010-05-301-2/+2
| | | | | pv list, there is no point in checking whether or not the pv list is empty, wait instead until the loop completes.
* Push down page queues lock acquisition in pmap_enter_object() andalc2010-05-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
* Roughly half of a typical pmap_mincore() implementation is machine-alc2010-05-241-2/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)
* Reorganize syscall entry and leave handling.kib2010-05-232-106/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* On entry to pmap_enter(), assert that the page is busy. While I'malc2010-05-161-1/+13
| | | | | | | | | | | | | | | | | | | | here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
* Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), andalc2010-05-081-3/+11
| | | | | | | | | | | vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.
* On Alan's advice, rather than do a wholesale conversion on a singlekmacy2010-04-302-2/+8
| | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
* MFamd64/i386 r207205alc2010-04-291-11/+4
| | | | | | | Clearing a page table entry's accessed bit and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified, so don't do it. Moreover, on ia64, don't set the page's dirty field unless pmap_protect() is removing write access.
* Style: use #define<TAB> instead of #define<SPACE>.kib2010-04-271-1/+1
| | | | | Noted by: bde, pluknet gmail com MFC after: 11 days
* Add OF_getscsinitid(), a helper similar to OF_getetheraddr() but formarius2010-04-261-0/+1
| | | | | obtaining the initiator ID to be used for SPI controllers from the Open Firmware device tree.
* Resurrect pmap_is_referenced() and use it in mincore(). Essentially,alc2010-04-241-0/+11
| | | | | | | | | | | | | | | | | pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!
* Move the constants specifying the size of struct kinfo_proc intokib2010-04-241-0/+2
| | | | | | | | | | machine-specific header files. Add KINFO_PROC32_SIZE for struct kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add CTASSERT for the size of struct kinfo_proc32. Submitted by: pluknet Reviewed by: imp, jhb, nwhitehorn MFC after: 2 weeks
* Change USB_DEBUG to #ifdef and allow it to be turned off. Previously this hadthompsa2010-04-221-0/+1
| | | | | | the illusion of a tunable setting but was always turned on regardless. MFC after: 1 week
* Change the arguments of exec_setregs() so that it receives a pointernwhitehorn2010-03-251-3/+3
| | | | | | | | to the image_params struct instead of several members of that struct individually. This makes it easier to expand its arguments in the future without touching all platforms. Reviewed by: jhb
* Remove COMPAT_43TTY from stock kernel configuration files.ed2010-03-131-1/+0
| | | | | | | | COMPAT_43TTY enables the sgtty interface. Even though its exposure has only been removed in FreeBSD 8.0, it wasn't used by anything in the base system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your ports/packages are less than two years old, they will prefer termios over sgtty.
* The NetBSD Foundation has granted permission to remove clause 3 and 4 fromjoel2010-03-031-7/+0
| | | | | | the software. Obtained from: NetBSD
* Adjust style (following the already existing rules) for the newlyattilio2010-02-151-1/+1
| | | | | | introduced option DEADLKRES. Reported by: danfe, julian, avg
* - Assert that HEAPSZ is a multiple of PAGE_SIZE as at least the firmwaremarius2010-02-134-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | of Sun Fire V1280 doesn't round up the size itself but instead lets claiming of non page-sized amounts of memory fail. - Change parameters and variables related to the TLB slots to unsigned which is more appropriate. - Search the whole OFW device tree instead of only the children of the root nexus device for the BSP as starting with UltraSPARC IV the 'cpu' nodes hang off of from 'cmp' (chip multi-threading processor) or 'core' or combinations thereof. Also in large UltraSPARC III based machines the 'cpu' nodes hang off of 'ssm' (scalable shared memory) nodes which group snooping-coherency domains together instead of directly from the nexus. - Add support for UltraSPARC IV and IV+ BSPs. Due to the fact that these are multi-core each CPU has two Fireplane config registers and thus the module/target ID has to be determined differently so the one specific to a certain core is used. Similarly, starting with UltraSPARC IV the individual cores use a different property in the OFW device tree to indicate the CPU/core ID as it no longer is in coincidence with the shared slot/socket ID. While at it additionally distinguish between CPUs with Fireplane and JBus interconnects as these also use slightly different sizes for the JBus/agent/module/target IDs. - Check the return value of init_heap(). This requires moving it after cons_probe() so we can panic when appropriate. This should be fine as the PowerPC OFW loader uses that order for quite some time now.
* Add the options DEADLKRES (introducing the deadlock resolver thread) inattilio2010-02-101-0/+1
| | | | | | | | | | the 'debugging' section of any HEAD kernel and enable for the mainstream ones, excluding the embedded architectures. It may, of course, enabled on a case-by-case basis. Sponsored by: Sandvine Incorporated Requested by: emaste Discussed with: kib
* Merge r178860 from sparc64:marius2010-02-011-81/+90
| | | | | | | | - Remove the BUS_HANDLE_MIN checking in the __BUS_DEBUG_ACCESS macro; for UPA it should have fulfilled its purpose by now and Fireplane- and JBus-based machines are way to messy in organization to implement something equivalent. - Fix a bunch of style(9) bugs.
* Merge r177565 from sparc64:marius2010-02-012-14/+13
| | | | | | - Const'ify the bus_stream_asi and bus_type_asi arrays. - Replace hard-coded functions names missed in bus_machdep.c with __func__. - Break some long lines.
* Merge r157224 from sparc64:marius2010-02-011-4/+4
| | | | | Sync with the other archs and declare the memory location referenced by the address argument of the bus_space_write_multi_*() familiy as const.
* Add INCLUDE_CONFIG_FILE in GENERIC on all non-embedded platforms.imp2010-01-101-1/+1
| | | | | | # This is the resolution of removing it from DEFAULTS... MFC after: 5 days
* In sys/<arch>/conf/Makefile set TARGET to <arch>. That allowsbz2010-01-081-0/+2
| | | | | | | | | | | | | | | | sys/conf/makeLINT.mk to only do certain things for certain architectures. Note that neither arm nor mips have the Makefile there, thus essentially not (yet) supporting LINT. This would enable them do add special treatment to sys/conf/makeLINT.mk as well chosing one of the many configurations as LINT. This is a hack of doing this and keeping it in a separate commit will allow us to more easily identify and back it out. Discussed on/with: arch, jhb (as part of the LINT-VIMAGE thread) MFC after: 1 month
* Revert 200594. This file isn't intended for these sorts of things.imp2010-01-041-7/+0
|
* Add vlan(4) to all GENERIC kernels.brooks2010-01-031-0/+1
| | | | MFC after: 1 week
* Hook ebus(4) and isa(4) up to the sun4v LINT build in order tomarius2009-12-231-1/+1
| | | | | ensure that their compilation doesn't break as they are expected to work as-is now (but aren't actually run-time tested).
* - Remove devices which are/were only relevant for sun4u.marius2009-12-231-19/+0
|
* Add INCLUDE_CONFIG_FILE, and a note in comments about how to alsodougb2009-12-161-0/+7
| | | | include the comments with CONFIGARGS
* Add additional checks of the kernel stack addresses in order tomarius2009-12-081-3/+11
| | | | | | ensure we don't overrun the end of the call chain. MFC after: 1 week
* Simplify the invocation of vm_fault(). Specifically, eliminate the flagalc2009-11-271-6/+3
| | | | | | | VM_FAULT_DIRTY. The information provided by this flag can be trivially inferred by vm_fault(). Discussed with: kib
* Extract the code that records syscall results in the frame into MDkib2009-11-102-36/+41
| | | | | | | | | | | function cpu_set_syscall_retval(). Suggested by: marcel Reviewed by: marcel, davidxu PowerPC, ARM, ia64 changes: marcel Sparc64 tested and reviewed by: marius, also sunv reviewed MIPS tested by: gonzo MFC after: 1 month
* In r197963, a race with thread being selected for signal deliverykib2009-10-271-5/+1
| | | | | | | | | | | | | while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls. Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race. Reviewed by: davidxu Tested by: pho MFC after: 1 month
* o Introduce vm_sync_icache() for making the I-cache coherent withmarcel2009-10-211-0/+5
| | | | | | | | | | | | | | | | | | | | | the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent *after* writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent *before* any writes happen. The difference is key when the I-cache prefetches.
* Change the load base to below 2GB so PIE binaries work including whenmarius2009-10-181-1/+1
| | | | | | | | | compiled to use the Medium/Low code model, which we currently default to for the userland. GNU/Linux has moved their default to Medium/Middle some time ago, which probably explains why the current GNU ld(1) uses a base in the range between 32 and 44 bits instead. Submitted by: kib
* Define architectural load bases for PIE binaries. Addresses were selectedkib2009-10-101-0/+2
| | | | | | | | | | by looking at the bases used for non-relocatable executables by gnu ld(1), and adjusting it slightly. Discussed with: bz Reviewed by: kan Tested by: bz (i386, amd64), bsam (linux) MFC after: some time
* Add a new sysctl for reporting all of the supported page sizes.alc2009-09-181-0/+2
| | | | | Reviewed by: jhb MFC after: 3 weeks
OpenPOWER on IntegriCloud