summaryrefslogtreecommitdiffstats
path: root/sys/powerpc/aim
Commit message (Collapse)AuthorAgeFilesLines
* Fix physical address type to vm_paddr_t also for powerpc64.raj2012-05-251-11/+11
|
* Fix physical address type to vm_paddr_t.raj2012-05-241-11/+11
|
* Replace the list of PVOs owned by each PMAP with an RB tree. This simplifiesnwhitehorn2012-05-202-176/+57
| | | | | | | range operations like pmap_remove() and pmap_protect() as well as allowing simple operations like pmap_extract() not to involve any global state. This substantially reduces lock coverages for the global table lock and improves concurrency.
* Fix final bugs in memory barriers on PowerPC:nwhitehorn2012-05-042-2/+4
| | | | | | | | | | | | - Use isync/lwsync unconditionally for acquire/release. Use of isync guarantees a complete memory barrier, which is important for serialization of bus space accesses with mutexes on multi-processor systems. - Go back to using sync as the I/O memory barrier, which solves the same problem as above with respect to mutex release using lwsync, while not penalizing non-I/O operations like a return to sync on the atomic release operations would. - Place an acquisition barrier around thread lock acquisition in cpu_switchin().
* Fix build on 32-bit systems.nwhitehorn2012-04-281-1/+1
|
* After switching mutexes to use lwsync, they no longer provide sufficientnwhitehorn2012-04-282-30/+19
| | | | | | | | guarantees on acquire for the tlbie mutex. Conversely, the TLB invalidation sequence provides guarantees that do not need to be redundantly applied on release. Roll a small custom lock that is just right. Simultaneously, convert the SLB tree changes back to lwsync, as changing them to sync was a misdiagnosis of the tlbie barrier problem this commit actually fixes.
* Revert r234581 for this file. The lockless SLB tree code does in fact neednwhitehorn2012-04-241-2/+2
| | | | | a heavyweight sync instead of a lightweight sync to function properly. Thanks to mdf for the clarification.
* Use lwsync to provide memory barriers on systems that support it insteadnwhitehorn2012-04-221-2/+2
| | | | | | | | of sync (lwsync is an alternate encoding of sync on systems that do not support it, providing graceful fallback). This provides more than an order of magnitude reduction in the time required to acquire or release a mutex. MFC after: 2 months
* Avoid a lock order reversal in pmap_extract_and_hold() from relockingnwhitehorn2012-04-221-1/+32
| | | | | | | | the page. This PMAP requires an additional lock besides the PMAP lock in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release. Suggested by: kib MFC after: 4 days
* Make sure all pending operations have completed on the existing threadnwhitehorn2012-04-202-0/+2
| | | | | | before (potentially) migrating it to a different CPU. MFC after: 5 days
* We don't need kcopy() in any of the remaining places it is used, sonwhitehorn2012-04-113-33/+3
| | | | | | remove it. MFC after: 2 weeks
* Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxynwhitehorn2012-04-111-14/+10
| | | | | | | | for whether the page is physical. On dense phys mem systems (32-bit), VM_PHYS_TO_PAGE will not return NULL for device memory pages if device memory is above physical memory even if there is no allocated vm_page. Attempting to use the returned page could then cause either memory corruption or a page fault.
* Fix error in r233949. Synchronizing icaches on uncacheable pages turns outnwhitehorn2012-04-111-2/+4
| | | | | not to be a good idea, and of course the PV entry list for a page is never empty after the page has been mapped.
* Execute an initial ptesync if and only if the PTE is actually beingnwhitehorn2012-04-061-14/+7
| | | | invalidated, as opposed to a ref/changed bit update.
* Substantially reduce the scope of the locks held in pmap_enter(), whichnwhitehorn2012-04-061-34/+8
| | | | improves concurrency slightly.
* Reduce the frequency that the PowerPC/AIM pmaps invalidate instructionnwhitehorn2012-04-063-57/+29
| | | | | | | | caches, by invalidating kernel icaches only when needed and not flushing user caches for shared pages. Suggested by: kib MFC after: 2 weeks
* More PMAP performance improvements: skip 256 MB segments entirely if theynwhitehorn2012-03-282-11/+26
| | | | | | are are not mapped during ranged operations and reduce the scope of the tlbie lock only to the actual tlbie instruction instead of the entire sequence. There are a few more optimization possibilities here as well.
* Make sure to call vm_page_dirty() before the pmap lock is released tonwhitehorn2012-03-271-2/+2
| | | | | | prevent a race where another process could conclude the page was clean. Submitted by: alc
* More PMAP concurrency improvements: replace the table lock and (almost) allnwhitehorn2012-03-271-86/+100
| | | | | | | | uses of the page queues mutex with a new rwlock that protects the page table and the PV lists. This reduces system time during a parallel buildworld by 35%. Reviewed by: alc
* More PMAP performance improvements: on powerpc64, when TLBIE can be runnwhitehorn2012-03-251-4/+11
| | | | | with exceptions enabled, leave them enabled and use a regular mutex to guard TLB invalidations instead of a spinlock.
* Only call vm_page_dirty() on pages that are writable in order not tonwhitehorn2012-03-241-4/+12
| | | | confuse the VM.
* Following suggestions from alc, skip wired mappings in pmap_remove_pages()nwhitehorn2012-03-241-51/+29
| | | | | and remove moea64_attr_*() in favor of direct calls to vm_page_dirty() and friends.
* Remove acquisition of VM page queues lock from pmap_protect(). Any actualnwhitehorn2012-03-181-2/+0
| | | | | | | | | manipulation of the pvo_vlink and pvo_olink entries is already protected by the table lock, so most remaining instances of the acquisition of the page queues lock can likely be replaced with the table lock, or removed if the table lock is already held. Reviewed by: alc
* Implement pmap_remove_pages(). This will be added later to the 32-bit MMUnwhitehorn2012-03-151-0/+18
| | | | | | module. Suggested by: alc
* Improve algorithm for deciding whether to loop through all process pagesnwhitehorn2012-03-151-40/+58
| | | | | | | | or look them up individually in pmap_remove() and apply the same logic in the other ranged operation (pmap_protect). This speeds up make installworld by a factor of 2 on powerpc64. MFC after: 1 week
* Use LIST_FOREACH_SAFE() instead of LIST_FOREACH() in pmap_remove(), sincenwhitehorn2012-03-142-4/+4
| | | | | | the point of this loop is to remove elements. This worked by accident before. MFC after: 2 days
* Revert the _NOPROF entries on cpu_throw, cpu_switch and savectx. They can beandreast2012-02-051-3/+3
| | | | | | profiled too now. MFC after: 2 weeks
* Fix build for the case of powerpc64 kernel without COMPAT_FREEBSD32.kib2012-01-301-0/+3
| | | | MFC after: 2 months
* Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bitkib2012-01-301-0/+4
| | | | | | | and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported, and some variants of powerpc32. MFC after: 2 months (if ever)
* This commit adds profiling support for powerpc64. Now we can do applicationandreast2012-01-203-7/+8
| | | | | | | | | | | | | profiling and kernel profiling. To enable kernel profiling one has to build kgmon(8). I will enable the build once I managed to build and test powerpc (32-bit) kernels with profiling support. - add a powerpc64 PROF_PROLOGUE for _mcount. - add macros to avoid adding the PROF_PROLOGUE in certain assembly entries. - apply these macros where needed. - add size information to the MCOUNT function. MFC after: 3 weeks, together with r230291
* Rework SLB trap handling so that double-faults into an SLB trap handler arenwhitehorn2012-01-154-57/+229
| | | | | | | | | | | | | | possible, and double faults within an SLB trap handler are not. The result is that it possible to take an SLB fault at any time, on any address, for any reason, at any point in the kernel. This lets us do two important things. First, it removes the (soft) 16 GB RAM ceiling on PPC64 as well as any architectural limitations on KVA space. Second, it lets the kernel tolerate poorly designed hypervisors that have a tendency to fail to restore the SLB properly after a hypervisor context switch. MFC after: 6 weeks
* Implement hwpmc counting PMC support for PowerPC G4+ (MPC745x/MPC744x).jhibbits2011-12-242-0/+16
| | | | | | | Sampling is in progress. Approved by: nwhitehorn (mentor) MFC after: 9.0-RELEASE
* Allow this to work on embedded systems without Open Firmware by makingnwhitehorn2011-12-161-35/+67
| | | | | | | lack of a /chosen non-fatal, and manually removing memory in use by the kernel from the physical memory map. Submitted by: rpaulo
* Zero BSS on start, in case the ELF loader that started the kernel did notnwhitehorn2011-12-161-0/+11
| | | | | | do this for us. This can happen on some embedded systems. Submitted by: rpaulo
* Eliminate vestiges of page coloring.alc2011-12-152-4/+2
|
* Keep track of PVO entries in each pmap, which allows much fasternwhitehorn2011-12-112-9/+41
| | | | | | | pmap_remove() for large sparse requests. This can prevent pmap_remove() operations on 64-bit process destruction or swapout that would take several hundred times the lifetime of the universe to complete. This behavior is largely indistinguishable from a hang.
* - There's no need to overwrite the default device method with the defaultmarius2011-11-221-4/+4
| | | | | | | | | | one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel - While at it, use DEVMETHOD_END. Discussed with: jhb - Also while at it, use __FBSDID.
* Use a global __pure2 function instead of a global register variable fornwhitehorn2011-11-172-2/+12
| | | | | curthread, like on x86 and sparc64. This makes the kernel somewhat more clang friendly, which doesn't support global register variables.
* Add an extra invariant here which was useful on 64-bit CPUs.nwhitehorn2011-11-171-0/+2
|
* Refactor the code that performs physically contiguous memory allocation,alc2011-11-161-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | yielding a new public interface, vm_page_alloc_contig(). This new function addresses some of the limitations of the current interfaces, contigmalloc() and kmem_alloc_contig(). For example, the physically contiguous memory that is allocated with those interfaces can only be allocated to the kernel vm object and must be mapped into the kernel virtual address space. It also provides functionality that vm_phys_alloc_contig() doesn't, such as wiring the returned pages. Moreover, unlike that function, it respects the low water marks on the paging queues and wakes up the page daemon when necessary. That said, at present, this new function can't be applied to all types of vm objects. However, that restriction will be eliminated in the coming weeks. From a design standpoint, this change also addresses an inconsistency between vm_phys_alloc_contig() and the other vm_phys_alloc*() functions. Specifically, vm_phys_alloc_contig() manipulated vm_page fields that other functions in vm/vm_phys.c didn't. Moreover, vm_phys_alloc_contig() knew about vnodes and reservations. Now, vm_page_alloc_contig() is responsible for these things. Reviewed by: kib Discussed with: jhb
* Fix a bug where the pmap_cpu_bootstrap() ap argument could be clobbered.nwhitehorn2011-11-092-2/+4
| | | | | | | | Luckily, it mostly wasn't important, so this didn't cause major problems. Also improve register reuse when setting up trap frames very slightly. Submitted by: Justin Hibbits <chmeeedalf at gmail dot com> MFC after: 5 days
* Inline the syscallenter() and syscallret(). This reduces the time measuredkib2011-09-111-0/+2
| | | | | | | | by the syscall entry speed microbenchmarks by ~10% on amd64. Submitted by: jhb Approved by: re (bz) MFC after: 2 weeks
* Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomickib2011-09-062-26/+26
| | | | | | | | | | | | | | | | | flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
* - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flagkib2011-08-092-39/+29
| | | | | | | | | | | | | | to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)
* This a follow up commit from r224216 for powerpc 32-bit. Increaseandreast2011-07-251-2/+2
| | | | | | | the storage size for sintrcnt/sintrnames to .long. Reviewed by: nwhitehorn Approved by: re (kib)
* On 64 bit architectures size_t is 8 bytes, thus it should use an 8 bytesattilio2011-07-191-2/+2
| | | | | | | | | | storage. Fix the sintrcnt/sintrnames specification. No MFC is previewed for this patch. Reported, reviewed and tested by: marcel Approved by: re (kib)
* - Remove the eintrcnt/eintrnames usage and introduce the concept ofattilio2011-07-182-4/+10
| | | | | | | | | | | | | | | | sintrcnt/sintrnames which are symbols containing the size of the 2 tables. - For amd64/i386 remove the storage of intr* stuff from assembly files. This area can be widely improved by applying the same to other architectures and likely finding an unified approach among them and move the whole code to be MI. More work in this area is expected to happen fairly soon. No MFC is previewed for this patch. Tested by: pluknet Reviewed by: jhb Approved by: re (kib)
* With retirement of cpumask_t and usage of cpuset_t for representing aattilio2011-07-042-12/+4
| | | | | | | | | | | | | | | mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient. Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))). This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement. MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast
* Revert r223479. It is unnecessary and served only to slightly amelioratenwhitehorn2011-06-262-4/+0
| | | | some manifestations of the bug actually fixed in r223485.
* Use the ABI-mandated thread pointer register (r2 for ppc32, r13 for ppc64)nwhitehorn2011-06-2310-49/+54
| | | | | | | | | | | | | | | instead of a PCPU field for curthread. This averts a race on SMP systems with a high interrupt rate where the thread looking up the value of curthread could be preempted and migrated between obtaining the PCPU pointer and reading the value of pc_curthread, resulting in curthread being observed to be the current thread on the thread's original CPU. This played merry havoc with the system, in particular with mutexes. Many thanks to jhb for helping me work this one out. Note that Book-E is in principle susceptible to the same problem, but has not been modified yet due to lack of Book-E hardware. MFC after: 2 weeks
OpenPOWER on IntegriCloud