summaryrefslogtreecommitdiffstats
path: root/sys/powerpc/aim
Commit message (Collapse)AuthorAgeFilesLines
* Fix two small bugs. The PowerPC 970 does not support non-coherent memorynwhitehorn2010-03-151-2/+2
| | | | | | access, and reflects this by autonomously writing LPTE_M into PTE entries. As such, we should not panic if LPTE_M changes by itself. While here, fix a harmless typo in moea64_sync_icache().
* Place interrupt handling in a critical section and remove doublenwhitehorn2010-03-091-2/+4
| | | | | | | | | | | counting in incrementing the interrupt nesting level. This fixes a number of bugs in which the interrupt thread could be preempted by an IPI, indefinitely delaying acknowledgement of the interrupt to the PIC, causing interrupt starvation and hangs. Reported by: linimon Reviewed by: marcel, jhb MFC after: 1 week
* Fix an obvious lock escape and fix a typo in a comment.nwhitehorn2010-03-041-2/+4
|
* Patch some more concurrency issues here. This expands the page tablenwhitehorn2010-03-041-23/+43
| | | | | lock to cover the PVOs, and removes the scratchpage PTEs from the PVOs entirely to avoid the system trying to be helpful and rewriting them.
* The NetBSD Foundation has granted permission to remove clause 3 and 4 fromjoel2010-03-031-7/+0
| | | | | | their software. Obtained from: NetBSD
* Move the OEA64 scratchpage to the end of KVA from the beginning, and setnwhitehorn2010-02-251-9/+14
| | | | | | | | | | | its PVO to map physical address 0 instead of kernelstart. This fixes a situation in which a user process could attempt to return this address via KVM, have it fault while being modified, and then panic the kernel because (a) it is supposed to map a valid address and (b) it lies in the no-fault region between VM_MIN_KERNEL_ADDRESS and virtual_avail. While here, move msgbuf and dpcpu make into regular KVA space for consistency with other implementations.
* Provide an implementation of pmap_dev_direct_mapped() on OEA64. This isnwhitehorn2010-02-251-1/+16
| | | | | required in order to be able to mmap the running kernel, which is turn required to avoid fstat returning gibberish.
* Use dcbz instead of word stores for page zeroing, providing a factor ofnwhitehorn2010-02-241-9/+21
| | | | 3-4 speedup.
* Close a race involving the OEA64 scratchpage. When the scratch page'snwhitehorn2010-02-241-10/+14
| | | | | | | | | | | | | | | physical address is changed, there is a brief window during which its PTE is invalid. Since moea64_set_scratchpage_pa() does not and cannot hold the page table lock, it was possible for another CPU to insert a new PTE into the scratch page's PTEG slot during this interval, corrupting both mappings. Solve this by creating a new flag, LPTE_LOCKED, such that moea64_pte_insert will avoid claiming locked PTEG slots even if they are invalid. This change also incorporates some additional paranoia added to solve things I thought might be this bug. Reported by: linimon
* Allow user programs to execute mfpvr instructions. Linux allows this, andnwhitehorn2010-02-221-1/+22
| | | | | | | some math-related software like GMP expects to be able to use it to pick a target appropriately. MFC after: 1 week
* Reduce KVA pressure on OEA64 systems running in bridge mode by mappingnwhitehorn2010-02-204-63/+43
| | | | | | | | | | | | | UMA segments at their physical addresses instead of into KVA. This emulates the direct mapping behavior of OEA32 in an ad-hoc way. To make this work properly required sharing the entire kernel PMAP with Open Firmware, so ofw_pmap is transformed into a stub on 64-bit CPUs. Also implement some more tweaks to get more mileage out of our limited amount of KVA, principally by extending KVA into segment 16 until the beginning of the first OFW mapping. Reported by: linimon
* Fix a bug where pages being removed from memory entirely no longer havenwhitehorn2010-02-182-55/+39
| | | | | | | | | | | PVOs, and so the modified state of the page can no longer be communicated to the VM layer, causing pages not to be flushed to swap when needed, in turn causing memory corruption. Also make several correctness adjustments to I-Cache synchronization and TLB invalidation for 64-bit Book-S CPUs. Obtained from: projects/ppc64 Discussed with: grehan MFC after: 2 weeks
* Remove extraneous semicolons, no functional changes.mbr2010-01-072-6/+6
| | | | | Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week
* The first argument of dcbz interprets r0 as a literal zero, not the second.nwhitehorn2009-12-031-1/+1
| | | | | | This worked before by accident. MFC after: 1 week
* Add a CPU features framework on PowerPC and simplify CPU setup a littlenwhitehorn2009-11-282-4/+8
| | | | | | | | | | more. This provides three new sysctls to user space: hw.cpu_features - A bitmask of available CPU features hw.floatingpoint - Whether or not there is hardware FP support hw.altivec - Whether or not Altivec is available PR: powerpc/139154 MFC after: 10 days
* Simplify the invocation of vm_fault(). Specifically, eliminate the flagalc2009-11-271-3/+1
| | | | | | | VM_FAULT_DIRTY. The information provided by this flag can be trivially inferred by vm_fault(). Discussed with: kib
* Garbage collect some code that was never compiled in to handle Altivecnwhitehorn2009-11-221-6/+0
| | | | during traps. It predates actual Altivec support and was never used.
* Provide a real fix to the too-many-translations problem when bootingnwhitehorn2009-11-121-56/+62
| | | | | | | | | | | | from CD on 64-bit hardware to replace existing band-aids. This occurred when the preloaded mdroot required too many mappings for the static buffer. Since we only use the translations buffer once, allocate a dynamic buffer on the stack. This early in the boot process, the call chain is quite short and we can be assured of having sufficient stack space. Reviewed by: grehan
* Extract the code that records syscall results in the frame into MDkib2009-11-102-37/+57
| | | | | | | | | | | function cpu_set_syscall_retval(). Suggested by: marcel Reviewed by: marcel, davidxu PowerPC, ARM, ia64 changes: marcel Sparc64 tested and reviewed by: marius, also sunv reviewed MIPS tested by: gonzo MFC after: 1 month
* Spell sz correctly.nwhitehorn2009-11-091-1/+1
| | | | Pointed out by: jmallett
* Increase the size of the OFW translations buffer to handle G5 systemsnwhitehorn2009-11-091-1/+4
| | | | | | | | that use many translation regions in firmware, and add bounds checking to prevent buffer overflows in case even the new value is exceeded. Reported by: Jacob Lambert MFC after: 3 days
* Unbreak cpu_switch(). The register allocator in my brain is clearlynwhitehorn2009-10-311-4/+6
| | | | | broken. Also, Altivec context switching worked before only by accident, but should work now by design.
* Remove an unnecessary sync that crept in the last commit.nwhitehorn2009-10-311-1/+0
|
* Fix a race in casuword() exposed by csup. casuword() non-atomically readnwhitehorn2009-10-311-2/+13
| | | | | | the current value of its argument before atomically replacing it, which could occasionally return the wrong value on an SMP system. This resulted in user mutex operations hanging when using threaded applications.
* Loop on blocked threads when using ULE scheduler, removing annwhitehorn2009-10-311-9/+21
| | | | XXX MP comment.
* Garbage collect set_user_sr(), which is declared static inline andnwhitehorn2009-10-311-9/+0
| | | | never called.
* Turn off Altivec data-stream prefetching before going into power-savenwhitehorn2009-10-291-3/+21
| | | | mode on those CPUs that need it.
* In r197963, a race with thread being selected for signal deliverykib2009-10-271-7/+1
| | | | | | | | | | | | | while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls. Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race. Reviewed by: davidxu Tested by: pho MFC after: 1 month
* Remove debugging printf that snuck in here.nwhitehorn2009-10-231-1/+0
| | | | Pointy hat to: me
* Add some more paranoia to setting HID registers, and update the AIMnwhitehorn2009-10-233-12/+27
| | | | | | | clock routines to work better with SMP. This makes SMP work fully and stably on an Xserve G5. Obtained from: Book-E (clock bits)
* Do not map the trap vectors into the kernel's address space. They arenwhitehorn2009-10-232-6/+13
| | | | | | | | only used in real mode and keeping them mapped only serves to make NULL a valid address, which results in silent NULL pointer deferences. Suggested by: Patrick Kerharo Obtained from: projects/ppc64
* Add SMP support on U3-based G5 systems. This does not yet work perfectly:nwhitehorn2009-10-233-78/+184
| | | | | | | | | | at least on my Xserve, getting the decrementer and timebase on APs to tick requires setting up a clock chip over I2C, which is not yet done. While here, correct the 64-bit tlbie function to set the CPU to 64-bit mode correctly. Hardware donated by: grehan
* o Introduce vm_sync_icache() for making the I-cache coherent withmarcel2009-10-212-24/+60
| | | | | | | | | | | | | | | | | | | | | the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent *after* writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent *before* any writes happen. The difference is key when the I-cache prefetches.
* Don't assume that physical addresses are identity mapped. This allowsnwhitehorn2009-10-181-1/+8
| | | | | the second processor on G5 systems to start. Note that SMP is still non-functional on these systems because of IPI delivery problems.
* Correct another typo. Actually save the condition register insteadnwhitehorn2009-10-111-1/+1
| | | | of overwriting r12 by mistake.
* Correct a typo here and actually save DSISR instead of overwriting it.nwhitehorn2009-10-111-1/+1
|
* Increase the size of the page table on 64-bit PowerPC machines as anwhitehorn2009-07-121-2/+0
| | | | | | | | | bandaid to prevent exhaustion of the primary and secondary hash groups in the event of extreme stress on the PMAP layer (e.g. a forkbomb). This wastes memory, and should be revised to properly handle PTEG spills instead. Suggested by: grehan Approved by: re (kensmith)
* Implement a facility for dynamic per-cpu variables.jeff2009-06-232-0/+30
| | | | | | | | | | | | | | | - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_*. Requires only one extra instruction more than PCPU_* and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent. Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas
* Get the gdb/psim emulator functioning again.grehan2009-06-102-6/+26
| | | | | | | | | | | | | | | | | | | | | aim/machdep.c: - the RI status register bit needs to be set when doing the mtmsrd 64-bit instruction test - psim doesn't implement the dcbz instruction so the run-time cacheline test fails. Set the cachline size to 32 to avoid infinite loops in future calls to __syncicache() aim/platform_chrp.c: - if after iterating through / and a name property of "cpus" still isn't found, just search directly for '/cpus'. - psim doesn't put a "reg" property on it's cpu nodes, so assume 0 since it is uniprocessor-only at this point powerpc/openpic.c - the number of CPUs reported is 1 too many on psim's openpic Reviewed by: nwhitehorn MFC after: 1 week (openpic part)
* Introduce support for cpufreq on PowerPC with the dynamic frequencynwhitehorn2009-05-311-8/+0
| | | | switching capabilities of the MPC7447A and MPC7448.
* Add cpu_flush_dcache() for use after non-DMA based I/O so that amarcel2009-05-181-0/+10
| | | | | | | | | | | | | | | | | | | | | possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.
* PowerPC common SMP startup and time base rework.raj2009-05-141-11/+8
| | | | | | | | | | - make mftb() shared, rewrite in C, provide complementary mttb() - adjust SMP startup per the above, additional comments, minor naming changes - eliminate redundant TB defines, other minor cosmetics Reviewed by: marcel, nwhitehorn Obtained from: Freescale, Semihalf
* Factor out platform dependent things unrelated to device drivers into anwhitehorn2009-05-148-157/+269
| | | | | | | | | | new platform module. These are probed in early boot, and have the responsibility of determining the layout of physical memory, determining the CPU timebase frequency, and handling the zoo of SMP mechanisms found on PowerPC. Reviewed by: marcel, raj Book-E parts by: raj
* Zero PCB during early AIM PowerPC init.raj2009-04-241-0/+1
| | | | | | | | | When memory is not zero'ed by firmware, uninitialized PCB can have bogus contents, which appear as a saved onfault condition, Altivec context to restore etc. and lead to corruption/crashes. This commit fixes such issues. Submitted by: Michal Mazur arg ! semihalf dot com Tested by: Andreas Tobler andreast-list ! fgznet dot ch
* Fix a typo in the SRR1 comparison for program exceptions. While here,nwhitehorn2009-04-191-3/+2
| | | | | | | | | | | replace magic numbers with constants to keep this from happening again. Without this fix, some programs would occasionally get SIGTRAP instead of SIGILL on an illegal instruction. This affected Altivec detection in pixman, and possibly other software. Reported by: Andreas Tobler MFC after: 1 week
* Changing the overflow trap to use bla to branch to dbtrap in r190946 wasnwhitehorn2009-04-141-1/+1
| | | | bogus. Revert to a branch that does not set LR. It's been a long week...
* Rework the way we get the cacheline size. Instead of having a table ofnwhitehorn2009-04-121-13/+38
| | | | | | CPUs known to use 128 byte cache lines and defaulting to 32, use the dcbz instruction to measure it. Also make dcbz behave the way you would expect on PPC 970.
* Fix recognition of kernel-mode traps that pass through the KDB trap handlernwhitehorn2009-04-111-4/+2
| | | | | | | | | | but do not actually invoke KDB. This includes recoverable machine checks encountered in kernel mode. This patch causes machines with Grackle host-PCI bridges to be able to correctly enumerate them again. MFC after: 3 days
* Fix the build when KDB is disabled. The second instance of rfi innwhitehorn2009-04-051-0/+3
| | | | | trap_subr.S that is patched at runtime to rfid on 64-bit systems is inside KDB-specific code, so don't patch it without KDB.
* Perform a dummy stwcx. when we switch contexts. The contextmarcel2009-04-041-0/+6
| | | | | | | | | | | | | | | | | | | being switched out may hold a reservation. The stwcx. will clear the reservation. This is architecturally recommended. The scenario this addresses is as follows: 1. Thread 1 performs a lwarx and as such holds a reservation. 2. Thread 1 gets switched out (before doing the matching stwcx.) and thread 2 is switched in. 3. Thread 2 performs a stwcx. to the same reservation granule. This will succeed because the processor has the reservation even though thread 2 didn't do the lwarx. Note that on some processors the address given the stwcx. is not checked. On these processors the mere condition of having a reservation would cause the stwcx. to succeed, irrespective of whether the addresses are the same. The dummy stwcx. is especially important for those processors.
OpenPOWER on IntegriCloud