| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Define the vm_ooffset_t and vm_pindex_t types as machine-independend.
(cherry picked from commit 2f90047bce7ecbc5d94c29e91fbee08aa4212dbc)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in
the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try,
for example, to run tasks on CPUs that did not exist or to allocate too
few buffers on systems with sparse CPU IDs in which there are holes in the
range and mp_maxid > mp_ncpus. Such circumstances generally occur on
systems with SMT, but on which SMT is disabled. This patch restores system
operation at least on POWER8 systems configured in this way.
There are a number of other places in the kernel with potential problems
in these situations, but where sparse CPU IDs are not currently known
to occur, mostly in the ARM machine-dependent code. These will be fixed
in a follow-up commit after the stable/11 branch.
PR: kern/210106
Reviewed by: jhb
Approved by: re (glebius)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
PowerPC Book-E SMP is currently broken for unknown reasons. Pull in
Semihalf changes made c2012 for e500mc/e5500, which enables booting SMP.
This eliminates the shared software TLB1 table, replacing it with
tlb1_read_entry() function.
This does not yet support ePAPR SMP booting, and doesn't handle resetting CPUs
already released (ePAPR boot releases APs to a spin loop waiting on a specific
address). This will be addressed in the near future by using the MPIC to reset
the AP into our own alternate boot address.
This does include a change to the dpaa/dtsec(4) driver, to mark the portals as
CPU-private.
Test Plan:
Tested on Amiga X5000/20 (P5020). Boots, prints the following
messages:
Adding CPU 0, pir=0, awake=1
Waking up CPU 1 (dev=1)
Adding CPU 1, pir=20, awake=1
SMP: AP CPU #1 launched
top(1) shows CPU1 active.
Obtained from: Semihalf
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D5945
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There is currently a 1GB hole between user and kernel address spaces
into which direct (1:1 PA:VA) device mappings go. This appears to go largely
unused, leaving all devices to contend with the 128MB block at the end of the
32-bit space (0xf8000000-0xffffffff). This easily fills up, and needs to be
densely packed. However, dense packing wastes precious TLB1 space, of which
there are only 16 (e500v2) or 64(e5500) entries available.
Change this by using the 1GB space for all device mappings, and allow the kernel
to use the entire upper 1GB for KVA. This also allows us to use sparse device
mappings, freeing up TLB entries.
Test Plan: Boot tested on p5020.
Differential Revision: https://reviews.freebsd.org/D5832
|
|
|
|
| |
Sponsored by: Alex Perez/Inertial Computing
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Freescale's QorIQ line includes a new ethernet controller, based on their
Datapath Acceleration Architecture (DPAA). This uses a combination of a Frame
manager, Buffer manager, and Queue manager to improve performance across all
interfaces by being able to pass data directly between hardware acceleration
interfaces.
As part of this import, Freescale's Netcomm Software (ncsw) driver is imported.
This was an attempt by Freescale to create an OS-agnostic sub-driver for
managing the hardware, using shims to interface to the OS-specific APIs. This
work was abandoned, and Freescale's primary work is in the Linux driver (dual
BSD/GPL license). Hence, this was imported directly to sys/contrib, rather than
going through the vendor area. Going forward, FreeBSD-specific changes may be
made to the ncsw code, diverging from the upstream in potentially incompatible
ways. An alternative could be to import the Linux driver itself, using the
linuxKPI layer, as that would maintain parity with the vendor-maintained driver.
However, the Linux driver has not been evaluated for reliability yet, and may
have issues with the import, whereas the ncsw-based driver in this commit was
completed by Semihalf 4 years ago, and is very stable.
Other SoC modules based on DPAA, which could be added in the future:
* Security and Encryption engine (SEC4.x, SEC5.x)
* RAID engine
Additional work to be done:
* Implement polling mode
* Test vlan support
* Add support for the Pattern Matching Engine, which can do regular expression
matching on packets.
This driver has been tested on the P5020 QorIQ SoC. Others listed in the
dtsec(4) manual page are expected to work as the same DPAA engine is included in
all.
Obtained from: Semihalf
Relnotes: Yes
Sponsored by: Alex Perez/Inertial Computing
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some drivers need special memory requirements. X86 solves this with a
pmap_change_attr() API, which DRM uses for changing the mapping of the GART and
other memory regions. Implement the same function for PowerPC. AIM currently
does not need this, but will in the future for DRM, so a default is added for
that, for business as usual. Book-E has some drivers coming down that do
require non-default memory coherency. In this case, the Datapath Acceleration
Architecture (DPAA) based ethernet controller has 2 regions for the buffer
portals: cache-inhibited, and cache-enabled. By default, device memory is
cache-inhibited. If the cache-enabled memory regions are mapped
cache-inhibited, an alignment exception is thrown on access.
Test Plan:
Tested with a new driver to be added after this (DPAA dTSEC ethernet driver).
No alignment exceptions thrown, driver works as expected with this.
Reviewed By: nwhitehorn
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5471
|
|
|
|
|
|
| |
PTE was getting overwritten by just the flags.
Pointy-hat to: jhibbits
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ucontext_t available. Our code even has XXX comment about this.
Add a bit of compliance by moving struct __ucontext definition into
sys/_ucontext.h and including it into signal.h and sys/ucontext.h.
Several machine/ucontext.h headers were changed to use namespace-safe
types (like uint64_t->__uint64_t) to not depend on sys/types.h.
struct __stack_t from sys/signal.h is made always visible in private
namespace to satisfy sys/_ucontext.h requirements.
Apparently mips _types.h pollutes global namespace with f_register_t
type definition. This commit does not try to fix the issue.
PR: 207079
Reported and tested by: Ting-Wei Lan <lantw44@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The revised Book-E spec, adding the specification for the MMUv2 and e6500,
includes a hardware PTE layout for indirect page tables. In order to support
this in the future, migrate the PTE format to match the MMUv2 hardware PTE
format.
Test Plan: Boot tested on a P5020 board. Booted to multiuser mode.
Differential Revision: https://reviews.freebsd.org/D5224
|
|
|
|
|
|
|
|
|
|
|
| |
The PT_{GET,SET}FPREGS requests use 'struct fpreg' and the NT_FPREGSET
core note stores a copy of 'struct fpreg'. As with x86 and the floating
point state there compared to the extended state in XSAVE, struct fpreg
on powerpc now only holds the 'base' FP state, and setting it via
PT_SETFPREGS leaves the extended vector state in a thread unchanged.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D5004
|
|
|
|
|
|
| |
This part was a botched revert of a test change.
Spotted by: alc
|
|
|
|
|
|
| |
VM_MAX_KERNEL_ADDERESS is the maximum KVA address. 0xf8000000 is the start of
device mapping space. Since several conditional checks use '<=' against
VM_MAX_KERNEL_ADDRESS, bad things could feasibly happen.
|
|
|
|
|
| |
setfault() when testing for faults. This should also help the compiler
do the right thing with this complicated-to-optimize function.
|
|
|
|
|
|
|
| |
proliferation of them on large IBM systems and add some error checking if
we exceed that number.
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
|
|
|
| |
Newer Book-E cores (e500mc, e5500, e6500) do not support the WE bit in the MSR,
and instead delegate CPU idling to the SoC.
Perhaps in the future the QORIQ_DPAA option for the mpc85xx platform will become
a subclass, which will eliminate most of the #ifdef's.
|
|
|
|
|
|
|
|
|
| |
Summary:
With some additional changes for AIM, that could also support much
larger physmem sizes. Given that 32-bit AIM is more or less obsolete, though,
it's not worth it at this time.
Differential Revision: https://reviews.freebsd.org/D4345
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
into a new function that other platforms can share.
This creates a new ofw_reg_to_paddr() function (in a new ofw_subr.c file)
that contains most of the existing ppc implementation, mostly unchanged.
The ppc code now calls the new MI code from the MD code, then creates a
ppc-specific bus_space mapping from the results. The new arm implementation
does the same in an arm-specific way.
This also moves the declaration of OF_decode_addr() from ofw_machdep.h to
openfirm.h, except on sparc64 which uses a different function signature.
This will help all FDT platforms to set up early console access using
OF_decode_addr().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
e500mc, e5500, and e6500 all use the normal FPU, with the same behavior as AIM
hardware. e6500 also supports Altivec, so, although we don't yet have e6500
hardware to test on, add these IVORs as well. Theoretically, since it boots the
same as a e5500, it should work, single-threaded, single-core, with full altivec
support as of this commit.
With this commit, and some other patches to be committed shortly FreeBSD now
boots on the P5020, single-core, all the way to user space, and should boot just
fine on e500mc.
Relnotes: Yes (e500mc, e5500 support)
Sponsored by: Alex Perez/Inertial Computing
|
|
|
|
|
|
| |
is available commercially with up to 96 threads per socket.
MFC after: 3 weeks
|
|
|
|
| |
or gcc) and binutils >= 2.24. Not enabled by default.
|
|
|
|
|
|
|
| |
Bits in mcsr indicate if the address is valid, and whether it's a physical
address or effective address.
Sponsored by: Alex Perez/Inertial Computing
|
|
|
|
| |
separate commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lwsync instruction, which does not provide Store/Load barrier. Fix
this by using "full" sync barrier for mb().
atomic_store_rel() does not need full barrier, change mb() call there
to the lwsync instruction if not hitting the known CPU erratas
(i.e. on 32bit). Provide powerpc_lwsync() helper to isolate the
lwsync/sync compile time selection, and use it in atomic_store_rel()
and several other places which duplicate the code.
Noted by: alc
Reviewed and tested by: nwhitehorn
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
| |
new, simplified, ELF ABI that avoids some of the stranger aspects of the
existing 64-bit PowerPC ABI (function descriptors, in particular). Actually
generating such executables requires a new version of binutils and a newer
compiler (either GCC or clang) than GCC 4.2.1.
|
|
|
|
| |
Pointy-hat to: jhibbits
|
|
|
|
|
|
| |
Increase BUS_SPACE_MAXADDR to allow for this.
Sponsored by: Alex Perez/Inertial Computing
|
| |
|
|
|
|
|
| |
This was missed in r235936. With recent work for 36-bit paddr, this is now
needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of this, clean up tlb1_init(), since bootinfo is always NULL here just
eliminate the loop altogether.
Also, fix a bug in mmu_booke_mapdev_attr() where it's possible to map a larger
immediately following a smaller page, causing the mappings to overlap. Instead,
break up the new mapping into smaller chunks. The downside to this is that it
uses more precious TLB1 entries, which, on smaller chips (e500v2) it could cause
problems with TLB1 being out of space (e500v2 only has 16 TLB1 entries).
Obtained from: Semihalf (partial)
Sponsored by: Alex Perez/Inertial Computing
|
|
|
|
|
|
| |
Missed these files, from the original diff.
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D3027
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
initial thread stack is not adjusted by the tunable, the stack is
allocated too early to get access to the kernel environment. See
TD0_KSTACK_PAGES for the thread0 stack sizing on i386.
The tunable was tested on x86 only. From the visual inspection, it
seems that it might work on arm and powerpc. The arm
USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already
incorrect for the threads with non-default kstack size. I only
changed the macros to use variable instead of constant, since I cannot
test.
On arm64, mips and sparc64, some static data structures are sized by
KSTACK_PAGES, so the tunable is disabled.
Sponsored by: The FreeBSD Foundation
MFC after: 2 week
|
|
|
|
|
|
|
| |
Remove the advertising clause from the Regents of the University of
California's license, per the letter dated July 22, 1999.
Update clause numbering.
|
|
|
|
|
|
|
| |
Remove the advertising clause from the Regents of the University of
California's license, per the letter dated July 22, 1999.
Update clause numbering.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)
These will create and destroy a temporary, CPU-local KVA mapping of a specified page.
Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread
Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page
The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.
NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: http://reviews.freebsd.org/D3013
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
provide a semantic defined by the C11 fences with corresponding
memory_order.
atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.
atomic_thread_fence_rel() is r, w | w.
atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation. Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.
atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.
Reviewed by: alc
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
|
|
|
|
|
|
|
| |
On Book-E, physical addresses are actually 36-bits, not 32-bits. This is
currently worked around by ignoring the top bits. However, in some cases, the
boot loader configures CCSR to something above the 32-bit mark. This is stage 1
in updating the pmap to handle 36-bit physaddr.
|
|
|
|
|
|
|
| |
This will print out the Memory Subsystem Status Register on MPC745x (G4+ class),
and the Machine Check Status Register on Book-E class CPUs, to aid in debugging
machine checks. Other relevant registers, for other CPUs, can be added in the
future.
|
|
|
|
|
|
| |
Differential Revision: https://reviews.freebsd.org/D2712
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
|
|
|
|
| |
This supports e500v1, e500v2, and e500mc. Tested only on e500v2, but the
performance counters are identical across all, with e500mc having some
additional events.
Relnotes: Yes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and export them to userland.
- Define __HAVE_REG32 on platforms that define a reg32 structure and check
for this in <sys/procfs.h> to control when to export prstatus32, etc.
- Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures.
libbfd looks for these types, and having them fixes 'gcore' in gdb of a
32-bit process on a 64-bit platform.
- Use the structure definitions from <sys/procfs.h> in gcore's elf32 core
dump code instead of duplicating the definitions.
Differential Revision: https://reviews.freebsd.org/D2142
Reviewed by: kib, nathanw (powerpc bits)
MFC after: 1 week
|
|
|
|
|
| |
Renumber EXC_DEBUG to be above EXC_LAST, so as not to conflict with AIM trap
vectors.
|
|
|
|
| |
prevents contamination from a previous kernel (e.g. after shutdown -r).
|
| |
|
|
|
|
|
|
| |
have the same meaning and occupy the same memory address in the trapframe
courtesy of union. Avoid some pointless #ifdef by spelling them both 'DAR'
in the trapframe.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this change is to improve concurrency:
- Drop global state stored in the shadow overflow page table (and all other
global state)
- Remove all global locks
- Use per-PTE lock bits to allow parallel page insertion
- Reconstruct state when requested for evicted PTEs instead of buffering
it during overflow
This drops total wall time for make buildworld on a 32-thread POWER8 system
by a factor of two and system time by a factor of three, providing performance
20% better than similarly clocked Core i7 Xeons per-core. Performance on
smaller SMP systems, where PMAP lock contention was not as much of an issue,
is nearly unchanged.
Tested on: POWER8, POWER5+, G5 UP, G5 SMP (64-bit and 32-bit kernels)
Merged from: user/nwhitehorn/ppc64-pmap-rework
Looked over by: jhibbits, andreast
MFC after: 3 months
Relnotes: yes
Sponsored by: FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and POWER8. This instruction set unifies the 32 64-bit scalar floating
point registers with the 32 128-bit vector registers into a single bank
of 64 128-bit registers. Kernel support mostly amounts to saving and
restoring the wider version of the floating point registers and making
sure that both scalar FP and vector registers are enabled once a VSX
instruction is executed. get_mcontext() and friends currently cannot
see the high bits, which will require a little more work.
As the system compiler (GCC 4.2) does not support VSX, making use of this
from userland requires either newer GCC or clang.
Relnotes: yes
Sponsored by: FreeBSD Foundation
|