| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This reverts commit 70d1caf0ad967030b2ce835dc0f116ed1733c82c.
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
on amd64. [FreeBSD-SA-18:03.speculative_execution]
Approved by: so
Security: FreeBSD-SA-18:03.speculative_execution
Security: CVE-2017-5715
Security: CVE-2017-5754
|
| |
| |
| |
| |
| |
| | |
r328083,328096,328116,328119,328120,328128,328135,328153,328157,""
This reverts commit d3d59b01294138e59995b31d2bcbbbdf45e26a3c.
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | | |
Replace many instances of VM_WAIT with blocking page allocation flags.
(cherry picked from commit 2069f0080fbdcf49b623bc3c1eda76524a4d1a77)
|
|/ /
| |
| |
| | |
This reverts commit 430a2bea3907149b30cc75fc722b6cf1f81da82a.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
328166,328177,328199,328202,328205,328468,328470,328624,328625,328627,
328628,329214,329297,329365:
Meltdown mitigation by PTI, PCID optimization of PTI, and kernel use of IBRS
for some mitigations of Spectre.
Tested by: emaste, Arshan Khanifar <arshankhanifar@gmail.com>
Discussed with: jkim
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 6dd025b40ee6870bea6ba670f30dcf684edc3f6c)
|
|
|
|
|
|
|
|
|
|
| |
Refine the fix from r312954. Specifically, add a new PDE-only flag,
PG_PROMOTED, that indicates whether lingering 4KB page mappings might
need to be flushed on a PDE change that restricts or destroys a 2MB
page mapping. This flag allows the pmap to avoid range invalidations
that are both unnecessary and costly.
Approved by: re (kib)
|
|
|
|
|
| |
r308489, r308706:
Add PQ_LAUNDRY and remove PG_CACHED pages.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correct page frame mask constant used in pmap_change_attr_locked
This was introduced in r290156. It's present in 11.0, but not any 10.x
release unless someone decided to MFC it.
It affects ordinary pages right above the DMAP limit, which is effectively
system memory rounded up to a 1 GB (3rd level superpage) boundary (or up to
a minimum of 4 GB, on small systems).
Sponsored by: Dell EMC
|
|
|
|
|
|
| |
sys: Replace zero with NULL for pointers.
Found with: devel/coccinelle
|
|
|
|
|
| |
In pmap_enter(), set the PG_MANAGED flag on the new PTE in one place,
rather two places, and do so before the pmap lock is acquired.
|
|
|
|
|
| |
Microoptimize pmap_protect_pde() on amd64, i386 and
pmap_protect_pte1() on armv6.
|
|
|
|
|
| |
Do not leave stale 4K TLB entries on pde (superpage) removal or
protection change.
|
|
|
|
| |
Use SFENCE for ordering CLFLUSHOPT.
|
|
|
|
| |
Coalesce TLB shootdowns of global PTEs in pmap_advise() on x86.
|
|
|
|
|
|
| |
For machines which support PCID but not have INVPCID instruction,
i.e. SandyBridge and IvyBridge, correct a race between pmap_activate()
and invltlb_pcid_handler().
|
|
|
|
|
|
|
|
|
|
|
|
| |
As an optimization to the machine-independent layer, change the machine-
dependent pmap_ts_referenced() so that it updates the page's dirty field
if a modified bit is found while counting reference bits. This
opportunistic update can be performed at low cost and can eliminate the
need for some future calls to pmap_is_modified() by the machine-
independent layer.
Replace the number 4 in sparc64's pmap_ts_referenced() by
PMAP_TS_REFERENCED_MAX, like we've done elsewhere, e.g., amd64.
|
|
|
|
|
| |
Export the pmap_cache_bits() and pmap_pinit_pml4() functions from the
amd64 pmap.
|
|
|
|
| |
Move pmap_p*e_index() inline functions from pmap.c to pmap.h.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add explicit detection of KVM hypervisor
Set vm_guest to a new enum value (VM_GUEST_KVM) when kvm is detected and use
vm_guest in conditionals testing for KVM.
Also, fix a conditional checking if we're running in a VM which caught only
the generic VM case, but not more specific VMs (KVM, VMWare, etc.). (Spotted
by: vangyzen).
Sponsored by: Dell Inc.
Approved by: vangyzen (mentor)
|
|
|
|
|
| |
The pmap_delayed_invl_wait() function blocks on turnstile, it does not
spin, in the committed version. Remove stray '*' in the text.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
does not cover the dynamically registered ficititious ranges, and
fictitious pages mappings are not promoted. Offer a dummy struct
md_page to fetch constant superpage pv list generation to satisfy
logic. Also, by initializing the pv_dummy pv_list to empty, we can
remove several explicit PG_FICTITIOUS tests.
Reported and tested by: Michael Butler <imb@protected-networks.net>
(previous version)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D6728
Approved by: re (hrs)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not try to change attributes for DMAP when working on a mapping
which is not covered by the DMAP. This was reported on real system
where a BAR of a device (NTB) was mapped outside the PCI window.
Reported and tested by: mav
Reviewed by: jhb, mav
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D6668
|
|
|
|
|
|
|
| |
emulation. Assert that syscalls do not leak DI.
Reported by: gjb
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
| |
http://docs.freebsd.org/cgi/mid.cgi?552BFEB2.8040407
Re-implement it entirely in inline assembly not to let compilers do silly
spilling to memory. For non-POPCNT case, use newly added bit_count(3).
Reported by: alc
Reviewed by: alc, kib
Differential Revision: https://reviews.freebsd.org/D6541
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only current purpose of the pvh lock was explained there
On Wed, Jan 09, 2013 at 11:46:13PM -0600, Alan Cox wrote:
> Let me lay out one example for you in detail. Suppose that we have
> three processors and two of these processors are actively using the same
> pmap. Now, one of the two processors sharing the pmap performs a
> pmap_remove(). Suppose that one of the removed mappings is to a
> physical page P. Moreover, suppose that the other processor sharing
> that pmap has this mapping cached with write access in its TLB. Here's
> where the trouble might begin. As you might expect, the processor
> performing the pmap_remove() will acquire the fine-grained lock on the
> PV list for page P before destroying the mapping to page P. Moreover,
> this processor will ensure that the vm_page's dirty field is updated
> before releasing that PV list lock. However, the TLB shootdown for this
> mapping may not be initiated until after the PV list lock is released.
> The processor performing the pmap_remove() is not problematic, because
> the code being executed by that processor won't presume that the mapping
> is destroyed until the TLB shootdown has completed and pmap_remove() has
> returned. However, the other processor sharing the pmap could be
> problematic. Specifically, suppose that the third processor is
> executing the page daemon and concurrently trying to reclaim page P.
> This processor performs a pmap_remove_all() on page P in preparation for
> reclaiming the page. At this instant, the PV list for page P may
> already be empty but our second processor still has a stale TLB entry
> mapping page P. So, changes might still occur to the page after the
> page daemon believes that all mappings have been destroyed. (If the PV
> entry had still existed, then the pmap lock would have ensured that the
> TLB shootdown completed before the pmap_remove_all() finished.) Note,
> however, the page daemon will know that the page is dirty. It can't
> possibly mistake a dirty page for a clean one. However, without the
> current pvh global locking, I don't think anything is stopping the page
> daemon from starting the laundering process before the TLB shootdown has
> completed.
>
> I believe that a similar example could be constructed with a clean page
> P' and a stale read-only TLB entry. In this case, the page P' could be
> "cached" in the cache/free queues and recycled before the stale TLB
> entry is flushed.
TLBs for addresses with updated PTEs are always flushed before pmap
lock is unlocked. On the other hand, amd64 pmap code does not always
flushes TLBs before PV list locks are unlocked, if previously PTEs
were cleared and PV entries removed.
To handle the situations where a thread might notice empty PV list but
third thread still having access to the page due to TLB invalidation
not finished yet, introduce delayed invalidation. Comparing with the
pvh_global_lock, DI does not block entered thread when
pmap_remove_all() or pmap_remove_write() (callers of
pmap_delayed_invl_wait()) are executed in parallel. But _invl_wait()
callers are blocked until all previously noted DI blocks are leaved,
thus ensuring that neccessary TLB invalidations were performed before
returning from pmap_remove_all() or pmap_remove_write().
See comments for detailed description of the mechanism, and also for
the explanations why several pmap methods, most important
pmap_enter(), do not need DI protection.
Reviewed by: alc, jhb (turnstile KPI usage)
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D5747
|
|
|
|
|
|
| |
used to implement PCID support on amd64.
Reviewed by: kib
|
|
|
|
|
|
|
|
|
| |
call pmap_invalidate_page() even though they are not destroying a leaf-
level page table entry.
Eliminate some bogus white-space characters in a comment.
Reviewed by: kib
|
|
|
|
|
|
| |
Use param.h howmany() instead of hand-rolled version.
Sponsored by: EMC / Isilon Storage Division
|
| |
|
|
|
|
|
|
|
|
|
|
| |
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.
This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Typical TLBs have 40-512 entries available. At some point, iterating
every single page in a requested invalidation range and issuing invlpg
on it is more expensive than flushing the TLB and allowing it to reload
on demand.
Broadwell CPUs have 1536 L2 TLB entries, so I've picked the arbitrary
number 4096 entries as a hueristic at which point we flush TLB rather
than invalidating every single potential page.
Reviewed by: alc
Feedback from: jhb, kib
MFC notes: Depends on r291688
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D4280
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the PG_G global pte flag, pmap_invalidate_all() fails to flush global
TLB entries [*]. This is because TLB shootdown handler for such
configs reloads CR3, and on i386 pmap_invalidate_all() does the same
for the initiating CPU. Note that current code does not issue total
invalidation requests for the kernel_pmap.
Rename amd64 function invltlb_globpcid() to invltlb_glob(), it is not
specific for PCID for quite some time, and implement the same
functionality for i386. Use the function instead of invltlb() in
shootdown handlers and in i386 pmap_invalidate_all(), but only for the
kernel pmap (which maps pages with the PG_G attribute set), which
takes care of PG_G TLB entries on flush.
To detect the affected pmap in i386 TLB shootdown handler, pmap should
be passed to the smp_masked_invltlb() function, which makes amd64 and
i386 TLB shootdown code almost identical. Merge the code under x86/.
Noted by: jhb [*]
Reviewed by: cem, jhb, pho
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D4346
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pmap_change_attr must change the memory type of both the requested KVA
and the corresponding DMAP mappings (if such mappings exist), to satisfy
an Intel requirement that two or more mappings to the same physical
pages must have the same memory type.
However, not all kernel mapped pages have corresponding DMAP mappings --
for example, 64-bit BARs. Skip fixing up the DMAP for out-of-bounds
addresses.
Submitted by: Steve Wahl <steve_wahl@dell.com>
Reviewed by: alc, jhb
Sponsored by: Dell Compellent
Differential Revision: https://reviews.freebsd.org/D4030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ordered with the MFENCE instruction. Similar weak guarantees are also
specified by the AMD APM vol. 3 rev. 3.22. x86 pmap methods
pmap_invalidate_cache_range() and pmap_invalidate_cache_pages() braced
CLFLUSH loop with MFENCE both before and after the loop.
In the revision 56 of SDM, Intel stated that all existing
implementations of CLFLUSH are strict, CLFLUSH instructions execution
is ordered WRT other CLFLUSH and writes. Also, the strict behaviour
is made architectural.
A new instruction CLFLUSHOPT (which was documented for some time in
the Instruction Set Extensions Programming Reference) provides the
weak behaviour which was previously attributed to CLFLUSH.
Use CLFLUSHOPT when available. When CLFLUSH is used on Intel CPUs, do
not execute MFENCE before and after the flushing loop.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
| |
belong to a vm object, they can't be paged out. Since they can't be paged
out, they are never enqueued in a paging queue. Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue. As of r288122, we can avoid
this false impression by passing PQ_NONE.
Submitted by: kmacy (an earlier version)
Differential Revision: https://reviews.freebsd.org/D1674
|
|
|
|
|
|
|
|
|
|
| |
is a little tight in and by itself, but severily insufficient
when one needs to map a large frame buffer as part of console
initialization. 64MB slop should be enough for a while. As an
example: a 15" MacBook Pro with retina display needs ~28MB of
KVA for the frame buffer.
PR: 193745
|
|
|
|
|
|
|
|
|
|
|
| |
map. Handle busdma bouncing and ata PIO accesses by using global
frame used by the current CPU locally for the duration of
pmap_quick_enter/remove_page(). A spin mutex protects the concurent
frame use and prevents thread migration.
Noted by: royger
Reviewed by: alc, jah, royger (previous version)
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
frame buffers and memory mapped UARTs.
1. Delay calling cninit() until after pmap_bootstrap(). This makes
sure we have PMAP initialized enough to add translations. Keep
kdb_init() after cninit() so that we have console when we need
to break into the debugger on boot.
2. Unfortunately, the ATPIC code had be moved as well so as to
avoid a spurious trap #30. The reason for which is not known
at this time.
3. In pmap_mapdev_attr(), when we need to map a device prior to the
VM system being initialized, use virtual_avail as the KVA to map
the device at. In particular, avoid using the direct map on amd64
because we can't demote by virtue of not being able to allocate
yet. Keep track of the translation.
Re-use the translation after the VM has been initialized to not
waste KVA and to satisfy the assumption in uart(4) that the handle
returned for the low-level console is the same as later returned
when the device is probed and attached.
4. In pmap_unmapdev() remove the mapping from the table when called
pre-init. Otherwise keep the mapping. During bus probe and attach
device resources are mapped and unmapped multiple times, which
would have us destroy the mapping used by the low-level console.
5. In pmap_init(), set pmap_initialized to signal that we're not
pre-init anymore. On amd64, bring the direct map in sync with the
translations created at that time.
6. Implement bus_space_map() and bus_space_unmap() for real: when
the tag corresponds to memory space, call the corresponding
pmap_mapdev() and pmap_unmapdev() functions to construct and
actual handle.
7. In efifb.c and vt_vga.c, remove the crutches and hacks and simply
call pmap_mapdev_attr() or bus_space_map() as desired.
Notes:
1. uart(4) already used bus_space_map() during low-level console
setup but since serial ports have traditionally been I/O port
based, the lack of a proper implementation for said function
was not a problem. It has always supported memory mapped UARTs
for low-level consoles by setting hw.uart.console accordingly.
2. The use of the direct map on amd64 without setting caching
attributes has been a bigger problem than previously thought.
This change has the fortunate (and unexpected) side-effect of
fixing various EFI frame buffer problems (though not all).
PR: 191564, 194952
Special thanks to:
1. XipLink, Inc -- generously donated an Intel Bay Trail E3800
based eval board (ADLE3800PC).
2. The FreeBSD Foundation, in particular emaste@ -- for UEFI
support in general and testing.
3. Everyone who tested the proposed for PR 191564.
4. jhb@ and kib@ for being a soundboard and applying a clue bat
if so needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)
These will create and destroy a temporary, CPU-local KVA mapping of a specified page.
Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread
Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page
The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.
NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: http://reviews.freebsd.org/D3013
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
buildkernel run.
Some of them were write-only under some kernel options, e.g. variables
keeping values only used by CTR() macros. It costs nothing to the
code readability and correctness to eliminate the warnings in those
cases too by removing the local cached values used only for
single-access.
Review: https://reviews.freebsd.org/D2665
Reviewed by: rodrigc
Looked at by: bjk
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
|
|
|
|
|
| |
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
|
|
| |
particular, switch to the proc0 pmap to have expected %cr3 and PCID
for the thread0 during initialization, and the up to date pm_active
mask.
pmap_pinit0() should be done after proc0->p_vmspace is assigned so
that the amd64 pmap_activate() find the correct curproc pmap.
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
|
|
|
|
|
|
|
| |
Requested by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
|