summaryrefslogtreecommitdiffstats
path: root/sys/arm64
diff options
context:
space:
mode:
authorandrew <andrew@FreeBSD.org>2016-09-16 12:39:21 +0000
committerandrew <andrew@FreeBSD.org>2016-09-16 12:39:21 +0000
commitd7bd0f0c8e5a12e8d04d7be4febc8f3d863fcb8d (patch)
treef23ad757d40869fdb065f8da4ce07c4e03df36e3 /sys/arm64
parent6d9a5a5ea4aec39d2e6553956a6faf2c2a89b061 (diff)
downloadFreeBSD-src-d7bd0f0c8e5a12e8d04d7be4febc8f3d863fcb8d.zip
FreeBSD-src-d7bd0f0c8e5a12e8d04d7be4febc8f3d863fcb8d.tar.gz
MFC 305545:
Only call cpu_icache_sync_range when inserting an executable page. If the page is non-executable the contents of the i-cache are unimportant so this call is just adding unneeded overhead when inserting pages. While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1 iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into arm64_icache_sync_range() from pmap_enter() along with a high number of instruction fetches and iTLB/iCache hits. Limiting the calls to arm64_icache_sync_range() to only executable pages, we observe the iTLB and iCache Hit going down by about 43%. These numbers are quite misleading when looked at alone as at the same time instructions retired were reduced by 19.2% and instruction fetches were reduced by 38.8%. Overall this reduced the runtime of the test program by 22.4%. On Juno hardware, in steady-state, running the same test, using the cycle count to determine runtime, we do see a reduction of up to 28.9% in runtime. While these numbers certainly depend on the program executed, we expect an overall performance improvement. Obtained from: ABT Systems Ltd Sponsored by: The FreeBSD Foundation
Diffstat (limited to 'sys/arm64')
-rw-r--r--sys/arm64/arm64/pmap.c5
1 files changed, 3 insertions, 2 deletions
diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
index 71acdd2..500aeb3 100644
--- a/sys/arm64/arm64/pmap.c
+++ b/sys/arm64/arm64/pmap.c
@@ -2939,8 +2939,9 @@ validate:
pmap_invalidate_page(pmap, va);
if (pmap != pmap_kernel()) {
- if (pmap == &curproc->p_vmspace->vm_pmap)
- cpu_icache_sync_range(va, PAGE_SIZE);
+ if (pmap == &curproc->p_vmspace->vm_pmap &&
+ (prot & VM_PROT_EXECUTE) != 0)
+ cpu_icache_sync_range(va, PAGE_SIZE);
if ((mpte == NULL || mpte->wire_count == NL3PG) &&
pmap_superpages_enabled() &&
OpenPOWER on IntegriCloud