summaryrefslogtreecommitdiffstats
path: root/doc/tuning.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tuning.md')
-rw-r--r--doc/tuning.md28
1 files changed, 27 insertions, 1 deletions
diff --git a/doc/tuning.md b/doc/tuning.md
index 474553b..53e682b 100644
--- a/doc/tuning.md
+++ b/doc/tuning.md
@@ -1,6 +1,7 @@
# Tuning Guide
## Content Overview
+* [Windows](windows)
* [NVIDIA Backend](#nvidia-backend)
* [Choose Value for `threads` and `blocks`](#choose-value-for-threads-and-blocks)
* [Add more GPUs](#add-more-gpus)
@@ -8,6 +9,14 @@
* [Choose `intensity` and `worksize`](#choose-intensity-and-worksize)
* [Add more GPUs](#add-more-gpus)
* [Increase Memory Pool](#increase-memory-pool)
+ * [Scratchpad Indexing](#scratchpad-indexing)
+* [CPU Backend](#cpu-backend)
+ * [Choose Value for `low_power_mode`](#choose-value-for-low_power_mode)
+
+## Windows
+"Run As Administrator" prompt (UAC) confirmation is needed to use large pages on Windows 7.
+On Windows 10 it is only needed once to set up the account to use them.
+Disable the dialog with the command line option `--noUAC`
## NVIDIA Backend
@@ -80,4 +89,21 @@ export GPU_MAX_ALLOC_PERCENT=99
export GPU_SINGLE_ALLOC_PERCENT=99
```
-*Note:* Windows user must use `set` instead of `export` to define an environment variable. \ No newline at end of file
+*Note:* Windows user must use `set` instead of `export` to define an environment variable.
+
+### Scratchpad Indexing
+
+The layout of the hash scratchpad memory can be changed for each GPU with the option `strided_index` in `amd.txt`.
+Try to change the value from the default `true` to `false`.
+
+## CPU Backend
+
+By default the CPU backend can be tuned in the config file `cpu.txt`
+
+### Choose Value for `low_power_mode`
+
+The optimal value for `low_power_mode` depends on the cache size of your CPU, and the number of threads.
+
+The `low_power_mode` can be set to a number between `1` to `5`. When set to a value `N` greater than `1`, this mode increases the single thread performance by `N` times, but also requires at least `2*N` MB of cache per thread. It can also be set to `false` or `true`. The value `false` is equivalent to `1`, and `true` is equivalent to `2`.
+
+This setting is particularly useful for CPUs with very large cache. For example the Intel Crystal Well Processors are equipped with 128MB L4 cache, enough to run 8 threads at an optimal `low_power_mode` value of `5`.
OpenPOWER on IntegriCloud