summaryrefslogtreecommitdiffstats
path: root/include
diff options
context:
space:
mode:
authorEric Dumazet <dada1@cosmosbay.com>2007-05-08 00:32:57 -0700
committerLinus Torvalds <torvalds@woody.linux-foundation.org>2007-05-08 11:15:17 -0700
commit5517d86bea237c1d7078840182d9ebc0fe4c1afc (patch)
tree67f1999895313878bfa904c66dffb7066f3c8d91 /include
parent46cb4b7c88fa5517f64b5bee42939ea3614cddcb (diff)
downloadop-kernel-dev-5517d86bea237c1d7078840182d9ebc0fe4c1afc.zip
op-kernel-dev-5517d86bea237c1d7078840182d9ebc0fe4c1afc.tar.gz
Speed up divides by cpu_power in scheduler
I noticed expensive divides done in try_to_wakeup() and find_busiest_group() on a bi dual core Opteron machine (total of 4 cores), moderatly loaded (15.000 context switch per second) oprofile numbers : CPU: AMD64 processors, speed 2600.05 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 50000 samples % symbol name ... 613914 1.0498 try_to_wake_up 834 0.0013 :ffffffff80227ae1: div %rcx 77513 0.1191 :ffffffff80227ae4: mov %rax,%r11 608893 1.0413 find_busiest_group 1841 0.0031 :ffffffff802260bf: div %rdi 140109 0.2394 :ffffffff802260c2: test %sil,%sil Some of these divides can use the reciprocal divides we introduced some time ago (currently used in slab AFAIK) We can assume a load will fit in a 32bits number, because with a SCHED_LOAD_SCALE=128 value, its still a theorical limit of 33554432 When/if we reach this limit one day, probably cpus will have a fast hardware divide and we can zap the reciprocal divide trick. Ingo suggested to rename cpu_power to __cpu_power to make clear it should not be modified without changing its reciprocal value too. I did not convert the divide in cpu_avg_load_per_task(), because tracking nr_running changes may be not worth it ? We could use a static table of 32 reciprocal values but it would add a conditional branch and table lookup. [akpm@linux-foundation.org: !SMP build fix] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include')
-rw-r--r--include/linux/sched.h8
1 files changed, 7 insertions, 1 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 15ab3e0..3d95c48 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -680,8 +680,14 @@ struct sched_group {
/*
* CPU power of this group, SCHED_LOAD_SCALE being max power for a
* single CPU. This is read only (except for setup, hotplug CPU).
+ * Note : Never change cpu_power without recompute its reciprocal
*/
- unsigned long cpu_power;
+ unsigned int __cpu_power;
+ /*
+ * reciprocal value of cpu_power to avoid expensive divides
+ * (see include/linux/reciprocal_div.h)
+ */
+ u32 reciprocal_cpu_power;
};
struct sched_domain {
OpenPOWER on IntegriCloud