diff options
author | marius <marius@FreeBSD.org> | 2011-10-01 00:11:03 +0000 |
---|---|---|
committer | marius <marius@FreeBSD.org> | 2011-10-01 00:11:03 +0000 |
commit | 3c6a68b66d050e0c7b507e89beaf57b5df7636a7 (patch) | |
tree | 72cce6cbedec749f35c8f5e3da23b6d0efb8f6af /sys/sparc64 | |
parent | c4a8a741fd80ee26b85664d65c614df7abeb15cd (diff) | |
download | FreeBSD-src-3c6a68b66d050e0c7b507e89beaf57b5df7636a7.zip FreeBSD-src-3c6a68b66d050e0c7b507e89beaf57b5df7636a7.tar.gz |
In total store which we use for running the kernel and all of the userland
atomic operations behave as if the were followed by a memory barrier so
there's no need to include ones in the acquire variants of atomic(9).
Removing these results a small performance improvement, specifically this
is sufficient to compensate the performance loss seen in the worldstone
benchmark seen when using SCHED_ULE instead of SCHED_4BSD.
This change is inspired by Linux even more radically doing the equivalent
thing some time ago.
Thanks go to Peter Jeremy for additional testing.
Diffstat (limited to 'sys/sparc64')
-rw-r--r-- | sys/sparc64/include/atomic.h | 17 |
1 files changed, 9 insertions, 8 deletions
diff --git a/sys/sparc64/include/atomic.h b/sys/sparc64/include/atomic.h index d663fbc..cd15d2d 100644 --- a/sys/sparc64/include/atomic.h +++ b/sys/sparc64/include/atomic.h @@ -74,12 +74,16 @@ * * the return value of cas is used to avoid the extra reload. * - * The memory barriers provided by the acq and rel variants are intended - * to be sufficient for use of relaxed memory ordering. Due to the - * suggested assembly syntax of the membar operands containing a # - * character, they cannot be used in macros. The cmask and mmask bits + * We only include a memory barrier in the rel variants as in total store + * order which we use for running the kernel and all of the userland atomic + * loads and stores behave as if the were followed by a membar with a mask + * of #LoadLoad | #LoadStore | #StoreStore. In order to be also sufficient + * for use of relaxed memory ordering, the atomic_cas() in the acq variants + * additionally would have to be followed by a membar #LoadLoad | #LoadStore. + * Due to the suggested assembly syntax of the membar operands containing a + * # character, they cannot be used in macros. The cmask and mmask bits thus * are hard coded in machine/cpufunc.h and used here through macros. - * Hopefully sun will choose not to change the bit numbers. + * Hopefully the bit numbers won't change in the future. */ #define itype(sz) uint ## sz ## _t @@ -93,7 +97,6 @@ #define atomic_cas_acq(p, e, s, sz) ({ \ itype(sz) v; \ v = atomic_cas(p, e, s, sz); \ - membar(LoadLoad | LoadStore); \ v; \ }) @@ -118,7 +121,6 @@ #define atomic_op_acq(p, op, v, sz) ({ \ itype(sz) t; \ t = atomic_op(p, op, v, sz); \ - membar(LoadLoad | LoadStore); \ t; \ }) @@ -135,7 +137,6 @@ #define atomic_load_acq(p, sz) ({ \ itype(sz) v; \ v = atomic_load(p, sz); \ - membar(LoadLoad | LoadStore); \ v; \ }) |