summaryrefslogtreecommitdiffstats
path: root/libexec
diff options
context:
space:
mode:
authorvangyzen <vangyzen@FreeBSD.org>2015-10-26 16:21:56 +0000
committervangyzen <vangyzen@FreeBSD.org>2015-10-26 16:21:56 +0000
commite22b77611897ae7d2d5eb4de12f21d2a5fc1ccb6 (patch)
treeeddbd2bd9d74250a0f1379938c2e3424071536d9 /libexec
parent582d35681a4d2ceee539e894e0183d3b4552aecf (diff)
downloadFreeBSD-src-e22b77611897ae7d2d5eb4de12f21d2a5fc1ccb6.zip
FreeBSD-src-e22b77611897ae7d2d5eb4de12f21d2a5fc1ccb6.tar.gz
Disable SSE in libthr
Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. If the thread does not otherwise use SSE, this usage incurs a context-switch of the FPU/SSE state, which reduces the performance of multiple real-world applications by a non-trivial amount (3-5% in one application). Instead of this change, I experimented with eagerly switching the FPU state at context-switch time. This did not help. Most of the cost seems to be in the read/write of memory--as kib@ stated--and not in the #NM handling. I tested on machines with and without XSAVEOPT. One counter-argument to this change is that most applications already use SIMD, and the number of applications and amount of SIMD usage are only increasing. This is absolutely true. I agree that--in general and in principle--this change is in the wrong direction. However, there are applications that do not use enough SSE to offset the extra context-switch cost. SSE does not provide a clear benefit in the current libthr code with the current compiler, but it does provide a clear loss in some cases. Therefore, disabling SSE in libthr is a non-loss for most, and a gain for some. I refrained from disabling SSE in libc--as was suggested--because I can't make the above argument for libc. It provides a wide variety of code; each case should be analyzed separately. https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html Suggestions from: dim, jmg, rpaulo Sponsored by: Dell Inc.
Diffstat (limited to 'libexec')
-rw-r--r--libexec/rtld-elf/amd64/Makefile.inc2
-rw-r--r--libexec/rtld-elf/i386/Makefile.inc2
2 files changed, 2 insertions, 2 deletions
diff --git a/libexec/rtld-elf/amd64/Makefile.inc b/libexec/rtld-elf/amd64/Makefile.inc
index 7528dbe..a09db6f 100644
--- a/libexec/rtld-elf/amd64/Makefile.inc
+++ b/libexec/rtld-elf/amd64/Makefile.inc
@@ -1,6 +1,6 @@
# $FreeBSD$
-CFLAGS+= -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float
+CFLAGS+= ${CFLAGS_NO_SIMD} -msoft-float
# Uncomment this to build the dynamic linker as an executable instead
# of a shared library:
#LDSCRIPT= ${.CURDIR}/${MACHINE_CPUARCH}/elf_rtld.x
diff --git a/libexec/rtld-elf/i386/Makefile.inc b/libexec/rtld-elf/i386/Makefile.inc
index 7528dbe..a09db6f 100644
--- a/libexec/rtld-elf/i386/Makefile.inc
+++ b/libexec/rtld-elf/i386/Makefile.inc
@@ -1,6 +1,6 @@
# $FreeBSD$
-CFLAGS+= -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float
+CFLAGS+= ${CFLAGS_NO_SIMD} -msoft-float
# Uncomment this to build the dynamic linker as an executable instead
# of a shared library:
#LDSCRIPT= ${.CURDIR}/${MACHINE_CPUARCH}/elf_rtld.x
OpenPOWER on IntegriCloud