locking/pvqspinlock: Queue node adaptive spinning

In an overcommitted guest where some vCPUs have to be halted to make forward progress in other areas, it is highly likely that a vCPU later in the spinlock queue will be spinning while the ones earlier in the queue would have been halted. The spinning in the later vCPUs is then just a waste of precious CPU cycles because they are not going to get the lock soon as the earlier ones have to be woken up and take their turn to get the lock. This patch implements an adaptive spinning mechanism where the vCPU will call pv_wait() if the previous vCPU is not running. Linux kernel builds were run in KVM guest on an 8-socket, 4 cores/socket Westmere-EX system and a 4-socket, 8 cores/socket Haswell-EX system. Both systems are configured to have 32 physical CPUs. The kernel build times before and after the patch were: Westmere Haswell Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs ----- -------- -------- -------- -------- Before patch 3m02.3s 5m00.2s 1m43.7s 3m03.5s After patch 3m03.0s 4m37.5s 1m43.0s 2m47.2s For 32 vCPUs, this patch doesn't cause any noticeable change in performance. For 48 vCPUs (over-committed), there is about 8% performance improvement. Signed-off-by: Waiman Long <Waiman.Long@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Douglas Hatch <doug.hatch@hpe.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1447114167-47185-8-git-send-email-Waiman.Long@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Waiman Long <Waiman.Long@hpe.com> 2015-11-09 19:09:27 -0500
committer: Ingo Molnar <mingo@kernel.org> 2015-12-04 11:39:51 +0100
commit: cd0272fab785077c121aa91ec2401090965bbc37 (patch)
tree: a83b168dc2b3fa4086e5409398855ac699272d83 /kernel/locking/qspinlock.c
parent: 1c4941fd53afb46ab15826628e4819866d008a28 (diff)
download: op-kernel-dev-cd0272fab785077c121aa91ec2401090965bbc37.zip
op-kernel-dev-cd0272fab785077c121aa91ec2401090965bbc37.tar.gz
1 files changed, 3 insertions, 2 deletions
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 2ea4299..393d187 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -248,7 +248,8 @@ static __always_inline void set_locked(struct qspinlock *lock)
  */
 
 static __always_inline void __pv_init_node(struct mcs_spinlock *node) { }
-static __always_inline void __pv_wait_node(struct mcs_spinlock *node) { }
+static __always_inline void __pv_wait_node(struct mcs_spinlock *node,
+					   struct mcs_spinlock *prev) { }
 static __always_inline void __pv_kick_node(struct qspinlock *lock,
 					   struct mcs_spinlock *node) { }
 static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
@@ -407,7 +408,7 @@ queue:
 		prev = decode_tail(old);
 		WRITE_ONCE(prev->next, node);
 
-		pv_wait_node(node);
+		pv_wait_node(node, prev);
 		arch_mcs_spin_lock_contended(&node->locked);
 
 		/*
author	Waiman Long <Waiman.Long@hpe.com>	2015-11-09 19:09:27 -0500
committer	Ingo Molnar <mingo@kernel.org>	2015-12-04 11:39:51 +0100
commit	cd0272fab785077c121aa91ec2401090965bbc37 (patch)
tree	a83b168dc2b3fa4086e5409398855ac699272d83 /kernel/locking/qspinlock.c
parent	1c4941fd53afb46ab15826628e4819866d008a28 (diff)
download	op-kernel-dev-cd0272fab785077c121aa91ec2401090965bbc37.zip op-kernel-dev-cd0272fab785077c121aa91ec2401090965bbc37.tar.gz