[PATCH] dm: Fix deadlock under high i/o load in raid1 setup.

On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid, we experienced frequent deadlock of the system under high i/o load. 'cat /dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly after a few GB, 'cp' would be left in 'D' state along with kjournald and kmirrord. The functions cp and kjournald were blocked in did vary, but kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates that this problem has been around even before. So much for the facts, here's my interpretation: mempool_alloc() first tries to atomically allocate the requested memory, or falls back to hand out preallocated chunks from the mempool. If both fail, it puts the calling process (kmirrord in this case) on a private waitqueue until somebody refills the pool. Where the only 'somebody' is kmirrord itself, so we have a deadlock. I worked around this problem by falling back to a (blocking) kmalloc when before kmirrord would have ended up on the waitqueue. This defeats part of the benefits of using the mempool, but at least keeps the system running. And it could be done with a two-line change. Note that mempool_alloc() clears the GFP_NOIO flag internally, and only uses it to decide whether to wait or return an error if immediate allocation fails, so the attached patch doesn't change behaviour in the non-deadlocking case. Path is against current git (2.6.18-rc4), but should apply to earlier versions as well. I've tested on 2.6.15, where this patch makes the difference between random lockup and a stable system. Signed-off-by: Daniel Kobras <kobras@linux.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
author: Daniel Kobras <kobras@linux.de> 2006-08-27 01:23:24 -0700
committer: Linus Torvalds <torvalds@g5.osdl.org> 2006-08-27 11:01:28 -0700
commit: c06aad854fdb9da38fcc22dccfe9d72919453e43 (patch)
tree: a27fc99fe974cc5df08393c5b16b4499b07aa3e5
parent: 9a654518e1b774b8e8f74a819fd12a931e7672c9 (diff)
download: op-kernel-dev-c06aad854fdb9da38fcc22dccfe9d72919453e43.zip
op-kernel-dev-c06aad854fdb9da38fcc22dccfe9d72919453e43.tar.gz
1 files changed, 3 insertions, 1 deletions
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index be48ced..c54de98 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -255,7 +255,9 @@ static struct region *__rh_alloc(struct region_hash *rh, region_t region)
 	struct region *reg, *nreg;
 
 	read_unlock(&rh->hash_lock);
-	nreg = mempool_alloc(rh->region_pool, GFP_NOIO);
+	nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC);
+	if (unlikely(!nreg))
+		nreg = kmalloc(sizeof(struct region), GFP_NOIO);
 	nreg->state = rh->log->type->in_sync(rh->log, region, 1) ?
 		RH_CLEAN : RH_NOSYNC;
 	nreg->rh = rh;
author	Daniel Kobras <kobras@linux.de>	2006-08-27 01:23:24 -0700
committer	Linus Torvalds <torvalds@g5.osdl.org>	2006-08-27 11:01:28 -0700
commit	c06aad854fdb9da38fcc22dccfe9d72919453e43 (patch)
tree	a27fc99fe974cc5df08393c5b16b4499b07aa3e5
parent	9a654518e1b774b8e8f74a819fd12a931e7672c9 (diff)
download	op-kernel-dev-c06aad854fdb9da38fcc22dccfe9d72919453e43.zip op-kernel-dev-c06aad854fdb9da38fcc22dccfe9d72919453e43.tar.gz