From 841c1316c7da6199a7df473893c141943991a756 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 17 Mar 2017 00:12:31 +0800 Subject: md: raid1: improve write behind This patch improve handling of write behind in the following ways: - introduce behind master bio to hold all write behind pages - fast clone bios from behind master bio - avoid to change bvec table directly - use bio_copy_data() and make code more clean Suggested-by: Shaohua Li Signed-off-by: Ming Lei Signed-off-by: Shaohua Li --- drivers/md/raid1.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) (limited to 'drivers/md/raid1.h') diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index dd22a37..4271cd7 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -153,9 +153,13 @@ struct r1bio { int read_disk; struct list_head retry_list; - /* Next two are only valid when R1BIO_BehindIO is set */ - struct bio_vec *behind_bvecs; - int behind_page_count; + + /* + * When R1BIO_BehindIO is set, we store pages for write behind + * in behind_master_bio. + */ + struct bio *behind_master_bio; + /* * if the IO is in WRITE direction, then multiple bios are used. * We choose the number when they are allocated. -- cgit v1.1 From c230e7e53526c223a3e1caf40747d6e37c0e4394 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Wed, 5 Apr 2017 14:05:50 +1000 Subject: md/raid1: simplify the splitting of requests. raid1 currently splits requests in two different ways for two different reasons. First, bio_split() is used to ensure the bio fits within a resync accounting region. Second, multiple r1bios are allocated for each bio to handle the possiblity of known bad blocks on some devices. This can be simplified to just use bio_split() once, and not use multiple r1bios. We delay the split until we know a maximum bio size that can be handled with a single r1bio, and then split the bio and queue the remainder for later handling. This avoids all loops inside raid1.c request handling. Just a single read, or a single set of writes, is submitted to lower-level devices for each bio that comes from generic_make_request(). When the bio needs to be split, generic_make_request() will do the necessary looping and call md_make_request() multiple times. raid1_make_request() no longer queues request for raid1 to handle, so we can remove that branch from the 'if'. This patch also creates a new private bio_set (conf->bio_split) for splitting bios. Using fs_bio_set is wrong, as it is meant to be used by filesystems, not block devices. Using it inside md can lead to deadlocks under high memory pressure. Delete unused variable in raid1_write_request() (Shaohua) Signed-off-by: NeilBrown Signed-off-by: Shaohua Li --- drivers/md/raid1.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'drivers/md/raid1.h') diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index 4271cd7..b0ab0da 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -107,6 +107,8 @@ struct r1conf { mempool_t *r1bio_pool; mempool_t *r1buf_pool; + struct bio_set *bio_split; + /* temporary buffer to synchronous IO when attempting to repair * a read error. */ -- cgit v1.1 From 43ac9b84a399bc10210a2d9f4e0778b7c6059c07 Mon Sep 17 00:00:00 2001 From: Xiao Ni Date: Thu, 27 Apr 2017 16:28:49 +0800 Subject: md/raid1: Use a new variable to count flighting sync requests In new barrier codes, raise_barrier waits if conf->nr_pending[idx] is not zero. After all the conditions are true, the resync request can go on be handled. But it adds conf->nr_pending[idx] again. The next resync request hit the same bucket idx need to wait the resync request which is submitted before. The performance of resync/recovery is degraded. So we should use a new variable to count sync requests which are in flight. I did a simple test: 1. Without the patch, create a raid1 with two disks. The resync speed: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 0.00 166.00 0.00 10.38 0.00 128.00 0.03 0.20 0.20 0.00 0.19 3.20 sdc 0.00 0.00 0.00 166.00 0.00 10.38 128.00 0.96 5.77 0.00 5.77 5.75 95.50 2. With the patch, the result is: sdb 2214.00 0.00 766.00 0.00 185.69 0.00 496.46 2.80 3.66 3.66 0.00 1.03 79.10 sdc 0.00 2205.00 0.00 769.00 0.00 186.44 496.52 5.25 6.84 0.00 6.84 1.30 100.10 Suggested-by: Shaohua Li Signed-off-by: Xiao Ni Acked-by: Coly Li Signed-off-by: Shaohua Li --- drivers/md/raid1.h | 1 + 1 file changed, 1 insertion(+) (limited to 'drivers/md/raid1.h') diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index b0ab0da..c8894ef 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -84,6 +84,7 @@ struct r1conf { */ wait_queue_head_t wait_barrier; spinlock_t resync_lock; + atomic_t nr_sync_pending; atomic_t *nr_pending; atomic_t *nr_waiting; atomic_t *nr_queued; -- cgit v1.1