From 659b254fa7392e32b59a30d4b61fb12c4cd440ff Mon Sep 17 00:00:00 2001 From: Guoqing Jiang Date: Mon, 21 Dec 2015 10:50:59 +1100 Subject: md-cluster: remove a disk asynchronously from cluster environment For cluster raid, if one disk couldn't be reach in one node, then other nodes would receive the REMOVE message for the disk. In receiving node, we can't call md_kick_rdev_from_array to remove the disk from array synchronously since the disk might still be busy in this node. So let's set a ClusterRemove flag on the disk, then let the thread to do the removal job eventually. Signed-off-by: Guoqing Jiang Signed-off-by: Goldwyn Rodrigues Signed-off-by: NeilBrown --- drivers/md/md.h | 1 + 1 file changed, 1 insertion(+) (limited to 'drivers/md/md.h') diff --git a/drivers/md/md.h b/drivers/md/md.h index ca0b643..f7b17ae 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -183,6 +183,7 @@ enum flag_bits { * Usually, this device should be faster * than other devices in the array */ + ClusterRemove, }; #define BB_LEN_MASK (0x00000000000001FFULL) -- cgit v1.1 From 15858fa5b00c1067a8a8e53ea32f4a65f8bebbb8 Mon Sep 17 00:00:00 2001 From: Guoqing Jiang Date: Mon, 21 Dec 2015 10:51:00 +1100 Subject: md-cluster: Defer MD reloading to mddev->thread Reloading of superblock must be performed under reconfig_mutex. However, this cannot be done with md_reload_sb because it would deadlock with the message DLM lock. So, we defer it in md_check_recovery() which is executed by mddev->thread. This introduces a new flag, MD_RELOAD_SB, which if set, will reload the superblock. And good_device_nr is also added to 'struct mddev' which is used to get the num of the good device within cluster raid. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Guoqing Jiang Signed-off-by: NeilBrown --- drivers/md/md.h | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'drivers/md/md.h') diff --git a/drivers/md/md.h b/drivers/md/md.h index f7b17ae..8817e62 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -235,6 +235,9 @@ struct mddev { */ #define MD_JOURNAL_CLEAN 5 /* A raid with journal is already clean */ #define MD_HAS_JOURNAL 6 /* The raid array has journal feature set */ +#define MD_RELOAD_SB 7 /* Reload the superblock because another node + * updated it. + */ int suspended; atomic_t active_io; @@ -465,6 +468,7 @@ struct mddev { struct work_struct event_work; /* used by dm to report failure event */ void (*sync_super)(struct mddev *mddev, struct md_rdev *rdev); struct md_cluster_info *cluster_info; + unsigned int good_device_nr; /* good device num within cluster raid */ }; static inline int __must_check mddev_lock(struct mddev *mddev) -- cgit v1.1 From 9ebc6ef188a0656f3620835f9be7fe22c1644c1c Mon Sep 17 00:00:00 2001 From: Deepa Dinamani Date: Mon, 21 Dec 2015 10:51:01 +1100 Subject: drivers: md: use ktime_get_real_seconds() get_seconds() API is not y2038 safe on 32 bit systems and the API is deprecated. Replace it with calls to ktime_get_real_seconds() API instead. Change mddev structure types to time64_t accordingly. 32 bit signed timestamps will overflow in the year 2038. Change the user interface mdu_array_info_s structure timestamps: ctime and utime values used in ioctls GET_ARRAY_INFO and SET_ARRAY_INFO to unsigned int. This will extend the field to last until the year 2106. The long term plan is to get rid of ctime and utime values in this structure as this information can be read from the on-disk meta data directly. Clamp the tim64_t timestamps to positive values with a max of U32_MAX when returning from GET_ARRAY_INFO ioctl to accommodate above changes in the data type of timestamps to unsigned int. v0.90 on disk meta data uses u32 for maintaining time stamps. So this will also last until year 2106. Assumption is that the usage of v0.90 will be deprecated by year 2106. Timestamp fields in the on disk meta data for v1.0 version already use 64 bit data types. Remove the truncation of the bits while writing to or reading from these from the disk. Signed-off-by: Deepa Dinamani Reviewed-by: Arnd Bergmann Signed-off-by: NeilBrown --- drivers/md/md.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/md/md.h') diff --git a/drivers/md/md.h b/drivers/md/md.h index 8817e62..e16a17c 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -264,7 +264,7 @@ struct mddev { * managed externally */ char metadata_type[17]; /* externally set*/ int chunk_sectors; - time_t ctime, utime; + time64_t ctime, utime; int level, layout; char clevel[16]; int raid_disks; -- cgit v1.1 From 274d8cbde1bc3bdfb31c5d6a58113dff5cee4f87 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 4 Jan 2016 16:16:58 +1100 Subject: md: Remove 'ready' field from mddev. This field is always set in tandem with ->pers, and when it is tested ->pers is also tested. So ->ready is not needed. It was needed once, but code rearrangement and locking changes have removed that needed. Signed-off-by: NeilBrown --- drivers/md/md.h | 2 -- 1 file changed, 2 deletions(-) (limited to 'drivers/md/md.h') diff --git a/drivers/md/md.h b/drivers/md/md.h index e16a17c..fc6f7bb 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -246,8 +246,6 @@ struct mddev { * are happening, so run/ * takeover/stop are not safe */ - int ready; /* See when safe to pass - * IO requests down */ struct gendisk *gendisk; struct kobject kobj; -- cgit v1.1 From 1501efadc524a0c99494b576923091589a52d2a4 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Wed, 13 Jan 2016 16:00:07 -0800 Subject: md/raid: only permit hot-add of compatible integrity profiles It is not safe for an integrity profile to be changed while i/o is in-flight in the queue. Prevent adding new disks or otherwise online spares to an array if the device has an incompatible integrity profile. The original change to the blk_integrity_unregister implementation in md, commmit c7bfced9a671 "md: suspend i/o during runtime blk_integrity_unregister" introduced an immediate hang regression. This policy of disallowing changes the integrity profile once one has been established is shared with DM. Here is an abbreviated log from a test run that: 1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [ 59.076127] 2/ Tries to add an integrity-disabled device (pmem1m) [ 90.489209] 3/ Retries with an integrity-enabled device (pmem1s) [ 205.671277] [ 59.076127] md/raid1:md0: active with 1 out of 2 mirrors [ 59.078302] md: data integrity enabled on md0 [..] [ 90.489209] md0: incompatible integrity profile for pmem1m [..] [ 205.671277] md: super_written gets error=-5 [ 205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device. [ 205.677386] md/raid1:md0: Operation continuing on 1 devices. [ 205.683037] RAID1 conf printout: [ 205.684699] --- wd:1 rd:2 [ 205.685972] disk 0, wo:0, o:1, dev:pmem0s [ 205.687562] disk 1, wo:1, o:1, dev:pmem1s [ 205.691717] md: recovery of RAID array md0 Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister") Cc: Cc: Mike Snitzer Reported-by: NeilBrown Signed-off-by: Dan Williams Signed-off-by: NeilBrown --- drivers/md/md.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/md/md.h') diff --git a/drivers/md/md.h b/drivers/md/md.h index fc6f7bb..a491e22 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -660,7 +660,7 @@ extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev); extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors); extern int md_check_no_bitmap(struct mddev *mddev); extern int md_integrity_register(struct mddev *mddev); -extern void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev); +extern int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev); extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale); extern void mddev_init(struct mddev *mddev); -- cgit v1.1