UBIFS: expect corruption only in last journal head LEBs

This patch improves UBIFS recovery and teaches it to expect corruption only in the last buds. Indeed, currently we just recover all buds, which is incorrect because only the last buds can have corruptions in case of a power cut. So it is inconsistent with the rest of the recovery strategy which tries hard to distinguish between corruptions cause by power cuts and other types of corruptions. This patch also adds one quirk - a bit older UBIFS was could have corruption in the next to last bud because of the way it switched buds: when bud A is full, it first searched for the next bud B, the wrote a reference node to the log about B, and then synchronized the write-buffer of A. So we could end up with buds A and B, where B is the last, but A had corruption. The UBIFS behavior was fixed, though, so currently it always first synchronizes A's write-buffer and only after this adds B to the log. However, to be make sure that we handle unclean (after a power cut) UBIFS images belonging to older UBIFS - we need to add a quirk and keep it for some time: we need to check for the situation described above. Thankfully, it is easy to check for that situation. When UBIFS adds B to the log, it always first unmaps B, then maps it, and then syncs A's write-buffer. Thus, in that situation we can check that B is empty, in which case it is OK to have corruption in A. To check that B is empty it is enough to just read the first few bytes of the bud and compare them with 0xFFs. This quirk may be removed in a couple of years. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
author: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> 2011-05-15 13:11:00 +0300
committer: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> 2011-05-16 14:11:25 +0300
commit: 91c66083fca36cdf496e927ef8bea19e6b1bbdce (patch)
tree: 8298bc056e929e1c946b1b2d6acbcc21dd54e235 /fs
parent: cb14a18465686ea6add51b1008865b8174c28bd7 (diff)
download: op-kernel-dev-91c66083fca36cdf496e927ef8bea19e6b1bbdce.zip
op-kernel-dev-91c66083fca36cdf496e927ef8bea19e6b1bbdce.tar.gz
1 files changed, 71 insertions, 4 deletions
diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c
index 0f50fbf..6617280 100644
--- a/fs/ubifs/replay.c
+++ b/fs/ubifs/replay.c
@@ -473,6 +473,65 @@ int ubifs_validate_entry(struct ubifs_info *c,
 }
 
 /**
+ * is_last_bud - check if the bud is the last in the journal head.
+ * @c: UBIFS file-system description object
+ * @bud: bud description object
+ *
+ * This function checks if bud @bud is the last bud in its journal head. This
+ * information is then used by 'replay_bud()' to decide whether the bud can
+ * have corruptions or not. Indeed, only last buds can be corrupted by power
+ * cuts. Returns %1 if this is the last bud, and %0 if not.
+ */
+static int is_last_bud(struct ubifs_info *c, struct ubifs_bud *bud)
+{
+	struct ubifs_jhead *jh = &c->jheads[bud->jhead];
+	struct ubifs_bud *next;
+	uint32_t data;
+	int err;
+
+	if (list_is_last(&bud->list, &jh->buds_list))
+		return 1;
+
+	/*
+	 * The following is a quirk to make sure we work correctly with UBIFS
+	 * images used with older UBIFS.
+	 *
+	 * Normally, the last bud will be the last in the journal head's list
+	 * of bud. However, there is one exception if the UBIFS image belongs
+	 * to older UBIFS. This is fairly unlikely: one would need to use old
+	 * UBIFS, then have a power cut exactly at the right point, and then
+	 * try to mount this image with new UBIFS.
+	 *
+	 * The exception is: it is possible to have 2 buds A and B, A goes
+	 * before B, and B is the last, bud B is contains no data, and bud A is
+	 * corrupted at the end. The reason is that in older versions when the
+	 * journal code switched the next bud (from A to B), it first added a
+	 * log reference node for the new bud (B), and only after this it
+	 * synchronized the write-buffer of current bud (A). But later this was
+	 * changed and UBIFS started to always synchronize the write-buffer of
+	 * the bud (A) before writing the log reference for the new bud (B).
+	 *
+	 * But because older UBIFS always synchronized A's write-buffer before
+	 * writing to B, we can recognize this exceptional situation but
+	 * checking the contents of bud B - if it is empty, then A can be
+	 * treated as the last and we can recover it.
+	 *
+	 * TODO: remove this piece of code in a couple of years (today it is
+	 * 16.05.2011).
+	 */
+	next = list_entry(bud->list.next, struct ubifs_bud, list);
+	if (!list_is_last(&next->list, &jh->buds_list))
+		return 0;
+
+	err = ubi_read(c->ubi, next->lnum, (char *)&data,
+		       next->start, 4);
+	if (err)
+		return 0;
+
+	return data == 0xFFFFFFFF;
+}
+
+/**
  * replay_bud - replay a bud logical eraseblock.
  * @c: UBIFS file-system description object
  * @b: bud entry which describes the bud
@@ -483,15 +542,23 @@ int ubifs_validate_entry(struct ubifs_info *c,
  */
 static int replay_bud(struct ubifs_info *c, struct bud_entry *b)
 {
+	int is_last = is_last_bud(c, b->bud);
 	int err = 0, used = 0, lnum = b->bud->lnum, offs = b->bud->start;
-	int jhead = b->bud->jhead;
 	struct ubifs_scan_leb *sleb;
 	struct ubifs_scan_node *snod;
 
-	dbg_mnt("replay bud LEB %d, head %d, offs %d", lnum, jhead, offs);
+	dbg_mnt("replay bud LEB %d, head %d, offs %d, is_last %d",
+		lnum, b->bud->jhead, offs, is_last);
 
-	if (c->need_recovery)
-		sleb = ubifs_recover_leb(c, lnum, offs, c->sbuf, jhead != GCHD);
+	if (c->need_recovery && is_last)
+		/*
+		 * Recover only last LEBs in the journal heads, because power
+		 * cuts may cause corruptions only in these LEBs, because only
+		 * these LEBs could possibly be written to at the power cut
+		 * time.
+		 */
+		sleb = ubifs_recover_leb(c, lnum, offs, c->sbuf,
+					 b->bud->jhead != GCHD);
 	else
 		sleb = ubifs_scan(c, lnum, offs, c->sbuf, 0);
 	if (IS_ERR(sleb))
author	Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-15 13:11:00 +0300
committer	Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-16 14:11:25 +0300
commit	91c66083fca36cdf496e927ef8bea19e6b1bbdce (patch)
tree	8298bc056e929e1c946b1b2d6acbcc21dd54e235 /fs
parent	cb14a18465686ea6add51b1008865b8174c28bd7 (diff)
download	op-kernel-dev-91c66083fca36cdf496e927ef8bea19e6b1bbdce.zip op-kernel-dev-91c66083fca36cdf496e927ef8bea19e6b1bbdce.tar.gz