summaryrefslogtreecommitdiffstats
path: root/fs/xfs/xfs_inode_buf.h
diff options
context:
space:
mode:
authorDave Chinner <dchinner@redhat.com>2013-08-28 21:22:47 +1000
committerBen Myers <bpm@sgi.com>2013-08-30 13:44:53 -0500
commit50d5c8d8e938e3c4c0d21db9fc7d64282dc7be20 (patch)
treef3695befa5404b0abd5f1e18ddfcb59d97943401 /fs/xfs/xfs_inode_buf.h
parentb58fa554e9b940083a0691f7234c13240fc09377 (diff)
downloadop-kernel-dev-50d5c8d8e938e3c4c0d21db9fc7d64282dc7be20.zip
op-kernel-dev-50d5c8d8e938e3c4c0d21db9fc7d64282dc7be20.tar.gz
xfs: check LSN ordering for v5 superblocks during recovery
Log recovery has some strict ordering requirements which unordered or reordered metadata writeback can defeat. This can occur when an item is logged in a transaction, written back to disk, and then logged in a new transaction before the tail of the log is moved past the original modification. The result of this is that when we read an object off disk for recovery purposes, the buffer that we read may not contain the object type that recovery is expecting and hence at the end of the checkpoint being recovered we have an invalid object in memory. This isn't usually a problem, as recovery will then replay all the other checkpoints and that brings the object back to a valid and correct state, but the issue is that while the object is in the invalid state it can be flushed to disk. This results in the object verifier failing and triggering a corruption shutdown of log recover. This is correct behaviour for the verifiers - the problem is that we are not detecting that the object we've read off disk is newer than the transaction we are replaying. All metadata in v5 filesystems has the LSN of it's last modification stamped in it. This enabled log recover to read that field and determine the age of the object on disk correctly. If the LSN of the object on disk is older than the transaction being replayed, then we replay the modification. If the LSN of the object matches or is more recent than the transaction's LSN, then we should avoid overwriting the object as that is what leads to the transient corrupt state. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Diffstat (limited to 'fs/xfs/xfs_inode_buf.h')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud