summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/scrub.c
Commit message (Collapse)AuthorAgeFilesLines
...
| * Btrfs: check return value of bio_alloc() properlyTsutomu Itoh2012-04-121-0/+4
| | | | | | | | | | | | | | | | bio_alloc() has the possibility of returning NULL. So, it is necessary to check the return value. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-03-301-372/+1035
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes and features from Chris Mason: "We've merged in the error handling patches from SuSE. These are already shipping in the sles kernel, and they give btrfs the ability to abort transactions and go readonly on errors. It involves a lot of churn as they clarify BUG_ONs, and remove the ones we now properly deal with. Josef reworked the way our metadata interacts with the page cache. page->private now points to the btrfs extent_buffer object, which makes everything faster. He changed it so we write an whole extent buffer at a time instead of allowing individual pages to go down,, which will be important for the raid5/6 code (for the 3.5 merge window ;) Josef also made us more aggressive about dropping pages for metadata blocks that were freed due to COW. Overall, our metadata caching is much faster now. We've integrated my patch for metadata bigger than the page size. This allows metadata blocks up to 64KB in size. In practice 16K and 32K seem to work best. For workloads with lots of metadata, this cuts down the size of the extent allocation tree dramatically and fragments much less. Scrub was updated to support the larger block sizes, which ended up being a fairly large change (thanks Stefan Behrens). We also have an assortment of fixes and updates, especially to the balancing code (Ilya Dryomov), the back ref walker (Jan Schmidt) and the defragging code (Liu Bo)." Fixed up trivial conflicts in fs/btrfs/scrub.c that were just due to removal of the second argument to k[un]map_atomic() in commit 7ac687d9e047. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (75 commits) Btrfs: update the checks for mixed block groups with big metadata blocks Btrfs: update to the right index of defragment Btrfs: do not bother to defrag an extent if it is a big real extent Btrfs: add a check to decide if we should defrag the range Btrfs: fix recursive defragment with autodefrag option Btrfs: fix the mismatch of page->mapping Btrfs: fix race between direct io and autodefrag Btrfs: fix deadlock during allocating chunks Btrfs: show useful info in space reservation tracepoint Btrfs: don't use crc items bigger than 4KB Btrfs: flush out and clean up any block device pages during mount btrfs: disallow unequal data/metadata blocksize for mixed block groups Btrfs: enhance superblock sanity checks Btrfs: change scrub to support big blocks Btrfs: minor cleanup in scrub Btrfs: introduce common define for max number of mirrors Btrfs: fix infinite loop in btrfs_shrink_device() Btrfs: fix memory leak in resolver code Btrfs: allow dup for data chunks in mixed mode Btrfs: validate target profiles only if we are going to use them ...
| * Merge git://git.jan-o-sch.net/btrfs-unstable into for-linusChris Mason2012-03-281-2/+2
| |\ | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/transaction.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * Btrfs: fix regression in scrub path resolvingJan Schmidt2012-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 4692cf58 we introduced new backref walking code for btrfs. This assumes we're searching live roots, which requires a transaction context. While scrubbing, however, we must not join a transaction because this could deadlock with the commit path. Additionally, what scrub really wants to do is resolving a logical address in the commit root it's currently checking. This patch adds support for logical to path resolving on commit roots and makes scrub use that. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * | Merge branch 'error-handling' into for-linusChris Mason2012-03-281-11/+13
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/ctree.c fs/btrfs/disk-io.c fs/btrfs/extent-tree.c fs/btrfs/extent_io.c fs/btrfs/extent_io.h fs/btrfs/inode.c fs/btrfs/scrub.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| | * | btrfs: replace many BUG_ONs with proper error handlingJeff Mahoney2012-03-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs currently handles most errors with BUG_ON. This patch is a work-in- progress but aims to handle most errors other than internal logic errors and ENOMEM more gracefully. This iteration prevents most crashes but can run into lockups with the page lock on occasion when the timing "works out." Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| | * | btrfs: enhance transaction abort infrastructureJeff Mahoney2012-03-221-2/+6
| | | | | | | | | | | | | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| | * | btrfs: return void in functions without error conditionsJeff Mahoney2012-03-221-26/+10
| | |/ | | | | | | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com>
| * | Btrfs: change scrub to support big blocksStefan Behrens2012-03-271-340/+1013
| | | | | | | | | | | | | | | | | | | | | | | | | | | Scrub used to be coded for nodesize == leafsize == sectorsize == PAGE_SIZE. This is now changed to support sizes for nodesize and leafsize which are N * PAGE_SIZE. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: minor cleanup in scrubStefan Behrens2012-03-271-36/+25
| |/ | | | | | | | | | | | | Just a minor cleanup commit in preparation for the big block changes. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | btrfs: remove the second argument of k[un]map_atomic()Cong Wang2012-03-201-4/+4
|/ | | | Signed-off-by: Cong Wang <amwang@redhat.com>
* btrfs: don't check DUP chunks twiceArne Jansen2012-02-151-3/+5
| | | | | | | | | | | Because scrub enumerates the dev extent tree to find the chunks to scrub, it currently finds each DUP chunk twice and also scrubs it twice. This patch makes sure that scrub_chunk only checks that part of the chunk the dev extent has been found for. This only changes the behaviour for DUP chunks. Reported-and-tested-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Arne Jansen <sensille@gmx.net>
* Merge branch 'integrity-check-patch-v2' of ↵Chris Mason2012-01-161-2/+3
|\ | | | | | | | | | | | | | | | | | | git://btrfs.giantdisaster.de/git/btrfs into integration Conflicts: fs/btrfs/ctree.h fs/btrfs/super.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: integrate integrity check module into btrfsStefan Behrens2011-12-211-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last part of the patch series. It modifies the btrfs code to use the integrity check module if configured to do so with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set, the only effective change is that code is added that handles the mount option to activate the integrity check. If the mount option is set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code complains in the log and the mount fails with EINVAL. Add the mount option to activate the usage of the integrity check code. Add invocation of btrfs integrity check code init and cleanup function on mount and umount, respectively. Add hook to call btrfs integrity check code version of submit_bh/submit_bio. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
* | Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into ↵Chris Mason2012-01-161-3/+4
|\ \ | |/ |/| | | integration
| * Btrfs: new backref walking codeJan Schmidt2012-01-051-3/+4
| | | | | | | | | | | | | | | | | | | | The old backref iteration code could only safely be used on commit roots. Besides this limitation, it had bugs in finding the roots for these references. This commit replaces large parts of it by btrfs_find_all_roots() which a) really finds all roots and the correct roots, b) works correctly under heavy file system load, c) considers delayed refs. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* | Btrfs: fix num_workers_starting bug and other bugs in async threadJosef Bacik2011-12-151-2/+6
|/ | | | | | | | | | | | | | | | | Al pointed out we have some random problems with the way we account for num_workers_starting in the async thread stuff. First of all we need to make sure to decrement num_workers_starting if we fail to start the worker, so make __btrfs_start_workers do this. Also fix __btrfs_start_workers so that it doesn't call btrfs_stop_workers(), there is no point in stopping everybody if we failed to create a worker. Also check_pending_worker_creates needs to call __btrfs_start_work in it's work function since it already increments num_workers_starting. People only start one worker at a time, so get rid of the num_workers argument everywhere, and make btrfs_queue_worker a void since it will always succeed. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* btrfs scrub: handle -ENOMEM from init_ipath()Dan Carpenter2011-11-301-0/+5
| | | | | | init_ipath() can return an ERR_PTR(-ENOMEM). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
* btrfs: Fix up 32/64-bit compatibility for new ioctlsJeff Mahoney2011-11-201-1/+1
| | | | | | | | | | | | | | This patch casts to unsigned long before casting to a pointer and fixes the following warnings: fs/btrfs/extent_io.c:2289:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] fs/btrfs/ioctl.c:2933:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] fs/btrfs/ioctl.c:2937:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] fs/btrfs/ioctl.c:3020:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] fs/btrfs/scrub.c:275:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] fs/btrfs/backref.c:686:27: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: handle bio_add_page failure gracefully in scrubArne Jansen2011-11-111-35/+29
| | | | | | | | | | Currently scrub fails with ENOMEM when bio_add_page fails. Unfortunately dm based targets accept only one page per bio, thus making scrub always fails. This patch just submits the current bio when an error is encountered and starts a new one. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix a potential btrfs_bio leak on scrub fixupsIlya Dryomov2011-11-061-0/+1
| | | | | | | | In case we were able to map less than we wanted (length < PAGE_SIZE clause is true) btrfs_bio is still allocated and we have to free it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix the new inspection ioctls for 32 bit compatChris Mason2011-11-061-1/+1
| | | | | | | | The new ioctls to follow backrefs are not clean for 32/64 bit compat. This reworks them for u64s everywhere. They are brand new, so there are no problems with changing the interface now. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge git://git.jan-o-sch.net/btrfs-unstable into integrationChris Mason2011-11-061-43/+433
|\ | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/Makefile fs/btrfs/extent_io.c fs/btrfs/extent_io.h fs/btrfs/scrub.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: integrating raid-repair and scrub-fixup-nodatasumJan Schmidt2011-09-291-25/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | This ties nodatasum fixup in scrub together with raid repair patches. While both series are working fine alone, scrub will report uncorrectable errors if they occur in a nodatasum extent *and* the page is in the page cache. Previously, we would have triggered readpage to find good data and do the repair. However, readpage wouldn't read anything in the case where the page is up to date in the cache. So, we simply take that good data we have and call repair_io_failure directly (unless the page in the cache is dirty). Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs: btrfs_multi_bio replaced with btrfs_bioJan Schmidt2011-09-291-10/+10
| | | | | | | | | | | | | | | | | | btrfs_bio is a bio abstraction able to split and not complete after the last bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio tracks the mirror_num used to read data which can be used for error correction purposes. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs scrub: add fixup code for errors on nodatasum filesJan Schmidt2011-09-291-6/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | This removes a FIXME comment and introduces the first part of nodatasum fixup: It gets the corresponding inode for a logical address and triggers a regular readpage for the corrupted sector. Once we have on-the-fly error correction our error will be automatically corrected. The correction code is expected to clear the newly introduced EXTENT_DAMAGED flag, making scrub report that error as "corrected" instead of "uncorrectable" eventually. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs scrub: use int for mirror_num, not u64Jan Schmidt2011-09-291-4/+4
| | | | | | | | | | | | the rest of the code uses int mirror_num, and so should scrub Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs scrub: bugfix: mirror_num off by oneJan Schmidt2011-09-291-6/+6
| | | | | | | | | | | | | | | | Fix the mirror_num determination in scrub_stripe. The rest of the scrub code did not use mirror_num for anything important and that error went unnoticed. The nodatasum fixup patch of this set depends on a correct mirror_num. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs scrub: print paths of corrupted filesJan Schmidt2011-09-291-6/+163
| | | | | | | | | | | | | | | | | | While scrubbing, we may encounter various errors. Previously, a logical address was printed to the log only. Now, all paths belonging to that address are resolved and printed separately. That should work for hardlinks as well as reflinks. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs scrub: added unverified_errorsJan Schmidt2011-09-291-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | In normal operation, scrub is reading data sequentially in large portions. In case of an i/o error, we try to find the corrupted area(s) by issuing page sized read requests. With this commit we increment the unverified_errors counter if all of the small size requests succeed. Userland patches carrying such conspicous events to the administrator should already be around. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* | Merge branch 'for-chris' of git://github.com/sensille/linux into integrationChris Mason2011-11-061-62/+50
|\ \ | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/ctree.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | btrfs: use readahead API for scrubArne Jansen2011-10-021-62/+50
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scrub uses a simple tree-enumeration to bring the relevant portions of the extent- and csum-tree into the page cache before starting the scrub-I/O. This is now replaced by using the new readahead-API. During readahead the scrub is being accounted as paused, so it won't hold off transaction commits. This change raises the average disk bandwith utilisation on my test volume from 70% to 90%. On another volume, the time for a test run went down from 89s to 43s. Changes v5: - reada1/2 are now of type struct reada_control * Signed-off-by: Arne Jansen <sensille@gmx.net>
* | btrfs: separate superblock items out of fs_infoDavid Sterba2011-11-061-1/+1
|/ | | | | | | | | | | | fs_info has now ~9kb, more than fits into one page. This will cause mount failure when memory is too fragmented. Top space consumers are super block structures super_copy and super_for_commit, ~2.8kb each. Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64) Add a wrapper for freeing fs_info and all of it's dynamically allocated members. Signed-off-by: David Sterba <dsterba@suse.cz>
* btrfs: remove unneeded includes from scrub.cArne Jansen2011-06-101-6/+0
| | | | Signed-off-by: Arne Jansen <sensille@gmx.net>
* btrfs: reinitialize scrub workersArne Jansen2011-06-101-1/+5
| | | | | | | | Scrub starts the workers each time a scrub starts and stops them after it finished. This patch adds an initialization for the workers before each start, otherwise the workers behave strangely. Signed-off-by: Arne Jansen <sensille@gmx.net>
* btrfs: scrub: errors in tree enumerationArne Jansen2011-06-101-23/+34
| | | | | | | | due to the semantics of btrfs_search_slot the path can point to an invalid slot when ret > 0. This condition went unnoticed, which in turn could have led to an incomplete scrubbing. Signed-off-by: Arne Jansen <sensille@gmx.net>
* btrfs: add helper for fs_info->closingDavid Sterba2011-06-041-1/+1
| | | | | | | wrap checking of filesystem 'closing' flag and fix a few missing memory barriers. Signed-off-by: David Sterba <dsterba@suse.cz>
* btrfs: scrub: add explicit pluggingArne Jansen2011-06-041-3/+4
| | | | | | | | With the removal of the implicit plugging scrub ends up doing more and smaller I/O than necessary. This patch adds explicit plugging per chunk. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* btrfs: scrub: don't reuse bios and pagesArne Jansen2011-06-041-49/+65
| | | | | | | | | | | | The current scrub implementation reuses bios and pages as often as possible, allocating them only on start and releasing them when finished. This leads to more problems with the block layer than it's worth. The elevator gets confused when there are more pages added to the bio than bi_size suggests. This patch completely rips out the reuse of bios and pages and allocates them freshly for each submit. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Maosn <chris.mason@oracle.com>
* btrfs scrub: don't coalesce pages that are logically discontiguousArne Jansen2011-05-261-1/+2
| | | | | | | | | scrub_page collects several pages into one bio as long as they are physically contiguous. As we only save one logical address for the whole bio, don't collect pages that are physically contiguous but logically discontiguous. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge branch 'for-chris' of ↵Chris Mason2011-05-231-5/+5
| | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arne/btrfs-unstable-arne into inode_numbers Conflicts: fs/btrfs/Makefile fs/btrfs/ctree.h fs/btrfs/volumes.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
* btrfs: add readonly flagArne Jansen2011-05-121-10/+13
| | | | | | setting the readonly flag prevents writes in case an error is detected Signed-off-by: Arne Jansen <sensille@gmx.net>
* btrfs scrub: make fixups syncIlya Dryomov2011-05-121-207/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs scrub - make fixups sync, don't reuse fixup bios Fixups are already sync for csum failures, this patch makes them sync for EIO case as well. Fixups are now sharing pages with the parent sbio - instead of allocating a separate page to do a fixup we grab the page from the sbio buffer. Fixup bios are no longer reused. struct fixup is no longer needed, instead pass [sbio pointer, index]. Originally this was added to look at the possibility of sharing the code between drive swap and scrub, but it actually fixes a serious bug in scrub code where errors that could be corrected were ignored and reported as uncorrectable. btrfs scrub - restore bios properly after media errors The current code reallocates a bio after a media error. This is a temporary measure introduced in v3 after a serious problem related to bio reuse was found in v2 of scrub patchset. Basically we did not reset bv_offset and bv_len fields of the bio_vec structure. They are changed in case I/O error happens, for example, at offset 512 or 1024 into the page. Also bi_flags field wasn't properly setup before reusing the bio. Signed-off-by: Arne Jansen <sensille@gmx.net>
* btrfs: scrubArne Jansen2011-05-121-0/+1492
This adds an initial implementation for scrub. It works quite straightforward. The usermode issues an ioctl for each device in the fs. For each device, it enumerates the allocated device chunks. For each chunk, the contained extents are enumerated and the data checksums fetched. The extents are read sequentially and the checksums verified. If an error occurs (checksum or EIO), a good copy is searched for. If one is found, the bad copy will be rewritten. All enumerations happen from the commit roots. During a transaction commit, the scrubs get paused and afterwards continue from the new roots. This commit is based on the series originally posted to linux-btrfs with some improvements that resulted from comments from David Sterba, Ilya Dryomov and Jan Schmidt. Signed-off-by: Arne Jansen <sensille@gmx.net>
OpenPOWER on IntegriCloud