summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/volumes.c
Commit message (Collapse)AuthorAgeFilesLines
* Btrfs: Throttle for async bio submits higher up the chainChris Mason2008-09-251-6/+0
| | | | | | | | | | | | | | | The current code waits for the count of async bio submits to get below a given threshold if it is too high right after adding the latest bio to the work queue. This isn't optimal because the caller may have sequential adjacent bios pending they are waiting to send down the pipe. This changeset requires the caller to wait on the async bio count, and changes the async checksumming submits to wait for async bios any time they self throttle. The end result is much higher sequential throughput. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Wait for async bio submissions to make some progress at queue timeChris Mason2008-09-251-1/+17
| | | | | | | | Before, the btrfs bdi congestion function was used to test for too many async bios. This keeps that check to throttle pdflush, but also adds a check while queuing bios. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Count async bios separately from async checksum work itemsChris Mason2008-09-251-3/+3
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Fix the multi-bio code to save the original bio for completionChris Mason2008-09-251-1/+10
| | | | | | | | | | | | | | | | The multi-bio code is responsible for duplicating blocks in raid1 and single spindle duplication. It has counters to make sure all of the locations for a given extent are properly written before io completion is returned to the higher layers. But, it didn't always complete the same bio it was given, sometimes a clone was completed instead. This lead to problems with the async work queues because they saved a pointer to the bio in a struct off bi_private. The fix is to remember the original bio and only complete that one. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Hold a reference on bios during submit_bio, add some extra bio checksChris Mason2008-09-251-1/+9
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: implement memory reclaim for leaf reference cacheYan2008-09-251-1/+0
| | | | | | | | | | | | | | The memory reclaiming issue happens when snapshot exists. In that case, some cache entries may not be used during old snapshot dropping, so they will remain in the cache until umount. The patch adds a field to struct btrfs_leaf_ref to record create time. Besides, the patch makes all dead roots of a given snapshot linked together in order of create time. After a old snapshot was completely dropped, we check the dead root list and remove all cache entries created before the oldest dead root in the list. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add locking around volume management (device add/remove/balance)Chris Mason2008-09-251-14/+44
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Replace the transaction work queue with kthreadsChris Mason2008-09-251-4/+8
| | | | | | | This creates one kthread for commits and one kthread for deleting old snapshots. All the work queues are removed. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Replace the big fs_mutex with a collection of other locksChris Mason2008-09-251-6/+13
| | | | | | | | Extent alloctions are still protected by a large alloc_mutex. Objectid allocations are covered by a objectid mutex Other btree operations are protected by a lock on individual btree nodes Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add a thread pool just for submit_bioChris Mason2008-09-251-1/+2
| | | | | | | | If a bio submission is after a lock holder waiting for the bio on the work queue, it is possible to deadlock. Move the bios into their own pool. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add async worker threads for pre and post IO checksummingChris Mason2008-09-251-5/+157
| | | | | | | | | | | | | | | | | | | | Btrfs has been using workqueues to spread the checksumming load across other CPUs in the system. But, workqueues only schedule work on the same CPU that queued the work, giving them a limited benefit for systems with higher CPU counts. This code adds a generic facility to schedule work with pools of kthreads, and changes the bio submission code to queue bios up. The queueing is important to make sure large numbers of procs on the system don't turn streaming workloads into random workloads by sending IO down concurrently. The end result of all of this is much higher performance (and CPU usage) when doing checksumming on large machines. Two worker pools are created, one for writes and one for endio processing. The two could deadlock if we tried to service both from a single pool. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Allocator fix variety packChris Mason2008-09-251-8/+4
| | | | | | | | | | | | | | * Force chunk allocation when find_free_extent has to do a full scan * Record the max key at the start of defrag so it doesn't run forever * Block groups might not be contiguous, make a forward search for the next block group in extent-tree.c * Get rid of extra checks for total fs size * Fix relocate_one_reference to avoid relocating the same file data block twice when referenced by an older transaction * Use the open device count when allocating chunks so that we don't try to allocate from devices that don't exist Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Use kzalloc on the fs_devices allocationChris Mason2008-09-251-2/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Handle transid == 0 while opening devicesChris Mason2008-09-251-1/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Fix btrfs_open_devices to deal with changes since the scan ioctlsChris Mason2008-09-251-11/+59
| | | | | | | Devices can change after the scan ioctls are done, and btrfs_open_devices needs to be able to verify them as they are opened and used by the FS. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add mount -o degraded to allow mounts to continue with missing devicesChris Mason2008-09-251-78/+201
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Handle write errors on raid1 and raid10Chris Mason2008-09-251-3/+8
| | | | | | | | | | | | When duplicate copies exist, writes are allowed to fail to one of those copies. This changeset includes a few changes that allow the FS to continue even when some IOs fail. It also adds verification of the parent generation number for btree blocks. This generation is stored in the pointer to a block, and it ensures that missed writes to are detected. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Chunk relocation fine tuning, and add a few printks to show progressChris Mason2008-09-251-0/+2
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Only open block devices once during mount -o subvol=Chris Mason2008-09-251-0/+3
| | | | | | | btrfs_open_devices needed a check to see if the device was already open. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add support for online device removalChris Mason2008-09-251-9/+212
| | | | | | | | | | | | | This required a few structural changes to the code that manages bdev pointers: The VFS super block now gets an anon-bdev instead of a pointer to the lowest bdev. This allows us to avoid swapping the super block bdev pointer around at run time. The code to read in the super block no longer goes through the extent buffer interface. Things got ugly keeping the mapping constant. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Compile warning fixup in volume.cChris Mason2008-09-251-1/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Tune stripe selection for raid1 and raid10Chris Mason2008-09-251-10/+7
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Deal with failed writes in mirrored configurationsChris Mason2008-09-251-3/+14
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Drop some verbose printksChris Mason2008-09-251-2/+0
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add balance ioctl to restripe the chunksChris Mason2008-09-251-9/+106
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add new ioctl to add devicesChris Mason2008-09-251-0/+75
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Make the resizer work based on shrinking and growing devicesChris Mason2008-09-251-12/+312
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add failure handling for read_sys_arrayChris Mason2008-09-251-7/+9
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Fix btrfs_get_extent and get_block corner cases, and disable O_DIRECT readsChris Mason2008-09-251-1/+1
| | | | | | | The generic O_DIRECT code assumes all the bios have the same bdev, which isn't true for multi-device btrfs. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add a special device list for chunk allocationsChris Mason2008-09-251-5/+10
| | | | | | | This allows other code that needs to walk every device in the FS to do so without locking against allocations. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Simplify device selection for mirrored readsChris Mason2008-09-251-16/+7
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Make an unplug function that doesn't unplug every spindleChris Mason2008-09-251-22/+57
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add 1MB to the min_free in alloc_chunkChris Mason2008-09-251-0/+3
| | | | | | This properly reflects the first 1MB we skip at the start of the device Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Fix chunk allocation when some devices don't have enough room for stripesChris Mason2008-09-251-16/+29
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Calculate appropriate chunk sizes for both small and large filesystemsChris Mason2008-09-251-7/+61
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add support for labels in the super blockChris Mason2008-09-251-8/+9
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Check device uuids along with devidsChris Mason2008-09-251-7/+23
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Avoid 64 bit div for RAID10Chris Mason2008-09-251-1/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Use the extent map cache to find the logical disk block during data ↵Chris Mason2008-09-251-0/+3
| | | | | | | | | | | | | | | | | | | retries The data read retry code needs to find the logical disk block before it can resubmit new bios. But, finding this block isn't allowed to take the fs_mutex because that will deadlock with a number of different callers. This changes the retry code to use the extent map cache instead, but that requires the extent map cache to have the extent we're looking for. This is a problem because btrfs_drop_extent_cache just drops the entire extent instead of the little tiny part it is invalidating. The bulk of the code in this patch changes btrfs_drop_extent_cache to invalidate only a portion of the extent cache, and changes btrfs_get_extent to deal with the results. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add RAID10 supportChris Mason2008-09-251-5/+41
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add chunk uuids and update multi-device back referencesChris Mason2008-09-251-26/+50
| | | | | | | | | | | | | | | | | | | | Block headers now store the chunk tree uuid Chunk items records the device uuid for each stripes Device extent items record better back refs to the chunk tree Block groups record better back refs to the chunk tree The chunk tree format has also changed. The objectid of BTRFS_CHUNK_ITEM_KEY used to be the logical offset of the chunk. Now it is a chunk tree id, with the logical offset being stored in the offset field of the key. This allows a single chunk tree to record multiple logical address spaces, upping the number of bytes indexed by a chunk tree from 2^64 to 2^128. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: A few updates for 2.6.18 and versions older than 2.6.25Chris Mason2008-09-251-8/+7
| | | | | | | This includes fixing a missing spinlock init call that caused oops on mount for most kernels other than 2.6.25. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: bio_endio support for linux 2.6.23 and older.Miguel2008-09-251-0/+4
| | | | | | | bio_endio() changed prototype on linux 2.6.24, support older kernels using the older prototype. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Write out all super blocks on commit, and bring back proper barrier ↵Chris Mason2008-09-251-3/+5
| | | | | | support Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Retry metadata reads in the face of checksum failuresChris Mason2008-09-251-4/+35
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Change btrfs_map_block to return a structure with mappings for all stripesChris Mason2008-09-251-60/+75
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add support for duplicate blocks on a single spindleChris Mason2008-09-251-4/+28
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add support for mirroring across drivesChris Mason2008-09-251-28/+126
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Fix btrfs_fill_super to return -EINVAL when no FS foundYan2008-09-251-1/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Implement raid0 when multiple devices are presentChris Mason2008-09-251-30/+100
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
OpenPOWER on IntegriCloud