op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	dm snapshot: fix metadata corruption	Mikulas Patocka	2014-03-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 55494bf2947dccdf2 ("dm snapshot: use dm-bufio") broke snapshots. Before that 3.14-rc1 commit, loading a snapshot's list of exceptions involved reading exception areas one by one into ps->area and inserting those exceptions into the hash table. Commit 55494bf2947dccdf2 changed it so that dm-bufio with prefetch is used to load exceptions in batchs. Exceptions are loaded correctly, but ps->area is left uninitialized. When a new exception is allocated, it is stored in this uninitialized ps->area which will be written to the disk. This causes metadata corruption. Fix this corruption by copying the last area that was read via dm-bufio into ps->area. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm snapshot: use dm-bufio prefetch	Mikulas Patocka	2014-01-14	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies dm-snapshot so that it prefetches the buffers when loading the exceptions. The number of buffers read ahead is specified in the DM_PREFETCH_CHUNKS macro. The current value for DM_PREFETCH_CHUNKS (12) was found to provide the best performance on a single 15k SCSI spindle. In the future we may modify this default or make it configurable. Also, introduce the function dm_bufio_set_minimum_buffers to setup bufio's number of internal buffers before freeing happens. dm-bufio may hold more buffers if enough memory is available. There is no guarantee that the specified number of buffers will be available - if you need a guarantee, use the argument reserved_buffers for dm_bufio_client_create. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm snapshot: use dm-bufio	Mikulas Patocka	2014-01-14	1	-7/+32
\| \| \| \| \| \| \| \|	Use dm-bufio for initial loading of the exceptions. Introduce a new function dm_bufio_forget that frees the given buffer. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm snapshot: prepare for switch to using dm-bufio	Mikulas Patocka	2014-01-14	1	-12/+14
\| \| \| \| \| \| \| \| \| \| \|	Change the functions get_exception, read_exception and insert_exceptions so that ps->area is passed as an argument. This patch doesn't change any functionality, but it refactors the code to allow for a cleaner switch over to using dm-bufio. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm snapshot: call destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()	Chuansheng Liu	2014-01-07	1	-0/+1
\| \| \| \| \| \| \| \| \|	In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to call destroy_work_on_stack() which frees the debug object to pair with INIT_WORK_ONSTACK(). Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
*	dm snapshot: fix data corruption	Mikulas Patocka	2013-10-16	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Cc: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: workaround for a false positive lockdep warning	Mikulas Patocka	2013-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kernel reports a lockdep warning if a snapshot is invalidated because it runs out of space. The lockdep warning was triggered by commit 0976dfc1d0cd80a4e9dfaf87bd87 ("workqueue: Catch more locking problems with flush_work()") in v3.5. The warning is false positive. The real cause for the warning is that the lockdep engine treats different instances of md->lock as a single lock. This patch is a workaround - we use flush_workqueue instead of flush_work. This code path is not performance sensitive (it is called only on initialization or invalidation), thus it doesn't matter that we flush the whole workqueue. The real fix for the problem would be to teach the lockdep engine to treat different instances of md->lock as separate locks. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.5+
*	md: Add in export.h for files using EXPORT_SYMBOL	Paul Gortmaker	2011-10-31	1	-0/+1
\| \| \| \| \| \| \| \|	These files were getting the defines for EXPORT_SYMBOL because device.h was including module.h. But we are going to put an end to that. So add the proper export.h include now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	dm snapshot: style cleanups	Jonathan Brassow	2011-08-02	1	-2/+2
\| \| \| \| \| \| \|	Coding style cleanups. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
*	dm: use vzalloc	Joe Perches	2011-08-02	1	-2/+1
\| \| \| \| \| \| \|	Use vzalloc() instead of vmalloc()+memset(). Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm: suppress endian warnings	Alasdair G Kergon	2011-08-02	1	-33/+38
\| \| \| \| \| \| \|	Suppress sparse warnings about cpu_to_le32() by using __le32 types for on-disk data etc. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: flush disk cache when merging	Mikulas Patocka	2011-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes dm-snapshot flush disk cache when writing metadata for merging snapshot. Without cache flushing the disk may reorder metadata write and other data writes and there is a possibility of data corruption in case of power fault. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm io: use fixed initial mempool size	Mikulas Patocka	2011-05-29	1	-12/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace the arbitrary calculation of an initial io struct mempool size with a constant. The code calculated the number of reserved structures based on the request size and used a "magic" multiplication constant of 4. This patch changes it to reserve a fixed number - itself still chosen quite arbitrarily. Further testing might show if there is a better number to choose. Note that if there is no memory pressure, we can still allocate an arbitrary number of "struct io" structures. One structure is enough to process the whole request. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: persistent make metadata_wq multithreaded	Tejun Heo	2011-01-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	metadata_wq serves on-stack work items from chunk_io(). Even if multiple chunk_io() are simultaneously in progress, each is independent and queued only once, so multithreaded workqueue can be safely used. Switch metadata_wq to multithread and flush the work item instead of the workqueue in chunk_io(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm: convert workqueues to alloc_ordered	Tejun Heo	2011-01-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Convert all create[_singlethread]_work() users to the new alloc[_ordered]_workqueue(). This conversion is mechanical and doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	workqueues: s/ON_STACK/ONSTACK/	Andrew Morton	2010-10-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Silly though it is, completions and wait_queue_heads use foo_ONSTACK (COMPLETION_INITIALIZER_ONSTACK, DECLARE_COMPLETION_ONSTACK, __WAIT_QUEUE_HEAD_INIT_ONSTACK and DECLARE_WAIT_QUEUE_HEAD_ONSTACK) so I guess workqueues should do the same thing. s/INIT_WORK_ON_STACK/INIT_WORK_ONSTACK/ s/INIT_DELAYED_WORK_ON_STACK/INIT_DELAYED_WORK_ONSTACK/ Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	dm: implement REQ_FLUSH/FUA support for bio-based dm	Tejun Heo	2010-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts bio-based dm to support REQ_FLUSH/FUA instead of now deprecated REQ_HARDBARRIER. * -EOPNOTSUPP handling logic dropped. * Preflush is handled as before but postflush is dropped and replaced with passing down REQ_FUA to member request_queues. This replaces one array wide cache flush w/ member specific FUA writes. * __split_and_process_bio() now calls __clone_and_map_flush() directly for flushes and guarantees all FLUSH bio's going to targets are zero ` length. * It's now guaranteed that all FLUSH bio's which are passed onto dm targets are zero length. bio_empty_barrier() tests are replaced with REQ_FLUSH tests. * Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes. * Dropped unlikely() around REQ_FLUSH tests. Flushes are not unlikely enough to be marked with unlikely(). * Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue doesn't support cache flushing. Advertise REQ_FLUSH \| REQ_FUA capability. * Request based dm isn't converted yet. dm_init_request_based_queue() resets flush support to 0 for now. To avoid disturbing request based dm code, dm->flush_error is added for bio based dm while requested based dm continues to use dm->barrier_error. Lightly tested linear, stripe, raid1, snap and crypt targets. Please proceed with caution as I'm not familiar with the code base. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: dm-devel@redhat.com Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
*	dm snapshot: persistent use define for disk header chunk size	Tomohiro Kusumi	2010-08-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This patch fixes hard-coded value for the size of a chunk that includes disk header for persistent snapshot. It should be changed to existing macro NUM_SNAPSHOT_HDR_CHUNKS instead of using hard-coded value 1. Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: persistent annotate work_queue as on stack	Mike Snitzer	2010-02-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	chunk_io() declares its 'struct mdata_req' on the stack and then initializes its 'struct work_struct' member. Annotate the initialization of this workqueue with INIT_WORK_ON_STACK to suppress a debugobjects warning seen when CONFIG_DEBUG_OBJECTS_WORK is enabled. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm exception store: add merge specific methods	Mikulas Patocka	2009-12-10	1	-2/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add functions that decide how many consecutive chunks of snapshot to merge back into the origin next and to update the metadata afterwards. prepare_merge provides a pointer to the most recent still-to-be-merged chunk and returns how many previous ones are consecutive and can be processed together. commit_merge removes the nr_merged most-recent chunks permanently from the exception store. The number must not exceed that returned by prepare_merge. Introduce NUM_SNAPSHOT_HDR_CHUNKS to show where the snapshot header chunk is accounted for. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: move cow ref from exception store to snap core	Mike Snitzer	2009-12-10	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Store the reference to the snapshot cow device in the core snapshot code instead of each exception store. It can be accessed through the new function dm_snap_cow(). Exception stores should each now maintain a reference to their parent snapshot struct. This is cleaner and makes part of the forthcoming snapshot merge code simpler. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Cc: Mikulas Patocka <mpatocka@redhat.com>
*	dm snapshot: add allocated metadata to snapshot status	Mike Snitzer	2009-12-10	1	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Add number of sectors used by metadata to the end of the snapshot's status line. Renamed dm_exception_store_type's 'fraction_full' to 'usage'. Renamed arguments to be clearer about what is being returned. Also added 'metadata_sectors'. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: rename dm_snap_exception to dm_exception	Jon Brassow	2009-12-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	The exception structure is not necessarily just a snapshot element (especially after we pull it out of dm-snap.c). Renaming appropriately. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: avoid else clause in persistent_read_metadata	Jon Brassow	2009-12-10	1	-25/+21
\| \| \| \| \| \| \| \| \|	Minor code touch-up. We don't need the 'else'. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: use unsigned integer chunk size	Mikulas Patocka	2009-10-16	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use unsigned integer chunk size. Maximum chunk size is 512kB, there won't ever be need to use 4GB chunk size, so the number can be 32-bit. This fixes compiler failure on 32-bit systems with large block devices. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: fix on disk chunk size validation	Mikulas Patocka	2009-09-04	1	-8/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix some problems seen in the chunk size processing when activating a pre-existing snapshot. For a new snapshot, the chunk size can either be supplied by the creator or a default value can be used. For an existing snapshot, the chunk size in the snapshot header on disk should always be used. If someone attempts to load an existing snapshot and has the 'default chunk size' option set, the kernel uses its default value even when it is incorrect for the snapshot being loaded. This patch ensures the correct on-disk value is always used. Secondly, when the code does use the chunk size stored on the disk it is prudent to revalidate it, so the code can exit cleanly if it got corrupted as happened in https://bugzilla.redhat.com/show_bug.cgi?id=461506 . Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: fix header corruption race on invalidation	Mikulas Patocka	2009-09-04	1	-10/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a persistent snapshot fills up, a race can corrupt the on-disk header which causes a crash on any future attempt to activate the snapshot (typically while booting). This patch fixes the race. When the snapshot overflows, __invalidate_snapshot is called, which calls snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that calls write_header. write_header constructs the new header in the "area" location. Concurrently, an existing kcopyd job may finish, call copy_callback and commit_exception method, that goes to persistent_commit_exception. persistent_commit_exception doesn't do locking, relying on the fact that callbacks are single-threaded, but it can race with snapshot invalidation and overwrite the header that is just being written while the snapshot is being invalidated. The result of this race is a corrupted header being written that can lead to a crash on further reactivation (if chunk_size is zero in the corrupted header). The fix is to use separate memory areas for each. See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506 Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: refactor zero_disk_area to use chunk_io	Mikulas Patocka	2009-09-04	1	-19/+7
\| \| \| \| \| \| \| \| \| \| \|	Refactor chunk_io to prepare for the fix in the following patch. Pass an area pointer to chunk_io and simplify zero_disk_area to use chunk_io. No functional change. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: use barrier when writing exception store	Mikulas Patocka	2009-06-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Send barrier requests when updating the exception area. Exception area updates need to be ordered w.r.t. data writes, so that the writes are not reordered in hardware disk cache. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	block: Do away with the notion of hardsect_size	Martin K. Petersen	2009-05-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	dm snapshot: persistent fix dtr cleanup	Jonathan Brassow	2009-04-02	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \|	The persistent exception store destructor does not properly account for all conditions in which it can be called. If it is called after 'ctr' but before 'read_metadata' (e.g. if something else in 'snapshot_ctr' fails) then it will attempt to free areas of memory that haven't been allocated yet. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: move status to exception store	Jonathan Brassow	2009-04-02	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \|	Let the exception store types print out their status through the new API, rather than having the snapshot code do it. Adjust the buffer position to allow for the preceding DMEMIT in the arguments to type->status(). Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: remove dm_snap header use	Jonathan Brassow	2009-04-02	1	-29/+27
\| \| \| \| \| \| \|	Move useful functions out of dm-snap.h and stop using dm-snap.h. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm exception store: move cow pointer	Jonathan Brassow	2009-04-02	1	-5/+5
\| \| \| \| \| \| \|	Move COW device from snapshot to exception store. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm exception store: move chunk_fields	Jonathan Brassow	2009-04-02	1	-23/+24
\| \| \| \| \| \| \|	Move chunk fields from snapshot to exception store. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm exception store: introduce registry	Jonathan Brassow	2009-04-02	1	-10/+57
\| \| \| \| \| \| \|	Move exception stores into a registry. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm exception store: separate type from instance	Jonathan Brassow	2009-04-02	1	-6/+7
\| \| \| \| \| \| \|	Introduce struct dm_exception_store_type. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: extend exception store functions	Jonathan Brassow	2009-01-06	1	-16/+26
\| \| \| \| \| \| \| \| \|	Supply dm_add_exception as a callback to the read_metadata function. Add a status function ready for a later patch and name the functions consistently. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
*	dm snapshot: split out exception store implementations	Alasdair G Kergon	2009-01-06	1	-0/+694
	Move the existing snapshot exception store implementations out into separate files. Later patches will place these behind a new interface in preparation for alternative implementations. Signed-off-by: Alasdair G Kergon <agk@redhat.com>