summaryrefslogtreecommitdiffstats
path: root/include/block
Commit message (Collapse)AuthorAgeFilesLines
* block: extract bdrv_setup_io_funcs()Stefan Hajnoczi2015-04-281-0/+8
| | | | | | | | Move the code to install coroutine and aio emulation function pointers in a BlockDriver to its own function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.hStefan Hajnoczi2015-04-281-0/+4
| | | | | | | | | The dirty bitmap functions are called from the block I/O processing code. Make them visible to block_int.h users so they can be used outside block.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Ensure consistent bitmap function prototypesJohn Snow2015-04-281-6/+5
| | | | | | | | | | | | We often don't need the BlockDriverState for functions that operate on bitmaps. Remove it. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-15-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qmp: add block-dirty-bitmap-clearJohn Snow2015-04-281-0/+1
| | | | | | | | | | | | | | | | | Add bdrv_clear_dirty_bitmap and a matching QMP command, qmp_block_dirty_bitmap_clear that enables a user to reset the bitmap attached to a drive. This allows us to reset a bitmap in the event of a full drive backup. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-12-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qmp: Add support of "dirty-bitmap" sync mode for drive-backupJohn Snow2015-04-282-0/+3
| | | | | | | | | | | | | | | | For "dirty-bitmap" sync mode, the block job will iterate through the given dirty bitmap to decide if a sector needs backup (backup all the dirty clusters and skip clean ones), just as allocation conditions of "top" sync mode. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-11-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add bitmap successorsJohn Snow2015-04-281-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bitmap successor is an anonymous BdrvDirtyBitmap that is intended to be created just prior to a sensitive operation (e.g. Incremental Backup) that can either succeed or fail, but during the course of which we still want a bitmap tracking writes. On creating a successor, we "freeze" the parent bitmap which prevents its deletion, enabling, anonymization, or creating a bitmap with the same name. On success, the parent bitmap can "abdicate" responsibility to the successor, which will inherit its name. The successor will have been tracking writes during the course of the backup operation. The parent will be safely deleted. On failure, we can "reclaim" the successor from the parent, unifying them such that the resulting bitmap describes all writes occurring since the last successful backup, for instance. Reclamation will thaw the parent, but not explicitly re-enable it. BdrvDirtyBitmap operations that target a single bitmap are protected by assertions that the bitmap is not frozen and/or disabled. BdrvDirtyBitmap operations that target a group of bitmaps, such as bdrv_{set,reset}_dirty will ignore frozen/disabled drives with a conditional instead. Internal functions that enable/disable dirty bitmaps have assertions added to them to prevent modifying frozen bitmaps. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-10-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add bitmap disabled statusJohn Snow2015-04-281-0/+3
| | | | | | | | | | | | | | | | | | | | | Add a status indicating the enabled/disabled state of the bitmap. A bitmap is by default enabled, but you can lock the bitmap into a read-only state by setting disabled = true. A previous version of this patch added a QMP interface for changing the state of the bitmap, but it has since been removed for now until a use case emerges where this state must be revealed to the user. The disabled state WILL be used internally for bitmap migration and bitmap persistence. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-9-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Introduce bdrv_dirty_bitmap_granularity()John Snow2015-04-281-0/+1
| | | | | | | | | | | | | | This returns the granularity (in bytes) of dirty bitmap, which matches the QMP interface and the existing query interface. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-6-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-removeJohn Snow2015-04-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new command pair is added to manage a user created dirty bitmap. The dirty bitmap's name is mandatory and must be unique for the same device, but different devices can have bitmaps with the same names. The granularity is an optional field. If it is not specified, we will choose a default granularity based on the cluster size if available, clamped to between 4K and 64K to mirror how the 'mirror' code was already choosing granularity. If we do not have cluster size info available, we choose 64K. This code has been factored out into a helper shared with block/mirror. This patch also introduces the 'block_dirty_bitmap_lookup' helper, which takes a device name and a dirty bitmap name and validates the lookup, returning NULL and setting errp if there is a problem with either field. This helper will be re-used in future patches in this series. The types added to block-core.json will be re-used in future patches in this series, see: 'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}' Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qmp: Ensure consistent granularity typeJohn Snow2015-04-282-2/+2
| | | | | | | | | | | | | We treat this field with a variety of different types everywhere in the code. Now it's just uint32_t. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qapi: Add optional field "name" to block dirty bitmapFam Zheng2015-04-281-1/+6
| | | | | | | | | | | | | | | | | | | | | | This field will be set for user created dirty bitmap. Also pass in an error pointer to bdrv_create_dirty_bitmap, so when a name is already taken on this BDS, it can report an error message. This is not global check, two BDSes can have dirty bitmap with a common name. Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will be used later when other QMP commands want to reference dirty bitmap by name. Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qmp: fill in the image field in BlockDeviceInfoAlberto Garcia2015-04-282-2/+2
| | | | | | | | | | | | | | | | | | | The image field in BlockDeviceInfo is supposed to contain an ImageInfo object. However that is being filled in by bdrv_query_info(), not by bdrv_block_device_info(), which is where BlockDeviceInfo is actually created. Anyone calling bdrv_block_device_info() directly will get a null image field. As a consequence of this, the HMP command 'info block -n -v' crashes QEMU. This patch moves the code that fills in that field from bdrv_query_info() to bdrv_block_device_info(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add bdrv_get_device_or_node_name()Alberto Garcia2015-04-281-0/+1
| | | | | | | | | | | | This function gets the device name associated with a BlockDriverState, or its node name if the device name is empty. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockjob: Allow nested pauseFam Zheng2015-04-281-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes block_job_pause to increase the pause counter and block_job_resume to decrease it. The counter will allow calling block_job_pause/block_job_resume unconditionally on a job when we need to suspend the IO temporarily. From now on, each block_job_resume must be paired with a block_job_pause to keep the counter balanced. The user pause from QMP or HMP will only trigger block_job_pause once until it's resumed, this is achieved by adding a user_paused flag in BlockJob. One occurrence of block_job_resume in mirror_complete is replaced with block_job_enter which does what is necessary. In block_job_cancel, the cancel flag is good enough to instruct coroutines to quit loop, so use block_job_enter to replace the unpaired block_job_resume. Upon block job IO error, user is notified about the entering to the pause state, so this pause belongs to user pause, set the flag accordingly and expect a matching QMP resume. [Extended doc comments as suggested by Paolo Bonzini <pbonzini@redhat.com>. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* AioContext: acquire/release AioContext during aio_pollPaolo Bonzini2015-04-281-6/+7
| | | | | | | | | | | This is the first step in pushing down acquire/release, and will let rfifolock drop the contention callback feature. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* aio-posix: move pollfds to thread-local storagePaolo Bonzini2015-04-281-3/+0
| | | | | | | | | | | | | | | | By using thread-local storage, aio_poll can stop using global data during g_poll_ns. This will make it possible to drop callbacks from rfifolock. [Moved npfd = 0 assignment to end of walking_handlers region as suggested by Paolo. This resolves the assert(npfd == 0) assertion failure in pollfds_cleanup(). --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: Set block size to BDRV_SECTOR_SIZEMax Reitz2015-03-181-2/+2
| | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-13-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* nbd: Fix potential signed overflow issuesMax Reitz2015-03-181-2/+2
| | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-11-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* nbd: Handle blk_getlength() failureMax Reitz2015-03-181-1/+2
| | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-9-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* block: Drop bdrv_findFam Zheng2015-03-161-1/+0
| | | | | | | | | All callers are converted, so drop it. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1425296209-1476-5-git-send-email-famz@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: add bdrv functions for geometry and blocksizeEkaterina Tumanova2015-03-102-0/+28
| | | | | | | | | | | | Add driver functions for geometry and blocksize detection Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424087278-49393-2-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Allow creation with refcount order != 4Max Reitz2015-03-101-0/+1
| | | | | | | | | | | Add a creation option to qcow2 for setting the refcount order of images to be created, and respect that option's value. This breaks some test outputs, fix them. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* sheepdog: selectable object size supportTeruaki Ishizaki2015-03-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Previously, qemu block driver of sheepdog used hard-coded VDI object size. This patch enables users to handle VDI object size. When you start qemu, you don't need to specify additional command option. But when you create the VDI which doesn't have default object size with qemu-img command, you specify object_size option. If you want to create a VDI of 8MB object size, you need to specify following command option. # qemu-img create -o object_size=8M sheepdog:test1 100M In addition, when you don't specify qemu-img command option, a default value of sheepdog cluster is used for creating VDI. # qemu-img create sheepdog:test2 100M Signed-off-by: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp> Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* coroutine: Clean up qemu_coroutine_enter()Kevin Wolf2015-03-091-0/+1
| | | | | | | | | | | | | qemu_coroutine_enter() is now the only user of coroutine_swap(). Both functions are short, so inline it. Also, using COROUTINE_YIELD is now even more confusing because this code is never called during qemu_coroutine_yield() any more. In fact, this value is never read back, so we can just introduce a new COROUTINE_ENTER which documents the purpose of the task switch better. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
* block: Forbid bdrv_set_aio_context outside BQLFam Zheng2015-02-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Even if the caller has both the old and the new AioContext's, there can be a deadlock, due to the leading bdrv_drain_all. Suppose there are four io threads (A, B, A0, B0) with A and B owning a BDS for each (bs_a, bs_b); Now A wants to move bs_a to iothread A0, and B wants to move bs_b to B0, at the same time: iothread A iothread B -------------------------------------------------------------------------- aio_context_acquire(A0) /* OK */ aio_context_acquire(B0) /* OK */ bdrv_set_aio_context(bs_a, A0) bdrv_set_aio_context(bs_b, B0) -> bdrv_drain_all() -> bdrv_drain_all() -> acquire A /* OK */ -> acquire A /* blocked */ -> acquire B /* blocked */ -> acquire B ... ... Deadlock happens because A is waiting for B, and B is waiting for A. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1423969591-23646-2-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* block: Remove "growable" from BDSMax Reitz2015-02-161-3/+0
| | | | | | | | | | | | Now that request clamping is done in the BlockBackend, the "growable" field can be removed from the BlockDriverState. All BDSs are now treated as being "growable" (that is, they are allowed to grow; they are not necessarily actually able to). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: Add Error parameter to bdrv_find_protocol()Max Reitz2015-02-161-1/+2
| | | | | | | | | | | | | The argument given to bdrv_find_protocol() is just a file name, which makes it difficult for the caller to reconstruct what protocol bdrv_find_protocol() was hoping to find. This patch adds an Error parameter to that function to solve this issue. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-4-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* nbd: Drop BDS backpointerMax Reitz2015-02-161-1/+0
| | | | | | | | | | | | | | Before this patch, the "opaque" pointer in an NBD BDS points to a BDRVNBDState, which contains an NbdClientSession object, which in turn contains a pointer to the BDS. This pointer may become invalid due to bdrv_swap(), so drop it, and instead pass the BDS directly to the nbd-client.c functions which then retrieve the NbdClientSession object from there. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1423256778-3340-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: New bdrv_add_key(), convert monitor to use itMarkus Armbruster2015-02-061-0/+1
| | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1422524221-8566-4-git-send-email-armbru@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: introduce BDRV_REQUEST_MAX_SECTORSPeter Lieven2015-02-061-0/+3
| | | | | | | | | | | | | | | | we check and adjust request sizes at several places with sometimes inconsistent checks or default values: INT_MAX INT_MAX >> BDRV_SECTOR_BITS UINT_MAX >> BDRV_SECTOR_BITS SIZE_MAX >> BDRV_SECTOR_BITS This patches introdocues a macro for the maximal allowed sectors per request and uses it at several places. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: Improve error messagesMax Reitz2015-02-061-1/+1
| | | | | | | | | | | | | | | | | | | This patch makes use of the Error object for nbd_receive_negotiate() so that errors during negotiation look nicer. Furthermore, this patch adds an additional error message if the received magic was wrong, but would be correct for the other protocol version, respectively: So if an export name was specified, but the NBD server magic corresponds to an old handshake, this condition is explicitly signaled to the user, and vice versa. As these messages are now part of the "Could not open image" error message, additional filtering has to be employed in iotest 083, which this patch does as well. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add event when disk usage exceeds thresholdFrancesco Romani2015-02-062-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. To let the guest run smoothly and be not unnecessarily paused, oVirt sets a disk usage threshold (so called 'high water mark') based on the occupation of the device, and automatically extends the image once the threshold is reached or exceeded. In order to detect the crossing of the threshold, oVirt has no choice but aggressively polling the QEMU monitor using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: deployments with hundreds of VMs are no longer rare. To fix this, this patch adds: * A new monitor command `block-set-write-threshold', to set a mark for a given block device. * A new event `BLOCK_WRITE_THRESHOLD', to report if a block device usage exceeds the threshold. * A new `write_threshold' field into the `BlockDeviceInfo' structure, to report the configured threshold. This will allow the managing application to use smarter and more efficient monitoring, greatly reducing the need of polling. [Updated qemu-iotests 067 output to add the new 'write_threshold' property. --Stefan] [Changed g_assert_false() to !g_assert() to fix the build on older glib versions. --Kevin] Signed-off-by: Francesco Romani <fromani@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add accounting for merged requestsPeter Lieven2015-02-061-0/+3
| | | | | | | Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: update string sizes for filename,backing_file,exact_filenameJeff Cody2015-01-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The string field entries 'filename', 'backing_file', and 'exact_filename' in the BlockDriverState struct are defined as 1024 bytes. However, many places that use these values accept a maximum of PATH_MAX bytes, so we have a mixture of 1024 byte and PATH_MAX byte allocations. This patch makes the BlockDriverStruct field string sizes match usage. This patch also does a few fixes related to the size that needs to happen now: * the block qapi driver is updated to use PATH_MAX bytes * the qcow and qcow2 drivers have an additional safety check * the block vvfat driver is updated to use PATH_MAX bytes for the size of backing_file, for systems where PATH_MAX is < 1024 bytes. * qemu-img uses PATH_MAX rather than 1024. These instances were not changed to be dynamically allocated, however, as the extra temporary 3K in stack usage for qemu-img does not seem worrisome. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell2015-01-141-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mostly bugfixes and cleanups from qemu-devel. Yet another small patch from the record/replay series, and a few SCSI and i386 patches as well. # gpg: Signature made Wed 14 Jan 2015 09:39:14 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: cpus: consistently use QEMU_CLOCK_VIRTUAL_RT for icount_warp_rt timer qemu-timer: rename timer_init to timer_init_tl scsi: fix cancellation when I/O was completed but DMA was not. rules.mak: Fix module build hw/scsi/lsi53c895a: add support for additional diag / debug registers qemu-common.h: optimise muldiv64 if int128 is available target-i386: do not memcpy in and out of xmm_regs target-i386: fix movntsd on big-endian hosts vl.c: fix regression when reading memory size from config file vl: Don't silently change topology when all -smp options were set vl: fix max_cpus check vl: Avoid unnecessary 'if' nesting 9pfs: changed to use event_notifier instead of qemu_pipe vl.c: fix regression when reading machine type from config file char: restore stdio echo on resume from suspend. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * qemu-timer: rename timer_init to timer_init_tlPaolo Bonzini2015-01-141-1/+1
| | | | | | | | | | | | | | timer_init is not called that often. Free the name for an equivalent of timer_new. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}Fam Zheng2015-01-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET, block-commit involves two asymmetric devices. This change is not user-visible (yet), because commit only works with device names. But once we enable backing reference in blockdev-add, or specifying node-name in block-commit command, we don't want the user to start two commit jobs on the same backing chain, which will corrupt things because of the final bdrv_swap. Before we have per category blockers, splitting this type is still better. [Resolved virtio-blk dataplane conflict by replacing BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}. They are safe since the block job runs in the same AioContext as the dataplane IOThread. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* | coroutine: drop qemu_coroutine_adjust_pool_sizePaolo Bonzini2015-01-131-10/+0
| | | | | | | | | | | | | | | | | | This is not needed anymore. The new TLS-based algorithm is adaptive. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1417518350-6167-7-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* | block: fix spoiling all dirty bitmaps by mirror and migrationVladimir Sementsov-Ogievskiy2015-01-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mirror and migration use dirty bitmaps for their purposes, and since commit [block: per caller dirty bitmap] they use their own bitmaps, not the global one. But they use old functions bdrv_set_dirty and bdrv_reset_dirty, which change all dirty bitmaps. Named dirty bitmaps series by Fam and Snow are affected: mirroring and migration will spoil all (not related to this mirroring or migration) named dirty bitmaps. This patch fixes this by adding bdrv_set_dirty_bitmap and bdrv_reset_dirty_bitmap, which change concrete bitmap. Also, to prevent such mistakes in future, old functions bdrv_(set,reset)_dirty are made static, for internal block usage. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com> CC: John Snow <jsnow@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Denis V. Lunev <den@openvz.org> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1417081246-3593-1-git-send-email-vsementsov@parallels.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* | block: JSON filenames and relative backing filesMax Reitz2015-01-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using a relative backing file name, qemu needs to know the directory of the top image file. For JSON filenames, such a directory cannot be easily determined (e.g. how do you determine the directory of a qcow2 BDS directly on top of a quorum BDS?). Therefore, do not allow relative filenames for the backing file of BDSs only having a JSON filename. Furthermore, BDS::exact_filename should be used whenever possible. If BDS::filename is not equal to BDS::exact_filename, the former will always be a JSON object. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* | block: Get full backing filename from stringMax Reitz2015-01-131-0/+3
|/ | | | | | | | | | | | | | Introduce bdrv_get_full_backing_filename_from_filename(), a function which takes the name of the backed file and a potentially relative backing filename to produce the full (absolute) backing filename. Use this function from bdrv_get_full_backing_filename(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: drop unused bdrv_clear_incoming_migration_all() prototypeStefan Hajnoczi2014-12-121-2/+0
| | | | | | | | | | The bdrv_clear_incoming_migration_all() function has not existed since commit 7ea2d269cb84ca7a2f4b7c3735634176f7c1dc35 ("block/migration: Disable cache invalidate for incoming migration"). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1418212937-22222-1-git-send-email-stefanha@redhat.com
* vmdk: Fix error for JSON descriptor file namesMax Reitz2014-12-121-0/+1
| | | | | | | | | | | | | | | If vmdk blindly tries to use path_combine() using bs->file->filename as the base file name, this will result in a bad error message for JSON file names when calling bdrv_open(). It is better to only try bs->file->exact_filename; if that is empty, bs->file->filename will be useless for path_combine() and an error should be emitted (containing bs->file->filename because desc_file_path (which is bs->file->exact_filename) is empty). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1417615043-26174-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: Make essential BlockDriver objects publicMax Reitz2014-12-101-0/+8
| | | | | | | | | | | | | | There are some block drivers which are essential to QEMU and may not be removed: These are raw, file and qcow2 (as the default non-raw format). Make their BlockDriver objects public so they can be directly referenced throughout the block layer without needing to call bdrv_find_format() and having to deal with an error at runtime, while the real problem occurred during linking (where raw, file or qcow2 were not linked into qemu). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw: Prohibit dangerous writes for probed imagesKevin Wolf2014-12-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user neglects to specify the image format, QEMU probes the image to guess it automatically, for convenience. Relying on format probing is insecure for raw images (CVE-2008-2004). If the guest writes a suitable header to the device, the next probe will recognize a format chosen by the guest. A malicious guest can abuse this to gain access to host files, e.g. by crafting a QCOW2 header with backing file /etc/shadow. Commit 1e72d3b (April 2008) provided -drive parameter format to let users disable probing. Commit f965509 (March 2009) extended QCOW2 to optionally store the backing file format, to let users disable backing file probing. QED has had a flag to suppress probing since the beginning (2010), set whenever a raw backing file is assigned. All of these additions that allow to avoid format probing have to be specified explicitly. The default still allows the attack. In order to fix this, commit 79368c8 (July 2010) put probed raw images in a restricted mode, in which they wouldn't be able to overwrite the first few bytes of the image so that they would identify as a different image. If a write to the first sector would write one of the signatures of another driver, qemu would instead zero out the first four bytes. This patch was later reverted in commit 8b33d9e (September 2010) because it didn't get the handling of unaligned qiov members right. Today's block layer that is based on coroutines and has qiov utility functions makes it much easier to get this functionality right, so this patch implements it. The other differences of this patch to the old one are that it doesn't silently write something different than the guest requested by zeroing out some bytes (it fails the request instead) and that it doesn't maintain a list of signatures in the raw driver (it calls the usual probe function instead). Note that this change doesn't introduce new breakage for false positive cases where the guest legitimately writes data into the first sector that matches the signatures of an image format (e.g. for nested virt): These cases were broken before, only the failure mode changes from corruption after the next restart (when the wrong format is probed) to failing the problematic write request. Also note that like in the original patch, the restrictions only apply if the image format has been guessed by probing. Explicitly specifying a format allows guests to write anything they like. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1416497234-29880-8-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Read only one sector for format probingKevin Wolf2014-12-101-0/+2
| | | | | | | | | | | | | | The only image format driver that even potentially accesses anything after 512 bytes in its bdrv_probe() implementation is VMDK, which reads a plain-text descriptor file. In practice, the field it's looking for seems to come first and will be well within the first 512 bytes, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1416497234-29880-7-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: Change external interface to BlockBackendMax Reitz2014-12-101-4/+3
| | | | | | | | | | | Substitute BlockDriverState by BlockBackend in every globally visible function provided by nbd. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1416309679-333-5-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add bdrv_get_node_nameFam Zheng2014-12-101-0/+1
| | | | | | | | | | | This returns the node name of a BDS. Remove the TODO comment and expect the callers to be explicit. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add bdrv_next_nodeFam Zheng2014-12-101-0/+1
| | | | | | | | | | | Similar to bdrv_next, this traverses through graph_bdrv_states. Will be useful to enumerate all the named nodes. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Merge remote-tracking branch ↵Peter Maydell2014-11-111-1/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/mjt/tags/pull-trivial-patches-2014-11-11' into staging trivial patches for 2014-11-11 # gpg: Signature made Tue 11 Nov 2014 14:38:39 GMT using RSA key ID A4C3D7DB # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" # gpg: aka "Michael Tokarev <mjt@corpit.ru>" # gpg: aka "Michael Tokarev <mjt@debian.org>" * remotes/mjt/tags/pull-trivial-patches-2014-11-11: block: Fix comment for bdrv_co_get_block_status sysbus: Correct SYSTEM_BUS(obj) defines target-i386: cpu: keeping function parameters alignment on new line xen-hvm: Remove redundant variable 'xstate' coroutine-sigaltstack: Change jmp_buf to sigjmp_buf pc-bios: petalogix-s3adsp1800.dtb: Use 'xlnx, xps-ethernetlite-2.00.a' instead of 'xlnx, xps-ethernetlite-2.00.b' gdbstub: Add a missing case of signal number translation in gdbstub numa: make 'info numa' take into account hotplugged memory slirp/smbd: modify/set several parameters in generated smbd.conf qemu-doc.texi: fix typos in x509 examples icc_bus: fix typo ICC_BRIGDE -> ICC_BRIDGE Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
OpenPOWER on IntegriCloud