summaryrefslogtreecommitdiffstats
path: root/block_int.h
Commit message (Collapse)AuthorAgeFilesLines
* block: change discard to co_discardPaolo Bonzini2011-10-211-2/+0
| | | | | | | | | | | | Since coroutine operation is now mandatory, convert both bdrv_discard implementations to coroutines. For qcow2, this means taking the lock around the operation. raw-posix remains synchronous. The bdrv_discard callback is then unused and can be eliminated. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: change flush to co_flushPaolo Bonzini2011-10-211-1/+0
| | | | | | | | | | | | Since coroutine operation is now mandatory, convert all bdrv_flush implementations to coroutines. For qcow2, this means taking the lock. Other implementations are simpler and just forward bdrv_flush to the underlying protocol, so they can avoid the lock. The bdrv_flush callback is then unused and can be eliminated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add bdrv_co_discard and bdrv_aio_discard supportPaolo Bonzini2011-10-211-2/+7
| | | | | | | This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: unify flush implementationsPaolo Bonzini2011-10-211-0/+1
| | | | | | | | | Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Keep track of devices' I/O statusLuiz Capitulino2011-10-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds support to the BlockDriverState type to keep track of devices' I/O status. There are three possible status: BDRV_IOS_OK (no error), BDRV_IOS_ENOSPC (no space error) and BDRV_IOS_FAILED (any other error). The distinction between no space and other errors is important because a management application may want to watch for no space in order to extend the space assigned to the VM and put it to run again. Qemu devices supporting the I/O status feature have to enable it explicitly by calling bdrv_iostatus_enable() _and_ have to be configured to stop the VM on errors (ie. werror=stop|enospc or rerror=stop). In case of multiple errors being triggered in sequence only the first one is stored. The I/O status is always reset to BDRV_IOS_OK when the 'cont' command is issued. Next commits will add support to some devices and extend the query-block/info block commands to return the I/O status information. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Move BlockConf & friends from block_int.h to block.hMarkus Armbruster2011-09-121-35/+0
| | | | | | | | | | | It's convenience stuff for block device models, so block.h isn't the ideal home either, but better than block_int.h. Permits moving some #include "block_int.h" from device model .h into .c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Drop BlockDriverState member removableMarkus Armbruster2011-09-121-1/+0
| | | | | | | It's a confused mess (see previous commit). No users remain. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Rename bdrv_set_locked() to bdrv_lock_medium()Markus Armbruster2011-09-121-1/+1
| | | | | | | While there, make the locked parameter bool. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Drop medium lock tracking, ask device models insteadMarkus Armbruster2011-09-121-1/+0
| | | | | | | | Requires new BlockDevOps member is_medium_locked(). Implement for IDE and SCSI CD-ROMs. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Drop tray status tracking, no longer usedMarkus Armbruster2011-09-121-1/+0
| | | | | | | Commit 4be9762a is now completely redone. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Declare qemu_blockalign() in block.h, not block_int.hMarkus Armbruster2011-09-061-2/+0
| | | | | | | | Device models should be able to use it without an unclean include of block_int.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Leave tracking media change to device modelsMarkus Armbruster2011-09-061-1/+0
| | | | | | | hw/fdc.c is the only one that cares. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Split change_cb() into change_media_cb(), resize_cb()Markus Armbruster2011-09-061-3/+0
| | | | | | | Multiplexing callbacks complicates matters needlessly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Generalize change_cb() to BlockDevOpsMarkus Armbruster2011-09-061-3/+2
| | | | | | | So we can more easily add device model callbacks. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Attach non-qdev devices as wellMarkus Armbruster2011-09-061-1/+2
| | | | | | | | | | | | | For now, this just protects against programming errors like having the same drive back multiple non-qdev devices, or untimely bdrv_delete(). Later commits will add other interesting uses. While there, rename BlockDriverState member peer to dev, bdrv_attach() to bdrv_attach_dev(), bdrv_detach() to bdrv_detach_dev(), and bdrv_get_attached() to bdrv_get_attached_dev(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: latency accountingChristoph Hellwig2011-08-261-0/+2
| | | | | | | | | Account the total latency for read/write/flush requests. This allows management tools to average it based on a snapshot of the nr ops counters and allow checking for SLAs or provide statistics. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: explicit I/O accountingChristoph Hellwig2011-08-251-5/+2
| | | | | | | | | | | | | | | | | | | | | | | Decouple the I/O accounting from bdrv_aio_readv/writev/flush and make the hardware models call directly into the accounting helpers. This means: - we do not count internal requests from image formats in addition to guest originating I/O - we do not double count I/O ops if the device model handles it chunk wise - we only account I/O once it actuall is done - can extent I/O accounting to synchronous or coroutine I/O easily - implement I/O latency tracking easily (see the next patch) I've conveted the existing device model callers to the new model, device models that are using synchronous I/O and weren't accounted before haven't been updated yet. Also scsi hasn't been converted to the end-to-end accounting as I want to defer that after the pending scsi layer overhaul. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: include flush requests in info blockstatsChristoph Hellwig2011-08-231-0/+1
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add bdrv_co_readv/writevKevin Wolf2011-08-021-0/+6
| | | | | | | | | Add new block driver callbacks bdrv_co_readv/writev, which work on a QEMUIOVector like bdrv_aio_*, but don't need a callback. The function may only be called inside a coroutine, so a block driver implementing this interface can yield instead of blocking during I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Make BlockDriver method bdrv_eject() return voidMarkus Armbruster2011-08-011-1/+1
| | | | | | | | | | | Callees always return 0, except for FreeBSD's cdrom_eject(), which returns -ENOTSUP when the device is in a terminally wedged state. The only caller is bdrv_eject(), and it maps -ENOTSUP to 0 since commit 4be9762a. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Make BlockDriver method bdrv_set_locked() return voidMarkus Armbruster2011-08-011-1/+1
| | | | | | | | | | | The only caller is bdrv_set_locked(), and it ignores the value. Callees always return 0, except for FreeBSD's cdrom_set_locked(), which returns -ENOTSUP when the device is in a terminally wedged state. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add bdrv_get_allocated_file_size() operationFam Zheng2011-07-191-0/+1
| | | | | | | | | | | | | | | qemu-img.c wants to count allocated file size of image. Previously it counts a single bs->file by 'stat' or Window API. As VMDK introduces multiple file support, the operation becomes format specific with platform specific meanwhile. The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls bdrv_get_allocated_file_size to count the bs. And also added VMDK code to count his own extents. Signed-off-by: Fam Zheng <famcool@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* VMDK: create different subformatsFam Zheng2011-07-191-0/+1
| | | | | | | | | | | | | Add create option 'format', with enums: monolithicSparse monolithicFlat twoGbMaxExtentSparse twoGbMaxExtentFlat Each creates a subformat image file. The default is monolithicSparse. Signed-off-by: Fam Zheng <famcool@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Replaced tabs with spaces in block.h and block_int.hDevin Nakamura2011-06-151-2/+2
| | | | | Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Remove type hint, it's guest matter, doesn't belong hereMarkus Armbruster2011-05-191-1/+0
| | | | | | | | No users of bdrv_get_type_hint() left. bdrv_set_type_hint() can make the media removable by side effect. Make that explicit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Add flag to indicate external users to block deviceMarcelo Tosatti2011-02-071-0/+1
| | | | | | | | | | Certain operations such as drive_del or resize cannot be performed while external users (eg. block migration) reference the block device. Add a flag to indicate that. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: tell drivers about an image resizeChristoph Hellwig2011-01-311-1/+4
| | | | | | | | Extend the change_cb callback with a reason argument, and use it to tell drivers about size changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qed: Add QEMU Enhanced Disk image formatStefan Hajnoczi2010-12-171-0/+1
| | | | | | | | This patch introduces the qed on-disk layout and implements image creation. Later patches add read/write and other functionality. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add discard supportChristoph Hellwig2010-12-171-1/+6
| | | | | | | | | Add a new bdrv_discard method to free blocks in a mapping image, and a new drive property to set the granularity for these discard. If no discard granularity support is set discard support is disabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qemu-img: Deprecate obsolete -6 and -e optionsJes Sorensen2010-12-141-1/+0
| | | | | | | | | If -6 or -e is specified, an error message is printed and we exit. It does not print help() to avoid the error message getting lost in the noise. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Add bootindex parameter to net/block/fd deviceGleb Natapov2010-12-111-1/+3
| | | | | | | | | If bootindex is specified on command line a string that describes device in firmware readable way is added into sorted list. Later this list will be passed into firmware to control boot order. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* block: Allow bdrv_flush to return errorsKevin Wolf2010-11-041-1/+1
| | | | | | | | This changes bdrv_flush to return 0 on success and -errno in case of failure. It's a requirement for implementing proper error handle in users of bdrv_flush. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* Copy snapshots out of QCOW2 diskedison2010-10-221-0/+2
| | | | | | | | | | | | | | In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage. The following patch adds a new option in "qemu-img": qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img. Right now, it only supports to copy the full snapshot, delta snapshot is on the way. Changes from V1: all the comments from Kevin are addressed: Add read-only checking Fix coding style Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp Signed-off-by: Disheng Su <edison@cloud.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Revert "Make default invocation of block drivers safer (v3)"Anthony Liguori2010-09-081-1/+0
| | | | | | | | | | | | | This reverts commit 79368c81bf8cf93864d7afc88b81b05d8f0a2c90. Conflicts: block.c I haven't been able to come up with a solution yet for the corruption caused by unaligned requests from the IDE disk so revert until a solution can be written. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* block: Change bdrv_eject() not to drop the imageMarkus Armbruster2010-08-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_eject() gets called when a device model opens or closes the tray. If the block driver implements method bdrv_eject(), that method gets called. Drivers host_cdrom implements it, and it opens and closes the physical tray, and nothing else. When a device model opens, then closes the tray, media changes only if the user actively changes the physical media while the tray is open. This is matches how physical hardware behaves. If the block driver doesn't implement method bdrv_eject(), we do something quite different: opening the tray severs the connection to the image by calling bdrv_close(), and closing the tray does nothing. When the device model opens, then closes the tray, media is gone, unless the user actively inserts another one while the tray is open, with a suitable change command in the monitor. This isn't how physical hardware behaves. Rather inconvenient when programs "helpfully" eject media to give you a chance to change it. The way bdrv_eject() behaves here turns that chance into a must, which is not what these programs or their users expect. Change the default action not to call bdrv_close(). Instead, note the tray status in new BlockDriverState member tray_open. Use it in bdrv_is_inserted(). Arguably, the device models should keep track of tray status themselves. But this is less invasive. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Fix bdrv_has_zero_initKevin Wolf2010-08-031-2/+5
| | | | | | | | Assuming that any image on a block device is not properly zero-initialized is actually wrong: Only raw images have this problem. Any other image format shouldn't care about it, they initialize everything properly themselves. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: default to 0 minimal / optiomal I/O sizeChristoph Hellwig2010-07-261-2/+2
| | | | | | | | | | Currently we set them to 512 bytes unless manually specified. Unforuntaly some brain-dead partitioning tools create unaligned partitions if they get low enough optiomal I/O size values, so don't report any at all unless explicitly set. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Make default invocation of block drivers safer (v3)Anthony Liguori2010-07-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qcow2/vdi: Change check to distinguish error casesKevin Wolf2010-07-061-2/+5
| | | | | | | This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Catch attempt to attach multiple devices to a blockdevMarkus Armbruster2010-07-021-0/+2
| | | | | | | | | | | | | | | For instance, -device scsi-disk,drive=foo -device scsi-disk,drive=foo happily creates two SCSI disks connected to the same block device. It's all downhill from there. Device usb-storage deliberately attaches twice to the same blockdev, which fails with the fix in place. Detach before the second attach there. Also catch attempt to delete while a guest device model is attached. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qdev: Decouple qdev_prop_drive from DriveInfoMarkus Armbruster2010-07-021-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Make the property point to BlockDriverState, cutting out the DriveInfo middleman. This prepares the ground for block devices that don't have a DriveInfo. Currently all user-defined ones have a DriveInfo, because the only way to define one is -drive & friends (they go through drive_init()). DriveInfo is closely tied to -drive, and like -drive, it mixes information about host and guest part of the block device. I'm working towards a new way to define block devices, with clean host/guest separation, and I need to get DriveInfo out of the way for that. Fortunately, the device models are perfectly happy with BlockDriverState, except for two places: ide_drive_initfn() and scsi_disk_initfn() need to check the DriveInfo for a serial number set with legacy -drive serial=... Use drive_get_by_blockdev() there. Device model code should now use DriveInfo only when explicitly dealing with drives defined the old way, i.e. without -device. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: fix physical_block_size calculationChristoph Hellwig2010-06-221-1/+3
| | | | | | | | | | Both SCSI and virtio expect the physical block size relative to the logical block size. So get the factor first before calculating the log2. Reported-by: Mike Cao <bcao@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Move error actions from DriveInfo to BlockDriverStateMarkus Armbruster2010-06-151-0/+1
| | | | | | | | That's where they belong semantically (block device host part), even though the actions are actually executed by guest device code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Add wr_highest_sector blockstatKevin Wolf2010-05-031-0/+1
| | | | | | | | | | | | | | | This adds the wr_highest_sector blockstat which implements what is generally known as the high watermark. It is the highest offset of a sector written to the respective BlockDriverState since it has been opened. The query-blockstat QMP command is extended to add this value to the result, and also to add the statistics of the underlying protocol in a new "parent" field. Note that to get the "high watermark" of a qcow2 image, you need to look into the wr_highest_sector field of the parent (which can be a file, a host_device, ...). The wr_highest_sector of the qcow2 BlockDriverState itself is the highest offset on the _virtual_ disk that the guest has written to. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Open the underlying image file in generic codeKevin Wolf2010-05-031-1/+4
| | | | | | | | | | | | | | | Format drivers shouldn't need to bother with things like file names, but rather just get an open BlockDriverState for the underlying protocol. This patch introduces this behaviour for bdrv_open implementation. For protocols which need to access the filename to open their file/device/connection/... a new callback bdrv_file_open is introduced which doesn't get an underlying file opened. For now, also some of the more obscure formats use bdrv_file_open because they open() the file themselves instead of using the block.c functions. They need to be fixed in later patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Convert first_drv to QLISTStefan Hajnoczi2010-04-231-1/+1
| | | | | Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Convert bdrv_first to QTAILQStefan Hajnoczi2010-04-231-1/+2
| | | | | Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Do not export bdrv_firstStefan Hajnoczi2010-04-231-2/+0
| | | | | | | | | The bdrv_first linked list of BlockDriverStates is currently extern so that block migration can iterate the list. However, since there is already a bdrv_iterate() function there is no need to expose bdrv_first. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blkdebug: Add events and rulesKevin Wolf2010-04-231-0/+2
| | | | | | | | | | | | | Block drivers can trigger a blkdebug event whenever they reach a place where it could be useful to inject an error for testing/debugging purposes. Rules are read from a blkdebug config file and describe which action is taken when an event is triggered. For now this is only injecting an error (with a few options) or changing the state (which is an integer). Rules can be declared to be active only in a specific state; this way later rules can distiguish on which path we came to trigger their event. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add logical_block_size propertyChristoph Hellwig2010-03-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Add a logical block size attribute as various guest side tools only increase the filesystem sector size based on it, not the advisory physical block size. For scsi we already have support for a different logical block size in place for CDROMs that we can built upon. Only my recent block device characteristics VPD page needs some fixups. Note that we leave the logial block size for CDROMs hardcoded as the 2k value is expected for it in general. For virtio-blk we already have a feature flag claiming to support a variable logical block size that was added for the s390 kuli hypervisor. Interestingly it does not actually change the units in which the protocol works, which is still fixed at 512 bytes, but only communicates a different minimum I/O granularity. So all we need to do in virtio is to add a trap for unaligned I/O and round down the device size to the next multiple of the logical block size. IDE does not support any other logical block size than 512 bytes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
OpenPOWER on IntegriCloud