summaryrefslogtreecommitdiffstats
path: root/block
Commit message (Collapse)AuthorAgeFilesLines
* qed: Refuse to create images on block devicesStefan Hajnoczi2011-01-241-0/+6
| | | | | | | | QED relies on the underlying filesystem to extend the file and maintain its size. Check that images are not created on a block device. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Batch flushes for COWKevin Wolf2011-01-243-4/+19
| | | | | | | | | qcow2 calls bdrv_flush() after performing COW in order to ensure that the L2 table change is never written before the copy is safe on disk. Now that the L2 table is cached, we can wait with flushing until we write out the next L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Use QcowCacheKevin Wolf2011-01-245-298/+240
| | | | | | | Use the new functions of qcow2-cache.c for everything that works on refcount block and L2 tables. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Add QcowCacheKevin Wolf2011-01-242-0/+309
| | | | | | | | | | | | | | | | | | | | | | This adds some new cache functions to qcow2 which can be used for caching refcount blocks and L2 tables. When used with cache=writethrough they work like the old caching code which is spread all over qcow2, so for this case we have merely a cleanup. The interesting case is with writeback caching (this includes cache=none) where data isn't written to disk immediately but only kept in cache initially. This leads to some form of metadata write batching which avoids the current "write to refcount block, flush, write to L2 table" pattern for each single request when a lot of cluster allocations happen. Instead, cache entries are only written out if its required to maintain the right order. In the pure cluster allocation case this means that all metadata updates for requests are done in memory initially and on sync, first the refcount blocks are written to disk, then fsync, then L2 tables. This improves performance of scenarios with lots of cluster allocations noticably (e.g. installation or after taking a snapshot). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: fix unaligned accessAurelien Jarno2011-01-241-1/+1
| | | | | | | | | | cpu_to_be64w() is called with an obviously non-aligned pointer. Use cpu_to_be64wu() instead. It fixes unaligned accesses errors on IA64 hosts. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* vpc: fix a file descriptor leakBlue Swirl2011-01-121-17/+30
| | | | | | | Fix a file descriptor leak, reported by cppcheck: [/src/qemu/block/vpc.c:524]: (error) Resource leak: fd Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* vvfat: fix a file descriptor leakBlue Swirl2011-01-121-0/+1
| | | | | | | Fix a file descriptor leak, reported by cppcheck: [/src/qemu/block/vvfat.c:759]: (error) Resource leak: dir Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* Add proper -errno error return values to qcow2_open()Jes Sorensen2010-12-171-18/+42
| | | | | | | | In addition this adds missing braces to the function to be consistent with the coding style. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/qcow2.c: rename qcow_ functions to qcow2_Jes Sorensen2010-12-173-98/+104
| | | | | | | | It doesn't really make sense for functions in qcow2.c to be named qcow_ so convert the names to match correctly. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qed: Consistency check supportStefan Hajnoczi2010-12-173-3/+336
| | | | | | | | | | | | | | | This patch adds support for the qemu-img check command. It also introduces a dirty bit in the qed header to mark modified images as needing a check. This bit is cleared when the image file is closed cleanly. If an image file is opened and it has the dirty bit set, a consistency check will run and try to fix corrupted table offsets. These corruptions may occur if there is power loss while an allocating write is performed. Once the image is fixed it opens as normal again. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qed: Read/write supportStefan Hajnoczi2010-12-172-2/+652
| | | | | | | | | | | | | This patch implements the read/write state machine. Operations are fully asynchronous and multiple operations may be active at any time. Allocating writes lock tables to ensure metadata updates do not interfere with each other. If two allocating writes need to update the same L2 table they will run sequentially. If two allocating writes need to update different L2 tables they will run in parallel. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qed: Table, L2 cache, and cluster functionsStefan Hajnoczi2010-12-176-1/+854
| | | | | | | | | | | | | | This patch adds code to look up data cluster offsets in the image via the L1/L2 tables. The L2 tables are writethrough cached in memory for performance (each read/write requires a lookup so it is essential to cache the tables). With cluster lookup code in place it is possible to implement bdrv_is_allocated() to query the number of contiguous allocated/unallocated clusters. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qed: Add QEMU Enhanced Disk image formatStefan Hajnoczi2010-12-172-0/+702
| | | | | | | | This patch introduces the qed on-disk layout and implements image creation. Later patches add read/write and other functionality. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw-posix: add discard supportChristoph Hellwig2010-12-171-0/+45
| | | | | | | | | Add support to discard blocks in a raw image residing on an XFS filesystem by calling the XFS_IOC_UNRESVSP64 ioctl to punch holes. Support for other hole punching mechanisms can be added when they become available. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add discard supportChristoph Hellwig2010-12-171-0/+6
| | | | | | | | | Add a new bdrv_discard method to free blocks in a mapping image, and a new drive property to set the granularity for these discard. If no discard granularity support is set discard support is disabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* ceph/rbd block driver for qemu-kvmChristian Brunner2010-12-142-0/+1130
| | | | | | | | | | | | | | RBD is an block driver for the distributed file system Ceph (http://ceph.newdream.net/). This driver uses librados (which is part of the Ceph server) for direct access to the Ceph object store and is running entirely in userspace (Yehuda also wrote a driver for the linux kernel, that can be used to access rbd volumes as a block device). Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Christian Brunner <chb@muc.de> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw-posix: raw_pwrite comment fixupChristoph Hellwig2010-11-261-1/+1
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Remove unused s->hd in various driversKevin Wolf2010-11-245-6/+0
| | | | | | | | All drivers use bs->file instead of s->hd for quite a while now, so it's time to remove s->hd. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* Fix win32 buildBlue Swirl2010-11-071-1/+1
| | | | | | | Fix a return value change missed by 205ef7961f781496366e0a93a4ec621ad3724bd7. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* block: avoid a warning on 64 bit hosts with long as int64_tBlue Swirl2010-11-041-2/+2
| | | | | | | | | | | | | | When building on a 64 bit host which uses 'long' for int64_t, GCC emits a warning: CC block/blkverify.o /src/qemu/block/blkverify.c: In function `blkverify_verify_readv': /src/qemu/block/blkverify.c:304: warning: long long int format, long unsigned int arg (arg 3) Rework a77cffe7e916f4dd28f2048982ea2e0d98143b11 to avoid the warning. Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Invalidate cache after failed readKevin Wolf2010-11-042-0/+2
| | | | | | | | The cache content may be destroyed after a failed read, better not use it any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* vpc: Implement bdrv_flushKevin Wolf2010-11-041-8/+13
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Allow bdrv_flush to return errorsKevin Wolf2010-11-0410-19/+26
| | | | | | | | This changes bdrv_flush to return 0 on success and -errno in case of failure. It's a requirement for implementing proper error handle in users of bdrv_flush. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori2010-10-265-198/+146
|\
| * block: Use GCC_FMT_ATTR and fix a format errorStefan Weil2010-10-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding the gcc format attribute detects a format bug which is fixed here. v2: Don't use type cast. BDRV_SECTOR_SIZE is unsigned long long, so %lld should be the correct format specifier. Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * Copy snapshots out of QCOW2 diskedison2010-10-223-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage. The following patch adds a new option in "qemu-img": qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img. Right now, it only supports to copy the full snapshot, delta snapshot is on the way. Changes from V1: all the comments from Kevin are addressed: Add read-only checking Fix coding style Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp Signed-off-by: Disheng Su <edison@cloud.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * qcow2: Remove old image creation functionKevin Wolf2010-10-221-224/+0
| | | | | | | | | | | | They have been #ifdef'd out by the previous patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * qcow2: Simplify image creationKevin Wolf2010-10-221-1/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of doing lots of magic for setting up initial refcount blocks and stuff create a minimal (inconsistent) image, open it and initialize the rest with regular qcow2 functions. This is a complete rewrite of the image creation function. The old implementating is #ifdef'd out and will be removed by the next patch (removing it here would have made the diff unreadable because diff tries to find similarities when it's really a rewrite) Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * qcow2: Support exact L1 table growthStefan Hajnoczi2010-10-224-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The L1 table grow operation includes a size calculation that bumps up the new L1 table size in order to anticipate the size needs of vmstate data. This helps reduce the number of times that the L1 table has to be grown when vmstate data is appended. This size overhead is not necessary during image creation, bdrv_truncate(), or snapshot goto operations. In fact, existing qemu-iotests that exercise table growth are no longer able to trigger it because image creation preallocates an L1 table that is too large after changes to qcow_create2(). This patch keeps the size calculation but also adds exact growth for callers that do not want to inflate the L1 table size unnecessarily. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* | qemu-timer: move commonly used timer code to qemu-timer-commonBlue Swirl2010-10-231-6/+6
|/ | | | | | | | | | | | | | | | | | | | Move timer init functions to a new file, qemu-timer-common.c. Make other critical timer functions inlined to preserve performance in qemu-timer.c, also move muldiv64() (used by the inline functions) to qemu-timer.h. Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly. Remove a similar/duplicate definition in qemu-tool.c. Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used there. After this change, tracing can be used also for user code and simpletrace on Win32. Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* block: avoid a write only variableBlue Swirl2010-10-131-0/+1
| | | | | | | | | | | Compiling with GCC 4.6.0 20100925 produced a warning: /src/qemu/block/qcow2-refcount.c: In function 'update_refcount': /src/qemu/block/qcow2-refcount.c:552:13: error: variable 'dummy' set but not used [-Werror=unused-but-set-variable] Fix by adding a dummy cast so that the result is not unused. Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* block/vvfat: Fix compiler warning in debug codeStefan Weil2010-10-031-1/+0
| | | | | | | | | | Fix this compiler warning: ./block/vvfat.c:2285: error: comparison of unsigned expression >= 0 is always true Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* block-verify: fix 32-bit buildAnthony Liguori2010-09-221-1/+1
| | | | | Reported-by: Peter Lemenkov <lemenkov@gmail.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* blkverify: Add block driver for verifying I/OStefan Hajnoczi2010-09-211-0/+382
| | | | | | | | | | | | | The blkverify block driver makes investigating image format data corruption much easier. A raw image initialized with the same contents as the test image (e.g. qcow2 file) must be provided. The raw image mirrors read/write operations and is used to verify that data read from the test image is correct. See docs/blkverify.txt for more information. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Avoid bounce buffers for AIO write requestsKevin Wolf2010-09-211-23/+18
| | | | | | | | | | qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO write path. Encrypted images continue to use a bounce buffer, however with constant size. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Avoid bounce buffers for AIO read requestsKevin Wolf2010-09-213-30/+68
| | | | | | | | | | qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO read path, and constrains them to a constant size for encrypted images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Get rid of additional sync on COWKevin Wolf2010-09-211-2/+8
| | | | | | | We always have a sync for the refcount update when a new cluster is allocated. If we move this past the COW, we can save an additional sync. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Move sync out of qcow2_alloc_clustersKevin Wolf2010-09-213-2/+7
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Move sync out of update_refcountKevin Wolf2010-09-211-2/+11
| | | | | | | Note that the flush is omitted intentionally in qcow2_free_clusters. If anything, we can leak clusters here if we lose the writes. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Move sync out of write_refcount_block_entriesKevin Wolf2010-09-211-1/+3
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: correctly manage default portLaurent Vivier2010-09-211-2/+0
| | | | | | | | block/nbd.c: use default port number when none is specified qemu-nbd.c: use IANA-assigned port number: 10809 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw-posix: handle > 512 byte alignment correctlyChristoph Hellwig2010-09-211-33/+46
| | | | | | | | | | | Replace the hardcoded handling of 512 byte alignment with bs->buffer_alignment to handle larger sector size devices correctly. Note that we can not rely on it to be initialize in bdrv_open, so deal with the worst case there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* vvfat: Use cache=unsafeKevin Wolf2010-09-211-4/+10
| | | | | | | | The qcow file used for write support in vvfat is a temporary file, so we can use cache=unsafe there. Without this, write support is just too slow to be of any use. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
* vvfat: Fix double free for opening the image rwKevin Wolf2010-09-211-3/+4
| | | | | | | | Allocation and deallocation of bs->opaque is not in the control of a block driver. Therefore it should not set bs->opaque to a data structure used by another bs, or closing the image will lead to a double free. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
* vvfat: Fix segfault on write to read-only diskKevin Wolf2010-09-211-0/+5
| | | | | | | | | | | | vvfat tries to set the readonly flag in its open function, but nowadays this is overwritted with the readonly=... command line option. Check in bdrv_write if the vvfat was opened read-only and return an error in this case. Without this check, vvfat tries to access the qcow bs, which is NULL without enabled write support. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
* blkdebug: fix enum comparisonBlue Swirl2010-09-181-3/+1
| | | | | | | | | | | | | | The signedness of enum types depend on the compiler implementation. Therefore the check for negative values may or may not be meaningful. Fix by explicitly casting to a signed integer. Since the values are also checked earlier against event_names table, this is an internal error. Change the 'if' to 'assert'. This also avoids a warning with GCC flag -Wtype-limits. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* Revert "Make default invocation of block drivers safer (v3)"Anthony Liguori2010-09-081-130/+0
| | | | | | | | | | | | | This reverts commit 79368c81bf8cf93864d7afc88b81b05d8f0a2c90. Conflicts: block.c I haven't been able to come up with a solution yet for the corruption caused by unaligned requests from the IDE disk so revert until a solution can be written. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qcow2: Remove unnecessary flush after L2 writeKevin Wolf2010-09-081-4/+12
| | | | | | | | When a new cluster was allocated, we only need a flush after the write to the L2 table if it was a COW and we need to decrease the refcounts of the old clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw-posix: improve detection of scsi-generic devicesBernhard Kohl2010-09-081-2/+8
| | | | | | | Allow symbolic links which point to /dev/sgX devices. Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* raw-posix: Don't use file name for host_cdrom detection on LinuxKevin Wolf2010-09-081-3/+0
| | | | | | | On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose anything but false positives by removing the check for a /dev/cd* path. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
OpenPOWER on IntegriCloud