summaryrefslogtreecommitdiffstats
path: root/async.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "iothread: release iothread around aio_poll"Stefan Hajnoczi2015-06-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | This reverts commit a0710f7995f914e3044e5899bd8ff6c43c62f916. In qemu-devel email message <556DBF87.2020908@de.ibm.com>, Christian Borntraeger writes: Having many guests all with a kernel/ramdisk (via -kernel) and several null block devices will result in hangs. All hanging guests are in partition detection code waiting for an I/O to return so very early maybe even the first I/O. Reverting that commit "fixes" the hangs. Reverting this commit for the 2.4 release. More time is needed to investigate and correct this patch. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* iothread: release iothread around aio_pollPaolo Bonzini2015-04-281-7/+1
| | | | | | | | | | | | | | | | This is the first step towards having fine-grained critical sections in dataplane threads, which resolves lock ordering problems between address_space_* functions (which need the BQL when doing MMIO, even after we complete RCU-based dispatch) and the AioContext. Because AioContext does not use contention callbacks anymore, the unit test has to be changed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* aio-posix: move pollfds to thread-local storagePaolo Bonzini2015-04-281-2/+0
| | | | | | | | | | | | | | | | By using thread-local storage, aio_poll can stop using global data during g_poll_ns. This will make it possible to drop callbacks from rfifolock. [Moved npfd = 0 assignment to end of walking_handlers region as suggested by Paolo. This resolves the assert(npfd == 0) assertion failure in pollfds_cleanup(). --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* aio: strengthen memory barriers for bottom half schedulingPaolo Bonzini2015-04-091-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two problems with memory barriers in async.c. The fix is to use atomic_xchg in order to achieve sequential consistency between the scheduling of a bottom half and the corresponding execution. First, if bh->scheduled is already 1 in qemu_bh_schedule, QEMU does not execute a memory barrier to order any writes needed by the callback before the read of bh->scheduled. If the other side sees req->state as THREAD_ACTIVE, the callback is not invoked and you get deadlock. Second, the memory barrier in aio_bh_poll is too weak. Without this patch, it is possible that bh->scheduled = 0 is not "published" until after the callback has returned. Another thread wants to schedule the bottom half, but it sees bh->scheduled = 1 and does nothing. This causes a lost wakeup. The memory barrier should have been changed to smp_mb() in commit 924fe12 (aio: fix qemu_bh_schedule() bh->ctx race condition, 2014-06-03) together with qemu_bh_schedule()'s. Guess who reviewed that patch? Both of these involve a store and a load, so they are reproducible on x86_64 as well. It is however much easier on aarch64, where the libguestfs test suite triggers the bug fairly easily. Even there the failure can go away or appear depending on compiler optimization level, tracing options, or even kernel debugging options. Paul Leveille however reported how to trigger the problem within 15 minutes on x86_64 as well. His (untested) recipe, reproduced here for reference, is the following: 1) Qcow2 (or 3) is critical – raw files alone seem to avoid the problem. 2) Use “cache=directsync” rather than the default of “cache=none” to make it happen easier. 3) Use a server with a write-back RAID controller to allow for rapid IO rates. 4) Run a random-access load that (mostly) writes chunks to various files on the virtual block device. a. I use ‘diskload.exe c:25’, a Microsoft HCT load generator, on Windows VMs. b. Iometer can probably be configured to generate a similar load. 5) Run multiple VMs in parallel, against the same storage device, to shake the failure out sooner. 6) IvyBridge and Haswell processors for certain; not sure about others. A similar patch survived over 12 hours of testing, where an unpatched QEMU would fail within 15 minutes. This bug is, most likely, also the cause of failures in the libguestfs testsuite on AArch64. Thanks to Laszlo Ersek for initially reporting this bug, to Stefan Hajnoczi for suggesting closer examination of qemu_bh_schedule, and to Paul for providing test input and a prototype patch. Reported-by: Laszlo Ersek <lersek@redhat.com> Reported-by: Paul Leveille <Paul.Leveille@stratus.com> Reported-by: John Snow <jsnow@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1428419779-26062-1-git-send-email-pbonzini@redhat.com Suggested-by: Paul Leveille <Paul.Leveille@stratus.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: replace g_new0 with g_new for bottom half allocation.Paolo Bonzini2015-01-131-4/+6
| | | | | | | | | This saves about 15% of the clock cycles spent on allocation. Using the slice allocator does not add a visible improvement; allocation is faster than malloc, while freeing seems to be slower. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: mark AioContext as recursivePaolo Bonzini2015-01-131-0/+1
| | | | | | | | | | AioContext can be accessed recursively, in fact that's what we do with aio_poll. Marking the GSource as recursive avoids that GLib blocks it and unblocks it around every call to aio_dispatch, which is a pretty expensive operation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Use g_new0() for a bit of extra type checkingMarkus Armbruster2014-12-101-1/+1
| | | | | | | | | | | | g_new(T, 1) is safer than g_malloc(sizeof(T)), because it returns T * rather than void *, which lets the compiler catch more type errors. Missed in commit 02c4f26. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 1417697709-13087-1-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* async: aio_context_new(): Handle event_notifier_init failureChrysostomos Nanakos2014-09-221-5/+11
| | | | | | | | | | | | | | | | | | | | | On a system with a low limit of open files the initialization of the event notifier could fail and QEMU exits without printing any error information to the user. The problem can be easily reproduced by enforcing a low limit of open files and start QEMU with enough I/O threads to hit this limit. The same problem raises, without the creation of I/O threads, while QEMU initializes the main event loop by enforcing an even lower limit of open files. This commit adds an error message on failure: # qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1 qemu: Failed to initialize event notifier: Too many open files in system Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* AioContext: introduce aio_preparePaolo Bonzini2014-08-291-0/+5
| | | | | | | | | | | This will be used to implement socket polling on Windows. On Windows, select() and g_poll() are completely different; sockets are polled with select() before calling g_poll, and the g_poll must be nonblocking if select() says a socket is ready. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* AioContext: export and use aio_dispatchPaolo Bonzini2014-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | So far, aio_poll's scheme was dispatch/poll/dispatch, where the first dispatch phase was used only in the GSource case in order to avoid a blocking poll. Earlier patches changed it to dispatch/prepare/poll/dispatch, where prepare is aio_compute_timeout. By making aio_dispatch public, we can remove the first dispatch phase altogether, so that both aio_poll and the GSource use the same prepare/poll/dispatch scheme. This patch breaks the invariant that aio_poll(..., true) will not block the first time it returns false. This used to be fundamental for qemu_aio_flush's implementation as "while (qemu_aio_wait()) {}" but no code in QEMU relies on this invariant anymore. The return value of aio_poll() is now comparable with that of g_main_context_iteration. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* AioContext: take bottom halves into account when computing aio_poll timeoutPaolo Bonzini2014-08-291-14/+18
| | | | | | | | | | | | | | | | | Right now, QEMU invokes aio_bh_poll before the "poll" phase of aio_poll. It is simpler to do it afterwards and skip the "poll" phase altogether when the OS-dependent parts of AioContext are invoked from GSource. This way, AioContext behaves more similarly when used as a GSource vs. when used as stand-alone. As a start, take bottom halves into account when computing the poll timeout. If a bottom half is ready, do a non-blocking poll. As a side effect, this makes idle bottom halves work with aio_poll; an improvement, but not really an important one since they are deprecated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* AioContext: speed up aio_notifyPaolo Bonzini2014-07-091-1/+18
| | | | | | | | | | | | | | | | | | | | | In many cases, the call to event_notifier_set in aio_notify is unnecessary. In particular, if we are executing aio_dispatch, or if aio_poll is not blocking, we know that we will soon get to the next loop iteration (if necessary); the thread that hosts the AioContext's event loop does not need any nudging. The patch includes a Promela formal model that shows that this really works and does not need any further complication such as generation counts. It needs a memory barrier though. The generation counts are not needed because any change to ctx->dispatching after the memory barrier is okay for aio_notify. If it changes from zero to one, it is the right thing to skip event_notifier_set. If it changes from one to zero, the event_notifier_set is unnecessary but harmless. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* aio: fix qemu_bh_schedule() bh->ctx race conditionStefan Hajnoczi2014-06-041-4/+10
| | | | | | | | | | | | | | | | | qemu_bh_schedule() is supposed to be thread-safe at least the first time it is called. Unfortunately this is not quite true: bh->scheduled = 1; aio_notify(bh->ctx); Since another thread may run the BH callback once it has been scheduled, there is a race condition if the callback frees the BH before aio_notify(bh->ctx) has a chance to run. Reported-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Stefan Priebe <s.priebe@profihost.ag>
* aio: add aio_context_acquire() and aio_context_release()Stefan Hajnoczi2014-03-131-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | It can be useful to run an AioContext from a thread which normally does not "own" the AioContext. For example, request draining can be implemented by acquiring the AioContext and looping aio_poll() until all requests have been completed. The following pattern should work: /* Event loop thread */ while (running) { aio_context_acquire(ctx); aio_poll(ctx, true); aio_context_release(ctx); } /* Another thread */ aio_context_acquire(ctx); bdrv_read(bs, 0x1000, buf, 1); aio_context_release(ctx); This patch implements aio_context_acquire() and aio_context_release(). Note that existing aio_poll() callers do not need to worry about acquiring and releasing - it is only needed when multiple threads will call aio_poll() on the same AioContext. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aio / timers: aio_ctx_prepare sets timeout from AioContext timersAlex Bligh2013-08-221-1/+12
| | | | | | | | | | Calculate the timeout in aio_ctx_prepare taking into account the timers attached to the AioContext. Alter aio_ctx_check similarly. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aio / timers: Add a notify callback to QEMUTimerListAlex Bligh2013-08-221-1/+6
| | | | | | | | Add a notify pointer to QEMUTimerList so it knows what to notify on a timer change. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aio / timers: Add QEMUTimerListGroup to AioContextAlex Bligh2013-08-221-0/+2
| | | | | | | | | Add a QEMUTimerListGroup each AioContext (meaning a QEMUTimerList associated with each clock is added) and delete it when the AioContext is freed. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aio: drop io_flush argumentStefan Hajnoczi2013-08-191-2/+2
| | | | | | | | | | | The .io_flush() handler no longer exists and has no users. Drop the io_flush argument to aio_set_fd_handler() and related functions. The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no longer used and are dropped too. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* QEMUBH: make AioContext's bh re-entrantLiu Ping Fan2013-07-191-2/+31
| | | | | | | | | | | | BH will be used outside big lock, so introduce lock to protect between the writers, ie, bh's adders and deleter. The lock only affects the writers and bh's callback does not take this extra lock. Note that for the same AioContext, aio_bh_poll() can not run in parallel yet. Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aio: add a ThreadPool instance to AioContextStefan Hajnoczi2013-03-151-0/+11
| | | | | | | | | | | | | | | | | | This patch adds a ThreadPool to AioContext. It's possible that some AioContext instances will never use the ThreadPool, so defer creation until aio_get_thread_pool(). The reason why AioContext should have the ThreadPool is because the ThreadPool is bound to a AioContext instance where the work item's callback function is invoked. It doesn't make sense to keep the ThreadPool pointer anywhere other than AioContext. For example, block/raw-posix.c can get its AioContext's ThreadPool and submit work. Special note about headers: I used struct ThreadPool in aio.h because there is a circular dependency if aio.h includes thread-pool.h. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: convert aio_poll() to g_poll(3)Stefan Hajnoczi2013-02-211-0/+2
| | | | | | | | | | | | | | | AioHandler already has a GPollFD so we can directly use its events/revents. Add the int pollfds_idx field to AioContext so we can map g_poll(3) results back to AioHandlers. Reuse aio_dispatch() to invoke handlers after g_poll(3). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-id: 1361356113-11049-10-git-send-email-stefanha@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* misc: move include files to include/qemu/Paolo Bonzini2012-12-191-1/+1
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* block: move include files to include/block/Paolo Bonzini2012-12-191-1/+1
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: Get rid of qemu_aio_flush()Kevin Wolf2012-12-111-5/+0
| | | | | | | There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* aio: fix aio_ctx_prepare with idle bottom halvesPaolo Bonzini2012-11-121-4/+2
| | | | | | | | | | | | Commit ed2aec4867f0d5f5de496bb765347b5d0cfe113d changed the return value of aio_ctx_prepare from false to true when only idle bottom halves are available. This broke PC old-style DMA, which uses them. Fix this by making aio_ctx_prepare return true only when non-idle bottom halves are scheduled to run. Reported-by: malc <av1474@comtv.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: malc <av1474@comtv.ru>
* aio: clean up now-unused functionsPaolo Bonzini2012-10-301-16/+7
| | | | | | | | Some cleanups can now be made, now that the main loop does not anymore need hooks into the bottom half code. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: add aio_notifyPaolo Bonzini2012-10-301-4/+26
| | | | | | | | With this change async.c does not rely anymore on any service from main-loop.c, i.e. it is completely self-contained. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: make AioContexts GSourcesPaolo Bonzini2012-10-301-1/+64
| | | | | | This lets AioContexts be used (optionally) with a glib main loop. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: add non-blocking variant of aio_waitPaolo Bonzini2012-10-301-1/+1
| | | | | | | This will be used when polling the GSource attached to an AioContext. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: add I/O handlers to the AioContext interfacePaolo Bonzini2012-10-301-0/+6
| | | | | | | With this patch, I/O handlers (including event notifier handlers) can be attached to a single AioContext. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* aio: introduce AioContext, move bottom halves therePaolo Bonzini2012-10-301-15/+15
| | | | | | | | | | | | Start introducing AioContext, which will let us remove globals from aio.c/async.c, and introduce multiple I/O threads. The bottom half functions now take an additional AioContext argument. A bottom half is created with a specific AioContext that remains the same throughout the lifetime. qemu_bh_new is just a wrapper that uses a global context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* async: Use bool for boolean struct members and remove a holeStefan Weil2012-05-011-3/+3
| | | | | | | | Using bool reduces the size of the structure and improves readability. A hole in the structure was removed. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* main_loop_wait: block indefinitelyStefano Stabellini2012-04-261-1/+1
| | | | | | | | | | | | | | | - remove qemu_calculate_timeout; - explicitly size timeout to uint32_t; - introduce slirp_update_timeout; - pass NULL as timeout argument to select in case timeout is the maximum value; Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Paul Brook <paul@codesourcery.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* main-loop: create main-loop.hPaolo Bonzini2011-10-211-0/+1
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* async: Allow nested qemu_bh_poll callsKevin Wolf2011-09-061-8/+16
| | | | | | | | | | | | | | qemu may segfault when a BH handler first deletes a BH and then (possibly indirectly) calls a nested qemu_bh_poll(). This is because the inner instance frees the BH and deletes it from the list that the outer one processes. This patch deletes BHs only in the outermost qemu_bh_poll instance. Commit 7887f620 already tried to achieve the same, but it assumed that the BH handler would only delete its own BH. With a nested qemu_bh_poll(), this isn't guaranteed, so that commit wasn't enough. Hope this one fixes it for real. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Use glib memory allocation and free functionsAnthony Liguori2011-08-201-2/+2
| | | | | | qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* async: Remove AsyncContextKevin Wolf2011-08-021-91/+7
| | | | | | | | | | | The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and can run AIO callbacks of different requests if it weren't for AsyncContexts). Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Allow nested qemu_bh_poll() after BH deletionKevin Wolf2011-06-151-2/+3
| | | | | | | | | | | | Without this, qemu segfaults when a BH handler first deletes its BH and then calls another function which involves a nested qemu_bh_poll() call. This can be reproduced by generating an I/O error (e.g. with blkdebug) on an IDE device and using rerror/werror=stop to stop the VM. When continuing the VM, qemu segfaults. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* Introduce contexts for asynchronous callbacksKevin Wolf2009-10-271-7/+93
| | | | | | | | | Add the possibility to use AIO and BHs without allowing foreign callbacks to be run. Basically, you put your own AIOs and BHs in a separate context. For details see the comments in the source. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* Split out bottom halvesKevin Wolf2009-10-271-0/+130
Instead of putting more and more stuff into vl.c, let's have the generic functions that deal with asynchronous callbacks in their own file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
OpenPOWER on IntegriCloud