summaryrefslogtreecommitdiffstats
path: root/include/qemu/queue.h
Commit message (Collapse)AuthorAgeFilesLines
* queue: fix QSLIST_INSERT_HEAD_ATOMIC racePaolo Bonzini2015-03-121-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a not-so-subtle race in QSLIST_INSERT_HEAD_ATOMIC. Because atomic_cmpxchg returns the old value instead of a success flag, QSLIST_INSERT_HEAD_ATOMIC was checking for success by comparing against the second argument to atomic_cmpxchg. Unfortunately, this only works if the second argument is a local or thread-local variable. If it is in memory, it can be subject to common subexpression elimination (and then everything's fine) or reloaded after the atomic_cmpxchg, depending on the compiler's whims. If the latter happens, the race can happen. A thread can sneak in, doing something on elm->field.sle_next after the atomic_cmpxchg and before the comparison. This causes a wrong failure, and then two threads are using "elm" at the same time. In the case discovered by Christian, the sequence was likely something like this: thread 1 | thread 2 QSLIST_INSERT_HEAD_ATOMIC | atomic_cmpxchg succeeds | elm added to list | | steal release_pool | QSLIST_REMOVE_HEAD | elm removed from list | ... | QSLIST_INSERT_HEAD_ATOMIC | (overwrites sle_next) spurious failure | atomic_cmpxchg succeeds | elm added to list again | | steal release_pool | QSLIST_REMOVE_HEAD | elm removed again | The last three steps could be done by a third thread as well. A reproducer that failed in a matter of seconds is as follows: - the guest has 32 VCPUs on a 28 core host (hyperthreading was enabled), memory was 16G just to err on the safe side (the host has 64G, but hey at least you need no s390) - the guest has 24 null-aio virtio-blk devices using dataplane (-object iothread,id=ioN -drive if=none,id=blkN,driver=null-aio,size=500G -device virtio-blk-pci,iothread=ioN,drive=blkN) - the guest also has a single network interface. It's only doing loopback tests so slirp vs. tap and the model doesn't matter. - the guest is running fio with the following script: [global] rw=randread blocksize=16k ioengine=libaio runtime=10m buffered=0 fallocate=none time_based iodepth=32 [virtio1a] filename=/dev/block/252\:16 [virtio1b] filename=/dev/block/252\:16 ... [virtio24a] filename=/dev/block/252\:384 [virtio24b] filename=/dev/block/252\:384 [listen1] protocol=tcp ioengine=net port=12345 listen rw=read bs=4k size=1000g [connect1] protocol=tcp hostname=localhost ioengine=net port=12345 protocol=tcp rw=write startdelay=1 size=1000g ... [listen8] protocol=tcp ioengine=net port=12352 listen rw=read bs=4k size=1000g [connect8] protocol=tcp hostname=localhost ioengine=net port=12352 rw=write startdelay=1 size=1000g Moral of the story: I should refrain from writing more clever stuff. At least it looks like it is not too clever to be undebuggable. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1426002357-6889-1-git-send-email-pbonzini@redhat.com Fixes: c740ad92d0d958fa785e5d7aa1b67ecaf30a6a54 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* rcu: introduce RCU-enabled QLISTMike Day2015-02-161-11/+0
| | | | | | | | | | | | | Add RCU-enabled variants on the existing bsd DQ facility. Each operation has the same interface as the existing (non-RCU) version. Also, each operation is implemented as macro. Using the RCU-enabled QLIST, existing QLIST users will be able to convert to RCU without using a different list interface. Signed-off-by: Mike Day <ncmike@ncultra.org> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* rcu: add rcu libraryPaolo Bonzini2015-02-021-0/+13
| | | | | | | | | | | | | | | | | | This includes a (mangled) copy of the liburcu code. The main changes are: 1) removing dependencies on many other header files in liburcu; 2) removing for simplicity the tentative busy waiting in synchronize_rcu, which has limited performance effects; 3) replacing futexes in synchronize_rcu with QemuEvents for Win32 portability. The API is the same as liburcu, so it should be possible in the future to require liburcu on POSIX systems for example and use our copy only on Windows. Among the various versions available I chose urcu-mb, which is the least invasive implementation even though it does not have the fastest rcu_read_{lock,unlock} implementation. The urcu flavor can be changed later, after benchmarking. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* QSLIST: add lock-free operationsPaolo Bonzini2015-01-131-2/+13
| | | | | | | | | | | | These operations are trivial to implement and do not have ABA problems. They are enough to implement simple multiple-producer, single consumer lock-free lists or, as in the next patch, the multiple consumers can steal a whole batch of elements and process them at their leisure. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1417518350-6167-5-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* linux-aio: simplify removal of completed iocbs from the listPaolo Bonzini2014-12-121-0/+11
| | | | | | | | | | | There is no need to do another O(n) pass on the list; the iocb to split the list at is already available through the array we passed to io_submit. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1418305950-30924-6-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* misc: move include files to include/qemu/Paolo Bonzini2012-12-191-0/+414
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OpenPOWER on IntegriCloud