summaryrefslogtreecommitdiffstats
path: root/hw/net
Commit message (Collapse)AuthorAgeFilesLines
* net: ne2000: fix bounds check in ioport operationsPrasad J Pandit2016-03-221-4/+6
| | | | | | | | | | | | | While doing ioport r/w operations, ne2000 device emulation suffers from OOB r/w errors. Update respective array bounds check to avoid OOB access. Reported-by: Ling Liu <liuling-it@360.cn> Cc: qemu-stable@nongnu.org Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit aa7f9966dfdff500bbbf1956d9e115b1fa8987a6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* e1000: eliminate infinite loops on out-of-bounds transfer startLaszlo Ersek2016-03-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The start_xmit() and e1000_receive_iov() functions implement DMA transfers iterating over a set of descriptors that the guest's e1000 driver prepares: - the TDLEN and RDLEN registers store the total size of the descriptor area, - while the TDH and RDH registers store the offset (in whole tx / rx descriptors) into the area where the transfer is supposed to start. Each time a descriptor is processed, the TDH and RDH register is bumped (as appropriate for the transfer direction). QEMU already contains logic to deal with bogus transfers submitted by the guest: - Normally, the transmit case wants to increase TDH from its initial value to TDT. (TDT is allowed to be numerically smaller than the initial TDH value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe that QEMU currently has here is a check against reaching the original TDH value again -- a complete wraparound, which should never happen. - In the receive case RDH is increased from its initial value until "total_size" bytes have been received; preferably in a single step, or in "s->rxbuf_size" byte steps, if the latter is smaller. However, null RX descriptors are skipped without receiving data, while RDH is incremented just the same. QEMU tries to prevent an infinite loop (processing only null RX descriptors) by detecting whether RDH assumes its original value during the loop. (Again, wrapping from RDLEN to 0 is normal.) What both directions miss is that the guest could program TDLEN and RDLEN so low, and the initial TDH and RDH so high, that these registers will immediately be truncated to zero, and then never reassume their initial values in the loop -- a full wraparound will never occur. The condition that expresses this is: xdh_start >= s->mac_reg[XDLEN] / sizeof(desc) i.e., TDH or RDH start out after the last whole rx or tx descriptor that fits into the TDLEN or RDLEN sized area. This condition could be checked before we enter the loops, but pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for bogus DMA addresses, so we just extend the existing failsafes with the above condition. This is CVE-2016-1981. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Prasad Pandit <ppandit@redhat.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: qemu-stable@nongnu.org RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit dd793a74882477ca38d49e191110c17dfee51dcc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* net: set endianness on all backend devicesLaurent Vivier2016-03-171-12/+11
| | | | | | | | | | | | | | | | | | | | commit 5be7d9f1b1452613b95c6ba70b8d7ad3d0797991 vhost-net: tell tap backend about the vnet endianness makes vhost net to set the endianness of the device, but only for the first device. In case of multiqueue, we have multiple devices... This patch sets the endianness for all the devices of the interface. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit a407644079c8639002e7ea635d851953b10a38c3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* net: ne2000: check ring buffer control registersPrasad J Pandit2016-03-171-0/+4
| | | | | | | | | | | | | | | Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152) bytes to process network packets. Registers PSTART & PSTOP define ring buffer size & location. Setting these registers to invalid values could lead to infinite loop or OOB r/w access issues. Add check to avoid it. Reported-by: Yang Hongke <yanghongke@huawei.com> Tested-by: Yang Hongke <yanghongke@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 415ab35a441eca767d033a2702223e785b9d5190) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* net: rocker: fix an incorrect array bounds checkPrasad J Pandit2016-03-151-4/+4
| | | | | | | | | | | | | | While processing transmit(tx) descriptors in 'tx_consume' routine the switch emulator suffers from an off-by-one error, if a descriptor was to have more than allowed(ROCKER_TX_FRAGS_MAX=16) fragments. Fix an incorrect bounds check to avoid it. Reported-by: Qinghao Tang <luodalongde@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 007cd223de527b5f41278f2d886c1a4beb3e67aa) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* net: vmxnet3: avoid memory leakage in activate_deviceP J P2016-03-151-8/+16
| | | | | | | | | | | | | | | | Vmxnet3 device emulator does not check if the device is active before activating it, also it did not free the transmit & receive buffers while deactivating the device, thus resulting in memory leakage on the host. This patch fixes both these issues to avoid host memory leakage. Reported-by: Qinghao Tang <luodalongde@gmail.com> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit aa4a3dce1c88ed51b616806b8214b7c8428b7470) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2015-12-074-11/+29
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging # gpg: Signature made Mon 07 Dec 2015 14:06:07 GMT using RSA key ID 398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: lan9118: log and ignore access to invalid registers, rather than aborting lan9118: fix emulation of MAC address loaded bit in E2P_CMD register vmxnet3: silence warning pcnet: fix rx buffer overflow(CVE-2015-7512) net: pcnet: add check to validate receive data size(CVE-2015-7504) e1000: fix hang of win2k12 shutdown with flood ping Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * lan9118: log and ignore access to invalid registers, rather than abortingAndrew Baumann2015-12-071-4/+8
| | | | | | | | | | | | | | | | | | | | | | With this change, access to invalid/unimplemented device registers are logged as a "guest error" rather than aborting qemu with hw_error. This enables drivers for similar devices (e.g. SMSC 9221), by simply ignoring the unimplemented writes. It's also closer to what real hardware does. Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * lan9118: fix emulation of MAC address loaded bit in E2P_CMD registerAndrew Baumann2015-12-071-3/+5
| | | | | | | | | | | | | | | | | | | | There appears to have been a longstanding typo in the implementation of the "MAC address loaded" bit in the E2P_CMD (EEPROM command) register. The code was using 0x10, but the controller spec says it should be bit 8 (0x100). Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * vmxnet3: silence warningMichael S. Tsirkin2015-12-071-1/+0
| | | | | | | | | | | | | | | | | | | | vmxnet3 always produces a warning under qtest. This is not a user error, don't warn. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * pcnet: fix rx buffer overflow(CVE-2015-7512)Jason Wang2015-12-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | Backends could provide a packet whose length is greater than buffer size. Check for this and truncate the packet to avoid rx buffer overflow in this case. Cc: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net: pcnet: add check to validate receive data size(CVE-2015-7504)Prasad J Pandit2015-12-071-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | In loopback mode, pcnet_receive routine appends CRC code to the receive buffer. If the data size given is same as the buffer size, the appended CRC code overwrites 4 bytes after s->buffer. Added a check to avoid that. Reported by: Qinghao Tang <luodalongde@gmail.com> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * e1000: fix hang of win2k12 shutdown with flood pingDenis V. Lunev2015-12-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown of UP VM under flood ping. The guest checks card state and reinjects itself interrupt in a loop. This is fatal for UP machine. There is no good way to fix this misbehavior but to kludge it. The emulation has interrupt throttling register aka ITR which limits interrupt rate and allows the guest to proceed this phase. There is no problem with this kludge for Linux guests - it adjust the value of it itself. On the other hand according to the initial research in commit e9845f0985f088dd01790f4821026df0afba5795 Author: Vincenzo Maffione <v.maffione@gmail.com> Date: Fri Aug 2 18:30:52 2013 +0200 e1000: add interrupt mitigation support ... Interrupt mitigation boosts performance when the guest suffers from an high interrupt rate (i.e. receiving short UDP packets at high packet rate). For some numerical results see the following link http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf this should also boost performance a bit. See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional details. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vincenzo Maffione <v.maffione@gmail.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | eepro100: Prevent two endless loopsStefan Weil2015-11-271-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg04592.html shows an example how an endless loop in function action_command can be achieved. During my code review, I noticed a 2nd case which can result in an endless loop. Reported-by: Qinghao Tang <luodalongde@gmail.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | vhost-user: ignore qemu-only featuresMichael S. Tsirkin2015-11-181-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | Some features (such as ctrl vq) are supported by qemu without need to communicate with the backend. Drop them from the feature mask so we set them unconditionally. Reported-by: Victor Kaplansky <vkaplans@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: don't send RESET_OWNER at stopYuanhan Liu2015-11-161-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | First of all, RESET_OWNER message is sent incorrectly, as it's sent before GET_VRING_BASE. And the reset message would let the later call get nothing correct. And, sending SET_VRING_ENABLE at stop, which has already been done, makes more sense than RESET_OWNER. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2015-11-122-109/+375
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging # gpg: Signature made Thu 12 Nov 2015 08:01:55 GMT using RSA key ID 398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: net: netmap: use error_setg() helpers in place of error_report() net: netmap: Fix compilation issue e1000: Introducing backward compatibility command line parameter e1000: Implementing various counters e1000: Fixing the packet address filtering procedure e1000: Fixing the received/transmitted octets' counters e1000: Fixing the received/transmitted packets' counters e1000: Trivial implementation of various MAC registers e1000: Introduced an array to control the access to the MAC registers e1000: Add support for migrating the entire MAC registers' array e1000: Cosmetic and alignment fixes slirp: Fix type casts and format strings in debug code Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | e1000: Introducing backward compatibility command line parameterLeonid Bloch2015-11-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This follows the previous patches, where support for migrating the entire MAC registers' array, and some new MAC registers were introduced. This patch introduces the e1000-specific boolean parameter "extra_mac_registers", which is on by default. Setting it to off will enable migration to older versions of QEMU, but will disable the read and write access to the new registers, that were introduced since adding the ability to migrate the entire MAC array. Example for usage to enable backward compatibility and to disable the new MAC registers: qemu-system-x86_64 -device e1000,extra_mac_registers=off,... ... As mentioned above, the default value is "on". Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Implementing various countersLeonid Bloch2015-11-121-5/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the following Statistic registers (various counters) according to Intel's specs: TSCTC GOTCL GOTCH GORCL GORCH MPRC BPRC RUC ROC BPTC MPTC PTC... PRC... PLEASE NOTE: these registers will not be active, nor will migrate, until a compatibility flag will be set (in the next patch in this series). Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Fixing the packet address filtering procedureLeonid Bloch2015-11-121-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if promiscuous unicast was enabled, a packet was received straight away, even if it was a multicast or a broadcast packet. This patch fixes that behavior, while making the filtering procedure a bit more human-readable. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Fixing the received/transmitted octets' countersLeonid Bloch2015-11-121-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, these 64-bit registers did not stick at their maximal values when (and if) they reached them, as they should do, according to the specs. This patch introduces a function that takes care of such registers, avoiding code duplication, making the relevant parts more compatible with the QEMU coding style, while ensuring that in the unlikely case of reaching the maximal value, the counter will stick there, as it supposed to. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Fixing the received/transmitted packets' countersLeonid Bloch2015-11-121-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to Intel's specs, these counters (as the other Statistic registers) stick at 0xffffffff when this maximal value is reached. Previously, they would reset after the max. value. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Trivial implementation of various MAC registersLeonid Bloch2015-11-122-3/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These registers appear in Intel's specs, but were not implemented. These registers are now implemented trivially, i.e. they are initiated with zero values, and if they are RW, they can be written or read by the driver, or read only if they are R (essentially retaining their zero values). For these registers no other procedures are performed. For the trivially implemented Diagnostic registers, a debug warning is produced on read/write attempts. PLEASE NOTE: these registers will not be active, nor will migrate, until a compatibility flag will be set (in a later patch in this series). The registers implemented here are: Transmit: RW: AIT Management: RW: WUC WUS IPAV IP6AT* IP4AT* FFLT* WUPM* FFMT* FFVT* Diagnostic: RW: RDFH RDFT RDFHS RDFTS RDFPC PBM* TDFH TDFT TDFHS TDFTS TDFPC Statistic: RW: FCRUC R: RNBC TSCTFC MGTPRC MGTPDC MGTPTC RFC RJC SCC ECOL LATECOL MCC COLC DC TNCRS SEC CEXTERR RLEC XONRXC XONTXC XOFFRXC XOFFTXC Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Introduced an array to control the access to the MAC registersLeonid Bloch2015-11-121-12/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The array of uint8_t's which is introduced here, contains access metadata about the MAC registers: if a register is accessible, but partly implemented, or if a register requires a certain compatibility flag in order to be accessed. Currently, 6 hypothetical flags are supported (3 exist for e1000 so far) but in the future, if more than 6 flags will be needed, the datatype of this array can simply be swapped for a larger one. This patch is intended to solve the following current problems: 1) In a scenario of migration between different versions of QEMU, which differ by the MAC registers implemented in them, some registers need not to be active if a compatibility flag is set, in order to preserve the machine's state perfectly for the older version. Checking this for each register individually, would create a lot of clutter in the code. 2) Some registers are (or may be) only partly implemented (e.g. placeholders that allow reading and writing, but lack other functions). In such cases it is better to print a debug warning on read/write attempts. As above, dealing with this functionality on a per-register level, would require longer and more messy code. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Add support for migrating the entire MAC registers' arrayLeonid Bloch2015-11-121-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the migration of the entire array of MAC registers possible during live migration. The entire array is just 128 KB long, so practically no penalty should be felt when transmitting it, additionally to the previously transmitted individual registers. The advantage here is eliminating the need to introduce new vmstate subsections in the future, when additional MAC registers will be implemented. Backward compatibility is preserved by introducing a e1000-specific boolean parameter (in a later patch), which will be on by default. Setting it to off would enable migration to older versions of QEMU. Additionally, this parameter will be used to control the access to the extra MAC registers in the future. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | e1000: Cosmetic and alignment fixesLeonid Bloch2015-11-122-79/+89
| |/ | | | | | | | | | | | | | | | | | | | | This fixes some alignment and cosmetic issues. The changes are made in order that the following patches in this series will look like integral parts of the code surrounding them, while conforming to the coding style. Although some changes in unrelated areas are also made. Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | error: More error_setg() usageEric Blake2015-11-112-12/+6
|/ | | | | | | | | | | A few uses of error_set(ERROR_CLASS_GENERIC_ERROR) were missed in c6bd8c706, or have snuck in since. Nuke them. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1447224690-9743-19-git-send-email-eblake@redhat.com> Acked-by: Andreas Färber <afaerber@suse.de> [Indentation tidied up, commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
* i.MX: Standardize i.MX FEC debugJean-Christophe Dubois2015-10-271-32/+32
| | | | | | | | | | | | | | | | The goal is to have debug code always compiled during build. We standardize all debug output on the following format: [QOM_TYPE_NAME]reporting_function: debug message The qemu_log_mask() output is following the same format as the above debug. Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Message-id: 57e565982db94fb433c32dfa17608888464d21de.1445781957.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* vmxnet3: Do not fill stats if device is inactiveShmulik Ladkani2015-10-271-0/+4
| | | | | | | | | | | | | | | | | | | Guest OS may issue VMXNET3_CMD_GET_STATS even before device was activated (for example in linux, after insmod but prior net-dev open). Accessing shared descriptors prior device activation is illegal as the VMXNET3State structures have not been fully initialized. As a result, guest memory gets corrupted and may lead to guest OS crashes. Fix, by not filling the stats descriptors if device is inactive. Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com> Acked-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* net: cadence_gem: Set initial MAC addressSebastian Huber2015-10-271-0/+6
| | | | | | | | | Set initial MAC address to the one specified by the command line. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* vhost user: add rarp sending after live migration for legacy guestThibaut Collet2015-10-221-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | A new vhost user message is added to allow QEMU to ask to vhost user backend to broadcast a fake RARP after live migration for guest without GUEST_ANNOUNCE capability. This new message is sent only if the backend supports the new VHOST_USER_PROTOCOL_F_RARP protocol feature. The payload of this new message is the MAC address of the guest (not known by the backend). The MAC address is copied in the first 6 bytes of a u64 to avoid to create a new payload message type. This new message has no equivalent ioctl so a new callback is added in the userOps structure to send the request. Upon reception of this new message the vhost user backend must generate and broadcast a fake RARP request to notify the migration is terminated. Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> [Rebased and fixed checkpatch errors - Marc-André] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
* vhost user: add support of live migrationThibaut Collet2015-10-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Some vhost user backends are able to support live migration. To provide this service the following features must be added: 1. Add the VIRTIO_NET_F_GUEST_ANNOUNCE capability to vhost-net when netdev backend is vhost-user. 2. Provide a nop receive callback to vhost-user. This callback is called by: * qemu_announce_self after a migration to send fake RARP to avoid network outage for peers talking to the migrated guest. - For guest with GUEST_ANNOUNCE capabilities, guest already sends GARP when the bit VIRTIO_NET_S_ANNOUNCE is set. => These packets must be discarded. - For guest without GUEST_ANNOUNCE capabilities, migration termination is notified when the guest sends packets. => These packets can be discarded. * virtio_net_tx_bh with a dummy boot to send fake bootp/dhcp request. BIOS guest manages virtio driver to send 4 bootp/dhcp request in case of dummy boot. => These packets must be discarded. Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
* vhost: use a function for each callMarc-André Lureau2015-10-221-10/+6
| | | | | | | | | | | | | | | | Replace the generic vhost_call() by specific functions for each function call to help with type safety and changing arguments. While doing this, I found that "unsigned long long" and "uint64_t" were used interchangeably and causing compilation warnings, using uint64_t instead, as the vhost & protocol specifies. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> [Fix enum usage and MQ - Thibaut Collet] Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
* Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2015-10-124-15/+37
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging # gpg: Signature made Mon 12 Oct 2015 08:56:47 BST using RSA key ID 398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: tests: add test cases for netfilter object netfilter: add a netbuffer filter net/queue: export qemu_net_queue_append_iov netfilter: print filter info associate with the netdev netfilter: add an API to pass the packet to next filter net/queue: introduce NetQueueDeliverFunc net: merge qemu_deliver_packet and qemu_deliver_packet_iov netfilter: hook packets before net queue send init/cleanup of netfilter object vl.c: init delayed object after net_init_clients vmxnet3: Add support for VMXNET3_CMD_GET_ADAPTIVE_RING_INFO command e1000: use alias for default model vmxnet3: Support reading IMR registers on bar0 net/vmxnet3: Refine l2 header validation Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * vmxnet3: Add support for VMXNET3_CMD_GET_ADAPTIVE_RING_INFO commandShmulik Ladkani2015-10-122-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | Some drivers (e.g. vmware-tools) issue the VMXNET3_CMD_GET_ADAPTIVE_RING_INFO command. Currently, due to lack of support, a bogus value (-1) is returned. Support this command, returning the "adaptive-ring disabled" flag. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * e1000: use alias for default modelJason Wang2015-10-121-7/+1
| | | | | | | | | | | | | | | | | | Instead of duplicating the "e1000-82540em" device model as "e1000", make the latter an alias for the former. Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com Reviewed-by: Markus Armbruster <armbru@redhat.com>
| * vmxnet3: Support reading IMR registers on bar0Shmulik Ladkani2015-10-121-1/+5
| | | | | | | | | | | | | | | | | | Instead of asserting, return the actual IMR register value. This is aligned with what's returned on ESXi. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Tested-by: Dana Rubin <dana.rubin@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net/vmxnet3: Refine l2 header validationDana Rubin2015-10-122-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Validation of l2 header length assumed minimal packet size as eth_header + 2 * vlan_header regardless of the actual protocol. This caused crash for valid non-IP packets shorter than 22 bytes, as 'tx_pkt->packet_type' hasn't been assigned for such packets, and 'vmxnet3_on_tx_done_update_stats()' expects it to be properly set. Refine header length validation in 'vmxnet_tx_pkt_parse_headers'. Check its return value during packet processing flow. As a side effect, in case IPv4 and IPv6 header validation failure, corrupt packets will be dropped. Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | rocker: Use g_new() & friends where that makes obvious senseMarkus Armbruster2015-10-084-9/+8
|/ | | | | | | | | | | | | | | | g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Same Coccinelle semantic patchas in commit b45c03f. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* virtio-net: correctly drop truncated packetsJason Wang2015-10-011-7/+1
| | | | | | | | | | | | | | | | | | | When packet is truncated during receiving, we drop the packets but neither discard the descriptor nor add and signal used descriptor. This will lead several issues: - sg mappings are leaked - rx will be stalled if a lots of packets were truncated In order to be consistent with vhost, fix by discarding the descriptor in this case. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge remote-tracking branch ↵Peter Maydell2015-09-252-10/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/vivier-misc/tags/pull-muldiv64-20150925' into staging Remove muldiv64() by using period instead of frequency # gpg: Signature made Fri 25 Sep 2015 14:54:37 BST using RSA key ID 3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier-misc/tags/pull-muldiv64-20150925: net: remove muldiv64() bt: remove muldiv64() hpet: remove muldiv64() arm: clarify the use of muldiv64() openrisc: remove muldiv64() mips: remove muldiv64() pcnet: remove muldiv64() rtl8139: remove muldiv64() i6300esb: remove muldiv64() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * pcnet: remove muldiv64()Laurent Vivier2015-09-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, timers were ticks based, and it made sense to add ticks to current time to know when to trigger an alarm. But since commit: 7447545 change all other clock references to use nanosecond resolution accessors All timers use nanoseconds and we need to convert ticks to nanoseconds, by doing something like: y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY) where x is the number of device ticks and y the number of system ticks. y is used as nanoseconds in timer functions, it works because 1 tick is 1 nanosecond. (get_ticks_per_sec() is 10^9) But as PCI frequency is 33 MHz, we can also do: y = x * 30; /* 33 MHz PCI period is 30 ns */ Which is much more simple. This implies a 33.333333 MHz PCI frequency, but this is correct. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * rtl8139: remove muldiv64()Laurent Vivier2015-09-251-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, timers were ticks based, and it made sense to add ticks to current time to know when to trigger an alarm. But since commit: 7447545 change all other clock references to use nanosecond resolution accessors All timers use nanoseconds and we need to convert ticks to nanoseconds, by doing something like: y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY) where x is the number of device ticks and y the number of system ticks. y is used as nanoseconds in timer functions, it works because 1 tick is 1 nanosecond. (get_ticks_per_sec() is 10^9) But as PCI frequency is 33 MHz, we can also do: y = x * 30; /* 33 MHz PCI period is 30 ns */ Which is much more simple. This implies a 33.333333 MHz PCI frequency, but this is correct. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
* | vhost-user: add a new message to disable/enable a specific virt queue.Changchun Ouyang2015-09-242-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable a specific virt queue, which is similar to attach/detach queue for tap device. virtio driver on guest doesn't have to use max virt queue pair, it could enable any number of virt queue ranging from 1 to max virt queue pair. Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
* | vhost-user: add multiple queue supportChangchun Ouyang2015-09-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is initially based a patch from Nikolay Nikolaev. This patch adds vhost-user multiple queue support, by creating a nc and vhost_net pair for each queue. Qemu exits if find that the backend can't support the number of requested queues (by providing queues=# option). The max number is queried by a new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol feature VHOST_USER_PROTOCOL_F_MQ is present first. The max queue check is done at vhost-user initiation stage. We initiate one queue first, which, in the meantime, also gets the max_queues the backend supports. In older version, it was reported that some messages are sent more times than necessary. Here we came an agreement with Michael that we could categorize vhost user messages to 2 types: non-vring specific messages, which should be sent only once, and vring specific messages, which should be sent per queue. Here I introduced a helper function vhost_user_one_time_request(), which lists following messages as non-vring specific messages: VHOST_USER_SET_OWNER VHOST_USER_RESET_DEVICE VHOST_USER_SET_MEM_TABLE VHOST_USER_GET_QUEUE_NUM For above messages, we simply ignore them when they are not sent the first time. Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
* | vhost-user: add VHOST_USER_GET_QUEUE_NUM messageYuanhan Liu2015-09-241-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is for querying how many queues the backend supports if it has mq support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried protocol features). vhost_net_get_max_queues() is the interface to export that value, and to tell if the backend supports # of queues user requested, which is done in the following patch. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
* | vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICEYuanhan Liu2015-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Quote from Michael: We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
* | vhost-user: add protocol feature negotiationMichael S. Tsirkin2015-09-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support a separate bitmask for vhost-user protocol features, and messages to get/set protocol features. Invoke them at init. No features are defined yet. [ leverage vhost_user_call for request handling -- Yuanhan Liu ] Signed-off-by: Michael S. Tsirkin <address@hidden> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
* | virtio-net: unbreak self announcement and guest offloads after migrationJason Wang2015-09-241-17/+23
|/ | | | | | | | | | | | | | | | | | | | After commit 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make features 64bit wide"). Device's guest_features was actually set after vdc->load(). This breaks the assumption that device specific load() function can check guest_features. For virtio-net, self announcement and guest offloads won't work after migration. Fixing this by defer them to virtio_net_load() where guest_features were guaranteed to be set. Other virtio devices looks fine. Fixes: 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make features 64bit wide") Cc: qemu-stable@nongnu.org Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* Fix bad error handling after memory_region_init_ram()Markus Armbruster2015-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Symptom: $ qemu-system-x86_64 -m 10000000 Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456: upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory Aborted (core dumped) Root cause: commit ef701d7 screwed up handling of out-of-memory conditions. Before the commit, we report the error and exit(1), in one place, ram_block_add(). The commit lifts the error handling up the call chain some, to three places. Fine. Except it uses &error_abort in these places, changing the behavior from exit(1) to abort(), and thus undoing the work of commit 3922825 "exec: Don't abort when we can't allocate guest memory". The three places are: * memory_region_init_ram() Commit 4994653 (right after commit ef701d7) lifted the error handling further, through memory_region_init_ram(), multiplying the incorrect use of &error_abort. Later on, imitation of existing (bad) code may have created more. * memory_region_init_ram_ptr() The &error_abort is still there. * memory_region_init_rom_device() Doesn't need fixing, because commit 33e0eb5 (soon after commit ef701d7) lifted the error handling further, and in the process changed it from &error_abort to passing it up the call chain. Correct, because the callers are realize() methods. Fix the error handling after memory_region_init_ram() with a Coccinelle semantic patch: @r@ expression mr, owner, name, size, err; position p; @@ memory_region_init_ram(mr, owner, name, size, ( - &error_abort + &error_fatal | err@p ) ); @script:python@ p << r.p; @@ print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column) When the last argument is &error_abort, it gets replaced by &error_fatal. This is the fix. If the last argument is anything else, its position is reported. This lets us check the fix is complete. Four positions get reported: * ram_backend_memory_alloc() Error is passed up the call chain, ultimately through user_creatable_complete(). As far as I can tell, it's callers all handle the error sanely. * fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize() DeviceClass.realize() methods, errors handled sanely further up the call chain. We're good. Test case again behaves: $ qemu-system-x86_64 -m 10000000 qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory [Exit 1 ] The next commits will repair the rest of commit ef701d7's damage. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
OpenPOWER on IntegriCloud