summaryrefslogtreecommitdiffstats
path: root/hw
Commit message (Collapse)AuthorAgeFilesLines
...
* virtio-balloon: reset the statistic timer to load devicePavel Butsykin2019-11-291-0/+4
| | | | | | | | | | | | | | | | If before loading snapshot we had set the timer of statistics, then after applying snapshot the expiry time would be irrelevant for the restored state of the virtual clocks. A simple fix is just to restart the timer after loading snapshot. For the user it may look like a long delay of statistics update after switch to the snapshot. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Migration: Add i82801b11 migration dataDr. David Alan Gilbert2019-11-291-0/+9
| | | | | | | | | | | | | | | The i82801b11 bridge didn't have a vmsd and thus didn't send any migration data, including that of its parent PCIBridge object. The symptom being if the guest used any devices behind the bridge the guest crashed (mostly with various interrupt related issues). Note: This will cause migration from old qemus that used this device to explicitly fail during migration as opposed to the guest crashing. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Sort the fw_cfg file listGerd Hoffmann2019-11-295-8/+143
| | | | | | | | | | | | | | | | | | | | | | | | Entries are inserted in filename order instead of being appended to the end in case sorting is enabled. This will avoid any future issues of moving the file creation around, it doesn't matter what order they are created now, the will always be in filename order. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Added machine type handling for compatibility. This was a fairly complex change, this will preserve the order of fw_cfg for older versions no matter what order the firmware files actually come in. A list is kept of the correct legacy order and the entries will be inserted based upon their order in the list. Except that some entries are ordered (in a specific area of the list) based upon what order they appear on the command line. Special handling is added for those entries. Signed-off-by: Corey Minyard <cminyard@mvista.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* xen: piix reuse pci generic class init functionMichael S. Tsirkin2019-11-291-13/+1
| | | | | | | | | | | piix3_ide_xen_class_init is identical to piix3_ide_class_init except it's buggy as it does not set exit and does not disable hotplug properly. Switch to the generic one. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci-testdev: fast mmio supportMichael S. Tsirkin2019-11-291-1/+6
| | | | | | | | | | | Teach PCI testdev to use fast MMIO when kvm makes it available. Before: mmio-wildcard-eventfd:pci-mem 2271 After: mmio-wildcard-eventfd:pci-mem 1218 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* rtl8139: using CP_TX_OWN for ownership transferring during txJason Wang2019-11-291-1/+1
| | | | | | | Through CP_TX_OWN and CP_RX_OWN points to the same bit, we'd better use CP_TX_OWN for tx descriptor handling. Signed-off-by: Jason Wang <jasowang@redhat.com>
* spapr_drc: enable immediate detach for unsignalled devicesMichael Roth2019-11-292-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently spapr doesn't support "aborting" hotplug of PCI devices by allowing device_del to immediately remove the device if we haven't signalled the presence of the device to the guest. In the past this wasn't an issue, since we always immediately signalled device attach and simply relied on full guest-aware add->remove path for device removal. However, as of 788d259, we now defer signalling for PCI functions until function 0 is attached, so now we need to deal with these "abort" operations for cases where a user hotplugs a non-0 function, then opts to remove it prior hotplugging function 0. Currently they'd have to reboot before the unplug completed. PCIe multifunction hotplug does not have this requirement however, so from a management implementation perspective it would be good to address this within the same release as 788d259. We accomplish this by simply adding a 'signalled' flag to track whether a device hotplug event has been sent to the guest. If it hasn't, we allow immediate removal under the assumption that the guest will not be using the device. Devices present at boot/reset time are also assumed to be 'signalled'. For CPU/memory/etc, signalling will still happen immediately as part of device_add, so only PCI functions should be affected. Cc: bharata@linux.vnet.ibm.com Cc: david@gibson.dropbear.id.au Cc: sbhat@linux.vnet.ibm.com Cc: qemu-ppc@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [dwg: This fixes a regression where an incorrect hot-add of a non-zero function can no longer be backed out until function 0 is added] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc: Rework POWER7 & POWER8 exception modelCédric Le Goater2019-11-291-15/+1
| | | | | | | | | | | | | | | | | | From: Benjamin Herrenschmidt <benh@kernel.crashing.org> This patch fixes the current AIL implementation for POWER8. The interrupt vector address can be calculated directly from LPCR when the exception is handled. The excp_prefix update becomes useless and we can cleanup the H_SET_MODE hcall. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: Removed LPES0/1 handling for HV vs. !HV Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ] Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> [dwg: This was written as a cleanup, but it also fixes a real bug where setting an alternative interrupt location would not be correctly migrated] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/arm/bcm2836: Wire up CPU timer interrupts correctlyPeter Maydell2019-11-291-1/+5
| | | | | | | | | | | | Wire up the CPU timer interrupts in the right order, with the nonsecure physical timer on cntpnsirq, the hyp timer on cnthpirq, and the secure physical timer on cntpsirq. (We did get the virt timer right, at least.) Reported-by: Antonio Huete Jiménez <tuxillo@quantumachine.net> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Message-id: 1458210790-6621-1-git-send-email-peter.maydell@linaro.org
* opencores_eth: indicate autonegotiation completionMax Filippov2019-11-291-1/+1
| | | | | | | | Indicate that autonegotiation is complete in the MII BMSR. This fixes networking on xtfpga platform in linux v4.5. Cc: qemu-stable@nongnu.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
* block: m25p80: at25128a/at25256a modelsMarcin Krzeminski2019-11-291-2/+13
| | | | | | | Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-12-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: n25q256a/n25q512a modelsMarcin Krzeminski2019-11-291-1/+2
| | | | | | | Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-11-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Implemented FSR registerMarcin Krzeminski2019-11-291-0/+15
| | | | | | | | | Implements FSR register, it is used for busy waits. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-10-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Fast read and 4bytes commandsMarcin Krzeminski2019-11-291-4/+46
| | | | | | | | | | Adds fast read and 4bytes commands family. This work is based on Pawel Lenkow patch from v1. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-9-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Dummy cycles for N25Q256/512Marcin Krzeminski2019-11-291-3/+11
| | | | | | | | | | Use the setting from the volatile cfg register to correctly set the number of dummy cycles. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-8-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Add configuration registersMarcin Krzeminski2019-11-291-0/+128
| | | | | | | | | | | | | This patch adds both volatile and non volatile configuration registers and commands to allow modify them. It is needed for proper handling dummy cycles. Initialization of those registers and flash state has been included as well. Some of this registers are used by kernel. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Acked-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-7-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: 4byte address modeMarcin Krzeminski2019-11-291-10/+33
| | | | | | | | | | This patch adds only 4byte address mode (does not cover dummy cycles). This mode is needed to access more than 16 MiB of flash. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-6-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Extend address modeMarcin Krzeminski2019-11-291-0/+27
| | | | | | | | | | | Extend address mode allows to switch flash 16 MiB banks, allowing user to access all flash sectors. This access mode is used by u-boot. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-5-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Widen flags variableMarcin Krzeminski2019-11-291-1/+1
| | | | | | | | | | | Extend the width of the flags variable to support the already existing (but unused) WR_1 flag, which is above the range of 8 bits. This allows support of EEPROM emulation which requires the WR_1 feature. Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-4-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: RESET_ENABLE and RESET_MEMORY commandsMarcin Krzeminski2019-11-291-1/+40
| | | | | | | Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-3-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* block: m25p80: Removed unused variableMarcin Krzeminski2019-11-291-2/+0
| | | | | | | Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-id: 1458719789-29868-2-git-send-email-marcin.krzeminski@nokia.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* ARM: Virt: Use gpio_key for power buttonShannon Zhao2019-11-291-2/+5
| | | | | | | | | | | | There is a problem for power button that it will not work if an early system_powerdown request happens before guest gpio driver loads. Fix this problem by using gpio_key. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Message-id: 1458221140-15232-3-git-send-email-zhaoshenglong@huawei.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* hw/gpio: Add the emulation of gpio_keyShannon Zhao2019-11-292-0/+105
| | | | | | | | | | | This will be used by ARM virt machine as a power button. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Message-id: 1458221140-15232-2-git-send-email-zhaoshenglong@huawei.com [PMM: Use hyphen rather than underscore in type names; add a comment briefly describing what the device does] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* xen_disk: Call blk_set_enable_write_cache() explicitlyKevin Wolf2019-11-291-1/+4
| | | | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* hw/mips/cps: enable ITU for multithreading processorsLeon Alrae2019-11-291-0/+32
| | | | | | | Make ITU available in the system if CPU supports multithreading and is part of CPS. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement ITC Storage - Bypass ViewLeon Alrae2019-11-291-0/+27
| | | | | | | | | | Bypass View does not cause issuing thread to block and does not affect any of the cells state bit. Read from a FIFO cell returns the value of the oldest entry. Store to a FIFO cell changes the value of the newest entry. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement ITC Storage - P/V Sync and Try ViewsLeon Alrae2019-11-291-0/+68
| | | | | | | | | | | | | | | | P/V Synchronized and Try Views can be used to access Semaphore cells. Load returns current value and post-decrements the value in the cell (until it reaches zero). Stores increment the value (until it saturates at 0xFFFF). P/V Synchronized View causes the issuing thread to block on read if value is 0. P/V Try View does not block the thread, it returns 0 in this case. Cell's Empty and Full bits are not modified. Trap bit (i.e. Gating Storage exceptions) not implemented. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement ITC Storage - Empty/Full Sync and Try ViewsLeon Alrae2019-11-291-0/+113
| | | | | | | | | | | | | | | | | | | | Empty/Full Synchronized and Try views can be used to access FIFO cells. Store to the FIFO cell pushes the value into the queue, load pops the oldest element from the queue. Cell's Full and Empty bits are automatically updated to reflect new state of the cell. Empty/Full Synchronized View causes the issuing thread to block when FIFO is empty while thread is performing a read, or FIFO is full while thread is performing a write. Empty/Full Try View never blocks the thread. If cell is full then write is ignored, if cell is empty then load returns 0. Trap bit (i.e. Gating Storage exceptions) not implemented. Store Conditional support for E/F Try View (i.e. indicate failure if FIFO is full) not implemented. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement ITC Storage - Control ViewLeon Alrae2019-11-291-0/+104
| | | | | | | | | Control view is used to access the ITC Storage Cell Tags. It never causes the issuing thread to block. Guest can empty the FIFO cell by setting Empty bit to 1. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement ITC Configuration Tags and Storage CellsLeon Alrae2019-11-292-0/+215
| | | | | | | | | | | | | | | | | | | | | | Implement ITC as a single object consisting of two memory regions: 1) tag_io: ITC Configuration Tags (i.e. ITCAddressMap{0,1} registers) which are accessible by the CPU via CACHE instruction. Also adding MemoryRegion *itc_tag to the CPUMIPSState so that CACHE instruction will dispatch reads/writes directly. 2) storage_io: memory-mapped ITC Storage whose address space is configurable (i.e. enabled/remapped/resized) by writing to ITCAddressMap{0,1} registers. ITC Storage contains FIFO and Semaphore cells. Read-only FIFO bit in the ITC cell tag indicates the type of the cell. If the ITC Storage contains both types of cells then FIFOs are located before Semaphores. Since issuing thread can get blocked on the access to a cell (in E/F Synchronized and P/V Synchronized Views) each cell has a bitmap to track which threads are currently blocked. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips_malta: add CPS to Malta boardLeon Alrae2019-11-291-11/+49
| | | | | | | | | | If the user specifies smp > 1 and the CPU with CM GCR support, then create Coherent Processing System (which takes care of instantiating CPUs) rather than CPUs directly and connect i8259 and cbus to the pins exposed by CPS. However, there is no GIC yet, thus CPS exposes CPU's IRQ pins so use the same pin numbers as before. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips_malta: move CPU creation to a separate functionLeon Alrae2019-11-291-29/+39
| | | | Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips_malta: remove redundant irq and clock initLeon Alrae2019-11-291-4/+0
| | | | | | | | | | | Global smp_cpus is never zero (even if user provides -smp 0), thus clocks and irqs are always initialized for each created CPU in the loop at the beginning of mips_malta_init. These two lines cause a leak of already allocated timer and irqs for the first CPU - remove them. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips_malta: remove CPUMIPSState from the write_bootloader()Leon Alrae2019-11-291-4/+4
| | | | | | | Remove CPUMIPSState from the write_bootloader() argument list as it is not used in the function. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips/cps: create CPC block inside CPSLeon Alrae2019-11-292-0/+69
| | | | | | | | | | | Create Cluster Power Controller and add a link to the CPC MemoryRegion in GCR. Guest can enable / map CPC to any physical address by writing to the memory-mapped GCR_CPC_BASE register. Set vp-start-reset property to 1 to allow only first VP to run from reset. Others are brought up by the guest via CPC memory-mapped registers. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: add initial Cluster Power Controller supportLeon Alrae2019-11-292-0/+178
| | | | | | | | | | | | | | Cluster Power Controller (CPC) is responsible for power management in multiprocessing system. It provides registers to control the power and the clock frequency of the individual elements in the system. This patch implements only three registers that are used to control the power state of each VP on a single core: * VP Run is a write-only register used to set each VP to the run state * VP Stop is a write-only register used to set each VP to the suspend state * VP Running is a read-only register indicating the run state of each VP Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips/cps: create GCR block inside CPSLeon Alrae2019-11-291-0/+23
| | | | Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: add initial Global Config Register supportYongbok Kim2019-11-292-0/+108
| | | | | | | | | | | | | | Add initial GCR support to indicate number of VPs present in the system, L2 bypass mode and revision number. Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com> [leon.alrae@imgtec.com: * removed GIC part, * changed commit message, * replaced %lx format spec. with PRIx64, * renamed mips_gcr.{c,h} to mips_cmgcr.{c,h}, * replaced CONFIG_MIPS_GIC with CONFIG_MIPS_CPS] Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* hw/mips: implement generic MIPS Coherent Processing System containerLeon Alrae2019-11-292-0/+110
| | | | | | | | | Implement generic MIPS Coherent Processing System (CPS) which in this commit just creates VPs, but it will serve as a container also for other components like Global Configuration Registers and Cluster Power Controller. Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
* Revert "e1000: fix hang of win2k12 shutdown with flood ping"Sameeh Jubran2019-11-291-5/+0
| | | | | | | | | | | This reverts commit 9596ef7c7b8528bedb240792ea1fb598543ad3c4. This workaround in order to fix endless interrupts is no longer needed because it was superseded by the previous patch (e1000: Fixing interrupt pace). Signed-off-by: Sameeh Jubran <sameeh@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* e1000: Fixing interrupts pace.Sameeh Jubran2019-11-291-0/+8
| | | | | | | | | | | | | | | | | | | | | | This patch introduces an upper bound for number of interrupts per second. Without this bound an interrupt storm can occur as it has been observed on Windows 10 when disabling the device. According to the SPEC - Intel PCI/PCI-X Family of Gigabit Ethernet Controllers Software Developer's Manual, section 13.4.18 - the Ethernet controller guarantees a maximum observable interrupt rate of 7813 interrupts/sec. If there is no upper bound this could lead to an interrupt storm by e1000 (when mit_delay < 500) causing interrupts to fire at a very high pace. Thus if mit_delay < 500 then the delay should be set to the minimum delay possible which is 500. This can be calculated easily as follows: Interval = 10^9 / (7813 * 256) = 500. Signed-off-by: Sameeh Jubran <sameeh@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* vfio: convert to 128 bit arithmetic calculations when adding mem regionsBandan Das2019-11-291-8/+11
| | | | | | | | | | | | | vfio_listener_region_add for a iommu mr results in an overflow assert since iommu memory region is initialized with UINT64_MAX. Convert calculations to 128 bit arithmetic for iommu memory regions and let int128_get64 assert for non iommu regions if there's an overflow. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bandan Das <bsd@redhat.com> [missed (end - 1) on 2nd trace call, move llsize closer to use] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* hw/net/spapr_llan: Enable the RX buffer pools by default for new machinesThomas Huth2019-11-292-2/+7
| | | | | | | | | | | RX buffer pools are now enabled by default for new machine types. For older machine types, they are still disabled to avoid breaking migration. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/net/spapr_llan: Fix receive buffer handling for better performanceThomas Huth2019-11-291-2/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tl;dr: This patch introduces an alternate way of handling the receive buffers of the spapr-vlan device, resulting in much better receive performance for the guest. Full story: One of our testers recently discovered that the performance of the spapr-vlan device is very poor compared to other NICs, and that a simple "ping -i 0.2 -s 65507 someip" in the guest can result in more than 50% lost ping packets (especially with older guest kernels < 3.17). After doing some analysis, it was clear that there is a problem with the way we handle the receive buffers in spapr_llan.c: The ibmveth driver of the guest Linux kernel tries to add a lot of buffers into several buffer pools (with 512, 2048 and 65536 byte sizes by default, but it can be changed via the entries in the /sys/devices/vio/1000/pool* directories of the guest). However, the spapr-vlan device of QEMU only tries to squeeze all receive buffer descriptors into one single page which has been supplied by the guest during the H_REGISTER_LOGICAL_LAN call, without taking care of different buffer sizes. This has two bad effects: First, only a very limited number of buffer descriptors is accepted at all. Second, we also hand 64k buffers to the guest even if the 2k buffers would fit better - and this results in dropped packets in the IP layer of the guest since too much skbuf memory is used. Though it seems at a first glance like PAPR says that we should store the receive buffer descriptors in the page that is supplied during the H_REGISTER_LOGICAL_LAN call, chapter 16.4.1.2 in the LoPAPR spec declares that "the contents of these descriptors are architecturally opaque, none of these descriptors are manipulated by code above the architected interfaces". That means we don't have to store the RX buffer descriptors in this page, but can also manage the receive buffers at the hypervisor level only. This is now what we are doing here: Introducing proper RX buffer pools which are also sorted by size of the buffers, so we can hand out a buffer with the best fitting size when a packet has been received. To avoid problems with migration from/to older version of QEMU, the old behavior is also retained and enabled by default. The new buffer management has to be enabled via a new "use-rx-buffer-pools" property. Now with the new buffer pool management enabled, the problem with "ping -s 65507" is fixed for me, and the throughput of a simple test with wget increases from creeping 3MB/s up to 20MB/s! Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/net/spapr_llan: Extract rx buffer code into separate functionsThomas Huth2019-11-291-36/+70
| | | | | | | | | | | | | Refactor the code a little bit by extracting the code that reads and writes the receive buffer list page into separate functions. There should be no functional change in this patch, this is just a preparation for the upcoming extensions that introduce receive buffer pools. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc: Create cpu_ppc_set_papr() helperBenjamin Herrenschmidt2019-11-291-9/+2
| | | | | | | | | | | | | And move the code adjusting the MSR mask and calling kvmppc_set_papr() to it. This allows us to add a few more things such as disabling setting of MSR:HV and appropriate LPCR bits which will be used when fixing the exception model. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [clg: removed LPCR setting ] Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/target-ppc/kvm: Only add hcall-instructions if KVM supports itAlexey Kardashevskiy2019-11-291-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ePAPR defines "hcall-instructions" device-tree property which contains code to call hypercalls in ePAPR paravirtualized guests. In general pseries guests won't use this property, instead using the PAPR defined hypercall interface. However, this property has been re-used to implement a hack to allow PR KVM to run (slightly modified) guests in some situations where it otherwise wouldn't be able to (because the system's L0 hypervisor doesn't forward the PAPR hypercalls to the PR KVM kernel). Hence, this property is always present in the device tree for pseries guests. All KVM guests use it at least to read features via the KVM_HC_FEATURES hypercall. The property is populated by the code returned from the KVM's KVM_PPC_GET_PVINFO ioctl; if not implemented in the KVM, QEMU supplies code which will fail all hypercall attempts. If QEMU does not create the property, and the guest kernel is compiled with CONFIG_EPAPR_PARAVIRT (which is normally the case), there is exactly the same stub at @epapr_hypercall_start already. Rather than maintaining this fairly useless stub implementation, it makes more sense not to create the property in the device tree in the first place if the host kernel does not implement it. This changes kvmppc_get_hypercall() to return 1 if the host kernel does not implement KVM_CAP_PPC_GET_PVINFO. The caller can use it to decide on whether to create the property or not. This changes the pseries machine to not create the property if KVM does not implement KVM_PPC_GET_PVINFO. In practice this means that from now on the property will not be created if either HV KVM or TCG is used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [reworded commit message for clarity --dwg] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* util: move declarations out of qemu-common.hVeronia Bahaa2019-11-2949-6/+52
| | | | | | | | | | Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Replaced get_tick_per_sec() by NANOSECONDS_PER_SECONDRutuja Shah2019-11-2942-108/+123
| | | | | | | | | | | | | | | | | | This patch replaces get_ticks_per_sec() calls with the macro NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec() is then removed. This replacement improves the readability and understandability of code. For example, timer_mod(fdctrl->result_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50)); NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns matches the unit of the expression on the right side of the plus. Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hw: explicitly include qemu-common.h and cpu.hPaolo Bonzini2019-11-2985-1/+164
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OpenPOWER on IntegriCloud