summaryrefslogtreecommitdiffstats
path: root/numa.c
Commit message (Collapse)AuthorAgeFilesLines
* QemuOpts: Convert qemu_opts_foreach() to ErrorMarkus Armbruster2015-06-091-2/+2
| | | | | | | | | Retain the function value for now, to permit selective conversion of its callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
* QemuOpts: Drop qemu_opts_foreach() parameter abort_on_failureMarkus Armbruster2015-06-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | When the argument is non-zero, qemu_opts_foreach() stops on callback returning non-zero, and returns that value. When the argument is zero, it doesn't stop, and returns the bit-wise inclusive or of all the return values. Funky :) The callers that pass zero could just as well pass one, because their callbacks can't return anything but zero: * qemu_add_globals()'s callback qdev_add_one_global() * qemu_config_write()'s callback config_write_opts() * main()'s callbacks default_driver_check(), drive_enable_snapshot(), vnc_init_func() Drop the parameter, and always stop. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
* numa: Print warning if no node is assigned to a CPUEduardo Habkost2015-03-191-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need all possible CPUs (including hotplug ones) to be present in the SRAT when QEMU starts. QEMU already does that correctly today, the only problem is that when a CPU is omitted from the NUMA configuration, it is silently assigned to node 0. Check if all CPUs up to max_cpus are present in the NUMA configuration and warn about missing CPUs. Make it just a warning, to allow management software to be updated if necessary. In the future we may make it a fatal error instead. Command-line examples: * Correct, no warning: $ qemu-system-x86_64 -smp 2,maxcpus=4 $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0-3 * Incomplete, with warnings: $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0 qemu-system-x86_64: warning: CPU(s) not present in any NUMA nodes: 1 2 3 qemu-system-x86_64: warning: All CPU(s) up to maxcpus should be described in NUMA config $ qemu-system-x86_64 -smp 2,maxcpus=4 -numa node,cpus=0-2 qemu-system-x86_64: warning: CPU(s) not present in any NUMA nodes: 3 qemu-system-x86_64: warning: All CPU(s) up to maxcpus should be described in NUMA config Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> --- v1 -> v2: (no changes) v2 -> v3: * Use enumerate_cpus() and error_report() for error message * Simplify logic using bitmap_full() v3 -> v4: * Clarify error message, mention that all CPUs up to maxcpus need to be described in NUMA config v4 -> v5: * Commit log update, to make problem description clearer
* numa: introduce machine callback for VCPU to node mappingIgor Mammedov2015-03-191-5/+13
| | | | | | | | | | | | | | | | Current default round-robin way of distributing VCPUs among NUMA nodes might be wrong in case on multi-core/threads CPUs. Making guests confused wrt topology where cores from the same socket are on different nodes. Allow a machine to override default mapping by providing MachineClass::cpu_index_to_socket_id() callback which would allow it group VCPUs from a socket on the same NUMA node. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* numa: Reject configuration if CPU appears on multiple nodesEduardo Habkost2015-03-191-0/+37
| | | | | | | | Each CPU can appear in only one NUMA node on the NUMA config. Reject configuration if a CPU appears in multiple nodes. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* numa: Reject CPU indexes > max_cpusEduardo Habkost2015-03-191-3/+5
| | | | | | | | | | | | CPU index is always less than max_cpus, as documented at sysemu.h: > The following shall be true for all CPUs: > cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS Reject configuration which uses invalid CPU indexes. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* numa: Fix off-by-one error at MAX_CPUMASK_BITS checkEduardo Habkost2015-03-191-2/+2
| | | | | | | | | | | | | | Fix the CPU index check to ensure we don't go beyond the size of the node_cpu bitmap. CPU index is always less than MAX_CPUMASK_BITS, as documented at sysemu.h: > The following shall be true for all CPUs: > cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* numa: remove superfluous '\n' around error_setgGonglei2015-03-101-3/+3
| | | | | | Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into ↵Peter Maydell2015-03-021-4/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging NUMA fixes queue # gpg: Signature made Mon Feb 23 19:28:42 2015 GMT using RSA key ID 984DC5A6 # gpg: Can't check signature: public key not found * remotes/ehabkost/tags/numa-pull-request: numa: Rename set_numa_modes() to numa_post_machine_init() numa: Rename option parsing functions numa: Move QemuOpts parsing to set_numa_nodes() numa: Make max_numa_nodeid static numa: Move NUMA globals to numa.c vl.c: Remove unnecessary zero-initialization of NUMA globals numa: Move NUMA declarations from sysemu.h to numa.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * numa: Rename set_numa_modes() to numa_post_machine_init()Eduardo Habkost2015-02-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | This function does some initialization that needs to be done after machine init. The function may be eventually removed if we move the CPUState.numa_node initialization to the CPU init code, but while the function exists, lets give it a name that makes sense. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
| * numa: Rename option parsing functionsEduardo Habkost2015-02-231-3/+3
| | | | | | | | | | | | | | | | | | Renaming set_numa_nodes() and numa_init_func() to parse_numa_opts() and parse_numa() makes the purpose of those functions clearer. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
| * numa: Move QemuOpts parsing to set_numa_nodes()Eduardo Habkost2015-02-231-1/+8
| | | | | | | | | | | | | | | | This allows us to make numa_init_func() static. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
| * numa: Make max_numa_nodeid staticEduardo Habkost2015-02-231-1/+3
| | | | | | | | | | | | | | | | Now the only code that uses the variable is inside numa.c. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
| * numa: Move NUMA globals to numa.cEduardo Habkost2015-02-231-0/+3
| | | | | | | | | | | | Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
| * numa: Move NUMA declarations from sysemu.h to numa.hEduardo Habkost2015-02-231-1/+1
| | | | | | | | | | | | | | | | | | | | Not all sysemu.h users need the NUMA declarations, and keeping them in a separate file makes it easier to see what are the interfaces provided by numa.c. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* | numa: Avoid qerror_report_err() outside QMP command handlersMarkus Armbruster2015-02-181-5/+3
|/ | | | | | | | | | | qerror_report_err() is a transitional interface to help with converting existing monitor commands to QMP. It should not be used elsewhere. Replace by error_report_err() in initial startup helper numa_init_func() and board setup helper memory_region_allocate_system_memory(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* numa: make 'info numa' take into account hotplugged memoryzhanghailiang2014-11-111-0/+38
| | | | | | | | | | | When do memory hotplug, if there is numa node, we should add the memory size to the corresponding node memory size. It affects the result of hmp command "info numa". Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* memory: add parameter errp to memory_region_init_ramHu Tao2014-09-091-2/+2
| | | | | | | | | Add parameter errp to memory_region_init_ram and update all call sites to pass in &error_abort. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hmp: fix MemdevList memory leakChen Fan2014-09-021-7/+2
| | | | | | | | | | | | the memdev_list in hmp_info_memdev() is never freed. so we use existent method qapi_free_MemdevList() to free it. and also we can use qapi_free_MemdevList() to replace list loops to clean up the memdev list in error path. Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Reviewed-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* query-memdev: fix potential memory leaksChen Fan2014-09-021-1/+5
| | | | | | | Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Reviewed-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* numa: show hex number in error message for consistency and prefix them with 0xHu Tao2014-08-141-2/+2
| | | | | | | | | | | | | | | The error messages before and after patch are: before: qemu-system-x86_64: total memory for NUMA nodes (134217728) should equal RAM size (20000000) after: qemu-system-x86_64: total memory for NUMA nodes (0x8000000) should equal RAM size (0x20000000) Cc: qemu-stable@nongnu.org Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* numa: check for busy memory backendHu Tao2014-07-061-0/+8
| | | | | | | | | | | | | | | | | | | Specifying the same memory backend twice leads to an assert: ./x86_64-softmmu/qemu-system-x86_64 -m 512M -enable-kvm -object memory-backend-ram,size=256M,id=ram0 -numa node,nodeid=0,memdev=ram0 -numa node,nodeid=1,memdev=ram0 qemu-system-x86_64: /scm/qemu/memory.c:1506: memory_region_add_subregion_common: Assertion `!subregion->container' failed. Aborted (core dumped) Detect and exit with an error message instead. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* numa: Reject configuration if not all node IDs are presentEduardo Habkost2014-06-291-1/+16
| | | | | | | | | | We don't support sparse NUMA node IDs yet, so this changes QEMU to reject configs where not all nodes are present. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* numa: Reject duplicate node IDsEduardo Habkost2014-06-291-0/+5
| | | | | | | | | | | | | | The same nodeid shouldn't appear multiple times in the command-line. In addition to detecting command-line mistakes, this will fix a bug where nb_numa_nodes may become larger than MAX_NODES (and cause out-of-bounds access on the numa_info array). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* numa: Keep track of NUMA nodes present on the command-lineEduardo Habkost2014-06-291-0/+2
| | | | | | | | | | | | | | | | | Based on "enable sparse node numbering" patch from Nishanth Aravamudan, but without the code to actually support sparse node IDs. This just adds the code to keep track of present/non-present nodes on the command-line, without changing any behavior. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> [Rename max_numa_node to max_numa_nodeid -Eduardo] [Initialize max_numa_nodeid to 0 -Eduardo] [Use MAX() macro when setting max_numa_nodeid -Eduardo] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* numa: fix commentMichael S. Tsirkin2014-06-291-1/+1
| | | | | | | s/if given for/is given for/; Reported-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* numa: fix commentMichael S. Tsirkin2014-06-291-1/+1
| | | | | | | | Fix up English in comments: s/the each/each/ Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* numa: handle mmaped memory allocation failure correctlyIgor Mammedov2014-06-191-1/+1
| | | | | | | | | | | when memory_region_init_ram_from_file() fails memory_region_size() will still return size that was provided at region init time. Instead use errp to properly detect error condition. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qmp: add query-memdevHu Tao2014-06-191-0/+84
| | | | | | | | | Add qmp command query-memdev to query for information of memory devices Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hostmem: add property to map memory with MAP_SHAREDPaolo Bonzini2014-06-191-1/+1
| | | | | | | | | | A new "share" property can be used with the "memory-file" backend to map memory with MAP_SHARED instead of MAP_PRIVATE. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory: add error propagation to file-based RAM allocationPaolo Bonzini2014-06-191-1/+12
| | | | | | | | | | | | | Right now, -mem-path will fall back to RAM-based allocation in some cases. This should never happen with "-object memory-file", prepare the code by adding correct error propagation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> MST: drop \n at end of error messages
* memory: move mem_path handling to memory_region_allocate_system_memoryPaolo Bonzini2014-06-191-1/+10
| | | | | | | | | | | | | | | Like the previous patch did in exec.c, split memory_region_init_ram and memory_region_init_ram_from_file, and push mem_path one step further up. Other RAM regions than system memory will now be backed by regular RAM. Also, boards that do not use memory_region_allocate_system_memory will not support -mem-path anymore. This can be changed before the patches are merged by migrating boards to use the function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* numa: add -numa node,memdev= optionPaolo Bonzini2014-06-191-2/+63
| | | | | | | | | | | | | | | | | | | This option provides the infrastructure for binding guest NUMA nodes to host NUMA nodes. For example: -object memory-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 \ -object memory-ram,size=1024M,policy=interleave,host-nodes=1-3,id=ram-node1 \ -numa node,nodeid=1,cpus=1,memdev=ram-node1 The option replaces "-numa node,mem=". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> MST: conflict resolution
* numa: introduce memory_region_allocate_system_memoryPaolo Bonzini2014-06-191-0/+9
| | | | | | | | | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> MST: resolve conflicts
* NUMA: convert -numa option to use OptsVisitorWanlong Gao2014-06-191-75/+70
| | | | | | | | | | | Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* NUMA: Add numa_info structure to contain numa nodes infoWanlong Gao2014-06-191-11/+12
| | | | | | | | | | | | | | | | Add the numa_info structure to contain the numa nodes memory, VCPUs information and the future added numa nodes host memory policies. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> [Fix hw/ppc/spapr.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* NUMA: check if the total numa memory size is equal to ram_sizeWanlong Gao2014-06-191-0/+14
| | | | | | | | | | | | | | | | | If the total number of the assigned numa nodes memory is not equal to the assigned ram size, it will write the wrong data to ACPI table, then the guest will ignore the wrong ACPI table and recognize all memory to one node. It's buggy, we should check it to ensure that we write the right data to ACPI table. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> MST: error message reworded
* NUMA: move numa related code to new file numa.cWanlong Gao2014-06-191-0/+185
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> MST: comment tweaks
OpenPOWER on IntegriCloud