summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2016-03-189-42/+34
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge second patch-bomb from Andrew Morton: - a couple of hotfixes - the rest of MM - a new timer slack control in procfs - a couple of procfs fixes - a few misc things - some printk tweaks - lib/ updates, notably to radix-tree. - add my and Nick Piggin's old userspace radix-tree test harness to tools/testing/radix-tree/. Matthew said it was a godsend during the radix-tree work he did. - a few code-size improvements, switching to __always_inline where gcc screwed up. - partially implement character sets in sscanf * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) sscanf: implement basic character sets lib/bug.c: use common WARN helper param: convert some "on"/"off" users to strtobool lib: add "on"/"off" support to kstrtobool lib: update single-char callers of strtobool() lib: move strtobool() to kstrtobool() include/linux/unaligned: force inlining of byteswap operations include/uapi/linux/byteorder, swab: force inlining of some byteswap operations include/asm-generic/atomic-long.h: force inlining of some atomic_long operations usb: common: convert to use match_string() helper ide: hpt366: convert to use match_string() helper ata: hpt366: convert to use match_string() helper power: ab8500: convert to use match_string() helper power: charger_manager: convert to use match_string() helper drm/edid: convert to use match_string() helper pinctrl: convert to use match_string() helper device property: convert to use match_string() helper lib/string: introduce match_string() helper radix-tree tests: add test for radix_tree_iter_next radix-tree tests: add regression3 test ...
| * param: convert some "on"/"off" users to strtoboolKees Cook2016-03-172-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes several users of manual "on"/"off" parsing to use strtobool. Some side-effects: - these uses will now parse y/n/1/0 meaningfully too - the early_param uses will now bubble up parse errors Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Amitkumar Karwar <akarwar@marvell.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Joe Perches <joe@perches.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nishant Sarmukadam <nishants@marvell.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steve French <sfrench@samba.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * powerpc/mm: enable page parallel initialisationLi Zhang2016-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parallel initialisation has been enabled for X86, boot time is improved greatly. On Power8, it is improved greatly for small memory. Here is the result from my test on Power8 platform: For 4GB of memory, boot time is improved by 59%, from 24.5s to 10s. For 50GB memory, boot time is improved by 22%, from 56.8s to 43.8s. Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm: introduce page reference manipulation functionsJoonsoo Kim2016-03-173-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The success of CMA allocation largely depends on the success of migration and key factor of it is page reference count. Until now, page reference is manipulated by direct calling atomic functions so we cannot follow up who and where manipulate it. Then, it is hard to find actual reason of CMA allocation failure. CMA allocation should be guaranteed to succeed so finding offending place is really important. In this patch, call sites where page reference is manipulated are converted to introduced wrapper function. This is preparation step to add tracepoint to each page reference manipulation function. With this facility, we can easily find reason of CMA allocation failure. There is no functional change in this patch. In addition, this patch also converts reference read sites. It will help a second step that renames page._count to something else and prevents later attempt to direct access to it (Suggested by Andrew). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * powerpc: query dynamic DEBUG_PAGEALLOC settingJoonsoo Kim2016-03-173-23/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'gpio-v4.6-1' of ↵Linus Torvalds2016-03-173-6/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for kernel v4.6. There is quite a lot of interesting stuff going on. The patches to other subsystems and arch-wide are ACKed as far as possible, though I consider things like per-arch <asm/gpio.h> as essentially a part of the GPIO subsystem so it should not be needed. Core changes: - The gpio_chip is now a *real device*. Until now the gpio chips were just piggybacking the parent device or (gasp) floating in space outside of the device model. We now finally make GPIO chips devices. The gpio_chip will create a gpio_device which contains a struct device, and this gpio_device struct is kept private. Anything that needs to be kept private from the rest of the kernel will gradually be moved over to the gpio_device. - As a result of making the gpio_device a real device, we have added resource management, so devm_gpiochip_add_data() will cut down on overhead and reduce code lines. A huge slew of patches convert almost all drivers in the subsystem to use this. - Building on making the GPIO a real device, we add the first step of a new userspace ABI: the GPIO character device. We take small steps here, so we first add a pure *information* ABI and the tool "lsgpio" that will list all GPIO devices on the system and all lines on these devices. We can now discover GPIOs properly from userspace. We still have not come up with a way to actually *use* GPIOs from userspace. - To encourage people to use the character device for the future, we have it always-enabled when using GPIO. The old sysfs ABI is still opt-in (and can be used in parallel), but is marked as deprecated. We will keep it around for the foreseeable future, but it will not be extended to cover ever more use cases. Cleanup: - Bjorn Helgaas removed a whole slew of per-architecture <asm/gpio.h> includes. This dates back to when GPIO was an opt-in feature and no shared library even existed: just a header file with proper prototypes was provided and all semantics were up to the arch to implement. These patches make the GPIO chip even more a proper device and cleans out leftovers of the old in-kernel API here and there. Still some cruft is left but it's very little now. - There is still some clamping of return values for .get() going on, but we now return sane values in the vast majority of drivers and the errorpath is sanitized. Some patches for powerpc, blackfin and unicore still drop in. - We continue to switch the ARM, MIPS, blackfin, m68k local GPIO implementations to use gpiochip_add_data() and cut down on code lines. - MPC8xxx is converted to use the generic GPIO helpers. - ATH79 is converted to use the generic GPIO helpers. New drivers: - WinSystems WS16C48 - Acces 104-DIO-48E - F81866 (a F7188x variant) - Qoric (a MPC8xxx variant) - TS-4800 - SPI serializers (pisosr): simple 74xx shift registers connected to SPI to obtain a dirt-cheap output-only GPIO expander. - Texas Instruments TPIC2810 - Texas Instruments TPS65218 - Texas Instruments TPS65912 - X-Gene (ARM64) standby GPIO controller" * tag 'gpio-v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (194 commits) Revert "Share upstreaming patches" gpio: mcp23s08: Fix clearing of interrupt. gpiolib: Fix comment referring to gpio_*() in gpiod_*() gpio: pca953x: Fix pca953x_gpio_set_multiple() on 64-bit gpio: xgene: Fix kconfig for standby GIPO contoller gpio: Add generic serializer DT binding gpio: uapi: use 0xB4 as ioctl() major gpio: tps65912: fix bad merge Revert "gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free" gpio: omap: drop dev field from gpio_bank structure gpio: mpc8xxx: Slightly update the code for better readability gpio: mpc8xxx: Remove *read_reg and *write_reg from struct mpc8xxx_gpio_chip gpio: mpc8xxx: Fixup setting gpio direction output gpio: mcp23s08: Add support for mcp23s18 dt-bindings: gpio: altera: Fix altr,interrupt-type property gpio: add driver for MEN 16Z127 GPIO controller gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free gpio: timberdale: Switch to devm_ioremap_resource() gpio: ts4800: Add IMX51 dependency gpiolib: rewrite gpiodev_add_to_list ...
| * \ Merge branch 'ib-mfd-regulator-gpio-4.6' of ↵Linus Walleij2016-03-0915-71/+65
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into devel
| * | | gpio: Remove unused asm/gpio.h filesBjorn Helgaas2016-02-161-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/gpio.h is included only by linux/gpio.h, and then only when the arch selects ARCH_HAVE_CUSTOM_GPIO_H. Only the following arches select it: arm avr32 blackfin m68k (COLDFIRE only) sh unicore32. Remove the unused asm/gpio.h files for the arches that do not select ARCH_HAVE_CUSTOM_GPIO_H. This is a follow-on to 7563bbf89d06 ("gpiolib/arches: Centralise bolierplate asm/gpio.h"). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| * | | powerpc: simple_gpio: Be sure to clamp return valueLinus Walleij2016-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we want gpio_chip .get() calls to be able to return negative error codes and propagate to drivers, we need to go over all drivers and make sure their return values are clamped to [0,1]. We do this by using the ret = !!(val) design pattern. Cc: Anton Vorontsov <anton@enomsg.org> Cc: Anatolij Gustschin <agust@denx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| * | | powerpc: ppc4cc/gpio: Be sure to clamp return valueLinus Walleij2016-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we want gpio_chip .get() calls to be able to return negative error codes and propagate to drivers, we need to go over all drivers and make sure their return values are clamped to [0,1]. We do this by using the ret = !!(val) design pattern. Cc: Anatolij Gustschin <agust@denx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
* | | | Merge branch 'next' of ↵Linus Torvalds2016-03-172-2/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: "There are a bunch of fixes to the TPM, IMA, and Keys code, with minor fixes scattered across the subsystem. IMA now requires signed policy, and that policy is also now measured and appraised" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (67 commits) X.509: Make algo identifiers text instead of enum akcipher: Move the RSA DER encoding check to the crypto layer crypto: Add hash param to pkcs1pad sign-file: fix build with CMS support disabled MAINTAINERS: update tpmdd urls MODSIGN: linux/string.h should be #included to get memcpy() certs: Fix misaligned data in extra certificate list X.509: Handle midnight alternative notation in GeneralizedTime X.509: Support leap seconds Handle ISO 8601 leap seconds and encodings of midnight in mktime64() X.509: Fix leap year handling again PKCS#7: fix unitialized boolean 'want' firmware: change kernel read fail to dev_dbg() KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert KEYS: Reserve an extra certificate symbol for inserting without recompiling modsign: hide openssl output in silent builds tpm_tis: fix build warning with tpm_tis_resume ima: require signed IMA policy ima: measure and appraise the IMA policy itself ima: load policy using path ...
| * | | | KEYS: CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an optionDavid Howells2016-02-102-2/+0
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an option as /proc/keys is now mandatory if the keyrings facility is enabled (it's used by libkeyutils in userspace). The defconfig references were removed with: perl -p -i -e 's/CONFIG_KEYS_DEBUG_PROC_KEYS=y\n//' \ `git grep -l CONFIG_KEYS_DEBUG_PROC_KEYS=y` and the integrity Kconfig fixed by hand. Signed-off-by: David Howells <dhowells@redhat.com> cc: Andreas Ziegler <andreas.ziegler@fau.de> cc: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
* | | | Merge branch 'linus' of ↵Linus Torvalds2016-03-171-0/+6
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.6: API: - Convert remaining crypto_hash users to shash or ahash, also convert blkcipher/ablkcipher users to skcipher. - Remove crypto_hash interface. - Remove crypto_pcomp interface. - Add crypto engine for async cipher drivers. - Add akcipher documentation. - Add skcipher documentation. Algorithms: - Rename crypto/crc32 to avoid name clash with lib/crc32. - Fix bug in keywrap where we zero the wrong pointer. Drivers: - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver. - Add PIC32 hwrng driver. - Support BCM6368 in bcm63xx hwrng driver. - Pack structs for 32-bit compat users in qat. - Use crypto engine in omap-aes. - Add support for sama5d2x SoCs in atmel-sha. - Make atmel-sha available again. - Make sahara hashing available again. - Make ccp hashing available again. - Make sha1-mb available again. - Add support for multiple devices in ccp. - Improve DMA performance in caam. - Add hashing support to rockchip" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: qat - remove redundant arbiter configuration crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() crypto: atmel - fix checks of error code returned by devm_ioremap_resource() crypto: qat - Change the definition of icp_qat_uof_regtype hwrng: exynos - use __maybe_unused to hide pm functions crypto: ccp - Add abstraction for device-specific calls crypto: ccp - CCP versioning support crypto: ccp - Support for multiple CCPs crypto: ccp - Remove check for x86 family and model crypto: ccp - memset request context to zero during import lib/mpi: use "static inline" instead of "extern inline" lib/mpi: avoid assembler warning hwrng: bcm63xx - fix non device tree compatibility crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode. crypto: qat - The AE id should be less than the maximal AE number lib/mpi: Endianness fix crypto: rockchip - add hash support for crypto engine in rk3288 crypto: xts - fix compile errors crypto: doc - add skcipher API documentation crypto: doc - update AEAD AD handling ...
| * | | crypto: xts - fix compile errorsStephan Mueller2016-02-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 28856a9e52c7 missed the addition of the crypto/xts.h include file for different architecture-specific AES implementations. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * | | crypto: xts - consolidate sanity check for keysStephan Mueller2016-02-171-0/+5
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch centralizes the XTS key check logic into the service function xts_check_key which is invoked from the different XTS implementations. With this, the XTS implementations in ARM, ARM64, PPC and S390 have now a sanity check for the XTS keys similar to the other arches. In addition, this service function received a check to ensure that the key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the check is not present in the standards defining XTS, it is only enforced in FIPS mode of the kernel. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | | Merge tag 'pci-v4.6-changes' of ↵Linus Torvalds2016-03-163-7/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for v4.6: Enumeration: - Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas) - Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas Resource management: - Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas) - Don't assign or reassign immutable resources (Bjorn Helgaas) - Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas) - Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas) - Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas) - ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas) - ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas) - MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas) - Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas) - Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas) - rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi) - designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi) Virtualization: - Wait for up to 1000ms after FLR reset (Alex Williamson) - Support SR-IOV on any function type (Kelly Zytaruk) - Add ACS quirk for all Cavium devices (Manish Jaggi) AER: - Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas) - Restore pci_ops pointer while calling original pci_ops (David Daney) - Fix aer_inject error codes (Jean Delvare) - Use dev_warn() in aer_inject (Jean Delvare) - Log actual error causes in aer_inject (Jean Delvare) - Log aer_inject error injections (Jean Delvare) VPD: - Prevent VPD access for buggy devices (Babu Moger) - Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas) - Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas) - Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas) - Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas) - Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas) - Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas) - Update VPD definitions (Hannes Reinecke) - Allow access to VPD attributes with size 0 (Hannes Reinecke) - Determine actual VPD size on first access (Hannes Reinecke) Generic host bridge driver: - Move structure definitions to separate header file (David Daney) - Add pci_host_common_probe(), based on gen_pci_probe() (David Daney) - Expose pci_host_common_probe() for use by other drivers (David Daney) Altera host bridge driver: - Fix altera_pcie_link_is_up() (Ley Foon Tan) Cavium ThunderX host bridge driver: - Add PCIe host driver for ThunderX processors (David Daney) - Add driver for ThunderX-pass{1,2} on-chip devices (David Daney) Freescale i.MX6 host bridge driver: - Add DT bindings to configure PHY Tx driver settings (Justin Waters) - Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach) - Move PHY reset into imx6_pcie_establish_link() (Lucas Stach) - Remove broken Gen2 workaround (Lucas Stach) - Move link up check into imx6_pcie_wait_for_link() (Lucas Stach) Freescale Layerscape host bridge driver: - Add "fsl,ls2085a-pcie" compatible ID (Yang Shi) Intel VMD host bridge driver: - Attach VMD resources to parent domain's resource tree (Jon Derrick) - Set bus resource start to 0 (Keith Busch) Microsoft Hyper-V host bridge driver: - Add fwnode_handle to x86 pci_sysdata (Jake Oshins) - Look up IRQ domain by fwnode_handle (Jake Oshins) - Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins) NVIDIA Tegra host bridge driver: - Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding) - Implement ->{add,remove}_bus() callbacks (Thierry Reding) - Remove unused struct tegra_pcie.num_ports field (Thierry Reding) - Track bus -> CPU mapping (Thierry Reding) - Remove misleading PHYS_OFFSET (Thierry Reding) Renesas R-Car host bridge driver: - Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman) Synopsys DesignWare host bridge driver: - ARC: Add PCI support (Joao Pinto) - Add generic dw_pcie_wait_for_link() (Joao Pinto) - Add default link up check if sub-driver doesn't override (Joao Pinto) - Add driver for prototyping kits based on ARC SDP (Joao Pinto) TI Keystone host bridge driver: - Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin) Xilinx AXI host bridge driver: - Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada) - Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada) - Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada) - Update Zynq binding with Microblaze node (Bharat Kumar Gogada) - microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada) Xilinx NWL host bridge driver: - Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada) Miscellaneous: - Check device_attach() return value always (Bjorn Helgaas) - Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas) - Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas) - ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas) - Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas) - Remove includes of asm/pci-bridge.h (Bjorn Helgaas) - Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas) - unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas) - Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler) - Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas) - Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa) - frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig) - Move pci_dma_* helpers to common code (Christoph Hellwig) - Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus) - Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson) - Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)" * tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits) PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition PCI: designware: Add driver for prototyping kits based on ARC SDP PCI: designware: Add default link up check if sub-driver doesn't override PCI: designware: Add generic dw_pcie_wait_for_link() PCI: Cleanup pci/pcie/Kconfig whitespace PCI: Simplify pci_create_attr() control flow PCI: Don't leak memory if sysfs_create_bin_file() fails PCI: Simplify sysfs ROM cleanup PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource MIPS: Loongson 3: Use temporary struct resource * to avoid repetition ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource ia64/PCI: Use ioremap() instead of open-coded equivalent ia64/PCI: Use temporary struct resource * to avoid repetition PCI: Clean up pci_map_rom() whitespace PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices PCI: thunder: Add PCIe host driver for ThunderX processors PCI: generic: Expose pci_host_common_probe() for use by other drivers PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe() ...
| | \ \
| | \ \
| | \ \
| | \ \
| *---. \ \ Merge branches 'pci/aer', 'pci/enumeration', 'pci/kconfig', 'pci/misc', ↵Bjorn Helgaas2016-03-152-6/+0
| |\ \ \ \ \ | | | |_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'pci/virtualization' and 'pci/vpd' into next * pci/aer: PCI/AER: Log aer_inject error injections PCI/AER: Log actual error causes in aer_inject PCI/AER: Use dev_warn() in aer_inject PCI/AER: Fix aer_inject error codes * pci/enumeration: PCI: Fix broken URL for Dell biosdevname * pci/kconfig: PCI: Cleanup pci/pcie/Kconfig whitespace PCI: Include pci/hotplug Kconfig directly from pci/Kconfig PCI: Include pci/pcie/Kconfig directly from pci/Kconfig * pci/misc: PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition PCI: Add QEMU top-level IDs for (sub)vendor & device unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h PCI: Move pci_dma_* helpers to common code frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration * pci/virtualization: PCI: Wait for up to 1000ms after FLR reset PCI: Support SR-IOV on any function type * pci/vpd: PCI: Prevent VPD access for buggy devices PCI: Sleep rather than busy-wait for VPD access completion PCI: Fold struct pci_vpd_pci22 into struct pci_vpd PCI: Rename VPD symbols to remove unnecessary "pci22" PCI: Remove struct pci_vpd_ops.release function pointer PCI: Move pci_vpd_release() from header file to pci/access.c PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code PCI: Determine actual VPD size on first access PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy PCI: Allow access to VPD attributes with size 0 PCI: Update VPD definitions
| | | | * | PCI: Move pci_dma_* helpers to common codeChristoph Hellwig2016-03-071-2/+0
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a long time all architectures implement the pci_dma_* functions using the generic DMA API, and they all use the same header to do so. Move this header, pci-dma-compat.h, to include/linux and include it from the generic pci.h instead of having each arch duplicate this include. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| | | * | PCI: Include pci/hotplug Kconfig directly from pci/KconfigBjorn Helgaas2016-03-081-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Include pci/hotplug/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/hotplug/Kconfig. Note that this effectively adds pci/hotplug/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/hotplug/Kconfig: alpha arm avr32 frv m68k microblaze mn10300 sparc unicore32 Inspired-by-patch-from: Bogicevic Sasa <brutallesale@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| | | * | PCI: Include pci/pcie/Kconfig directly from pci/KconfigBogicevic Sasa2016-03-081-2/+0
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Include pci/pcie/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/pcie/Kconfig. Note that this effectively adds pci/pcie/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/pcie/Kconfig: alpha avr32 blackfin frv m32r m68k microblaze mn10300 parisc sparc unicore32 xtensa [bhelgaas: changelog, source pci/pcie/Kconfig at top of pci/Kconfig, whitespace] Signed-off-by: Sasa Bogicevic <brutallesale@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| * | | PCI: Remove includes of empty asm-generic/pci-bridge.hBjorn Helgaas2016-02-051-1/+0
| |/ / | | | | | | | | | | | | | | | | | | include/asm-generic/pci-bridge.h is now empty, so remove every #include of it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Will Deacon <will.deacon@arm.com> (arm64)
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2016-03-1621-88/+945
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
| * \ \ Merge tag 'kvm-arm-for-4.6' of ↵Paolo Bonzini2016-03-0916-15/+105
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code Conflicts: include/uapi/linux/kvm.h
| * | | | KVM: PPC: Add support for 64bit TCE windowsAlexey Kardashevskiy2016-03-024-5/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not enough for directly mapped windows as the guest can get more than 4GB. This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against the locked memory limit. Since 64bit windows are to support Dynamic DMA windows (DDW), let's add @bus_offset and @page_shift which are also required by DDW. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Add @offset to kvmppc_spapr_tce_tableAlexey Kardashevskiy2016-03-022-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables userspace view of TCE tables to start from non-zero offset on a bus. This will be used for huge DMA windows. This only changes the internal structure, the user interface needs to change in order to use an offset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Add @page_shift to kvmppc_spapr_tce_tableAlexey Kardashevskiy2016-03-023-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment the kvmppc_spapr_tce_table struct can only describe 4GB windows and handle fixed size (4K) pages. Dynamic DMA windows support more so these limits need to be extended. This replaces window_size (in bytes, 4GB max) with page_shift (32bit) and size (64bit, in pages). This should cause no behavioural change as this is changing the internal structures only - the user interface still only allows one to create a 32-bit table with 4KiB pages at this stage. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: Add tunable to control H_IPI redirectionSuresh E. Warrier2016-02-293-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Redirecting the wakeup of a VCPU from the H_IPI hypercall to a core running in the host is usually a good idea, most workloads seemed to benefit. However, in one heavily interrupt-driven SMT1 workload, some regression was observed. This patch adds a kvm_hv module parameter called h_ipi_redirect to control this feature. The default value for this tunable is 1 - that is enable the feature. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: Send IPI to host core to wake VCPUSuresh E. Warrier2016-02-293-3/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to real-mode KVM to search for a core running in the host partition and send it an IPI message with VCPU to be woken. This avoids having to switch to the host partition to complete an H_IPI hypercall when the VCPU which is the target of the the H_IPI is not loaded (is not running in the guest). The patch also includes the support in the IPI handler running in the host to do the wakeup by calling kvmppc_xics_ipi_action for the PPC_MSG_RM_HOST_ACTION message. When a guest is being destroyed, we need to ensure that there are no pending IPIs waiting to wake up a VCPU before we free the VCPUs of the guest. This is accomplished by: - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs before freeing any VCPUs in kvm_arch_destroy_vm(). - Any PPC_MSG_RM_HOST_ACTION messages must be executed first before any other PPC_MSG_CALL_FUNCTION messages. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVMSuresh Warrier2016-02-293-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the support for the kick VCPU operation for kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function provides the function to be invoked for a host side operation when poked by the real mode KVM. This is initiated by KVM by sending an IPI to any free host core. KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and rm_data to point to the VCPU to be woken up before sending the IPI. Note that we have allocated one kvmppc_host_rm_core structure per core. The above values need to be set in the structure corresponding to the core to which the IPI will be sent. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUsSuresh Warrier2016-02-291-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kvmppc_host_rm_ops structure keeps track of which cores are are in the host by maintaining a bitmask of active/runnable online CPUs that have not entered the guest. This patch adds support to manage the bitmask when a CPU is offlined or onlined in the host. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: Manage core host stateSuresh Warrier2016-02-291-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the core host state in kvmppc_host_rm_ops whenever the primary thread of the core enters the guest or returns back. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Book3S HV: Host-side RM data structuresSuresh Warrier2016-02-293-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch defines the data structures to support the setting up of host side operations while running in real mode in the guest, and also the functions to allocate and free it. The operations are for now limited to virtual XICS operations. Currently, we have only defined one operation in the data structure: - Wake up a VCPU sleeping in the host when it receives a virtual interrupt The operations are assigned at the core level because PowerKVM requires that the host run in SMT off mode. For each core, we will need to manage its state atomically - where the state is defined by: 1. Is the core running in the host? 2. Is there a Real Mode (RM) operation pending on the host? Currently, core state is only managed at the whole-core level even when the system is in split-core mode. This just limits the number of free or "available" cores in the host to perform any host-side operations. The kvmppc_host_rm_core.rm_data allows any data to be passed by KVM in real mode to the host core along with the operation to be performed. The kvmppc_host_rm_ops structure is allocated the very first time a guest VM is started. Initial core state is also set - all online cores are in the host. This structure is never deleted, not even when there are no active guests. However, it needs to be freed when the module is unloaded because the kvmppc_host_rm_ops_hv can contain function pointers to kvm-hv.ko functions for the different supported host operations. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc/xics: Add icp_native_cause_ipi_rmSuresh Warrier2016-02-292-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function to cause an IPI by directly updating the MFFR register in the XICS. The function is meant for real-mode callers since they cannot use the smp_ops->cause_ipi function which uses an ioremapped address. Normal usage is for the the KVM real mode code to set the IPI message using smp_muxed_ipi_message_pass and then invoke icp_native_cause_ipi_rm to cause the actual IPI. The function requires kvm_hstate.xics_phys to have been initialized with the physical address of XICS. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc/smp: Add smp_muxed_ipi_set_messageSuresh Warrier2016-02-292-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smp_muxed_ipi_message_pass() invokes smp_ops->cause_ipi, which uses an ioremapped address to access registers on the XICS interrupt controller to cause the IPI. Because of this real mode callers cannot call smp_muxed_ipi_message_pass() for IPI messaging. This patch creates a separate function smp_muxed_ipi_set_message just to set the IPI message without the cause_ipi routine. After calling this function to set the IPI message, real mode callers must cause the IPI by writing to the XICS registers directly. As part of this, we also change smp_muxed_ipi_message_pass to call smp_muxed_ipi_set_message to set the message instead of doing it directly inside the routine. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc/smp: Support more IPI messagesSuresh Warrier2016-02-292-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch increases the number of demuxed messages for a controller with a single ipi to 8 for 64-bit systems. This is required because we want to use the IPI mechanism to send messages from a CPU running in KVM real mode in a guest to a CPU in the host to take some action. Currently, we only support 4 messages and all 4 are already taken. Define a fifth message PPC_MSG_RM_HOST_ACTION for this purpose. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Add support for multiple-TCE hcallsAlexey Kardashevskiy2016-02-167-9/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO devices or emulated PCI. These calls allow adding multiple entries (up to 512) into the TCE table in one call which saves time on transition between kernel and user space. The current implementation of kvmppc_h_stuff_tce() allows it to be executed in both real and virtual modes so there is one helper. The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address to the host address and since the translation is different, there are 2 helpers - one for each mode. This implements the KVM_CAP_PPC_MULTITCE capability. When present, the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL. If they can not be handled by the kernel, they are passed on to the user space. The user space still has to have an implementation for these. Both HV and PR-syle KVM are supported. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpersAlexey Kardashevskiy2016-02-162-10/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls) will validate TCE (not to have unexpected bits) and IO address (to be within the DMA window boundaries). This introduces helpers to validate TCE and IO address. The helpers are exported as they compile into vmlinux (to work in realmode) and will be used later by KVM kernel module in virtual mode. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Replace SPAPR_TCE_SHIFT with IOMMU_PAGE_SHIFT_4KAlexey Kardashevskiy2016-02-163-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K can be easily used instead, remove SPAPR_TCE_SHIFT. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Account TCE-containing pages in locked_vmAlexey Kardashevskiy2016-02-161-5/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment pages used for TCE tables (in addition to pages addressed by TCEs) are not counted in locked_vm counter so a malicious userspace tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and lock a lot of memory. This adds counting for pages used for TCE tables. This counts the number of pages required for a table plus pages for the kvmppc_spapr_tce_table struct (TCE table descriptor) itself. This changes release_spapr_tce_table() to store @npages on stack to avoid calling kvmppc_stt_npages() in the loop (tiny optimization, probably). This does not change the amount of used memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Use RCU for arch.spapr_tce_tablesAlexey Kardashevskiy2016-02-164-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment only spapr_tce_tables updates are protected against races but not lookups. This fixes missing protection by using RCU for the list. As lookups also happen in real mode, this uses list_for_each_entry_lockless() (which is expected not to access any vmalloc'd memory). This converts release_spapr_tce_table() to a RCU scheduled handler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | KVM: PPC: Rework H_PUT_TCE/H_GET_TCE handlersAlexey Kardashevskiy2016-02-161-42/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have following patches applied nicer. This moves the ioba boundaries check to a helper and adds a check for least bits which have to be zeros. The patch is pretty mechanical (only check for least ioba bits is added) so no change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc: Make vmalloc_to_phys() publicAlexey Kardashevskiy2016-02-163-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes vmalloc_to_phys() public as there will be another user (KVM in-kernel VFIO acceleration) for it soon. As this new user can be compiled as a module, this exports the symbol. As a little optimization, this changes the helper to call vmalloc_to_pfn() instead of vmalloc_to_page() as the size of the struct page may not be power-of-two aligned which will make gcc use multiply instructions instead of shifts. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | vfio: Enable VFIO device for powerpcDavid Gibson2016-02-151-1/+1
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is used to handle any necessary interactions between KVM and VFIO. Currently that device is built on x86 and ARM, but not powerpc, although powerpc does support both KVM and VFIO. This makes things awkward in userspace Currently qemu prints an alarming error message if you attempt to use VFIO and it can't initialize the KVM VFIO device. We don't want to remove the warning, because lack of the KVM VFIO device could mean coherency problems on x86. On powerpc, however, the error is harmless but looks disturbing, and a test based on host architecture in qemu would be ugly, and break if we do need the KVM VFIO device for something important in future. There's nothing preventing the KVM VFIO device from being built for powerpc, so this patch turns it on. It won't actually do anything, since we don't define any of the arch_*() hooks, but it will make qemu happy and we can extend it in future if we need to. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | | Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds2016-03-151-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...
| * | | | arch/hotplug: Call into idle with a proper stateThomas Gleixner2016-03-011-1/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the non boot cpus call into idle with the corresponding hotplug state, so the hotplug core can handle the further bringup. That's a first step to convert the boot side of the hotplugged cpus to do all the synchronization with the other side through the state machine. For now it'll only start the hotplug thread and kick the full bringup of the cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Rik van Riel <riel@redhat.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2016-03-142-14/+13
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle are: - Make schedstats a runtime tunable (disabled by default) and optimize it via static keys. As most distributions enable CONFIG_SCHEDSTATS=y due to its instrumentation value, this is a nice performance enhancement. (Mel Gorman) - Implement 'simple waitqueues' (swait): these are just pure waitqueues without any of the more complex features of full-blown waitqueues (callbacks, wake flags, wake keys, etc.). Simple waitqueues have less memory overhead and are faster. Use simple waitqueues in the RCU code (in 4 different places) and for handling KVM vCPU wakeups. (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker, Marcelo Tosatti) - sched/numa enhancements (Rik van Riel) - NOHZ performance enhancements (Rik van Riel) - Various sched/deadline enhancements (Steven Rostedt) - Various fixes (Peter Zijlstra) - ... and a number of other fixes, cleanups and smaller enhancements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) sched/cputime: Fix steal_account_process_tick() to always return jiffies sched/deadline: Remove dl_new from struct sched_dl_entity Revert "kbuild: Add option to turn incompatible pointer check into error" sched/deadline: Remove superfluous call to switched_to_dl() sched/debug: Fix preempt_disable_ip recording for preempt_disable() sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity time, acct: Drop irq save & restore from __acct_update_integrals() acct, time: Change indentation in __acct_update_integrals() sched, time: Remove non-power-of-two divides from __acct_update_integrals() sched/rt: Kick RT bandwidth timer immediately on start up sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug sched/debug: Move sched_domain_sysctl to debug.c sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c sched/rt: Fix PI handling vs. sched_setscheduler() sched/core: Remove duplicated sched_group_set_shares() prototype sched/fair: Consolidate nohz CPU load update code sched/fair: Avoid using decay_load_missed() with a negative value sched/deadline: Always calculate end of period on sched_yield() sched/cgroup: Fix cgroup entity load tracking tear-down rcu: Use simple wait queues where possible in rcutree ...
| * \ \ \ Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵Ingo Molnar2016-02-2916-15/+105
| |\ \ \ \ | | |/ / / | | | | | | | | | | | | | | | | | | | | applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | KVM: Use simple waitqueue for vcpu->wqMarcelo Tosatti2016-02-252-14/+13
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2016-03-142-5/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking changes from Ingo Molnar: "Various updates: - Futex scalability improvements: remove page lock use for shared futex get_futex_key(), which speeds up 'perf bench futex hash' benchmarks by over 40% on a 60-core Westmere. This makes anon-mem shared futexes perform close to private futexes. (Mel Gorman) - lockdep hash collision detection and fix (Alfredo Alvarez Fernandez) - lockdep testing enhancements (Alfredo Alvarez Fernandez) - robustify lockdep init by using hlists (Andrew Morton, Andrey Ryabinin) - mutex and csd_lock micro-optimizations (Davidlohr Bueso) - small x86 barriers tweaks (Michael S Tsirkin) - qspinlock updates (Waiman Long)" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait() locking/csd_lock: Explicitly inline csd_lock*() helpers futex: Replace barrier() in unqueue_me() with READ_ONCE() locking/lockdep: Detect chain_key collisions locking/lockdep: Prevent chain_key collisions tools/lib/lockdep: Fix link creation warning tools/lib/lockdep: Add tests for AA and ABBA locking tools/lib/lockdep: Add userspace version of READ_ONCE() tools/lib/lockdep: Fix the build on recent kernels locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h locking/mutex: Allow next waiter lockless wakeup locking/pvqspinlock: Enable slowpath locking count tracking locking/qspinlock: Use smp_cond_acquire() in pending code locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock() locking/mcs: Fix mcs_spin_lock() ordering futex: Remove requirement for lock_page() in get_futex_key() futex: Rename barrier references in ordering guarantees locking/atomics: Update comment about READ_ONCE() and structures locking/lockdep: Eliminate lockdep_init() locking/lockdep: Convert hash tables to hlists ...
| * \ \ \ Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar2016-03-102-1/+15
| |\ \ \ \ | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
OpenPOWER on IntegriCloud