summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2016-01-114-120/+146
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs compat_ioctl fixes from Al Viro: "This is basically Jann's patches from last week. I have _not_ included the stuff like switching i2c to ->compat_ioctl() into this one - those need more testing. Ideally I would like fs/compat_ioctl.c shrunk a lot, but that's a separate story" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS) compat_ioctl: don't pass fd around when not needed compat_ioctl: don't look up the fd twice
| * compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS)Jann Horn2016-01-081-62/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces all code in fs/compat_ioctl.c that translated ioctl arguments into a in-kernel structure, then performed do_ioctl under set_fs(KERNEL_DS), with code that allocates data on the user stack and can call the VFS ioctl handler under USER_DS. This is done as a hardening measure because the caller does not know what kind of ioctl handler will be invoked, only that no corresponding compat_ioctl handler exists and what the ioctl command number is. The accidental invocation of an unlocked_ioctl handler that unexpectedly calls copy_to_user could be a severe security issue. Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * compat_ioctl: don't pass fd around when not neededAl Viro2016-01-084-55/+61
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * compat_ioctl: don't look up the fd twiceJann Horn2016-01-081-54/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In code in fs/compat_ioctl.c that translates ioctl arguments into a in-kernel structure, then performs sys_ioctl, possibly under set_fs(KERNEL_DS), this commit changes the sys_ioctl calls to do_ioctl calls. do_ioctl is a new function that does the same thing as sys_ioctl, but doesn't look up the fd again. This change is made to avoid (potential) security issues because of ioctl handlers that accept one of the ioctl commands I2C_FUNCS, VIDEO_GET_EVENT, MTIOCPOS, MTIOCGET, TIOCGSERIAL, TIOCSSERIAL, RTC_IRQP_READ, RTC_EPOCH_READ. This can happen for multiple reasons: - The ioctl command number could be reused. - The ioctl handler might not check the full ioctl command. This is e.g. true for drm_ioctl. - The ioctl handler is very special, e.g. cuse_file_ioctl The real issue is that set_fs(KERNEL_DS) is used here, but that's fixed in a separate commit "compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS)". This change mitigates potential security issues by preventing a race that permits invocation of unlocked_ioctl handlers under KERNEL_DS through compat code even if a corresponding compat_ioctl handler exists. So far, no way has been identified to use this to damage kernel memory without having CAP_SYS_ADMIN in the init ns (with the capability, doing reads/writes at arbitrary kernel addresses should be easy through CUSE's ioctl handler with FUSE_IOCTL_UNRESTRICTED set). [AV: two missed sys_ioctl() taken care of] Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Linux 4.4v4.4Linus Torvalds2016-01-101-1/+1
| |
* | Merge tag 'scsi-fixes' of ↵Linus Torvalds2016-01-091-3/+6
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "A single fix for machines with pages > 4k (PPC mostly). There's a bug in our optimal transfer size code where we don't account for pages > 4k and can set the transfer size to be less than the page size causing nasty failures" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: sd: Reject optimal transfer length smaller than page size
| * \ Merge remote-tracking branch 'mkp-scsi/4.4/scsi-fixes' into fixesJames Bottomley2015-12-281-3/+6
| |\ \
| | * | sd: Reject optimal transfer length smaller than page sizeMartin K. Petersen2015-12-211-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eryu Guan reported that loading scsi_debug would fail. This turned out to be caused by scsi_debug reporting an optimal I/O size of 32KB which is smaller than the 64KB page size on the PowerPC system in question. Add a check to ensure that we only use the device-reported OPTIMAL TRANSFER LENGTH if it is bigger than or equal to the page cache size. Reported-by: Eryu Guan <guaneryu@gmail.com> Reported-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | Merge tag 'pci-v4.4-fixes-4' of ↵Linus Torvalds2016-01-091-0/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixlet from Bjorn Helgaas: "This marks the TI DRA7xx host bridge driver as broken. Apparently it has never worked without some additional out-of-tree code, so I'm going to mark it broken now and remove it completely next cycle unless it's fixed" * tag 'pci-v4.4-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: dra7xx: Mark driver as broken
| * | | | PCI: dra7xx: Mark driver as brokenRichard Cochran2016-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark the dra7xx PCI host driver as broken. This driver was first merged in v3.17 and has never worked. Although the driver compiles just fine, it is missing an essential device reset. If the driver is included, the kernel locks up hard shortly after booting, before any console output appears. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* | | | | vmstat: allocate vmstat_wq before it is usedMichal Hocko2016-01-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel test robot has reported the following crash: BUG: unable to handle kernel NULL pointer dereference at 00000100 IP: [<c1074df6>] __queue_work+0x26/0x390 *pdpt = 0000000000000000 *pde = f000ff53f000ff53 *pde = f000ff53f000ff53 Oops: 0000 [#1] PREEMPT PREEMPT SMP SMP CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.4.0-rc4-00139-g373ccbe #1 Workqueue: events vmstat_shepherd task: cb684600 ti: cb7ba000 task.ti: cb7ba000 EIP: 0060:[<c1074df6>] EFLAGS: 00010046 CPU: 0 EIP is at __queue_work+0x26/0x390 EAX: 00000046 EBX: cbb37800 ECX: cbb37800 EDX: 00000000 ESI: 00000000 EDI: 00000000 EBP: cb7bbe68 ESP: cb7bbe38 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000100 CR3: 01fd5000 CR4: 000006b0 Stack: Call Trace: __queue_delayed_work+0xa1/0x160 queue_delayed_work_on+0x36/0x60 vmstat_shepherd+0xad/0xf0 process_one_work+0x1aa/0x4c0 worker_thread+0x41/0x440 kthread+0xb0/0xd0 ret_from_kernel_thread+0x21/0x40 The reason is that start_shepherd_timer schedules the shepherd work item which uses vmstat_wq (vmstat_shepherd) before setup_vmstat allocates that workqueue so if the further initialization takes more than HZ we might end up scheduling on a NULL vmstat_wq. This is really unlikely but not impossible. Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress") Reported-by: kernel test robot <ying.huang@linux.intel.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge tag 'fixes-for-linus' of ↵Linus Torvalds2016-01-087-12/+49
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This is the final small set of ARM SoC bug fixes for linux-4.4, almost all regressions: OMAP: - data corruption on the Nokia N900 flash Allwinner: - Two defconfig change to get USB working again ARM Versatile: - Interrupt numbers gone bad after an older bug fix Nomadik: - Crashes from incorrect L2 cache settings VIA vt8500: - SD/MMC support on WM8650 never worked" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: dts: vt8500: Add SDHC node to DTS file for WM8650 ARM: Fix broken USB support in multi_v7_defconfig for sunxi devices ARM: versatile: fix MMC/SD interrupt assignment ARM: nomadik: set latencies to 8 cycles ARM: OMAP2+: Fix onenand rate detection to avoid filesystem corruption ARM: Fix broken USB support in sunxi_defconfig
| * \ \ \ \ Merge tag 'omap-for-v4.4/onenand-corruption' of ↵Arnd Bergmann2016-01-081-5/+9
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Pull "urgent onenand file system corruption fix for n900" from Tony Lindgren: Last minute urgent pull request to prevent file system corruption on Nokia N900. Looks like we have a GPMC bus timing bug that has gone unnoticed because of bootloader configured registers until few days ago. We are not detecting the onenand clock rate properly unless we have CONFIG_OMAP_GPMC_DEBUG set and this causes onenand corruption that can be easily be reproduced. There seems to be also an additional bug still lurking around for onenand corruption. But that is still being investigated and it does not seem to be GPMC timings related. Meanwhile, it would be good to get this fix into v4.4 to prevent wrong timings from corrupting onenand. * tag 'omap-for-v4.4/onenand-corruption' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Fix onenand rate detection to avoid filesystem corruption
| | * | | | | ARM: OMAP2+: Fix onenand rate detection to avoid filesystem corruptionTony Lindgren2016-01-061-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 63aa945b1013 ("memory: omap-gpmc: Add Kconfig option for debug") unified the GPMC debug for the SoCs with GPMC. The commit also left out the option for HWMOD_INIT_NO_RESET as we now require proper timings for GPMC to be able to remap GPMC devices out of address 0. Unfortunately on Nokia N900, onenand now only partially works with the device tree provided timings. It works enough to get detected but the clock rate supported by the onenand chip gets misdetected. This in turn causes the GPMC timings to be miscalculated and this leads into file system corruption on N900. Looks like onenand needs CS_CONFIG1 bit 27 WRITETYPE set for for sync write. This is needed also for async timings when we write to onenand with omap2_onenand_set_async_mode(). Without sync write bit set, the async read for the onenand ONENAND_REG_VERSION_ID will return 0xfff. Let's exit with an error if onenand rate is not detected. And let's remove the extra call to omap2_onenand_set_async_mode() as we only need to do this once at the end of omap2_onenand_setup_async(). Fixes: 63aa945b1013 ("memory: omap-gpmc: Add Kconfig option for debug") Cc: stable@vger.kernel.org # v4.2+ Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | | | dts: vt8500: Add SDHC node to DTS file for WM8650Roman Volkov2016-01-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since WM8650 has the same 'WMT' SDHC controller as WM8505, and the driver is already in the kernel, this node enables the controller support for WM8650 Signed-off-by: Roman Volkov <rvolkov@v1ros.org> Reviewed-by: Alexey Charkov <alchark@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | | ARM: Fix broken USB support in multi_v7_defconfig for sunxi devicesTimo Sigurdsson2016-01-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 69fb4dcada77 ("power: Add an axp20x-usb-power driver") introduced a new driver for the USB power supply used on various Allwinner based SBCs. However, the driver was not added to multi_v7_defconfig which breaks USB support for some boards (e.g. LeMaker BananaPi) as the kernel will now turn off the USB power supply during boot by default if the driver isn't present. (This was not the case in linux 4.3 or lower where the USB power was always left on.) Hence, add the driver to multi_v7_defconfig in order to keep USB support working on those boards that require it. Signed-off-by: Timo Sigurdsson <public_timo.s@silentcreek.de> Tested-by: Timo Sigurdsson <public_timo.s@silentcreek.de> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | | ARM: versatile: fix MMC/SD interrupt assignmentLinus Walleij2016-01-072-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0976c946a610d06e907335b7a3afa6db046f8e1b "arm/versatile: Fix versatile irq specifications" has an off-by-one error on the Versatile AB that has been regressing the Versatile AB hardware for some time. However it seems like the interrupt assignments have never been correct and I have now adjusted them according to the specification. The masks for the valid interrupts made it impossible to assign the right SIC interrupt for the MMCI, so I went in and fixed these to correspond to the specifications, and added references if anyone wants to double-check. Due to the Versatile PB including the Versatile AB as a base DTS file, we need to override and correct some values to correspond to the actual changes in the hardware. For the Versatile PB I don't think the IRQ line assignment for MMCI has ever been correct for either of the two MMCI blocks. It would be nice if someone with the physical PB board could test this. Patch tested on the Versatile AB, QEMU for Versatile AB and QEMU for Versatile PB. Cc: Rob Herring <robh@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: stable@vger.kernel.org Fixes: 0976c946a610 ("arm/versatile: Fix versatile irq specifications") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | | | | | ARM: nomadik: set latencies to 8 cyclesLinus Walleij2016-01-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Nomadik has sporadic crashes because of these latencies, setting them to max makes the platform work nicely, so use this values for now. These latencies were set to 2 since the Nomadik platform was merged, but I suspect they never took effect until the right size and associativity for the cache was specified in the device tree and that is why the crash comes now. Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | | | | | ARM: Fix broken USB support in sunxi_defconfigTimo Sigurdsson2015-12-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 69fb4dcada77 ("power: Add an axp20x-usb-power driver") introduced a new driver for the USB power supply used on various Allwinner based SBCs. However, the driver was not added to sunxi_defconfig which breaks USB support for some boards (e.g. LeMaker BananaPi) as the kernel will now turn off the USB power supply during boot by default if the driver isn't present. (This was not the case in linux 4.3 or lower where the USB power was always left on.) Hence, add the driver to sunxi_defconfig in order to keep USB support working on those boards that require it. Signed-off-by: Timo Sigurdsson <public_timo.s@silentcreek.de> Reported-by: David Tulloh <david@tulloh.id.au> Tested-by: David Tulloh <david@tulloh.id.au> Tested-by: Timo Sigurdsson <public_timo.s@silentcreek.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
* | | | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2016-01-082-1/+3
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fix from Paolo Bonzini: "A simple fix. I'm sending it before the merge window, because it refines a patch found in your master branch but not yet in the kvm/next branch that is destined for 4.5" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86: only channel 0 of the i8254 is linked to the HPET
| * | | | | | | kvm: x86: only channel 0 of the i8254 is linked to the HPETPaolo Bonzini2016-01-072-1/+3
| | |_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While setting the KVM PIT counters in 'kvm_pit_load_count', if 'hpet_legacy_start' is set, the function disables the timer on channel[0], instead of the respective index 'channel'. This is because channels 1-3 are not linked to the HPET. Fix the caller to only activate the special HPET processing for channel 0. Reported-by: P J P <pjp@fedoraproject.org> Fixes: 0185604c2d82c560dab2f2933a18f797e74ab5a8 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | | | | Merge tag 'pm+acpi-4.4-final' of ↵Linus Torvalds2016-01-081-1/+1
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Just one obvious fix that adds a missing function argument in ACPI code introduced recently (Kees Cook)" * tag 'pm+acpi-4.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / property: avoid leaking format string into kobject name
| * | | | | | | ACPI / property: avoid leaking format string into kobject nameKees Cook2016-01-081-1/+1
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dn->name is expected to be used as a literal, so add the missing "%s". Fixes: 263b4c1a64bc (ACPI / property: Expose data-only subnodes via sysfs) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2016-01-0813-32/+96
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A handful of x86 fixes: - a syscall ABI fix, fixing an Android breakage - a Xen PV guest fix relating to the RTC device, causing a non-working console - a Xen guest syscall stack frame fix - an MCE hotplug CPU crash fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numachip: Fix NumaConnect2 MMCFG PCI access x86/entry: Restore traditional SYSENTER calling convention x86/entry: Fix some comments x86/paravirt: Prevent rtc_cmos platform device init on PV guests x86/xen: Avoid fast syscall path for Xen PV guests x86/mce: Ensure offline CPUs don't participate in rendezvous process
| * | | | | | | x86/numachip: Fix NumaConnect2 MMCFG PCI accessDaniel J Blueman2015-12-301-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MMCFG PCI accessors weren't being setup for NumacConnect2 correctly due to over-early assignment; this would create the potential for the wrong PCI domain to be accessed. Fix this by using the correct arch-specific PCI init function. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Acked-by: Steffen Persvold <sp@numascale.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1451498807-15920-1-git-send-email-daniel@numascale.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/entry: Restore traditional SYSENTER calling conventionAndy Lutomirski2015-12-214-19/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that some Android versions hardcode the SYSENTER calling convention. This is buggy and will cause problems no matter what the kernel does. Nonetheless, we should try to support it. Credit goes to Linus for pointing out a clean way to handle the SYSENTER/SYSCALL clobber differences while preserving straightforward DWARF annotations. I believe that the original offending Android commit was: https://android.googlesource.com/platform%2Fbionic/+/7dc3684d7a2587e43e6d2a8e0e3f39bf759bd535 Reported-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-and-tested-by: Borislav Petkov <bp@alien8.de> Cc: <mark.gross@intel.com> Cc: Su Tao <tao.su@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: <frank.wang@intel.com> Cc: <borun.fu@intel.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Mingwei Shi <mingwei.shi@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/entry: Fix some commentsAndy Lutomirski2015-12-212-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-and-tested-by: Borislav Petkov <bp@alien8.de> Cc: <mark.gross@intel.com> Cc: Su Tao <tao.su@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: <qiuxu.zhuo@intel.com> Cc: <frank.wang@intel.com> Cc: <borun.fu@intel.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Mingwei Shi <mingwei.shi@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/paravirt: Prevent rtc_cmos platform device init on PV guestsDavid Vrabel2015-12-196-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding the rtc platform device in non-privileged Xen PV guests causes an IRQ conflict because these guests do not have legacy PIC and may allocate irqs in the legacy range. In a single VCPU Xen PV guest we should have: /proc/interrupts: CPU0 0: 4934 xen-percpu-virq timer0 1: 0 xen-percpu-ipi spinlock0 2: 0 xen-percpu-ipi resched0 3: 0 xen-percpu-ipi callfunc0 4: 0 xen-percpu-virq debug0 5: 0 xen-percpu-ipi callfuncsingle0 6: 0 xen-percpu-ipi irqwork0 7: 321 xen-dyn-event xenbus 8: 90 xen-dyn-event hvc_console ... But hvc_console cannot get its interrupt because it is already in use by rtc0 and the console does not work. genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0) We can avoid this problem by realizing that unprivileged PV guests (both Xen and lguests) are not supposed to have rtc_cmos device and so adding it is not necessary. Privileged guests (i.e. Xen's dom0) do use it but they should not have irq conflicts since they allocate irqs above legacy range (above gsi_top, in fact). Instead of explicitly testing whether the guest is privileged we can extend pv_info structure to include information about guest's RTC support. Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: vkuznets@redhat.com Cc: xen-devel@lists.xenproject.org Cc: konrad.wilk@oracle.com Cc: stable@vger.kernel.org # 4.2+ Link: http://lkml.kernel.org/r/1449842873-2613-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/xen: Avoid fast syscall path for Xen PV guestsBoris Ostrovsky2015-12-194-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After 32-bit syscall rewrite, and specifically after commit: 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path") ... the stack frame that is passed to xen_sysexit is no longer a "standard" one (i.e. it's not pt_regs). Since we end up calling xen_iret from xen_sysexit we don't need to fix up the stack and instead follow entry_SYSENTER_32's IRET path directly to xen_iret. We can do the same thing for compat mode even though stack does not need to be fixed. This will allow us to drop usergs_sysret32 paravirt op (in the subsequent patch) Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: david.vrabel@citrix.com Cc: konrad.wilk@oracle.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1447970147-1733-2-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86/mce: Ensure offline CPUs don't participate in rendezvous processAshok Raj2015-12-191-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel's MCA implementation broadcasts MCEs to all CPUs on the node. This poses a problem for offlined CPUs which cannot participate in the rendezvous process: Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler Kernel Offset: disabled Rebooting in 100 seconds.. More specifically, Linux does a soft offline of a CPU when writing a 0 to /sys/devices/system/cpu/cpuX/online, which doesn't prevent the #MC exception from being broadcasted to that CPU. Ensure that offline CPUs don't participate in the MCE rendezvous and clear the RIP valid status bit so that a second MCE won't cause a shutdown. Without the patch, mce_start() will increment mce_callin and wait for all CPUs. Offlined CPUs should avoid participating in the rendezvous process altogether. Signed-off-by: Ashok Raj <ashok.raj@intel.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | | | | | Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2016-01-083-8/+11
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc scheduler fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Reset task's lockless wake-queues on fork() sched/core: Fix unserialized r-m-w scribbling stuff sched/core: Check tgid in is_global_init() sched/fair: Fix multiplication overflow on 32-bit systems
| * | | | | | | | sched/core: Reset task's lockless wake-queues on fork()Sebastian Andrzej Siewior2016-01-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the following commit: 7675104990ed ("sched: Implement lockless wake-queues") we gained lockless wake-queues. The -RT kernel managed to lockup itself with those. There could be multiple attempts for task X to enqueue it for a wakeup _even_ if task X is already running. The reason is that task X could be runnable but not yet on CPU. The the task performing the wakeup did not leave the CPU it could performe multiple wakeups. With the proper timming task X could be running and enqueued for a wakeup. If this happens while X is performing a fork() then its its child will have a !NULL `wake_q` member copied. This is not a problem as long as the child task does not participate in lockless wakeups :) Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 7675104990ed ("sched: Implement lockless wake-queues") Link: http://lkml.kernel.org/r/20151221171710.GA5499@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | sched/core: Fix unserialized r-m-w scribbling stuffPeter Zijlstra2016-01-061-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the sched bitfieds (notably sched_reset_on_fork) can be set on other than current, this can cause the r-m-w to race with other updates. Since all the sched bits are serialized by scheduler locks, pull them in a separate word. Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: hannes@cmpxchg.org Cc: mhocko@kernel.org Cc: vdavydov@parallels.com Link: http://lkml.kernel.org/r/20151125150207.GM11639@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | sched/core: Check tgid in is_global_init()Sergey Senozhatsky2016-01-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our global init task can have sub-threads, so ->pid check is not reliable enough for is_global_init(), we need to check tgid instead. This has been spotted by Oleg and a fix was proposed by Richard a long time ago (see the link below). Oleg wrote: : Because is_global_init() is only true for the main thread of /sbin/init. : : Just look at oom_unkillable_task(). It tries to not kill init. But, say, : select_bad_process() can happily find a sub-thread of is_global_init() : and still kill it. I recently hit the problem in question; re-sending the patch (to the best of my knowledge it has never been submitted) with updated function comment. Credit goes to Oleg and Richard. Suggested-by: Richard Guy Briggs <rgb@redhat.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W . Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Serge E . Hallyn <serge.hallyn@ubuntu.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://www.redhat.com/archives/linux-audit/2013-December/msg00086.html Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | sched/fair: Fix multiplication overflow on 32-bit systemsAndrey Ryabinin2016-01-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make 'r' 64-bit type to avoid overflow in 'r * LOAD_AVG_MAX' on 32-bit systems: UBSAN: Undefined behaviour in kernel/sched/fair.c:2785:18 signed integer overflow: 87950 * 47742 cannot be represented in type 'int' The most likely effect of this bug are bad load average numbers resulting in weird scheduling. It's also likely that this can persist for a longer time - until the system goes idle for a long time so that all load avg numbers get reset. [ This is the CFS load average metric, not the procfs output, which is separate. ] Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking") Link: http://lkml.kernel.org/r/1450097243-30137-1-git-send-email-aryabinin@virtuozzo.com [ Improved the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2016-01-085-32/+21
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two core subsystem fixes, plus a handful of tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix race in swevent hash perf: Fix race in perf_event_exec() perf list: Robustify event printing routine perf list: Add support for PERF_COUNT_SW_BPF_OUT perf hists browser: Fix segfault if use symbol filter in cmdline perf hists browser: Reset selection when refresh perf hists browser: Add NULL pointer check to prevent crash perf buildid-list: Fix return value of perf buildid-list -k perf buildid-list: Show running kernel build id fix
| * | | | | | | | | perf: Fix race in swevent hashPeter Zijlstra2016-01-061-19/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race on CPU unplug where we free the swevent hash array while it can still have events on. This will result in a use-after-free which is BAD. Simply do not free the hash array on unplug. This leaves the thing around and no use-after-free takes place. When the last swevent dies, we do a for_each_possible_cpu() iteration anyway to clean these up, at which time we'll free it, so no leakage will occur. Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | perf: Fix race in perf_event_exec()Peter Zijlstra2016-01-061-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I managed to tickle this warning: [ 2338.884942] ------------[ cut here ]------------ [ 2338.890112] WARNING: CPU: 13 PID: 35162 at ../kernel/events/core.c:2702 task_ctx_sched_out+0x6b/0x80() [ 2338.900504] Modules linked in: [ 2338.903933] CPU: 13 PID: 35162 Comm: bash Not tainted 4.4.0-rc4-dirty #244 [ 2338.911610] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013 [ 2338.923071] ffffffff81f1468e ffff8807c6457cb8 ffffffff815c680c 0000000000000000 [ 2338.931382] ffff8807c6457cf0 ffffffff810c8a56 ffffe8ffff8c1bd0 ffff8808132ed400 [ 2338.939678] 0000000000000286 ffff880813170380 ffff8808132ed400 ffff8807c6457d00 [ 2338.947987] Call Trace: [ 2338.950726] [<ffffffff815c680c>] dump_stack+0x4e/0x82 [ 2338.956474] [<ffffffff810c8a56>] warn_slowpath_common+0x86/0xc0 [ 2338.963195] [<ffffffff810c8b4a>] warn_slowpath_null+0x1a/0x20 [ 2338.969720] [<ffffffff811a49cb>] task_ctx_sched_out+0x6b/0x80 [ 2338.976244] [<ffffffff811a62d2>] perf_event_exec+0xe2/0x180 [ 2338.982575] [<ffffffff8121fb6f>] setup_new_exec+0x6f/0x1b0 [ 2338.988810] [<ffffffff8126de83>] load_elf_binary+0x393/0x1660 [ 2338.995339] [<ffffffff811dc772>] ? get_user_pages+0x52/0x60 [ 2339.001669] [<ffffffff8121e297>] search_binary_handler+0x97/0x200 [ 2339.008581] [<ffffffff8121f8b3>] do_execveat_common.isra.33+0x543/0x6e0 [ 2339.016072] [<ffffffff8121fcea>] SyS_execve+0x3a/0x50 [ 2339.021819] [<ffffffff819fc165>] stub_execve+0x5/0x5 [ 2339.027469] [<ffffffff819fbeb2>] ? entry_SYSCALL_64_fastpath+0x12/0x71 [ 2339.034860] ---[ end trace ee1337c59a0ddeac ]--- Which is a WARN_ON_ONCE() indicating that cpuctx->task_ctx is not what we expected it to be. This is because context switches can swap the task_struct::perf_event_ctxp[] pointer around. Therefore you have to either disable preemption when looking at current, or hold ctx->lock. Fix perf_event_enable_on_exec(), it loads current->perf_event_ctxp[] before disabling interrupts, therefore a preemption in the right place can swap contexts around and we're using the wrong one. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Potapenko <glider@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: syzkaller <syzkaller@googlegroups.com> Link: http://lkml.kernel.org/r/20151210195740.GG6357@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | | Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar2015-12-181-1/+5
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent tooling fix from Arnaldo Carvalho de Melo: User visible changes: - Fix 'perf list' segfault due to lack of support for PERF_CONF_SW_BPF_OUTPUT in an array used just for printing available events, robustify the code involved (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | | | | perf list: Robustify event printing routineArnaldo Carvalho de Melo2015-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") added PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw array that wasn't initialized, thus set to NULL, fix print_symbol_events() to check for that case so that we don't crash if this happens again. (gdb) bt #0 __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198 #1 strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252 #2 0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall") at util/parse-events.c:1615 #3 print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675 #4 0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68 #5 0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370 #6 0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429 #7 run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473 #8 main (argc=2, argv=0x7fffffffe390) at perf.c:588 (gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT] $4 = {symbol = 0x0, alias = 0x0} (gdb) A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | | | perf list: Add support for PERF_COUNT_SW_BPF_OUTArnaldo Carvalho de Melo2015-12-161-0/+4
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When PERF_COUNT_SW_BPF_OUTPUT was added to the kernel we should've added it to tools/perf, where it is used just to list events. This ended up causing a segfault in commands like "perf list stall". Fix it by adding that new software counter. A patch to robustify perf to not segfault when the next counter gets added in the kernel will follow this one. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uya354upi3eprsey6mi5962d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | | | | | Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar2015-12-083-2/+10
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: User visible fixes: - Fix showing the running kernel build id using: (Michael Petlan) $ perf buildid-list -k 03c2a89c595616188f02f0282762a75b47069bc0 - hists browser (report, top) symbol filter segfault fixes (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | | | | perf hists browser: Fix segfault if use symbol filter in cmdlineWang Nan2015-12-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If feed perf a symbol filter in cmdline and the result is empty, pressing 'Enter' in the hist browser causes crash: # ./perf report perf.data <-- Common mistake for beginners Then press 'Enter': perf: Segmentation fault -------- backtrace -------- /home/wangnan/perf[0x53e578] /lib64/libc.so.6(+0x3545f)[0x7f76bafe045f] /home/wangnan/perf[0x539dd4] /home/wangnan/perf(perf_evlist__tui_browse_hists+0x96)[0x53d216] /home/wangnan/perf(cmd_report+0x1b9f)[0x442c7f] /home/wangnan/perf[0x47efa2] /home/wangnan/perf(main+0x5f5)[0x432fa5] /lib64/libc.so.6(__libc_start_main+0xf4)[0x7f76bafccbd4] /home/wangnan/perf[0x4330d4] This is because 'perf.data' is interpreted as a symbol filter, and the result is empty, so selection is empty. However, hist_browser__toggle_fold() forgets to check it. This patch simply return false when selection is NULL. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449455746-41952-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | | | perf hists browser: Reset selection when refreshWang Nan2015-12-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the following steps: Step 1: perf report Step 2: Use UP/DOWN to select an entry, don't press 'ENTER' Step 3: Use '/' to filter symbols, use a filter which returns empty result Step 4: Press 'ENTER' We see that, even if we have filtered all the symbols (and the main interface is empty), pressing 'ENTER' still selects one symbol. This behavior surprises the user. This patch resets browser->{he_,}selection in hist_browser__refresh() and lets it choose default selection. In this case browser->{he_,}selection keeps NULL so user won't see annotation item in menu. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449455746-41952-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | | | perf hists browser: Add NULL pointer check to prevent crashWang Nan2015-12-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch we can trigger a segfault by following steps: Step 0: Use 'perf record' to generate a perf.data without callchain Step 1: perf report Step 2: Use UP/DOWN to select an entry, don't press 'ENTER' Step 3: Use '/' to filter symbols, use a filter which returns empty result Step 4: Press 'ENTER' (notice here that the old selection is still there. This is another problem) Step 5: Press 'ENTER' to annotate that symbol Step 6: Press 'LEFT' to go out. Result: segfault: perf: Segmentation fault -------- backtrace -------- /home/wangnan/perf[0x53e568] /lib64/libc.so.6(+0x3545f)[0x7fba75d3245f] /home/wangnan/perf[0x537516] /home/wangnan/perf[0x533fef] /home/wangnan/perf[0x53b347] /home/wangnan/perf(perf_evlist__tui_browse_hists+0x96)[0x53d206] /home/wangnan/perf(cmd_report+0x1b9f)[0x442c7f] /home/wangnan/perf[0x47efa2] /home/wangnan/perf(main+0x5f5)[0x432fa5] /lib64/libc.so.6(__libc_start_main+0xf4)[0x7fba75d1ebd4] /home/wangnan/perf[0x4330d4] This is because in this case 'nd' could be NULL in ui_browser__hists_seek(), but that function never checks it. This patch adds checker for potential NULL pointer in that function. After this patch the above steps won't segfault. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1449455746-41952-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | | | perf buildid-list: Fix return value of perf buildid-list -kMichael Petlan2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The buildid string length is returned by perf buildid-list -k command. Since a non-zero return value means an error, perf buildid-list -k cmd should return 0 when successful instead. Before: # perf buildid-list -k 39356d74e96e02346fe0ec1f3f162b6c522bac62 # echo $? 41 After: # perf buildid-list -k 39356d74e96e02346fe0ec1f3f162b6c522bac62 # echo $? 0 Signed-off-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Fixes: 0b5a7935f3b5 ("perf buildid: Introduce sysfs/filename__sprintf_build_id") LPU-Reference: 1449080871.24573.145.camel@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | | | perf buildid-list: Show running kernel build id fixMichael Petlan2015-12-071-1/+1
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The --kernel option of perf buildid-list tool should show the running kernel buildid. The functionality has been lost during other changes of the related code. The build_id__sprintf() function should return length of the build-id string, but it was the length of the build-id raw data instead. Due to that, some return value checking caused that the final string was not printed out. With this patch the build_id__sprintf() returns the correct value, so the --kernel option works again. Before: # perf buildid-list --kernel # After: # perf buildid-list --kernel 972c1edab5bdc06cc224af45d510af662a3c6972 # Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> LPU-Reference: 1448632089.24573.114.camel@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | | | | | | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2016-01-081-3/+3
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fixes a core IRQ subsystem deadlock" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Prevent chip buslock deadlock
| * | | | | | | | | | genirq: Prevent chip buslock deadlockThomas Gleixner2015-12-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a interrupt chip utilizes chip->buslock then free_irq() can deadlock in the following way: CPU0 CPU1 interrupt(X) (Shared or spurious) free_irq(X) interrupt_thread(X) chip_bus_lock(X) irq_finalize_oneshot(X) chip_bus_lock(X) synchronize_irq(X) synchronize_irq() waits for the interrupt thread to complete, i.e. forever. Solution is simple: Drop chip_bus_lock() before calling synchronize_irq() as we do with the irq_desc lock. There is nothing to be protected after the point where irq_desc lock has been released. This adds chip_bus_lock/unlock() to the remove_irq() code path, but that's actually correct in the case where remove_irq() is called on such an interrupt. The current users of remove_irq() are not affected as none of those interrupts is on a chip which requires buslock. Reported-by: Fredrik Markström <fredrik.markstrom@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
* | | | | | | | | | | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2016-01-081-1/+1
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block revert from Jens Axboe: "The previous pull request had a split fix for NVMe, however there are corner cases where that ends up blowing up. So let's revert it for 4.4. The regression isn't introduced in this cycle, and it's "just" a performance regression, not a stability/integrity issue" * 'for-linus' of git://git.kernel.dk/linux-block: Revert "block: Split bios on chunk boundaries"
OpenPOWER on IntegriCloud