summaryrefslogtreecommitdiffstats
path: root/arch/x86/xen
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus-4.17-rc4-tag' of ↵Linus Torvalds2018-05-041-55/+31
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen cleanup from Juergen Gross: "One cleanup to remove VLAs from the kernel" * tag 'for-linus-4.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: Remove use of VLAs
| * x86/xen: Remove use of VLAsLaura Abbott2018-04-191-55/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's an ongoing effort to remove VLAs[1] from the kernel to eventually turn on -Wvla. It turns out, the few VLAs in use in Xen produce only a single entry array that is always bounded by GDT_SIZE. Clean up the code to get rid of the VLA and the loop. [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [boris: Use BUG_ON(size>PAGE_SIZE) instead of GDT_SIZE] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2018-04-151-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes and updates for x86: - Address a swiotlb regression which was caused by the recent DMA rework and made driver fail because dma_direct_supported() returned false - Fix a signedness bug in the APIC ID validation which caused invalid APIC IDs to be detected as valid thereby bloating the CPU possible space. - Fix inconsisten config dependcy/select magic for the MFD_CS5535 driver. - Fix a corruption of the physical address space bits when encryption has reduced the address space and late cpuinfo updates overwrite the reduced bit information with the original value. - Dominiks syscall rework which consolidates the architecture specific syscall functions so all syscalls can be wrapped with the same macros. This allows to switch x86/64 to struct pt_regs based syscalls. Extend the clearing of user space controlled registers in the entry patch to the lower registers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Fix signedness bug in APIC ID validity checks x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption x86/olpc: Fix inconsistent MFD_CS5535 configuration swiotlb: Use dma_direct_supported() for swiotlb_ops syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*() syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention syscalls/core, syscalls/x86: Clean up syscall stub naming convention syscalls/x86: Extend register clearing on syscall entry to lower registers syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64 syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32 syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y x86/syscalls: Don't pointlessly reload the system call number x86/mm: Fix documentation of module mapping range with 4-level paging x86/cpuid: Switch to 'static const' specifier
| * \ Merge branch 'WIP.x86/asm' into x86/urgent, because the topic is readyIngo Molnar2018-04-124-9/+33
| |\ \ | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86/apic: Fix signedness bug in APIC ID validity checksLi RongQing2018-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The APIC ID as parsed from ACPI MADT is validity checked with the apic->apic_id_valid() callback, which depends on the selected APIC type. For non X2APIC types APIC IDs >= 0xFF are invalid, but values > 0x7FFFFFFF are detected as valid. This happens because the 'apicid' argument of the apic_id_valid() callback is type 'int'. So the resulting comparison apicid < 0xFF evaluates to true for all unsigned int values > 0x7FFFFFFF which are handed to default_apic_id_valid(). As a consequence, invalid APIC IDs in !X2APIC mode are considered valid and accounted as possible CPUs. Change the apicid argument type of the apic_id_valid() callback to u32 so the evaluation is unsigned and returns the correct result. [ tglx: Massaged changelog ] Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: jgross@suse.com Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1523322966-10296-1-git-send-email-lirongqing@baidu.com
* | | | Merge tag 'for-linus-4.17-rc1-tag' of ↵Linus Torvalds2018-04-122-5/+7
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "A few fixes of Xen related core code and drivers" * tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen xen/acpi: off by one in read_acpi_id() xen/acpi: upload _PSD info for non Dom0 CPUs too x86/xen: Delay get_cpu_cap until stack canary is established xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_END xen: xenbus: Catch closing of non existent transactions xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
| * | | xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to XenBoris Ostrovsky2018-04-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead relied on finding RSDP in standard location in BIOS RO memory. This has worked since that's where Xen used to place it. However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard" address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh specific rsdp address retrieval function") Linux is able to find RSDP, for back-compatibility reasons we need to indicate to Xen that we can handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted flag in ELF notes. (Also take this opportunity and sync features.h header file with Xen) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
| * | | x86/xen: Delay get_cpu_cap until stack canary is establishedJason Andryuk2018-03-211-4/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") introduced a call to get_cpu_cap, which is fstack-protected. This is works on x86-64 as commit 4f277295e54c ("x86/xen: init %gs very early to avoid page faults with stack protector") ensures the stack protector is configured, but it it did not cover x86-32. Delay calling get_cpu_cap until after xen_setup_gdt has initialized the stack canary. Without this, a 32bit PV machine crashes early in boot. (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.6.6-xc x86_64 debug=n Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e019:[<00000000c10362f8>] And the PV kernel IP corresponds to init_scattered_cpuid_features 0xc10362f8 <+24>: mov %gs:0x14,%eax Fixes 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | | Merge tag 'pm-4.17-rc1-2' of ↵Linus Torvalds2018-04-111-0/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These include one big-ticket item which is the rework of the idle loop in order to prevent CPUs from spending too much time in shallow idle states. It reduces idle power on some systems by 10% or more and may improve performance of workloads in which the idle loop overhead matters. This has been in the works for several weeks and it has been tested and reviewed quite thoroughly. Also included are changes that finalize the cpufreq cleanup moving frequency table validation from drivers to the core, a few fixes and cleanups of cpufreq drivers, a cpuidle documentation update and a PM QoS core update to mark the expected switch fall-throughs in it. Specifics: - Rework the idle loop in order to prevent CPUs from spending too much time in shallow idle states by making it stop the scheduler tick before putting the CPU into an idle state only if the idle duration predicted by the idle governor is long enough. That required the code to be reordered to invoke the idle governor before stopping the tick, among other things (Rafael Wysocki, Frederic Weisbecker, Arnd Bergmann). - Add the missing description of the residency sysfs attribute to the cpuidle documentation (Prashanth Prakash). - Finalize the cpufreq cleanup moving frequency table validation from drivers to the core (Viresh Kumar). - Fix a clock leak regression in the armada-37xx cpufreq driver (Gregory Clement). - Fix the initialization of the CPU performance data structures for shared policies in the CPPC cpufreq driver (Shunyong Yang). - Clean up the ti-cpufreq, intel_pstate and CPPC cpufreq drivers a bit (Viresh Kumar, Rafael Wysocki). - Mark the expected switch fall-throughs in the PM QoS core (Gustavo Silva)" * tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits) tick-sched: avoid a maybe-uninitialized warning cpufreq: Drop cpufreq_table_validate_and_show() cpufreq: SCMI: Don't validate the frequency table twice cpufreq: CPPC: Initialize shared perf capabilities of CPUs cpufreq: armada-37xx: Fix clock leak cpufreq: CPPC: Don't set transition_latency cpufreq: ti-cpufreq: Use builtin_platform_driver() cpufreq: intel_pstate: Do not include debugfs.h PM / QoS: mark expected switch fall-throughs cpuidle: Add definition of residency to sysfs documentation time: hrtimer: Use timerqueue_iterate_next() to get to the next timer nohz: Avoid duplication of code related to got_idle_tick nohz: Gather tick_sched booleans under a common flag field cpuidle: menu: Avoid selecting shallow states with stopped tick cpuidle: menu: Refine idle state selection for running tick sched: idle: Select idle state before stopping the tick time: hrtimer: Introduce hrtimer_next_event_without() time: tick-sched: Split tick_nohz_stop_sched_tick() cpuidle: Return nohz hint from cpuidle_select() jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC ...
| * | | time: tick-sched: Reorganize idle tick management codeRafael J. Wysocki2018-04-051-0/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare the scheduler tick code for reworking the idle loop to avoid stopping the tick in some cases. The idea is to split the nohz idle entry call to decouple the idle time stats accounting and preparatory work from the actual tick stop code, in order to later be able to delay the tick stop once we reach more power-knowledgeable callers. Move away the tick_nohz_start_idle() invocation from __tick_nohz_idle_enter(), rename the latter to __tick_nohz_idle_stop_tick() and define tick_nohz_idle_stop_tick() as a wrapper around it for calling it from the outside. Make tick_nohz_idle_enter() only call tick_nohz_start_idle() instead of calling the entire __tick_nohz_idle_enter(), add another wrapper disabling and enabling interrupts around tick_nohz_idle_stop_tick() and make the current callers of tick_nohz_idle_enter() call it too to retain their current functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
* | | xen, mm: allow deferred page initialization for xen pv domainsPavel Tatashin2018-04-111-12/+26
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Juergen Gross noticed that commit f7f99100d8d ("mm: stop zeroing memory during allocation in vmemmap") broke XEN PV domains when deferred struct page initialization is enabled. This is because the xen's PagePinned() flag is getting erased from struct pages when they are initialized later in boot. Juergen fixed this problem by disabling deferred pages on xen pv domains. It is desirable, however, to have this feature available as it reduces boot time. This fix re-enables the feature for pv-dmains, and fixes the problem the following way: The fix is to delay setting PagePinned flag until struct pages for all allocated memory are initialized, i.e. until after free_all_bootmem(). A new x86_init.hyper op init_after_bootmem() is called to let xen know that boot allocator is done, and hence struct pages for all the allocated memory are now initialized. If deferred page initialization is enabled, the rest of struct pages are going to be initialized later in boot once page_alloc_init_late() is called. xen_after_bootmem() walks page table's pages and marks them pinned. Link: http://lkml.kernel.org/r/20180226160112.24724-2-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Mathias Krause <minipli@googlemail.com> Cc: Jinbum Park <jinb.park7@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jia Zhang <zhang.jia@linux.alibaba.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2018-04-023-8/+32
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - Extend the memmap= boot parameter syntax to allow the redeclaration and dropping of existing ranges, and to support all e820 range types (Jan H. Schönherr) - Improve the W+X boot time security checks to remove false positive warnings on Xen (Jan Beulich) - Support booting as Xen PVH guest (Juergen Gross) - Improved 5-level paging (LA57) support, in particular it's possible now to have a single kernel image for both 4-level and 5-level hardware (Kirill A. Shutemov) - AMD hardware RAM encryption support (SME/SEV) fixes (Tom Lendacky) - Preparatory commits for hardware-encrypted RAM support on Intel CPUs. (Kirill A. Shutemov) - Improved Intel-MID support (Andy Shevchenko) - Show EFI page tables in page_tables debug files (Andy Lutomirski) - ... plus misc fixes and smaller cleanups * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) x86/cpu/tme: Fix spelling: "configuation" -> "configuration" x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT x86/mm: Update comment in detect_tme() regarding x86_phys_bits x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic x86/mm: Remove pointless checks in vmalloc_fault x86/platform/intel-mid: Add special handling for ACPI HW reduced platforms ACPI, x86/boot: Introduce the ->reduced_hw_early_init() ACPI callback ACPI, x86/boot: Split out acpi_generic_reduce_hw_init() and export x86/pconfig: Provide defines and helper to run MKTME_KEY_PROG leaf x86/pconfig: Detect PCONFIG targets x86/tme: Detect if TME and MKTME is activated by BIOS x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G x86/boot/compressed/64: Use page table in trampoline memory x86/boot/compressed/64: Use stack from trampoline memory x86/boot/compressed/64: Make sure we have a 32-bit code segment x86/mm: Do not use paravirtualized calls in native_set_p4d() kdump, vmcoreinfo: Export pgtable_l5_enabled value x86/boot/compressed/64: Prepare new top-level page table for trampoline x86/boot/compressed/64: Set up trampoline memory x86/boot/compressed/64: Save and restore trampoline memory ...
| * \ Merge branch 'x86/urgent' into x86/mm to pick up dependenciesThomas Gleixner2018-03-141-2/+4
| |\ \ | | |/
| * | Merge branch 'x86/pti' into x86/mm, to pick up dependenciesIngo Molnar2018-03-121-0/+16
| |\ \ | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86/xen: Add pvh specific rsdp address retrieval functionJuergen Gross2018-02-261-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add pvh_get_root_pointer() for Xen PVH guests to communicate the address of the RSDP table given to the kernel via Xen start info. This makes the kernel boot again in PVH mode after on recent Xen the RSDP was moved to higher addresses. So up to that change it was pure luck that the legacy method to locate the RSDP was working when running as PVH mode. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: boris.ostrovsky@oracle.com Cc: lenb@kernel.org Cc: linux-acpi@vger.kernel.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20180219100906.14265-4-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | Merge tag 'v4.16-rc3' into x86/mm, to pick up fixesIngo Molnar2018-02-262-3/+5
| |\ \ \ | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/xen: Allow XEN_PV and XEN_PVH to be enabled with X86_5LEVELKirill A. Shutemov2018-02-212-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With boot-time switching between paging modes, XEN_PV and XEN_PVH can be boot into 4-level paging mode. Tested-by: Juergen Gross <jgross@suse.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180216114948.68868-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds2018-04-021-1/+1
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Ingo Molnar: "The main x86 APIC/IOAPIC changes in this cycle were: - Robustify kexec support to more carefully restore IRQ hardware state before calling into kexec/kdump kernels. (Baoquan He) - Clean up the local APIC code a bit (Dou Liyang) - Remove unused callbacks (David Rientjes)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Finish removing unused callbacks x86/apic: Drop logical_smp_processor_id() inline x86/apic: Modernize the pending interrupt code x86/apic: Move pending interrupt check code into it's own function x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified x86/apic: Rename variables and functions related to x86_io_apic_ops x86/apic: Remove the (now) unused disable_IO_APIC() function x86/apic: Fix restoring boot IRQ mode in reboot and kexec/kdump x86/apic: Split disable_IO_APIC() into two functions to fix CONFIG_KEXEC_JUMP=y x86/apic: Split out restore_boot_irq_mode() from disable_IO_APIC() x86/apic: Make setup_local_APIC() static x86/apic: Simplify init_bsp_APIC() usage x86/x2apic: Mark set_x2apic_phys_mode() as __init
| * | | | x86/apic: Rename variables and functions related to x86_io_apic_opsBaoquan He2018-02-171-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The names of x86_io_apic_ops and its two member variables are misleading: The ->read() member is to read IO_APIC reg, while ->disable() which is called by native_disable_io_apic()/irq_remapping_disable_io_apic() is actually used to restore boot IRQ mode, not to disable the IO-APIC. So rename x86_io_apic_ops to 'x86_apic_ops' since it doesn't only handle the IO-APIC, but also the local APIC. Also rename its member variables and the related callbacks. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: douly.fnst@cn.fujitsu.com Cc: joro@8bytes.org Cc: prarit@redhat.com Cc: uobergfe@redhat.com Link: http://lkml.kernel.org/r/20180214054656.3780-6-bhe@redhat.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-03-041-0/+16
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti fixes from Thomas Gleixner: "Three fixes related to melted spectrum: - Sync the cpu_entry_area page table to initial_page_table on 32 bit. Otherwise suspend/resume fails because resume uses initial_page_table and triggers a triple fault when accessing the cpu entry area. - Zero the SPEC_CTL MRS on XEN before suspend to address a shortcoming in the hypervisor. - Fix another switch table detection issue in objtool" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table objtool: Fix another switch table detection issue x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
| * | | x86/xen: Zero MSR_IA32_SPEC_CTRL before suspendJuergen Gross2018-02-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Older Xen versions (4.5 and before) might have problems migrating pv guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before suspending zero that MSR and restore it after being resumed. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Link: https://lkml.kernel.org/r/20180226140818.4849-1-jgross@suse.com
* | | | x86/xen: add tty0 and hvc0 as preferred consoles for dom0Juergen Gross2018-02-281-2/+4
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | Today the tty0 and hvc0 consoles are added as a preferred consoles for pv domUs only. As this requires a boot parameter for getting dom0 messages per default, add them for dom0, too. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | | x86/xen: Calculate __max_logical_packages on PV domainsPrarit Bhargava2018-02-171-0/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel panics on PV domains because native_smp_cpus_done() is only called for HVM domains. Calculate __max_logical_packages for PV domains. Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Tested-and-reported-by: Simon Gaiser <simon@invisiblethingslab.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: xen-devel@lists.xenproject.org Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-02-141-3/+3
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar: "Here's the latest set of Spectre and PTI related fixes and updates: Spectre: - Add entry code register clearing to reduce the Spectre attack surface - Update the Spectre microcode blacklist - Inline the KVM Spectre helpers to get close to v4.14 performance again. - Fix indirect_branch_prediction_barrier() - Fix/improve Spectre related kernel messages - Fix array_index_nospec_mask() asm constraint - KVM: fix two MSR handling bugs PTI: - Fix a paranoid entry PTI CR3 handling bug - Fix comments objtool: - Fix paranoid_entry() frame pointer warning - Annotate WARN()-related UD2 as reachable - Various fixes - Add Add Peter Zijlstra as objtool co-maintainer Misc: - Various x86 entry code self-test fixes - Improve/simplify entry code stack frame generation and handling after recent heavy-handed PTI and Spectre changes. (There's two more WIP improvements expected here.) - Type fix for cache entries There's also some low risk non-fix changes I've included in this branch to reduce backporting conflicts: - rename a confusing x86_cpu field name - de-obfuscate the naming of single-TLB flushing primitives" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/entry/64: Fix CR3 restore in paranoid_exit() x86/cpu: Change type of x86_cache_size variable to unsigned int x86/spectre: Fix an error message x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping selftests/x86/mpx: Fix incorrect bounds with old _sigfault x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]() x86/speculation: Add <asm/msr-index.h> dependency nospec: Move array_index_nospec() parameter checking into separate macro x86/speculation: Fix up array_index_nospec_mask() asm constraint x86/debug: Use UD2 for WARN() x86/debug, objtool: Annotate WARN()-related UD2 as reachable objtool: Fix segfault in ignore_unreachable_insn() selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory selftests/x86/pkeys: Remove unused functions selftests/x86: Clean up and document sscanf() usage selftests/x86: Fix vDSO selftest segfault for vsyscall=none x86/entry/64: Remove the unused 'icebp' macro ...
| * x86/mm: Rename flush_tlb_single() and flush_tlb_one() to ↵Andy Lutomirski2018-02-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __flush_tlb_one_[user|kernel]() flush_tlb_single() and flush_tlb_one() sound almost identical, but they really mean "flush one user translation" and "flush one kernel translation". Rename them to flush_tlb_one_user() and flush_tlb_one_kernel() to make the semantics more obvious. [ I was looking at some PTI-related code, and the flush-one-address code is unnecessarily hard to understand because the names of the helpers are uninformative. This came up during PTI review, but no one got around to doing it. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux-MM <linux-mm@kvack.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge tag 'for-linus-4.16-rc1-tag' of ↵Linus Torvalds2018-02-092-0/+22
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Only five small fixes for issues when running under Xen" * tag 'for-linus-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests pvcalls-back: do not return error on inet_accept EAGAIN xen-netfront: Fix race between device setup and open xen/grant-table: Use put_page instead of free_page x86/xen: init %gs very early to avoid page faults with stack protector
| * | xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guestsSimon Gaiser2018-02-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths") removed the check for autotranslation from {set,clear}_foreign_p2m_mapping but those are called by grant-table.c also on PVH/HVM guests. Cc: <stable@vger.kernel.org> # 4.14 Fixes: 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| * | x86/xen: init %gs very early to avoid page faults with stack protectorJuergen Gross2018-02-061-0/+16
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | When running as Xen pv guest %gs is initialized some time after C code is started. Depending on stack protector usage this might be too late, resulting in page faults. So setup %gs and MSR_GS_BASE in assembly code already. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Chris Patterson <cjp256@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds2018-01-301-2/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove unused IOMMU_STRESS Kconfig x86/extable: Mark exception handler functions visible x86/timer: Don't inline __const_udelay x86/headers: Remove duplicate #includes
| * | x86/headers: Remove duplicate #includesPravin Shedge2017-12-121-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: ard.biesheuvel@linaro.org Cc: boris.ostrovsky@oracle.com Cc: geert@linux-m68k.org Cc: jgross@suse.com Cc: linux-efi@vger.kernel.org Cc: luto@kernel.org Cc: matt@codeblueprint.co.uk Cc: thomas.lendacky@amd.com Cc: tim.c.chen@linux.intel.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1513024951-9221-1-git-send-email-pravin.shedge4linux@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2018-01-291-1/+1
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm update from Thomas Gleixner: "A single patch which excludes the GART aperture from vmcore as accessing that area from a dump kernel can crash the kernel. Not necessarily the nicest way to fix this, but curing this from ground up requires a more thorough rewrite of the whole kexec/kdump magic" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/gart: Exclude GART aperture from vmcore
| * | x86/gart: Exclude GART aperture from vmcoreJiri Bohac2018-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On machines where the GART aperture is mapped over physical RAM /proc/vmcore contains the remapped range and reading it may cause hangs or reboots. In the past, the GART region was added into the resource map, implemented by commit 56dd669a138c ("[PATCH] Insert GART region into resource map") However, inserting the iomem_resource from the early GART code caused resource conflicts with some AGP drivers (bko#72201), which got avoided by reverting the patch in commit 707d4eefbdb3 ("Revert [PATCH] Insert GART region into resource map"). This revert introduced the /proc/vmcore bug. The vmcore ELF header is either prepared by the kernel (when using the kexec_file_load syscall) or by the kexec userspace (when using the kexec_load syscall). Since we no longer have the GART iomem resource, the userspace kexec has no way of knowing which region to exclude from the ELF header. Changes from v1 of this patch: Instead of excluding the aperture from the ELF header, this patch makes /proc/vmcore return zeroes in the second kernel when attempting to read the aperture region. This is done by reusing the gart_oldmem_pfn_is_ram infrastructure originally intended to exclude XEN balooned memory. This works for both, the kexec_file_load and kexec_load syscalls. [Note that the GART region is the same in the first and second kernels: regardless whether the first kernel fixed up the northbridge/bios setting and mapped the aperture over physical memory, the second kernel finds the northbridge properly configured by the first kernel and the aperture never overlaps with e820 memory because the second kernel has a fake e820 map created from the crashkernel memory regions. Thus, the second kernel keeps the aperture address/size as configured by the first kernel.] register_oldmem_pfn_is_ram can only register one callback and returns an error if the callback has been registered already. Since XEN used to be the only user of this function, it never checks the return value. Now that we have more than one user, I added a WARN_ON just in case agp, XEN, or any other future user of register_oldmem_pfn_is_ram were to step on each other's toes. Fixes: 707d4eefbdb3 ("Revert [PATCH] Insert GART region into resource map") Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Baoquan He <bhe@redhat.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: David Airlie <airlied@linux.ie> Cc: yinghai@kernel.org Cc: joro@8bytes.org Cc: kexec@lists.infradead.org Cc: Borislav Petkov <bp@alien8.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Link: https://lkml.kernel.org/r/20180106010013.73suskgxm7lox7g6@dwarf.suse.cz
* | | Merge tag 'for-linus-4.15-rc8-tag' of ↵Linus Torvalds2018-01-122-6/+4
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains two build fixes for clang and two fixes for rather unlikely situations in the Xen gntdev driver" * tag 'for-linus-4.15-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/gntdev: Fix partial gntdev_mmap() cleanup xen/gntdev: Fix off-by-one error when unmapping with holes x86: xen: remove the use of VLAIS x86/xen/time: fix section mismatch for xen_init_time_ops()
| * | x86: xen: remove the use of VLAISNick Desaulniers2018-01-081-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable Length Arrays In Structs (VLAIS) is not supported by Clang, and frowned upon by others. https://lkml.org/lkml/2013/9/23/500 Here, the VLAIS was used because the size of the bitmap returned from xen_mc_entry() depended on possibly (based on kernel configuration) runtime sized data. Rather than declaring args as a VLAIS then calling sizeof on *args, we calculate the appropriate sizeof args manually. Further, we can get rid of the #ifdef's and rely on num_possible_cpus() (thanks to a helpful checkpatch warning from an earlier version of this patch). Suggested-by: Juergen Gross <jgross@suse.com> Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
| * | x86/xen/time: fix section mismatch for xen_init_time_ops()Nick Desaulniers2018-01-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The header declares this function as __init but is defined in __ref section. Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2017-12-231-2/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI preparatory patches from Thomas Gleixner: "Todays Advent calendar window contains twentyfour easy to digest patches. The original plan was to have twenty three matching the date, but a late fixup made that moot. - Move the cpu_entry_area mapping out of the fixmap into a separate address space. That's necessary because the fixmap becomes too big with NRCPUS=8192 and this caused already subtle and hard to diagnose failures. The top most patch is fresh from today and cures a brain slip of that tall grumpy german greybeard, who ignored the intricacies of 32bit wraparounds. - Limit the number of CPUs on 32bit to 64. That's insane big already, but at least it's small enough to prevent address space issues with the cpu_entry_area map, which have been observed and debugged with the fixmap code - A few TLB flush fixes in various places plus documentation which of the TLB functions should be used for what. - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for more than sysenter now and keeping the name makes backtraces confusing. - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(), which is only invoked on fork(). - Make vysycall more robust. - A few fixes and cleanups of the debug_pagetables code. Check PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the C89 initialization of the address hint array which already was out of sync with the index enums. - Move the ESPFIX init to a different place to prepare for PTI. - Several code moves with no functional change to make PTI integration simpler and header files less convoluted. - Documentation fixes and clarifications" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit init: Invoke init_espfix_bsp() from mm_init() x86/cpu_entry_area: Move it out of the fixmap x86/cpu_entry_area: Move it to a separate unit x86/mm: Create asm/invpcid.h x86/mm: Put MMU to hardware ASID translation in one place x86/mm: Remove hard-coded ASID limit checks x86/mm: Move the CR3 construction functions to tlbflush.h x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what x86/mm: Remove superfluous barriers x86/mm: Use __flush_tlb_one() for kernel memory x86/microcode: Dont abuse the TLB-flush interface x86/uv: Use the right TLB-flush API x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation x86/mm/64: Improve the memory map documentation x86/ldt: Prevent LDT inheritance on exec x86/ldt: Rework locking arch, mm: Allow arch_dup_mmap() to fail x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode ...
| * | | x86/cpu_entry_area: Move it out of the fixmapThomas Gleixner2017-12-221-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big and 0-day already hit a case where the fixmap PTEs were cleared by cleanup_highmap(). Aside of that the fixmap API is a pain as it's all backwards. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge tag 'for-linus-4.15-rc5-tag' of ↵Linus Torvalds2017-12-224-4/+98
|\ \ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains two fixes for running under Xen: - a fix avoiding resource conflicts between adding mmio areas and memory hotplug - a fix setting NX bits in page table entries copied from Xen when running a PV guest" * tag 'for-linus-4.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Mark unallocated host memory as UNUSABLE x86-64/Xen: eliminate W+X mappings
| * | | xen/balloon: Mark unallocated host memory as UNUSABLEBoris Ostrovsky2017-12-202-4/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum reservation") left host memory not assigned to dom0 as available for memory hotplug. Unfortunately this also meant that those regions could be used by others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those addresses as MMIO. To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus effectively reverting f5775e0b6116) and keep track of that region as a hostmem resource that can be used for the hotplug. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com>
| * | | x86-64/Xen: eliminate W+X mappingsJan Beulich2017-12-192-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few thousand such pages are usually left around due to the re-use of L1 tables having been provided by the hypervisor (Dom0) or tool stack (DomU). Set NX in the direct map variant, which needs to be done in L2 due to the dual use of the re-used L1s. For x86_configure_nx() to actually do what it is supposed to do, call get_cpu_cap() first. This was broken by commit 4763ed4d45 ("x86, mm: Clean up and simplify NX enablement") when switching away from the direct EFER read. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | | | Merge branch 'WIP.x86-pti.entry-for-linus' of ↵Linus Torvalds2017-12-182-2/+2
|\ \ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 syscall entry code changes for PTI from Ingo Molnar: "The main changes here are Andy Lutomirski's changes to switch the x86-64 entry code to use the 'per CPU entry trampoline stack'. This, besides helping fix KASLR leaks (the pending Page Table Isolation (PTI) work), also robustifies the x86 entry code" * 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/cpufeatures: Make CPU bugs sticky x86/paravirt: Provide a way to check for hypervisors x86/paravirt: Dont patch flush_tlb_single x86/entry/64: Make cpu_entry_area.tss read-only x86/entry: Clean up the SYSENTER_stack code x86/entry/64: Remove the SYSENTER stack canary x86/entry/64: Move the IST stacks into struct cpu_entry_area x86/entry/64: Create a per-CPU SYSCALL entry trampoline x86/entry/64: Return to userspace from the trampoline stack x86/entry/64: Use a per-CPU trampoline stack for IDT entries x86/espfix/64: Stop assuming that pt_regs is on the entry stack x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 x86/entry: Remap the TSS into the CPU entry area x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct x86/dumpstack: Handle stack overflow on all stacks x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss x86/kasan/64: Teach KASAN about the cpu_entry_area x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area x86/entry/gdt: Put per-CPU GDT remaps in ascending order x86/dumpstack: Add get_stack_info() support for the SYSENTER stack ...
| * | | x86/entry/64: Make cpu_entry_area.tss read-onlyAndy Lutomirski2017-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TSS is a fairly juicy target for exploits, and, now that the TSS is in the cpu_entry_area, it's no longer protected by kASLR. Make it read-only on x86_64. On x86_32, it can't be RO because it's written by the CPU during task switches, and we use a task gate for double faults. I'd also be nervous about errata if we tried to make it RO even on configurations without double fault handling. [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO. So it's probably safe to assume that it's a non issue, though Intel might have been creative in that area. Still waiting for confirmation. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct ↵Andy Lutomirski2017-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpu_entry_area Currently, the GDT is an ad-hoc array of pages, one per CPU, in the fixmap. Generalize it to be an array of a new 'struct cpu_entry_area' so that we can cleanly add new things to it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.563271721@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | Merge commit 'upstream-x86-virt' into WIP.x86/mmIngo Molnar2017-12-172-9/+9
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Merge a minimal set of virt cleanups, for a base for the MM isolation patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | \ \ \ Merge tag 'for-linus-4.15-rc4-tag' of ↵Linus Torvalds2017-12-151-1/+1
|\ \ \ \ \ | | |_|/ / | |/| | / | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two minor fixes for running as Xen dom0: - when built as 32 bit kernel on large machines the Xen LAPIC emulation should report a rather modern LAPIC in order to support enough APIC-Ids - The Xen LAPIC emulation is needed for dom0 only, so build it only for kernels supporting to run as Xen dom0" * tag 'for-linus-4.15-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: XEN_ACPI_PROCESSOR is Dom0-only x86/Xen: don't report ancient LAPIC version
| * | | x86/Xen: don't report ancient LAPIC versionJan Beulich2017-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unconditionally reporting a value seen on the P4 or older invokes functionality like io_apic_get_unique_id() on 32-bit builds, resulting in a panic() with sufficiently many CPUs and/or IO-APICs. Doing what that function does would be the hypervisor's responsibility anyway, so makes no sense to be used when running on Xen. Uniformly report a more modern version; this shouldn't matter much as both LAPIC and IO-APIC are being managed entirely / mostly by the hypervisor. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2017-12-062-13/+38
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - make CR4 handling irq-safe, which bug vmware guests ran into - don't crash on early IRQs in Xen guests - don't crash secondary CPU bringup if #UD assisted WARN()ings are triggered - make X86_BUG_FXSAVE_LEAK optional on newer AMD CPUs that have the fix - fix AMD Fam17h microcode loading - fix broadcom_postcore_init() if ACPI is disabled - fix resume regression in __restore_processor_context() - fix Sparse warnings - fix a GCC-8 warning * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Change time() prototype to match __vdso_time() x86: Fix Sparse warnings about non-static functions x86/power: Fix some ordering bugs in __restore_processor_context() x86/PCI: Make broadcom_postcore_init() check acpi_disabled x86/microcode/AMD: Add support for fam17h microcode loading x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD x86/idt: Load idt early in start_secondary x86/xen: Support early interrupts in xen pv guests x86/tlb: Disable interrupts when changing CR4 x86/tlb: Refactor CR4 setting and shadow write
| * | | | x86/xen: Support early interrupts in xen pv guestsJuergen Gross2017-11-282-13/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add early interrupt handlers activated by idt_setup_early_handler() to the handlers supported by Xen pv guests. This will allow for early WARN() calls not crashing the guest. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Link: https://lkml.kernel.org/r/20171124084221.30172-1-jgross@suse.com
* | | | | Merge tag 'for-linus-4.15-rc1-tag' of ↵Linus Torvalds2017-11-166-10/+173
|\ \ \ \ \ | |/ / / / |/| / / / | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "Xen features and fixes for v4.15-rc1 Apart from several small fixes it contains the following features: - a series by Joao Martins to add vdso support of the pv clock interface - a series by Juergen Gross to add support for Xen pv guests to be able to run on 5 level paging hosts - a series by Stefano Stabellini adding the Xen pvcalls frontend driver using a paravirtualized socket interface" * tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (34 commits) xen/pvcalls: fix potential endless loop in pvcalls-front.c xen/pvcalls: Add MODULE_LICENSE() MAINTAINERS: xen, kvm: track pvclock-abi.h changes x86/xen/time: setup vcpu 0 time info page x86/xen/time: set pvclock flags on xen_time_init() x86/pvclock: add setter for pvclock_pvti_cpu0_va ptp_kvm: probe for kvm guest availability xen/privcmd: remove unused variable pageidx xen: select grant interface version xen: update arch/x86/include/asm/xen/cpuid.h xen: add grant interface version dependent constants to gnttab_ops xen: limit grant v2 interface to the v1 functionality xen: re-introduce support for grant v2 interface xen: support priv-mapping in an HVM tools domain xen/pvcalls: remove redundant check for irq >= 0 xen/pvcalls: fix unsigned less than zero error check xen/time: Return -ENODEV from xen_get_wallclock() xen/pvcalls-front: mark expected switch fall-through xen: xenbus_probe_frontend: mark expected switch fall-throughs xen/time: do not decrease steal time after live migration on xen ...
| * | | x86/xen/time: setup vcpu 0 time info pageJoao Martins2017-11-083-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to support pvclock vdso on xen we need to setup the time info page for vcpu 0 and register the page with Xen using the VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall will also forcefully update the pvti which will set some of the necessary flags for vdso. Afterwards we check if it supports the PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having vdso/vsyscall support. And if so, it will set the cpu 0 pvti that will be later on used when mapping the vdso image. The xen headers are also updated to include the new hypercall for registering the secondary vcpu_time_info struct. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
OpenPOWER on IntegriCloud