summaryrefslogtreecommitdiffstats
path: root/arch/x86/lib
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds2018-04-021-15/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups and msr updates from Ingo Molnar: "The main change is a performance/latency improvement to /dev/msr access. The rest are misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well x86/cpuid: Allow cpuid_read() to schedule x86/msr: Allow rdmsr_safe_on_cpu() to schedule x86/rtc: Stop using deprecated functions x86/dumpstack: Unify show_regs() x86/fault: Do not print IP in show_fault_oops() x86/MSR: Move native_* variants to msr.h
| * x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as wellEric Dumazet2018-03-281-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When changing rdmsr_safe_on_cpu() to schedule, it was missed that __rdmsr_safe_on_cpu() was also used by rdmsrl_safe_on_cpu() Make rdmsrl_safe_on_cpu() a wrapper instead of copy/pasting the code which was added for the completion handling. Fixes: 07cde313b2d2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180328032233.153055-1-edumazet@google.com
| * x86/msr: Allow rdmsr_safe_on_cpu() to scheduleEric Dumazet2018-03-271-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | High latencies can be observed caused by a daemon periodically reading various MSR on all cpus. On KASAN enabled kernels ~10ms latencies can be observed simply reading one MSR. Even without KASAN, sending an IPI to a CPU, which is in a deep sleep state or in a long hard IRQ disabled section, waiting for the answer can consume hundreds of microseconds. All usage sites are in preemptible context, convert rdmsr_safe_on_cpu() to use a completion instead of busy polling. Overall daemon cpu usage was reduced by 35 %, and latencies caused by msr_read() disappeared. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Link: https://lkml.kernel.org/r/20180323215818.127774-1-edumazet@google.com
* | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2018-04-021-2/+0
|\ \ | |/ |/| | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm fixlets from Ingo Molnar: "A clobber list fix and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Trim clear_page.S includes x86/asm: Clobber flags in clear_page()
| * x86/asm: Trim clear_page.S includesAlexey Dobriyan2018-02-131-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | After alternatives were shifted to the call site, only 2 headers are necessary. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180113190648.GB23111@avx2 Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-02-262-57/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Yet another pile of melted spectrum related changes: - sanitize the array_index_nospec protection mechanism: Remove the overengineered array_index_nospec_mask_check() magic and allow const-qualified types as index to avoid temporary storage in a non-const local variable. - make the microcode loader more robust by properly propagating error codes. Provide information about new feature bits after micro code was updated so administrators can act upon. - optimizations of the entry ASM code which reduce code footprint and make the code simpler and faster. - fix the {pmd,pud}_{set,clear}_flags() implementations to work properly on paravirt kernels by removing the address translation operations. - revert the harmful vmexit_fill_RSB() optimization - use IBRS around firmware calls - teach objtool about retpolines and add annotations for indirect jumps and calls. - explicitly disable jumplabel patching in __init code and handle patching failures properly instead of silently ignoring them. - remove indirect paravirt calls for writing the speculation control MSR as these calls are obviously proving the same attack vector which is tried to be mitigated. - a few small fixes which address build issues with recent compiler and assembler versions" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM/x86: Remove indirect MSR op calls from SPEC_CTRL objtool, retpolines: Integrate objtool with retpoline support more closely x86/entry/64: Simplify ENCODE_FRAME_POINTER extable: Make init_kernel_text() global jump_label: Warn on failed jump_label patching attempt jump_label: Explicitly disable jump labels in __init code x86/entry/64: Open-code switch_to_thread_stack() x86/entry/64: Move ASM_CLAC to interrupt_entry() x86/entry/64: Remove 'interrupt' macro x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry() x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP objtool: Add module specific retpoline rules objtool: Add retpoline validation objtool: Use existing global variables for options x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute() x86/boot, objtool: Annotate indirect jump in secondary_startup_64() x86/paravirt, objtool: Annotate indirect calls ...
| * | Revert "x86/retpoline: Simplify vmexit_fill_RSB()"David Woodhouse2018-02-202-57/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1dde7415e99933bb7293d6b2843752cbdb43ec11. By putting the RSB filling out of line and calling it, we waste one RSB slot for returning from the function itself, which means one fewer actual function call we can make if we're doing the Skylake abomination of call-depth counting. It also changed the number of RSB stuffings we do on vmexit from 32, which was correct, to 16. Let's just stop with the bikeshedding; it didn't actually *fix* anything anyway. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2018-02-141-0/+1
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes all across the map: - /proc/kcore vsyscall related fixes - LTO fix - build warning fix - CPU hotplug fix - Kconfig NR_CPUS cleanups - cpu_has() cleanups/robustification - .gitignore fix - memory-failure unmapping fix - UV platform fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages x86/error_inject: Make just_return_func() globally visible x86/platform/UV: Fix GAM Range Table entries less than 1GB x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page x86/Kconfig: Further simplify the NR_CPUS config x86/Kconfig: Simplify NR_CPUS config x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()" x86/cpufeature: Update _static_cpu_has() to use all named variables x86/cpufeature: Reindent _static_cpu_has()
| * | x86/error_inject: Make just_return_func() globally visibleArnd Bergmann2018-02-131-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With link time optimizations enabled, I get a link failure: ./ccLbOEHX.ltrans19.ltrans.o: In function `override_function_with_return': <artificial>:(.text+0x7f3): undefined reference to `just_return_func' Marking the symbol .globl makes it work as expected. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 540adea3809f ("error-injection: Separate error-injection from kprobe") Link: http://lkml.kernel.org/r/20180202145634.200291-3-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-02-141-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar: "Here's the latest set of Spectre and PTI related fixes and updates: Spectre: - Add entry code register clearing to reduce the Spectre attack surface - Update the Spectre microcode blacklist - Inline the KVM Spectre helpers to get close to v4.14 performance again. - Fix indirect_branch_prediction_barrier() - Fix/improve Spectre related kernel messages - Fix array_index_nospec_mask() asm constraint - KVM: fix two MSR handling bugs PTI: - Fix a paranoid entry PTI CR3 handling bug - Fix comments objtool: - Fix paranoid_entry() frame pointer warning - Annotate WARN()-related UD2 as reachable - Various fixes - Add Add Peter Zijlstra as objtool co-maintainer Misc: - Various x86 entry code self-test fixes - Improve/simplify entry code stack frame generation and handling after recent heavy-handed PTI and Spectre changes. (There's two more WIP improvements expected here.) - Type fix for cache entries There's also some low risk non-fix changes I've included in this branch to reduce backporting conflicts: - rename a confusing x86_cpu field name - de-obfuscate the naming of single-TLB flushing primitives" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/entry/64: Fix CR3 restore in paranoid_exit() x86/cpu: Change type of x86_cache_size variable to unsigned int x86/spectre: Fix an error message x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping selftests/x86/mpx: Fix incorrect bounds with old _sigfault x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]() x86/speculation: Add <asm/msr-index.h> dependency nospec: Move array_index_nospec() parameter checking into separate macro x86/speculation: Fix up array_index_nospec_mask() asm constraint x86/debug: Use UD2 for WARN() x86/debug, objtool: Annotate WARN()-related UD2 as reachable objtool: Fix segfault in ignore_unreachable_insn() selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory selftests/x86/pkeys: Remove unused functions selftests/x86: Clean up and document sscanf() usage selftests/x86: Fix vDSO selftest segfault for vsyscall=none x86/entry/64: Remove the unused 'icebp' macro ...
| * x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_steppingJia Zhang2018-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86_mask is a confusing name which is hard to associate with the processor's stepping. Additionally, correct an indent issue in lib/cpu.c. Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com> [ Updated it to more recent kernels. ] Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-02-042-4/+14
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull spectre/meltdown updates from Thomas Gleixner: "The next round of updates related to melted spectrum: - The initial set of spectre V1 mitigations: - Array index speculation blocker and its usage for syscall, fdtable and the n180211 driver. - Speculation barrier and its usage in user access functions - Make indirect calls in KVM speculation safe - Blacklisting of known to be broken microcodes so IPBP/IBSR are not touched. - The initial IBPB support and its usage in context switch - The exposure of the new speculation MSRs to KVM guests. - A fix for a regression in x86/32 related to the cpu entry area - Proper whitelisting for known to be safe CPUs from the mitigations. - objtool fixes to deal proper with retpolines and alternatives - Exclude __init functions from retpolines which speeds up the boot process. - Removal of the syscall64 fast path and related cleanups and simplifications - Removal of the unpatched paravirt mode which is yet another source of indirect unproteced calls. - A new and undisputed version of the module mismatch warning - A couple of cleanup and correctness fixes all over the place Yet another step towards full mitigation. There are a few things still missing like the RBS underflow mitigation for Skylake and other small details, but that's being worked on. That said, I'm taking a belated christmas vacation for a week and hope that everything is magically solved when I'm back on Feb 12th" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES KVM/x86: Add IBPB support KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL x86/pti: Mark constant arrays as __initconst x86/spectre: Simplify spectre_v2 command line parsing x86/retpoline: Avoid retpolines for built-in __init functions x86/kvm: Update spectre-v1 mitigation KVM: VMX: make MSR bitmaps per-VCPU x86/paravirt: Remove 'noreplace-paravirt' cmdline option x86/speculation: Use Indirect Branch Prediction Barrier in context switch x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" x86/spectre: Report get_user mitigation for spectre_v1 nl80211: Sanitize array index in parse_txq_params vfs, fdtable: Prevent bounds-check bypass via speculative execution x86/syscall: Sanitize syscall table de-references under speculation x86/get_user: Use pointer masking to limit speculation ...
| * x86/get_user: Use pointer masking to limit speculationDan Williams2018-01-301-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. Unlike the __get_user() case get_user() includes the address limit check near the pointer de-reference. With that locality the speculation can be mitigated with pointer narrowing rather than a barrier, i.e. array_index_nospec(). Where the narrowing is performed by: cmp %limit, %ptr sbb %mask, %mask and %mask, %ptr With respect to speculation the value of %ptr is either less than %limit or NULL. Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwillia2-desk3.amr.corp.intel.com
| * x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospecDan Williams2018-01-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. __uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the limit check is far away from the user pointer de-reference. In those cases a barrier_nospec() prevents speculation with a potential pointer to privileged memory. uaccess_try_nospec covers get_user_try. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com
| * x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}Dan Williams2018-01-301-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for converting some __uaccess_begin() instances to __uacess_begin_nospec(), make sure all 'from user' uaccess paths are using the _begin(), _end() helpers rather than open-coded stac() and clac(). No functional changes. Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com
| * Merge tag 'v4.15' into x86/pti, to be able to merge dependent changesIngo Molnar2018-01-304-4/+1389
| |\ | | | | | | | | | | | | | | | | | | Time has come to switch PTI development over to a v4.15 base - we'll still try to make sure that all PTI fixes backport cleanly to v4.14 and earlier. Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2018-01-312-0/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Significantly shrink the core networking routing structures. Result of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf 2) Add netdevsim driver for testing various offloads, from Jakub Kicinski. 3) Support cross-chip FDB operations in DSA, from Vivien Didelot. 4) Add a 2nd listener hash table for TCP, similar to what was done for UDP. From Martin KaFai Lau. 5) Add eBPF based queue selection to tun, from Jason Wang. 6) Lockless qdisc support, from John Fastabend. 7) SCTP stream interleave support, from Xin Long. 8) Smoother TCP receive autotuning, from Eric Dumazet. 9) Lots of erspan tunneling enhancements, from William Tu. 10) Add true function call support to BPF, from Alexei Starovoitov. 11) Add explicit support for GRO HW offloading, from Michael Chan. 12) Support extack generation in more netlink subsystems. From Alexander Aring, Quentin Monnet, and Jakub Kicinski. 13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From Russell King. 14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso. 15) Many improvements and simplifications to the NFP driver bpf JIT, from Jakub Kicinski. 16) Support for ipv6 non-equal cost multipath routing, from Ido Schimmel. 17) Add resource abstration to devlink, from Arkadi Sharshevsky. 18) Packet scheduler classifier shared filter block support, from Jiri Pirko. 19) Avoid locking in act_csum, from Davide Caratti. 20) devinet_ioctl() simplifications from Al viro. 21) More TCP bpf improvements from Lawrence Brakmo. 22) Add support for onlink ipv6 route flag, similar to ipv4, from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits) tls: Add support for encryption using async offload accelerator ip6mr: fix stale iterator net/sched: kconfig: Remove blank help texts openvswitch: meter: Use 64-bit arithmetic instead of 32-bit tcp_nv: fix potential integer overflow in tcpnv_acked r8169: fix RTL8168EP take too long to complete driver initialization. qmi_wwan: Add support for Quectel EP06 rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK ipmr: Fix ptrdiff_t print formatting ibmvnic: Wait for device response when changing MAC qlcnic: fix deadlock bug tcp: release sk_frag.page in tcp_disconnect ipv4: Get the address of interface correctly. net_sched: gen_estimator: fix lockdep splat net: macb: Handle HRESP error net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring ipv6: addrconf: break critical section in addrconf_verify_rtnl() ipv6: change route cache aging logic i40e/i40evf: Update DESC_NEEDED value to reflect larger value bnxt_en: cleanup DIM work on device shutdown ...
| * \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-231-2/+3
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in 'net'. The esp{4,6}_offload.c conflicts were overlapping changes. The 'out' label is removed so we just return ERR_PTR(-EINVAL) directly. Signed-off-by: David S. Miller <davem@davemloft.net>
| * \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-173-3/+53
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | error-injection: Separate error-injection from kprobeMasami Hiramatsu2018-01-122-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since error-injection framework is not limited to be used by kprobes, nor bpf. Other kernel subsystems can use it freely for checking safeness of error-injection, e.g. livepatch, ftrace etc. So this separate error-injection framework from kprobes. Some differences has been made: - "kprobe" word is removed from any APIs/structures. - BPF_ALLOW_ERROR_INJECTION() is renamed to ALLOW_ERROR_INJECTION() since it is not limited for BPF too. - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this feature. It is automatically enabled if the arch supports error injection feature for kprobe or ftrace etc. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* | | | | | Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds2018-01-301-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove unused IOMMU_STRESS Kconfig x86/extable: Mark exception handler functions visible x86/timer: Don't inline __const_udelay x86/headers: Remove duplicate #includes
| * | | | | | x86/timer: Don't inline __const_udelayAndi Kleen2018-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __const_udelay is marked inline, and LTO will happily inline it everywhere Dropping the inline saves ~44k text in a LTO build. 13999560 1740864 1499136 17239560 1070e08 vmlinux-with-udelay-inline 13954764 1736768 1499136 17190668 1064f0c vmlinux-wo-udelay-inline Inlining it has no advantage in general, so its the right thing to do. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20171222001821.2157-2-andi@firstfloor.org
* | | | | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-01-292-0/+57
|\ \ \ \ \ \ \ | | |_|_|_|/ / | |/| | | | / | |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "Another set of melted spectrum related changes: - Code simplifications and cleanups for RSB and retpolines. - Make the indirect calls in KVM speculation safe. - Whitelist CPUs which are known not to speculate from Meltdown and prepare for the new CPUID flag which tells the kernel that a CPU is not affected. - A less rigorous variant of the module retpoline check which merily warns when a non-retpoline protected module is loaded and reflects that fact in the sysfs file. - Prepare for Indirect Branch Prediction Barrier support. - Prepare for exposure of the Speculation Control MSRs to guests, so guest OSes which depend on those "features" can use them. Includes a blacklist of the broken microcodes. The actual exposure of the MSRs through KVM is still being worked on" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Simplify indirect_branch_prediction_barrier() x86/retpoline: Simplify vmexit_fill_RSB() x86/cpufeatures: Clean up Spectre v2 related CPUID flags x86/cpu/bugs: Make retpoline module warning conditional x86/bugs: Drop one "mitigation" from dmesg x86/nospec: Fix header guards names x86/alternative: Print unadorned pointers x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown x86/msr: Add definitions for new speculation control MSRs x86/cpufeatures: Add AMD feature bits for Speculation Control x86/cpufeatures: Add Intel feature bits for Speculation Control x86/cpufeatures: Add CPUID_7_EDX CPUID leaf module/retpoline: Warn about missing retpoline in module KVM: VMX: Make indirect call speculation safe KVM: x86: Make indirect calls in emulator speculation safe
| * | | | | x86/retpoline: Simplify vmexit_fill_RSB()Borislav Petkov2018-01-272-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify it to call an asm-function instead of pasting 41 insn bytes at every call site. Also, add alignment to the macro as suggested here: https://support.google.com/faqs/answer/7625886 [dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk
* | | | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-01-281-1/+0
|\ \ \ \ \ \ | |/ / / / / | | | | | / | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 retpoline fixlet from Thomas Gleixner: "Remove the ESP/RSP thunks for retpoline as they cannot ever work. Get rid of them before they show up in a release" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/retpoline: Remove the esp/rsp thunk
| * | | | x86/retpoline: Remove the esp/rsp thunkWaiman Long2018-01-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn't make sense to have an indirect call thunk with esp/rsp as retpoline code won't work correctly with the stack pointer register. Removing it will help compiler writers to catch error in case such a thunk call is emitted incorrectly. Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law <law@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com
* | | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-01-211-2/+3
|\ \ \ \ \ | |/ / / / | | | | / | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti fixes from Thomas Gleixner: "A small set of fixes for the meltdown/spectre mitigations: - Make kprobes aware of retpolines to prevent probes in the retpoline thunks. - Make the machine check exception speculation protected. MCE used to issue an indirect call directly from the ASM entry code. Convert that to a direct call into a C-function and issue the indirect call from there so the compiler can add the retpoline protection, - Make the vmexit_fill_RSB() assembly less stupid - Fix a typo in the PTI documentation" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/retpoline: Optimize inline assembler for vmexit_fill_RSB x86/pti: Document fix wrong index kprobes/x86: Disable optimizing on the function jumps to indirect thunk kprobes/x86: Blacklist indirect thunk functions for kprobes retpoline: Introduce start/end markers of indirect thunk x86/mce: Make machine check speculation protected
| * | | kprobes/x86: Blacklist indirect thunk functions for kprobesMasami Hiramatsu2018-01-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark __x86_indirect_thunk_* functions as blacklist for kprobes because those functions can be called from anywhere in the kernel including blacklist functions of kprobes. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbox
| * | | retpoline: Introduce start/end markers of indirect thunkMasami Hiramatsu2018-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce start/end markers of __x86_indirect_thunk_* functions. To make it easy, consolidate .text.__x86.indirect_thunk.* sections to one .text.__x86.indirect_thunk section and put it in the end of kernel text section and adds __indirect_thunk_start/end so that other subsystem (e.g. kprobes) can identify it. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbox
* | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-01-143-3/+53
|\ \ \ \ | |/ / / | | | / | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "This contains: - a PTI bugfix to avoid setting reserved CR3 bits when PCID is disabled. This seems to cause issues on a virtual machine at least and is incorrect according to the AMD manual. - a PTI bugfix which disables the perf BTS facility if PTI is enabled. The BTS AUX buffer is not globally visible and causes the CPU to fault when the mapping disappears on switching CR3 to user space. A full fix which restores BTS on PTI is non trivial and will be worked on. - PTI bugfixes for EFI and trusted boot which make sure that the user space visible page table entries have the NX bit cleared - removal of dead code in the PTI pagetable setup functions - add PTI documentation - add a selftest for vsyscall to verify that the kernel actually implements what it advertises. - a sysfs interface to expose vulnerability and mitigation information so there is a coherent way for users to retrieve the status. - the initial spectre_v2 mitigations, aka retpoline: + The necessary ASM thunk and compiler support + The ASM variants of retpoline and the conversion of affected ASM code + Make LFENCE serializing on AMD so it can be used as speculation trap + The RSB fill after vmexit - initial objtool support for retpoline As I said in the status mail this is the most of the set of patches which should go into 4.15 except two straight forward patches still on hold: - the retpoline add on of LFENCE which waits for ACKs - the RSB fill after context switch Both should be ready to go early next week and with that we'll have covered the major holes of spectre_v2 and go back to normality" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) x86,perf: Disable intel_bts when PTI security/Kconfig: Correct the Documentation reference for PTI x86/pti: Fix !PCID and sanitize defines selftests/x86: Add test_vsyscall x86/retpoline: Fill return stack buffer on vmexit x86/retpoline/irq32: Convert assembler indirect jumps x86/retpoline/checksum32: Convert assembler indirect jumps x86/retpoline/xen: Convert Xen hypercall indirect jumps x86/retpoline/hyperv: Convert assembler indirect jumps x86/retpoline/ftrace: Convert ftrace assembler indirect jumps x86/retpoline/entry: Convert entry assembler indirect jumps x86/retpoline/crypto: Convert crypto assembler indirect jumps x86/spectre: Add boot time option to select Spectre v2 mitigation x86/retpoline: Add initial retpoline support objtool: Allow alternatives to be ignored objtool: Detect jumps to retpoline thunks x86/pti: Make unpoison of pgd for trusted boot work for real x86/alternatives: Fix optimize_nops() checking sysfs/cpu: Fix typos in vulnerability documentation x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC ...
| * | x86/retpoline/checksum32: Convert assembler indirect jumpsDavid Woodhouse2018-01-121-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert all indirect jumps in 32bit checksum assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.uk
| * | x86/retpoline: Add initial retpoline supportDavid Woodhouse2018-01-122-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide the corresponding thunks. Provide assembler macros for invoking the thunks in the same way that GCC does, from native and inline assembler. This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In some circumstances, IBRS microcode features may be used instead, and the retpoline can be disabled. On AMD CPUs if lfence is serialising, the retpoline can be dramatically simplified to a simple "lfence; jmp *\reg". A future patch, after it has been verified that lfence really is serialising in all circumstances, can enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition to X86_FEATURE_RETPOLINE. Do not align the retpoline in the altinstr section, because there is no guarantee that it stays aligned when it's copied over the oldinstr during alternative patching. [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks] [ tglx: Put actual function CALL/JMP in front of the macros, convert to symbolic labels ] [ dwmw2: Convert back to numeric labels, merge objtool fixes ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
* | | Merge branch 'WIP.x86-pti.entry-for-linus' of ↵Linus Torvalds2017-12-181-2/+2
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 syscall entry code changes for PTI from Ingo Molnar: "The main changes here are Andy Lutomirski's changes to switch the x86-64 entry code to use the 'per CPU entry trampoline stack'. This, besides helping fix KASLR leaks (the pending Page Table Isolation (PTI) work), also robustifies the x86 entry code" * 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/cpufeatures: Make CPU bugs sticky x86/paravirt: Provide a way to check for hypervisors x86/paravirt: Dont patch flush_tlb_single x86/entry/64: Make cpu_entry_area.tss read-only x86/entry: Clean up the SYSENTER_stack code x86/entry/64: Remove the SYSENTER stack canary x86/entry/64: Move the IST stacks into struct cpu_entry_area x86/entry/64: Create a per-CPU SYSCALL entry trampoline x86/entry/64: Return to userspace from the trampoline stack x86/entry/64: Use a per-CPU trampoline stack for IDT entries x86/espfix/64: Stop assuming that pt_regs is on the entry stack x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 x86/entry: Remap the TSS into the CPU entry area x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct x86/dumpstack: Handle stack overflow on all stacks x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss x86/kasan/64: Teach KASAN about the cpu_entry_area x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area x86/entry/gdt: Put per-CPU GDT remaps in ascending order x86/dumpstack: Add get_stack_info() support for the SYSENTER stack ...
| * | x86/entry/64: Make cpu_entry_area.tss read-onlyAndy Lutomirski2017-12-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TSS is a fairly juicy target for exploits, and, now that the TSS is in the cpu_entry_area, it's no longer protected by kASLR. Make it read-only on x86_64. On x86_32, it can't be RO because it's written by the CPU during task switches, and we use a task gate for double faults. I'd also be nervous about errata if we tried to make it RO even on configurations without double fault handling. [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO. So it's probably safe to assume that it's a non issue, though Intel might have been creative in that area. Still waiting for confirmation. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | x86/decoder: Fix and update the opcodes mapRandy Dunlap2017-12-151-2/+11
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update x86-opcode-map.txt based on the October 2017 Intel SDM publication. Fix INVPID to INVVPID. Add UD0 and UD1 instruction opcodes. Also sync the objtool and perf tooling copies of this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/decoder: Add new TEST instruction patternMasami Hiramatsu2017-11-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kbuild test robot reported this build warning: Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx) Warning: objdump says 3 bytes, but insn_get_length() says 2 Warning: decoded and checked 1569014 instructions with 1 warnings This sequence seems to be a new instruction not in the opcode map in the Intel SDM. The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8. Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis)" In that table, opcodes listed by the index REG bits as: 000 001 010 011 100 101 110 111 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX So, it seems TEST Ib is assigned to 001. Add the new pattern. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | x86/umip: Fix insn_get_code_seg_params()'s return valueBorislav Petkov2017-11-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to save on redundant structs definitions insn_get_code_seg_params() was made to return two 4-bit values in a char but clang complains: arch/x86/lib/insn-eval.c:780:10: warning: implicit conversion from 'int' to 'char' changes value from 132 to -124 [-Wconstant-conversion] return INSN_CODE_SEG_PARAMS(4, 8); ~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS' #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4)) Those two values do get picked apart afterwards the opposite way of how they were ORed so wrt to the LSByte, the return value is the same. But this function returns -EINVAL in the error case, which is an int. So make it return an int which is the native word size anyway and thus fix the clang warning. Reported-by: Kees Cook <keescook@google.com> Reported-by: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ricardo.neri-calderon@linux.intel.com Link: https://lkml.kernel.org/r/20171123091951.1462-1-bp@alien8.de
* | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2017-11-132-1/+1365
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Ingo Molnar: "Note that in this cycle most of the x86 topics interacted at a level that caused them to be merged into tip:x86/asm - but this should be a temporary phenomenon, hopefully we'll back to the usual patterns in the next merge window. The main changes in this cycle were: Hardware enablement: - Add support for the Intel UMIP (User Mode Instruction Prevention) CPU feature. This is a security feature that disables certain instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri) [ Note that this is disabled by default for now, there are some smaller enhancements in the pipeline that I'll follow up with in the next 1-2 days, which allows this to be enabled by default.] - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU feature, on top of SME (Secure Memory Encryption) support that was added in v4.14. (Tom Lendacky, Brijesh Singh) - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela) Other changes: - A big series of entry code simplifications and enhancements (Andy Lutomirski) - Make the ORC unwinder default on x86 and various objtool enhancements. (Josh Poimboeuf) - 5-level paging enhancements (Kirill A. Shutemov) - Micro-optimize the entry code a bit (Borislav Petkov) - Improve the handling of interdependent CPU features in the early FPU init code (Andi Kleen) - Build system enhancements (Changbin Du, Masahiro Yamada) - ... plus misc enhancements, fixes and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits) x86/build: Make the boot image generation less verbose selftests/x86: Add tests for the STR and SLDT instructions selftests/x86: Add tests for User-Mode Instruction Prevention x86/traps: Fix up general protection faults caused by UMIP x86/umip: Enable User-Mode Instruction Prevention at runtime x86/umip: Force a page fault when unable to copy emulated result to user x86/umip: Add emulation code for UMIP instructions x86/cpufeature: Add User-Mode Instruction Prevention definitions x86/insn-eval: Add support to resolve 16-bit address encodings x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode x86/insn-eval: Add wrapper function for 32 and 64-bit addresses x86/insn-eval: Add support to resolve 32-bit address encodings x86/insn-eval: Compute linear address in several utility functions resource: Fix resource_size.cocci warnings X86/KVM: Clear encryption attribute when SEV is active X86/KVM: Decrypt shared per-cpu variables when SEV is active percpu: Introduce DEFINE_PER_CPU_DECRYPTED x86: Add support for changing memory encryption attribute in early boot x86/io: Unroll string I/O when SEV is active x86/boot: Add early boot support when running with SEV active ...
| * | x86/insn-eval: Add support to resolve 16-bit address encodingsRicardo Neri2017-11-081-1/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tasks running in virtual-8086 mode, in protected mode with code segment descriptors that specify 16-bit default address sizes via the D bit, or via an address override prefix will use 16-bit addressing form encodings as described in the Intel 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section 2.1.5, Table 2-1. 16-bit addressing encodings differ in several ways from the 32-bit/64-bit addressing form encodings: ModRM.rm points to different registers and, in some cases, effective addresses are indicated by the addition of the value of two registers. Also, there is no support for SIB bytes. Thus, a separate function is needed to parse this form of addressing. Three functions are introduced. get_reg_offset_16() obtains the offset from the base of pt_regs of the registers indicated by the ModRM byte of the address encoding. get_eff_addr_modrm_16() computes the effective address from the value of the register operands. get_addr_ref_16() computes the linear address using the obtained effective address and the base address of the segment. Segment limits are enforced when running in protected mode. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-6-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/insn-eval: Handle 32-bit address encodings in virtual-8086 modeRicardo Neri2017-11-081-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to utilize 32-bit address encodings in virtual-8086 mode via an address override instruction prefix. However, the range of the effective address is still limited to [0x-0xffff]. In such a case, return error. Also, linear addresses in virtual-8086 mode are limited to 20 bits. Enforce such limit by truncating the most significant bytes of the computed linear address. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-5-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/insn-eval: Add wrapper function for 32 and 64-bit addressesRicardo Neri2017-11-081-5/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function insn_get_addr_ref() is capable of handling only 64-bit addresses. A previous commit introduced a function to handle 32-bit addresses. Invoke these two functions from a third wrapper function that calls the appropriate routine based on the address size specified in the instruction structure (obtained by looking at the code segment default address size and the address override prefix, if present). While doing this, rename the original function insn_get_addr_ref() with the more appropriate name get_addr_ref_64(), ensure it is only used for 64-bit addresses. Also, since 64-bit addresses are not possible in 32-bit builds, provide a dummy function such case. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-4-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/insn-eval: Add support to resolve 32-bit address encodingsRicardo Neri2017-11-081-6/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 32-bit and 64-bit address encodings are identical. Thus, the same logic could be used to resolve the effective address. However, there are two key differences: address size and enforcement of segment limits. If running a 32-bit process on a 64-bit kernel, it is best to perform the address calculation using 32-bit data types. In this manner hardware is used for the arithmetic, including handling of signs and overflows. 32-bit addresses are generally used in protected mode; segment limits are enforced in this mode. This implementation obtains the limit of the segment associated with the instruction operands and prefixes. If the computed address is outside the segment limits, an error is returned. It is also possible to use 32-bit address in long mode and virtual-8086 mode by using an address override prefix. In such cases, segment limits are not enforced. Support to use 32-bit arithmetic is added to the utility functions that compute effective addresses. However, the end result is stored in a variable of type long (which has a width of 8 bytes in 64-bit builds). Hence, once a 32-bit effective address is computed, the 4 most significant bytes are masked out to avoid sign extension. The newly added function get_addr_ref_32() is almost identical to the existing function insn_get_addr_ref() (used for 64-bit addresses). The only difference is that it verifies that the effective address is within the limits of the segment. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-3-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/insn-eval: Compute linear address in several utility functionsRicardo Neri2017-11-081-58/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Computing a linear address involves several steps. The first step is to compute the effective address. This requires determining the addressing mode in use and perform arithmetic operations on the operands. Plus, each addressing mode has special cases that must be handled. Once the effective address is known, the base address of the applicable segment is added to obtain the linear address. Clearly, this is too much work for a single function. Instead, handle each addressing mode in a separate utility function. This improves readability and gives us the opportunity to handler errors better. At the moment, arithmetic to compute the effective address uses 64-byte variables. Thus, limit support to 64-bit addresses. While reworking the function insn_get_addr_ref(), the variable addr_offset is renamed as regoff to reflect its actual use (i.e., offset, from the base of pt_regs, of the register used as operand). Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-2-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | Merge branch 'x86/mpx' into x86/asm, to pick up dependent commitsIngo Molnar2017-11-082-1/+866
| |\ \ | | |/ | |/| | | | | | | | | | | | | The UMIP series is based on top of changes already queued up in the x86/mpx branch, so merge it. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * x86/insn-eval: Extend get_seg_base_addr() to also obtain segment limitRicardo Neri2017-11-021-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In protected mode, it is common to want to obtain the limit of a segment along with its base address. This is useful, for instance, to verify that an effective address lies within a segment before computing a linear address. Up to this point, this library only computes linear addresses in long mode. Subsequent patches will include support for protected mode. Support to verify the segment limit will be needed. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509148310-30862-2-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * x86/insn-eval: Incorporate segment base in linear address computationRicardo Neri2017-11-011-3/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | insn_get_addr_ref() returns the effective address as defined by the section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software Developer's Manual. In order to compute the linear address, we must add to the effective address the segment base address as set in the segment descriptor. The segment descriptor to use depends on the register used as operand and segment override prefixes, if any. In most cases, the segment base address will be 0 if the USER_DS/USER32_DS segment is used or if segmentation is not used. However, the base address is not necessarily zero if a user programs defines its own segments. This is possible by using a local descriptor table. Since the effective address is a signed quantity, the unsigned segment base address is saved in a separate variable and added to the final, unsigned, effective address. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/1509135945-13762-19-git-send-email-ricardo.neri-calderon@linux.intel.com
| | * x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm ↵Ricardo Neri2017-11-011-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is 101b Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software Developer's Manual volume 2A states that when ModRM.mod is zero and ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means that none of the registers are used in the computation of the effective address. A return value of -EDOM indicates callers that they should not use the value of registers when computing the effective address for the instruction. In long mode, the effective address is given by the 32-bit displacement plus the location of the next instruction. In protected mode, only the displacement is used. The instruction decoder takes care of obtaining the displacement. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/1509135945-13762-18-git-send-email-ricardo.neri-calderon@linux.intel.com
| | * x86/insn-eval: Add function to get default params of code segmentRicardo Neri2017-11-011-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Obtain the default values of the address and operand sizes as specified in the D and L bits of the the segment descriptor selected by the register CS. The function can be used for both protected and long modes. For virtual-8086 mode, the default address and operand sizes are always 2 bytes. The returned parameters are encoded in a signed 8-bit data type. Auxiliar macros are provided to encode and decode such values. Improvements-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/1509135945-13762-17-git-send-email-ricardo.neri-calderon@linux.intel.com
| | * x86/insn-eval: Add utility functions to get segment descriptor base address ↵Ricardo Neri2017-11-011-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and limit With segmentation, the base address of the segment is needed to compute a linear address. This base address is obtained from the applicable segment descriptor. Such segment descriptor is referenced from a segment selector. These new functions obtain the segment base and limit of the segment selector indicated by segment register index given as argument. This index is any of the INAT_SEG_REG_* family of #define's. The logic to obtain the segment selector is wrapped in the function get_segment_selector() with the inputs described above. Once the selector is known, the base address is determined. In protected mode, the selector is used to obtain the segment descriptor and then its base address. In long mode, the segment base address is zero except when FS or GS are used. In virtual-8086 mode, the base address is computed as the value of the segment selector shifted 4 positions to the left. In protected mode, segment limits are enforced. Thus, a function to determine the limit of the segment is added. Segment limits are not enforced in long or virtual-8086. For the latter, addresses are limited to 20 bits; address size will be handled when computing the linear address. Improvements-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/1509135945-13762-16-git-send-email-ricardo.neri-calderon@linux.intel.com
| | * x86/insn-eval: Add utility function to get segment descriptorRicardo Neri2017-11-011-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The segment descriptor contains information that is relevant to how linear addresses need to be computed. It contains the default size of addresses as well as the base address of the segment. Thus, given a segment selector, we ought to look at segment descriptor to correctly calculate the linear address. In protected mode, the segment selector might indicate a segment descriptor from either the global descriptor table or a local descriptor table. Both cases are considered in this function. This function is a prerequisite for functions in subsequent commits that will obtain the aforementioned attributes of the segment descriptor. Improvements-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/1509135945-13762-15-git-send-email-ricardo.neri-calderon@linux.intel.com
OpenPOWER on IntegriCloud