summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggableTang Chen2014-01-211-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At very early time, the kernel have to use some memory such as loading the kernel image. We cannot prevent this anyway. So any node the kernel resides in should be un-hotpluggable. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* acpi, numa, mem_hotplug: mark hotpluggable memory in memblockTang Chen2014-01-212-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parsing SRAT, we know that which memory area is hotpluggable. So we invoke function memblock_mark_hotplug() introduced by previous patch to mark hotpluggable memory in memblock. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memblock: make memblock_set_node() support different memblock_typeTang Chen2014-01-213-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [sfr@canb.auug.org.au: fix powerpc build] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'driver-core-3.14-rc1' of ↵Linus Torvalds2014-01-202-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / sysfs patches from Greg KH: "Here's the big driver core and sysfs patch set for 3.14-rc1. There's a lot of work here moving sysfs logic out into a "kernfs" to allow other subsystems to also have a virtual filesystem with the same attributes of sysfs (handle device disconnect, dynamic creation / removal as needed / unneeded, etc) This is primarily being done for the cgroups filesystem, but the goal is to also move debugfs to it when it is ready, solving all of the known issues in that filesystem as well. The code isn't completed yet, but all should be stable now (there is a big section that was reverted due to problems found when testing) There's also some other smaller fixes, and a driver core addition that allows for a "collection" of objects, that the DRM people will be using soon (it's in this tree to make merges after -rc1 easier) All of this has been in linux-next with no reported issues" * tag 'driver-core-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (113 commits) kernfs: associate a new kernfs_node with its parent on creation kernfs: add struct dentry declaration in kernfs.h kernfs: fix get_active failure handling in kernfs_seq_*() Revert "kernfs: fix get_active failure handling in kernfs_seq_*()" Revert "kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq" Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()" Revert "kernfs: remove KERNFS_REMOVED" Revert "kernfs: restructure removal path to fix possible premature return" Revert "kernfs: invoke kernfs_unmap_bin_file() directly from __kernfs_remove()" Revert "kernfs: remove kernfs_addrm_cxt" Revert "kernfs: make kernfs_get_active() block if the node is deactivated but not removed" Revert "kernfs: implement kernfs_{de|re}activate[_self]()" Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers" Revert "pci: use device_remove_file_self() instead of device_schedule_callback()" Revert "scsi: use device_remove_file_self() instead of device_schedule_callback()" Revert "s390: use device_remove_file_self() instead of device_schedule_callback()" Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" Revert "kernfs: remove unnecessary NULL check in __kernfs_remove()" kernfs: remove unnecessary NULL check in __kernfs_remove() drivers/base: provide an infrastructure for componentised subsystems ...
| * Merge 3.13-rc5 into staging-nextGreg Kroah-Hartman2013-12-2414-77/+83
| |\ | | | | | | | | | | | | | | | We want these fixes here to handle some merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * \ Merge branch 'driver-core-linus' into driver-core-nextTejun Heo2013-12-109-15/+43
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a8b14744429f ("sysfs: give different locking key to regular and bin files") in driver-core-linus modifies sysfs_open_file() so that it gives out different locking classes to sysfs_open_files depending on whether the file is bin or not. Due to the massive kernfs reorganization in driver-core-next, this naturally causes merge conflict in fs/sysfs/file.c. Due to the way things are split between kernfs and sysfs in driver-core-next, the same fix can't easily be applied to driver-core-next. This merge simply ignores the offending commit. A following patch will implement a separate fix for the issue. Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | microcode: Use request_firmware_direct()Takashi Iwai2013-12-082-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new helper, request_firmware_direct(), for avoiding the lengthy timeout of non-existing firmware loads. Especially the Intel microcode driver suffers from this problem because each CPU triggers the f/w loading, thus it ends up taking (literally) hours with many cores. Tested-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge branch 'x86-x32-for-linus' of ↵Linus Torvalds2014-01-201-20/+22
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x32 uapi changes from Peter Anvin: "This is the first few of a set of patches by H.J. Lu to make the kernel uapi headers usable for x32, as required by some non-glibc libcs. These particular patches make the stat and statfs structures usable" * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, x32: Use __kernel_long_t for __statfs_word x86, x32: Use __kernel_long_t/__kernel_ulong_t in x86-64 stat.h
| * | | | x86, x32: Use __kernel_long_t/__kernel_ulong_t in x86-64 stat.hH.J. Lu2013-12-201-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both x32 and x86-64 use the same stat system call interface. But x32 long is 32-bit. This patch changes x86 uapi <asm/stat.h> to use __kernel_long_t/__kernel_ulong_t in x86-64 stat. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Link: http://lkml.kernel.org/r/CAMe9rOquPtWEro0GQ=Z95pZJ=c7GGkSHynjN4FbiB4p445x-Ng@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | Merge branch 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2014-01-207-26/+133
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull x86 cpufeature and mpx updates from Peter Anvin: "This includes the basic infrastructure for MPX (Memory Protection Extensions) support, but does not include MPX support itself. It is, however, a prerequisite for KVM support for MPX, which I believe will be pushed later this merge window by the KVM team. This includes moving the functionality in futex_atomic_cmpxchg_inatomic() into a new function in uaccess.h so it can be reused - this will be used by the final MPX patches. The actual MPX functionality (map management and so on) will be pushed in a future merge window, when ready" * 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel/mpx: Remove unused LWP structure x86, mpx: Add MPX related opcodes to the x86 opcode map x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic x86: add user_atomic_cmpxchg_inatomic at uaccess.h x86, xsave: Support eager-only xsave features, add MPX support x86, cpufeature: Define the Intel MPX feature flag
| * | | | | x86/intel/mpx: Remove unused LWP structureIngo Molnar2014-01-201-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't support LWP yet, don't give the impression that we do: represent the LWP state as opaque 128 bytes, the way Linux sees it currently. Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-ecarmjtfKpanpAapfck6dj6g@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86, mpx: Add MPX related opcodes to the x86 opcode mapQiaowei Ren2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds all the MPX instructions to x86 opcode map, so the x86 instruction decoder can decode MPX instructions. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/1389518403-7715-4-git-send-email-qiaowei.ren@intel.com Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomicQiaowei Ren2013-12-161-20/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | futex_atomic_cmpxchg_inatomic() is simply the 32-bit implementation of user_atomic_cmpxchg_inatomic(), which in turn is simply a generalization of the original code in futex_atomic_cmpxchg_inatomic(). Use the newly generalized user_atomic_cmpxchg_inatomic() as the futex implementation, too. [ hpa: retain the inline in futex.h rather than changing it to a macro ] Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/1387002303-6620-2-git-send-email-qiaowei.ren@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
| * | | | | x86: add user_atomic_cmpxchg_inatomic at uaccess.hQiaowei Ren2013-12-161-0/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds user_atomic_cmpxchg_inatomic() to use CMPXCHG instruction against a user space address. This generalizes the already existing futex_atomic_cmpxchg_inatomic() so it can be used in other contexts. This will be used in the upcoming support for Intel MPX (Memory Protection Extensions.) [ hpa: replaced #ifdef inside a macro with IS_ENABLED() ] Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/1387002303-6620-1-git-send-email-qiaowei.ren@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
| * | | | | x86, xsave: Support eager-only xsave features, add MPX supportQiaowei Ren2013-12-063-4/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some features, like Intel MPX, work only if the kernel uses eagerfpu model. So we should force eagerfpu on unless the user has explicitly disabled it. Add definitions for Intel MPX and add it to the supported list. [ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ] Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115@SHSMSX102.ccr.corp.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, cpufeature: Define the Intel MPX feature flagQiaowei Ren2013-12-061-0/+1
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define the Intel MPX (Memory Protection Extensions) CPU feature flag in the cpufeature list. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei.ren@intel.com Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | Merge branch 'x86-kaslr-for-linus' of ↵Linus Torvalds2014-01-2021-158/+650
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 kernel address space randomization support from Peter Anvin: "This enables kernel address space randomization for x86" * 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kaslr: Clarify RANDOMIZE_BASE_MAX_OFFSET x86, kaslr: Remove unused including <linux/version.h> x86, kaslr: Use char array to gain sizeof sanity x86, kaslr: Add a circular multiply for better bit diffusion x86, kaslr: Mix entropy sources together as needed x86/relocs: Add percpu fixup for GNU ld 2.23 x86, boot: Rename get_flags() and check_flags() to *_cpuflags() x86, kaslr: Raise the maximum virtual address to -1 GiB on x86_64 x86, kaslr: Report kernel offset on panic x86, kaslr: Select random position from e820 maps x86, kaslr: Provide randomness functions x86, kaslr: Return location from decompress_kernel x86, boot: Move CPU flags out of cpucheck x86, relocs: Add more per-cpu gold special cases
| * | | | | x86, kaslr: Clarify RANDOMIZE_BASE_MAX_OFFSETKees Cook2014-01-141-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The help text for RANDOMIZE_BASE_MAX_OFFSET was confusing. This has been clarified, and updated to be an export-only tunable. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20131210202745.GA2961@www.outflux.net Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Remove unused including <linux/version.h>Wei Yongjun2014-01-141-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Link: http://lkml.kernel.org/r/CAPgLHd-Fjx1RybjWFAu1vHRfTvhWwMLL3x46BouC5uNxHPjy1A@mail.gmail.com Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Use char array to gain sizeof sanityKees Cook2013-11-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The build_str needs to be char [] not char * for the sizeof() to report the string length. Reported-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20131112165607.GA5921@www.outflux.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | x86, kaslr: Add a circular multiply for better bit diffusionH. Peter Anvin2013-11-111-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we don't have RDRAND (in which case nothing else *should* matter), most sources have a highly biased entropy distribution. Use a circular multiply to diffuse the entropic bits. A circular multiply is a good operation for this: it is cheap on standard hardware and because it is symmetric (unlike an ordinary multiply) it doesn't introduce its own bias. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/20131111222839.GA28616@www.outflux.net
| * | | | | x86, kaslr: Mix entropy sources together as neededKees Cook2013-11-112-22/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on availability, mix the RDRAND and RDTSC entropy together with XOR. Only when neither is available should the i8254 be used. Update the Kconfig documentation to reflect this. Additionally, since bits used for entropy is masked elsewhere, drop the needless masking in the get_random_long(). Similarly, use the entire TSC, not just the low 32 bits. Finally, to improve the starting entropy, do a simple hashing of a build-time versions string and the boot-time boot_params structure for some additional level of unpredictability. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20131111222839.GA28616@www.outflux.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | x86/relocs: Add percpu fixup for GNU ld 2.23Kees Cook2013-10-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GNU linker tries to put __per_cpu_load into the percpu area, resulting in a lack of its relocation. Force this symbol to be relocated. Seen starting with GNU ld 2.23 and later. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Michael Davidson <md@google.com> Cc: Cong Ding <dinggnu@gmail.com> Link: http://lkml.kernel.org/r/20131016064314.GA2739@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86, boot: Rename get_flags() and check_flags() to *_cpuflags()H. Peter Anvin2013-10-134-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a function is used in more than one file it may not be possible to immediately tell from context what the intended meaning is. As such, it is more important that the naming be self-evident. Thus, change get_flags() to get_cpuflags(). For consistency, change check_flags() to check_cpuflags() even though it is only used in cpucheck.c. Link: http://lkml.kernel.org/r/1381450698-28710-2-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Raise the maximum virtual address to -1 GiB on x86_64Kees Cook2013-10-134-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit, this raises the maximum location to -1 GiB (from -1.5 GiB), the upper limit currently, since the kernel fixmap page mappings need to be moved to use the other 1 GiB (which would be the theoretical limit when building with -mcmodel=kernel). Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-7-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Report kernel offset on panicKees Cook2013-10-131-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the system panics, include the kernel offset in the report to assist in debugging. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-6-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Select random position from e820 mapsKees Cook2013-10-133-9/+202
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Counts available alignment positions across all e820 maps, and chooses one randomly for the new kernel base address, making sure not to collide with unsafe memory areas. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-5-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Provide randomness functionsKees Cook2013-10-134-14/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds potential sources of randomness: RDRAND, RDTSC, or the i8254. This moves the pre-alternatives inline rdrand function into the header so both pieces of code can use it. Availability of RDRAND is then controlled by CONFIG_ARCH_RANDOM, if someone wants to disable it even for kASLR. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-4-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kaslr: Return location from decompress_kernelKees Cook2013-10-138-24/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows decompress_kernel to return a new location for the kernel to be relocated to. Additionally, enforces CONFIG_PHYSICAL_START as the minimum relocation position when building with CONFIG_RELOCATABLE. With CONFIG_RANDOMIZE_BASE set, the choose_kernel_location routine will select a new location to decompress the kernel, though here it is presently a no-op. The kernel command line option "nokaslr" is introduced to bypass these routines. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-3-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, boot: Move CPU flags out of cpucheckKees Cook2013-10-137-97/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the CPU flags handling out of the cpucheck routines so that they can be reused by the future ASLR routines (in order to detect CPU features like RDRAND and RDTSC). This reworks has_eflag() and has_fpu() to be used on both 32-bit and 64-bit, and refactors the calls to cpuid to make them PIC-safe on 32-bit. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/1381450698-28710-2-git-send-email-keescook@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, relocs: Add more per-cpu gold special casesMichael Davidson2013-10-131-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "gold" linker doesn't seem to put some additional per-cpu cases in the right place. Add these to the per-cpu check. Without this, the kASLR patch series fails to correctly apply relocations, and fails to boot. Signed-off-by: Michael Davidson <md@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20131011013954.GA28902@www.outflux.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2014-01-205-0/+88
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull leftover x86 fixes from Ingo Molnar: "Two leftover fixes that did not make it into v3.13" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Add check for number of available vectors before CPU down x86, cpu, amd: Add workaround for family 16h, erratum 793
| * | | | | | x86: Add check for number of available vectors before CPU downPrarit Bhargava2014-01-153-0/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting "overflow" condition where irqs are "assigned" to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: <snip> [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] ------------[ cut here ]------------ [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48 [ 246.065728] ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040 [ 246.074073] 0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000 [ 246.082430] Call Trace: [ 246.085174] <IRQ> [<ffffffff8164fbf6>] dump_stack+0x46/0x58 [ 246.091633] [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [<ffffffff815710d6>] dev_watchdog+0x246/0x250 [ 246.110923] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [<ffffffff8106092a>] call_timer_fn+0x3a/0x110 [ 246.125224] [<ffffffff8106280f>] ? update_process_times+0x6f/0x80 [ 246.132137] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [<ffffffff81059a80>] __do_softirq+0xe0/0x220 [ 246.152976] [<ffffffff8165fedc>] call_softirq+0x1c/0x30 [ 246.158920] [<ffffffff810045f5>] do_softirq+0x55/0x90 [ 246.164670] [<ffffffff81059d35>] irq_exit+0xa5/0xb0 [ 246.170227] [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70 [ 246.184041] <EOI> [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30 [ 246.210846] [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250 [ 246.217371] [<ffffffff81646b47>] rest_init+0x77/0x80 [ 246.223028] [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb [ 246.229165] [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e [ 246.235787] [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are "orphaned" even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode v4: Additional changes suggested by Gong Chen. v5/v6/v7/v8: Updated comment text Signed-off-by: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com Reviewed-by: Gong Chen <gong.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Janet Morgan <janet.morgan@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Ruiv Wang <ruiv.wang@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
| * | | | | | x86, cpu, amd: Add workaround for family 16h, erratum 793Borislav Petkov2014-01-142-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the workaround for erratum 793 as a precaution in case not every BIOS implements it. This addresses CVE-2013-6885. Erratum text: [Revision Guide for AMD Family 16h Models 00h-0Fh Processors, document 51810 Rev. 3.04 November 2013] 793 Specific Combination of Writes to Write Combined Memory Types and Locked Instructions May Cause Core Hang Description Under a highly specific and detailed set of internal timing conditions, a locked instruction may trigger a timing sequence whereby the write to a write combined memory type is not flushed, causing the locked instruction to stall indefinitely. Potential Effect on System Processor core hang. Suggested Workaround BIOS should set MSR C001_1020[15] = 1b. Fix Planned No fix planned [ hpa: updated description, fixed typo in MSR name ] Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnic Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | Merge branch 'x86-ras-for-linus' of ↵Linus Torvalds2014-01-202-9/+17
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS changes from Ingo Molnar: - SCI reporting for other error types not only correctable ones - GHES cleanups - Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant - PCIe AER tracepoint severity levels fix - Error path correction for the mce device init - MCE timer fix - Add more flexibility to the error injection (EINJ) debugfs interface * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, mce: Fix mce_start_timer semantics ACPI, APEI, GHES: Cleanup ghes memory error handling ACPI, APEI: Cleanup alignment-aware accesses ACPI, APEI, GHES: Do not report only correctable errors with SCI ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface ACPI, eMCA: Combine eMCA/EDAC event reporting priority EDAC, sb_edac: Modify H/W event reporting policy EDAC: Add an edac_report parameter to EDAC PCI, AER: Fix severity usage in aer trace event x86, mce: Call put_device on device_register failure
| * \ \ \ \ \ \ Merge tag 'ras_for_3.14_p2' of ↵Ingo Molnar2014-01-122-8/+14
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull RAS updates from Borislav Petkov: " SCI reporting for other error types not only correctable ones + APEI GHES cleanups + mce timer fix " Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | | x86, mce: Fix mce_start_timer semanticsBorislav Petkov2014-01-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So mce_start_timer() has a 'cpu' argument which is supposed to mean to start a timer on that cpu. However, the code currently starts a timer on the *current* cpu the function runs on and causes the sanity-check in mce_timer_fn to fire: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn because it is running on the wrong cpu. This was triggered by Prarit Bhargava <prarit@redhat.com> by offlining all the cpus in succession. Then, we were fiddling with the CMCI storm settings when starting the timer whereas there's no need for that - if there's storm happening on this newly restarted cpu, we're going to be in normal CMCI mode initially and then when the CMCI interrupt starts firing, we're going to go to the polling mode with the timer real soon. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Prarit Bhargava <prarit@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Reviewed-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com
| | * | | | | | | ACPI, APEI, GHES: Do not report only correctable errors with SCIChen, Gong2013-12-211-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently SCI is employed to handle corrected errors - memory corrected errors, more specifically but in fact SCI still can be used to handle any errors, e.g. uncorrected or even fatal ones if enabled by the BIOS. Enable logging for those kinds of errors too. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1385363701-12387-1-git-send-email-gong.chen@linux.intel.com [ Boris: massage commit message, rename function arg. ] Signed-off-by: Borislav Petkov <bp@suse.de>
| * | | | | | | | Merge tag 'v3.13-rc8' into x86/ras, to pick up fixes.Ingo Molnar2014-01-128-12/+53
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * \ \ \ \ \ \ \ \ Merge tag 'ras_for_3.14' of ↵Ingo Molnar2013-12-161-1/+3
| |\ \ \ \ \ \ \ \ \ | | | |/ / / / / / / | | |/| | | | / / / | | |_|_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull RAS updates from Borislav Petkov: * Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant, from Gong Chen. * PCIe AER tracepoint severity levels fix, from Rui Wang. * Error path correction for the mce device init, from Levente Kurusa. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | | x86, mce: Call put_device on device_register failureLevente Kurusa2013-11-301-1/+3
| | | |_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a call to put_device() when the device_register() call has failed. This is required so that the last reference to the device is given up. Signed-off-by: Levente Kurusa <levex@linux.com> Link: http://lkml.kernel.org/r/5298F900.9000208@linux.com Signed-off-by: Borislav Petkov <bp@suse.de>
* | | | | | | | Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds2014-01-207-1/+466
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel SoC changes from Ingo Molnar: "Improved Intel SoC platform support" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, tsc, apic: Unbreak static (MSR) calibration when CONFIG_X86_LOCAL_APIC=n x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs arch: x86: New MailBox support driver for Intel SOC's
| * | | | | | | | x86, tsc, apic: Unbreak static (MSR) calibration when CONFIG_X86_LOCAL_APIC=nH. Peter Anvin2014-01-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we aren't going to use the local APIC anyway, we obviously don't care about its timer frequency. Link: http://lkml.kernel.org/r/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Bin Gao <bin.gao@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCsBin Gao2014-01-154-1/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On SoCs that have the calibration MSRs available, either there is no PIT, HPET or PMTIMER to calibrate against, or the PIT/HPET/PMTIMER is driven from the same clock as the TSC, so calibration is redundant and just slows down the boot. TSC rate is caculated by this formula: <maximum core-clock to bus-clock ratio> * <maximum resolved frequency> The ratio and the resolved frequency ID can be obtained from MSR. See Intel 64 and IA-32 System Programming Guid section 16.12 and 30.11.5 for details. Signed-off-by: Bin Gao <bin.gao@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org
| * | | | | | | | arch: x86: New MailBox support driver for Intel SOC'sDavid E. Box2014-01-084-0/+325
| | |_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current Intel SOC cores use a MailBox Interface (MBI) to provide access to configuration registers on devices (called units) connected to the system fabric. This is a support driver that implements access to this interface on those platforms that can enumerate the device using PCI. Initial support is for BayTrail, for which port definitons are provided. This is a requirement for implementing platform specific features (e.g. RAPL driver requires this to perform platform specific power management using the registers in PUNIT). Dependant modules should select IOSF_MBI in their respective Kconfig configuraiton. Serialized access is handled by all exported routines with spinlocks. The API includes 3 functions for access to unit registers: int iosf_mbi_read(u8 port, u8 opcode, u32 offset, u32 *mdr) int iosf_mbi_write(u8 port, u8 opcode, u32 offset, u32 mdr) int iosf_mbi_modify(u8 port, u8 opcode, u32 offset, u32 mdr, u32 mask) port: indicating the unit being accessed opcode: the read or write port specific opcode offset: the register offset within the port mdr: the register data to be read, written, or modified mask: bit locations in mdr to change Returns nonzero on error Note: GPU code handles access to the GFX unit. Therefore access to that unit with this driver is disallowed to avoid conflicts. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: http://lkml.kernel.org/r/1389216471-734-1-git-send-email-david.e.box@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Matthew Garrett <mjg59@srcf.ucam.org>
* | | | | | | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2014-01-205-54/+70
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "A cleanup, a fix and ASLR support for hugetlb mappings" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/numa: Fix 32-bit kernel NUMA boot x86/mm: Implement ASLR for hugetlb mappings x86/mm: Unify pte_to_pgoff() and pgoff_to_pte() helpers
| * | | | | | | | x86/mm/numa: Fix 32-bit kernel NUMA bootLans Zhang2013-12-191-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting a 32-bit x86 kernel on a NUMA machine, node data cannot be allocated from local node if the account of memory for node 0 covers the low memory space entirely: [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x83fffffff] [ 0.000000] NODE_DATA [mem 0x367ed000-0x367edfff] [ 0.000000] Initmem setup node 1 [mem 0x840000000-0xfffffffff] [ 0.000000] Cannot find 4096 bytes in node 1 [ 0.000000] 64664MB HIGHMEM available. [ 0.000000] 871MB LOWMEM available. To fix this issue, node data is allowed to be allocated from other nodes if the memory of local node is still not mapped. The expected result looks like this: [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x83fffffff] [ 0.000000] NODE_DATA [mem 0x367ed000-0x367edfff] [ 0.000000] Initmem setup node 1 [mem 0x840000000-0xfffffffff] [ 0.000000] NODE_DATA [mem 0x367ec000-0x367ecfff] [ 0.000000] NODE_DATA(1) on node 0 [ 0.000000] 64664MB HIGHMEM available. [ 0.000000] 871MB LOWMEM available. Signed-off-by: Lans Zhang <jia.zhang@windriver.com> Cc: <andi@firstfloor.org> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1386303510-18574-1-git-send-email-jia.zhang@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | Merge tag 'v3.13-rc4' into x86/mmIngo Molnar2013-12-1945-358/+131
| |\ \ \ \ \ \ \ \ | | | |/ / / / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pick up the latest fixes before applying new patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | x86/mm: Implement ASLR for hugetlb mappingsKirill A. Shutemov2013-11-193-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Matthew noticed that hugetlb mappings don't participate in ASLR on x86-64: % for i in `seq 3`; do > tools/testing/selftests/vm/map_hugetlb | grep address > done Returned address is 0x2aaaaac00000 Returned address is 0x2aaaaac00000 Returned address is 0x2aaaaac00000 /proc/PID/maps entries for the mapping are always the same (except inode number): 2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 8200 /anon_hugepage (deleted) 2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 256 /anon_hugepage (deleted) 2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 7180 /anon_hugepage (deleted) The reason is the generic hugetlb_get_unmapped_area() function which is used on x86-64. It doesn't support randomization and use bottom-up unmapped area lookup, instead of usual top-down on x86-64. x86 has arch-specific hugetlb_get_unmapped_area(), but it's used only on x86-32. Let's use arch-specific hugetlb_get_unmapped_area() on x86-64 too. That adds ASLR and switches hugetlb mappings to use top-down unmapped area lookup: % for i in `seq 3`; do > tools/testing/selftests/vm/map_hugetlb | grep address > done Returned address is 0x7f4f08a00000 Returned address is 0x7fdda4200000 Returned address is 0x7febe0000000 /proc/PID/maps entries: 7f4f08a00000-7f4f18a00000 rw-p 00000000 00:0c 1168 /anon_hugepage (deleted) 7fdda4200000-7fddb4200000 rw-p 00000000 00:0c 7092 /anon_hugepage (deleted) 7febe0000000-7febf0000000 rw-p 00000000 00:0c 7183 /anon_hugepage (deleted) Unmapped area lookup policy for hugetlb mappings is consistent with normal mappings now -- the only difference is alignment requirements for huge pages. libhugetlbfs test-suite didn't detect any regressions with the patch applied (although it shows few failures on my machine regardless the patch). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/20131119131750.EA45CE0090@blue.fi.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | | x86/mm: Unify pte_to_pgoff() and pgoff_to_pte() helpersCyrill Gorcunov2013-11-191-41/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use unified pte_bitop() helper to manipulate bits in pte/pgoff bitfields and convert pte_to_pgoff()/pgoff_to_pte() to inlines. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@kernel.org>
OpenPOWER on IntegriCloud