summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/mcheck/mce_amd.c
Commit message (Collapse)AuthorAgeFilesLines
* x86/mce/amd: Zap changelogBorislav Petkov2015-05-071-10/+2
| | | | | | | | It is useless and git history has it all detailed anyway. Update copyright while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
* x86/mce/amd: Rename setup_APIC_mceAravind Gopalakrishnan2015-05-071-2/+2
| | | | | | | | | | | | | | | 'setup_APIC_mce' doesn't give us an indication of why we are going to program LVT. Make that explicit by renaming it to setup_APIC_mce_threshold so we know. No functional change is introduced. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1430913538-1415-7-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86/mce/amd: Introduce deferred error interrupt handlerAravind Gopalakrishnan2015-05-071-0/+93
| | | | | | | | | | | | | | | | | | | | | Deferred errors indicate error conditions that were not corrected, but require no action from S/W (or action is optional).These errors provide info about a latent UC MCE that can occur when a poisoned data is consumed by the processor. Processors that report these errors can be configured to generate APIC interrupts to notify OS about the error. Provide an interrupt handler in this patch so that OS can catch these errors as and when they happen. Currently, we simply log the errors and exit the handler as S/W action is not mandated. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1430913538-1415-5-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86/mce/amd: Collect valid address before logging an errorAravind Gopalakrishnan2015-05-061-1/+4
| | | | | | | | | | | | | | | | | | | | amd_decode_mce() needs value in m->addr so it can report the error address correctly. This should be setup in __log_error() before we call mce_log(). We do this because the error address is an important bit of information which should be conveyed to userspace. The correct output then reports proper address, like this: [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS [-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b [Hardware Error]: MC0 Error Address: 0x00001f808f0ff040 Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1430913538-1415-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86/mce/amd: Factor out logging mechanismAravind Gopalakrishnan2015-05-061-10/+23
| | | | | | | | | | | | | | | | Refactor the code here to setup struct mce and call mce_log() to log the error. We're going to reuse this in a later patch as part of the deferred error interrupt enablement. No functional change is introduced. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1430913538-1415-2-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86/MCE/AMD: Enable thresholding interrupts by default if supportedAravind Gopalakrishnan2015-02-191-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We setup APIC vectors for threshold errors if interrupt_capable. However, we don't set interrupt_enable by default. Rework threshold_restart_bank() so that when we set up lvt_offset, we also set IntType to APIC and also enable thresholding interrupts for banks which support it by default. User is still allowed to disable interrupts through sysfs. While at it, check if status is valid before we proceed to log error using mce_log. This is because, in multi-node platforms, only the NBC (Node Base Core, i.e. the first core in the node) has valid status info in its MCA registers. So, the decoding of status values on the non-NBC leads to noise on kernel logs like so: EDAC DEBUG: amd64_inject_write_store: section=0x80000000 word_bits=0x10020001 [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:25 (15:2:0) MC4_STATUS[-|CE|-|-|- [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:26 (15:2:0) MC4_STATUS[-|CE|-|-|- <...> WARNING: CPU: 25 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0() WARNING: CPU: 26 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0() Something is rotten in the state of Denmark. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1422896561-7695-1-git-send-email-aravind.gopalakrishnan@amd.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
* x86/MCE/AMD: Drop bogus const modifier from AMD's bank4_names()Jan Beulich2015-02-191-1/+1
| | | | | | | | The compiler validly warns about it being ignored. Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/54C21511020000780005890E@mail.emea.novell.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Assign interrupt handler only when bank supports itChen Yucong2014-11-011-7/+10
| | | | | | | | | | | | There are some AMD CPU models which have thresholding banks but which cannot generate a thresholding interrupt. This is denoted by the bit MCi_MISC[IntP]. Make sure to check that bit before assigning the thresholding interrupt handler. Signed-off-by: Chen Yucong <slaoub@gmail.com> [ Boris: save an indentation level and rewrite commit message. ] Link: http://lkml.kernel.org/r/1412662128.28440.18.camel@debian Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Drop software-defined bank in error thresholdingBorislav Petkov2014-10-211-3/+2
| | | | | | | | | | | | | | | | | | | | | | Aravind had the good question about why we're assigning a software-defined bank when reporting error thresholding errors instead of simply using the bank which reports the last error causing the overflow. Digging through git history, it pointed to 95268664390b ("[PATCH] x86_64: mce_amd support for family 0x10 processors") which added that functionality. The problem with this, however, is that tools don't know about software-defined banks and get puzzled. So drop that K8_MCE_THRESHOLD_BASE and simply use the hw bank reporting the thresholding interrupt. Save us a couple of MSR reads while at it. Reported-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: https://lkml.kernel.org/r/5435B206.60402@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Move invariant code out from loop bodyChen Yucong2014-10-211-1/+3
| | | | | | | | | | Assigning to mce_threshold_vector is loop-invariant code in mce_amd_feature_init(). So do it only once, out of loop body. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1412263212.8085.6.camel@debian [ Boris: commit message corrections. ] Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Correct thresholding error loggingChen Yucong2014-10-211-15/+15
| | | | | | | | | | | | | mce_setup() does not gather the content of IA32_MCG_STATUS, so it should be read explicitly. Moreover, we need to clear IA32_MCx_STATUS to avoid that mce_log() logs the processed threshold event again at next time. But we do the logging ourselves and machine_check_poll() is completely useless there. So kill it. Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Use macros to compute bank MSRsChen Yucong2014-10-211-6/+4
| | | | | | | | | | | Avoid open coded calculations for bank MSRs by hiding the index of higher bank MSRs in well-defined macros. No semantic changes. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1411438561-24319-1-git-send-email-slaoub@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86: Replace __get_cpu_var usesChristoph Lameter2014-08-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* arch/x86: replace strict_strto callsDaniel Walter2014-08-081-2/+2
| | | | | | | | | | | | Replace obsolete strict_strto calls with appropriate kstrto calls Signed-off-by: Daniel Walter <dwalter@google.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: delete __cpuinit usage from all x86 filesPaul Gortmaker2013-07-141-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) are flagged as __cpuinit -- so if we remove the __cpuinit from arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the arch/x86 uses of the __cpuinit macros from all C files. x86 only had the one __CPUINIT used in assembly files, and it wasn't paired off with a .previous or a __FINIT, so we can delete it directly w/o any corresponding additional change there. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* x86, MCE, AMD: Use MCG_CAP MSR to find out number of banks on AMDBoris Ostrovsky2013-03-221-7/+15
| | | | | | | | | | | | | | | Currently number of error reporting register banks is hardcoded to 6 on AMD processors. This may break in virtualized scenarios when a hypervisor prefers to report fewer banks than what the physical HW provides. Since number of supported banks is reported in MSR_IA32_MCG_CAP[7:0] that's what we should use. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1363295441-1859-3-git-send-email-boris.ostrovsky@oracle.com [ reverse NULL ptr test logic ] Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, MCE, AMD: Replace shared_bank array with is_shared_bank() helperBoris Ostrovsky2013-03-221-8/+9
| | | | | | | | | Use helper function instead of an array to report whether register bank is shared. Currently only bank 4 (northbridge) is shared. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1363295441-1859-2-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Borislav Petkov <bp@suse.de>
* x86, AMD: Change Boris' email addressBorislav Petkov2012-10-301-1/+1
| | | | | | | | Move to private email and put in maintained status. Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86, amd, mce: Avoid NULL pointer reference on CPU northbridge lookupDaniel J Blueman2012-10-171-5/+5
| | | | | | | | | | | | | | | | | | | When booting on a federated multi-server system (NumaScale), the processor Northbridge lookup returns NULL; add guards to prevent this causing an oops. On those systems, the northbridge is accessed through MMIO and the "normal" northbridge enumeration in amd_nb.c doesn't work since we're generating the northbridge ID from the initial APIC ID and the last is not unique on those systems. Long story short, we end up without northbridge descriptors. Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Cc: stable@vger.kernel.org # 3.6 Link: http://lkml.kernel.org/r/1349073725-14093-1-git-send-email-daniel@numascale-asia.com [ Boris: beef up commit message ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge tag 'stable/for-linus-3.6-rc0-tag' of ↵Linus Torvalds2012-07-241-1/+21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen update from Konrad Rzeszutek Wilk: "Features: * Performance improvement to lower the amount of traps the hypervisor has to do 32-bit guests. Mainly for setting PTE entries and updating TLS descriptors. * MCE polling driver to collect hypervisor MCE buffer and present them to /dev/mcelog. * Physical CPU online/offline support. When an privileged guest is booted it is present with virtual CPUs, which might have an 1:1 to physical CPUs but usually don't. This provides mechanism to offline/online physical CPUs. Bug-fixes for: * Coverity found fixes in the console and ACPI processor driver. * PVonHVM kexec fixes along with some cleanups. * Pages that fall within E820 gaps and non-RAM regions (and had been released to hypervisor) would be populated back, but potentially in non-RAM regions." * tag 'stable/for-linus-3.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: populate correct number of pages when across mem boundary (v2) xen PVonHVM: move shared_info to MMIO before kexec xen: simplify init_hvm_pv_info xen: remove cast from HYPERVISOR_shared_info assignment xen: enable platform-pci only in a Xen guest xen/pv-on-hvm kexec: shutdown watches from old kernel xen/x86: avoid updating TLS descriptors if they haven't changed xen/x86: add desc_equal() to compare GDT descriptors xen/mm: zero PTEs for non-present MFNs in the initial page table xen/mm: do direct hypercall in xen_set_pte() if batching is unavailable xen/hvc: Fix up checks when the info is allocated. xen/acpi: Fix potential memory leak. xen/mce: add .poll method for mcelog device driver xen/mce: schedule a workqueue to avoid sleep in atomic context xen/pcpu: Xen physical cpus online/offline sys interface xen/mce: Register native mce handler as vMCE bounce back point x86, MCE, AMD: Adjust initcall sequence for xen xen/mce: Add mcelog support for Xen platform
| * x86, MCE, AMD: Adjust initcall sequence for xenLiu, Jinsong2012-07-191-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there are 3 funcs which need to be _initcalled in a logic sequence: 1. xen_late_init_mcelog 2. mcheck_init_device 3. threshold_init_device xen_late_init_mcelog must register xen_mce_chrdev_device before native mce_chrdev_device registration if running under xen platform; mcheck_init_device should be inited before threshold_init_device to initialize mce_device, otherwise a a NULL ptr dereference will cause panic. so we use following _initcalls 1. device_initcall(xen_late_init_mcelog); 2. device_initcall_sync(mcheck_init_device); 3. late_initcall(threshold_init_device); when running under xen, the initcall order is 1,2,3; on baremetal, we skip 1 and we do only 2 and 3. Acked-and-tested-by: Borislav Petkov <bp@amd64.org> Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | x86, MCE, AMD: Update copyrights and boilerplateBorislav Petkov2012-06-071-2/+4
| | | | | | | | | | | | | | Jacob is doing something else now so add myself as the loser who provides support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Give proper names to the thresholding banksBorislav Petkov2012-06-071-4/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having the banks numbered is ok but having real names which mean something to the user makes a lot more sense: /sys/devices/system/machinecheck/machinecheck0/ |-- bank0 |-- bank1 |-- bank2 |-- bank3 |-- bank4 |-- bank5 |-- bank6 |-- check_interval |-- cmci_disabled |-- combined_unit | |-- combined_unit | |-- error_count | |-- threshold_limit |-- dont_log_ce |-- execution_unit | |-- execution_unit | |-- error_count | |-- threshold_limit |-- ignore_ce |-- insn_fetch | |-- insn_fetch | |-- error_count | |-- threshold_limit |-- load_store | |-- load_store | |-- error_count | |-- threshold_limit |-- monarch_timeout |-- northbridge | |-- dram | | |-- error_count | | |-- interrupt_enable | | |-- threshold_limit | |-- ht_links | | |-- error_count | | |-- interrupt_enable | | |-- threshold_limit | |-- l3_cache | |-- error_count | |-- interrupt_enable | |-- threshold_limit ... Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Make error_count read onlyBorislav Petkov2012-06-071-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, writing to error count caused the code to reset the thresholding bank to the current thresholding limit and start counting errors from the beginning. This is misleading and unclear, and can be accomplished by writing the old thresholding limit into ->threshold_limit. Make error_count read-only with the functionality to show the current error count. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Cleanup reading of error_countBorislav Petkov2012-06-071-18/+5
| | | | | | | | | | | | | | | | | | We have rdmsr_on_cpu() now so remove locally defined solution in favor of the generic one. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Print decimal thresholding valuesBorislav Petkov2012-06-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If one sets the threshold limit, say to 25: $ echo 25 > machinecheck0/threshold_bank4/misc0/threshold_limit and then reads it back again, it gives $ cat machinecheck0/threshold_bank4/misc0/threshold_limit 19 which is actually 0x19 but we don't know that. Make all output decimal. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Move shared bank to node descriptorBorislav Petkov2012-06-071-20/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Well, instead of having a real bank 4 on the BSP of each node and symlinks on the remaining cores, we push it up into the amd_northbridge descriptor which now contains a pointer to the northbridge bank 4 because the bank is one per northbridge and, as such, belongs in the NB descriptor anyway. Each time we hotplug CPUs, we use the northbridge pointer to copy the shared bank into the per-CPU array of threshold_banks pointers, or destroy it when the last CPU on the node goes offline, or create it when the first comes online. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Remove local_allocate_... wrapperBorislav Petkov2012-06-071-8/+3
| | | | | | | | | | | | It is unneeded now so drop it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | x86, MCE, AMD: Remove shared banks sysfs linkingBorislav Petkov2012-06-071-98/+7
|/ | | | | | | | | | | | The code used to create a symlink on all non-BSP cores of a node when the MCi_MISCj bank is present once per node. (This is generally the case with bank 4 on AMD). However, these sysfs links cause a bunch of problems with cpu off-/onlining testing and are, as such, a bit overengineered. IOW, there's nothing wrong with having normal sysfs files for the shared banks since the corresponding MSRs are replicated across each core anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* x86, MCE, AMD: Hide interrupt_enable sysfs nodeBorislav Petkov2012-04-301-2/+7
| | | | | | | | | Depending on whether the box supports the APIC LVT interrupt for thresholding, we want to show the 'interrupt_enable' sysfs node or not. Make that the case by adding it to the default sysfs attributes only if it is supported. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* x86, MCE, AMD: Make APIC LVT thresholding interrupt optionalBorislav Petkov2012-04-301-12/+44
| | | | | | | | | | | | | Currently, the APIC LVT interrupt for error thresholding is implicitly enabled. However, there are models in the F15h range which do not enable it. Make the code machinery which sets up the APIC interrupt support an optional setting and add an ->interrupt_capable member to the bank representation mirroring that capability and enable the interrupt offset programming only if it is true. Simplify code and fixup comment style while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* Merge tag 'mce-for-tip' of ↵Ingo Molnar2012-03-141-4/+5
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Apply two miscellaneous MCE fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86/mce: Convert static array of pointers to per-cpu variablesGreg Kroah-Hartman2012-02-221-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When I previously fixed up the mce_device code, I used a static array of the pointers. It was (rightfully) pointed out to me that I should be using the per_cpu code instead. This patch converts the code over to that structure, moving the variable back into the per_cpu area, like it used to be for 3.2 and earlier. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: https://lkml.org/lkml/2012/1/27/165 Signed-off-by: Tony Luck <tony.luck@intel.com>
* | x86/mce/AMD: Fix UP build errorBorislav Petkov2012-02-221-0/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code touching struct cpuinfo_x86 members but also caused the following build error with Randy's randconfigs: mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map' Restore the #ifdef in threshold_create_bank() which creates symlinks on the non-BSP CPUs. There's a better patch series being worked on by Kevin Winchester which will solve this in a cleaner fashion, but that series is too ambitious for v3.3 merging - so we first queue up this trivial fix and then do the rest for v3.4. Signed-off-by: Borislav Petkov <bp@alien8.de> Acked-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Nick Bowler <nbowler@elliptictech.com> Link: http://lkml.kernel.org/r/20120203191801.GA2846@x1.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* mce: fix warning messages about static struct mce_deviceGreg Kroah-Hartman2012-01-161-7/+11
| | | | | | | | | | | | | | | | | | | When suspending, there was a large list of warnings going something like: Device 'machinecheck1' does not have a release() function, it is broken and must be fixed This patch turns the static mce_devices into dynamically allocated, and properly frees them when they are removed from the system. It solves the warning messages on my laptop here. Reported-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Djalal Harouni <tixxdz@opendz.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@amd64.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'driver-core-next' of ↵Linus Torvalds2012-01-071-6/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits) arm: fix up some samsung merge sysdev conversion problems firmware: Fix an oops on reading fw_priv->fw in sysfs loading file Drivers:hv: Fix a bug in vmbus_driver_unregister() driver core: remove __must_check from device_create_file debugfs: add missing #ifdef HAS_IOMEM arm: time.h: remove device.h #include driver-core: remove sysdev.h usage. clockevents: remove sysdev.h arm: convert sysdev_class to a regular subsystem arm: leds: convert sysdev_class to a regular subsystem kobject: remove kset_find_obj_hinted() m86k: gpio - convert sysdev_class to a regular subsystem mips: txx9_sram - convert sysdev_class to a regular subsystem mips: 7segled - convert sysdev_class to a regular subsystem sh: dma - convert sysdev_class to a regular subsystem sh: intc - convert sysdev_class to a regular subsystem power: suspend - convert sysdev_class to a regular subsystem power: qe_ic - convert sysdev_class to a regular subsystem power: cmm - convert sysdev_class to a regular subsystem s390: time - convert sysdev_class to a regular subsystem ... Fix up conflicts with 'struct sysdev' removal from various platform drivers that got changed: - arch/arm/mach-exynos/cpu.c - arch/arm/mach-exynos/irq-eint.c - arch/arm/mach-s3c64xx/common.c - arch/arm/mach-s3c64xx/cpu.c - arch/arm/mach-s5p64x0/cpu.c - arch/arm/mach-s5pv210/common.c - arch/arm/plat-samsung/include/plat/cpu.h - arch/powerpc/kernel/sysfs.c and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h
| * cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystemKay Sievers2011-12-211-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'Kevin Winchester2011-12-211-6/+1
|/ | | | | | | | | | | | | | | | | | | | | | | Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Steffen Persvold <sp@numascale.com> Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, mce: Use mce_sysdev_ prefix to group functionsHidetoshi Seto2011-06-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | There are many functions named mce_* so use a new prefix for the subset of functions related to sysfs support. And since f3c6ea1b06c71b43f751b36bd99345369fe911af introduces syscore_ops, use the prefix mce_syscore for some functions related to power management which were in sysdev_class before. Before: After: mce_device mce_sysdev mce_sysclass mce_sysdev_class mce_attrs mce_sysdev_attrs mce_dev_initialized mce_sysdev_initialized mce_create_device mce_sysdev_create mce_remove_device mce_sysdev_remove mce_suspend mce_syscore_suspend mce_shutdown mce_syscore_shutdown mce_resume mce_syscore_resume Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED81B.8020506@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* x86, mce, AMD: Fix leaving freed data in a listJulia Lawall2011-05-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | b may be added to a list, but is not removed before being freed in the case of an error. This is done in the corresponding deallocation function, so the code here has been changed to follow that. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E1,E2; identifier l; @@ *list_add(&E->l,E1); ... when != E1 when != list_del(&E->l) when != list_del_init(&E->l) when != E = E2 *kfree(E);// </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1305294731-12127-1-git-send-email-julia@diku.dk Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: Move llc_shared_map out of cpu_infoYinghai Lu2011-01-261-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | cpu_info is already with per_cpu, We can take llc_shared_map out of cpu_info, and declare it as per_cpu variable directly. So later referencing could be simple and directly instead of diving to find cpu_info at first. Also could make smp_store_cpu_info() much simple to avoid to do save and restore trick. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Alok N Kataria <akataria@vmware.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Hans J. Koch <hjk@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4D3A16E8.5020608@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* mce, amd: Remove goto in threshold_create_device()Robert Richter2010-10-251-2/+2
| | | | | | | | | Removing the goto in threshold_create_device(). Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1288015419-29543-5-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* mce, amd: Add helper functions to setup APICRobert Richter2010-10-251-29/+38
| | | | | | | | | | | This patch reworks and cleans up mce_amd_feature_init() by introducing helper functions to setup and check the LVT offset. It also fixes line endings in pr_err() calls. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1288015419-29543-4-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* mce, amd: Shorten local variables mci_misc_{hi,lo}Robert Richter2010-10-251-13/+13
| | | | | | | | | Shorten this variables to make later changes more readable. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1288015419-29543-3-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* mce, amd: Implement mce_threshold_block_init() helper functionRobert Richter2010-10-251-19/+29
| | | | | | | | | | | | | | | This patch adds a helper function for the initial setup of an mce threshold block. The LVT offset is passed as argument. Also making variable threshold_defaults local as it is only used in function mce_amd_feature_init(). Function threshold_restart_bank() is extended to setup the LVT offset, the change is backward compatible. Thus, now there is only a single wrmsrl() to setup the block. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1288015419-29543-2-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* apic, x86: Use BIOS settings for IBS and MCE threshold interrupt LVT offsetsRobert Richter2010-10-201-3/+24
| | | | | | | | | | | | | | | | We want the BIOS to setup the EILVT APIC registers. The offsets were hardcoded and BIOS settings were overwritten by the OS. Now, the subsystems for MCE threshold and IBS determine the LVT offset from the registers the BIOS has setup. If the BIOS setup is buggy on a family 10h system, a workaround enables IBS. If the OS determines an invalid register setup, a "[Firmware Bug]: " error message is reported. We need this change also for upcomming cpu families. Signed-off-by: Robert Richter <robert.richter@amd.com> LKML-Reference: <1286360874-1471-3-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration orderBorislav Petkov2010-10-111-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes possible cases of not collecting valid error info in the MCE error thresholding groups on F10h hardware. The current code contains a subtle problem of checking only the Valid bit of MSR0000_0413 (which is MC4_MISC0 - DRAM thresholding group) in its first iteration and breaking out if the bit is cleared. But (!), this MSR contains an offset value, BlkPtr[31:24], which points to the remaining MSRs in this thresholding group which might contain valid information too. But if we bail out only after we checked the valid bit in the first MSR and not the block pointer too, we miss that other information. The thing is, MC4_MISC0[BlkPtr] is not predicated on MCi_STATUS[MiscV] or MC4_MISC0[Valid] and should be checked prior to iterating over the MCI_MISCj thresholding group, irrespective of the MC4_MISC0[Valid] setting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, mcheck: Avoid duplicate sysfs links/files for thresholding banksAndreas Herrmann2010-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | kobject_add_internal failed for threshold_bank2 with -EEXIST, don't try to register things with the same name in the same directory: Pid: 1, comm: swapper Tainted: G W 2.6.31 #1 Call Trace: [<ffffffff81161b07>] ? kobject_add_internal+0x156/0x180 [<ffffffff81161cc0>] ? kobject_add+0x66/0x6b [<ffffffff81161793>] ? kobject_init+0x42/0x82 [<ffffffff81161cf9>] ? kobject_create_and_add+0x34/0x63 [<ffffffff81393963>] ? threshold_create_bank+0x14f/0x259 [<ffffffff8139310a>] ? mce_create_device+0x8d/0x1b8 [<ffffffff81646497>] ? threshold_init_device+0x3f/0x80 [<ffffffff81646458>] ? threshold_init_device+0x0/0x80 [<ffffffff81009050>] ? do_one_initcall+0x4f/0x143 [<ffffffff816413a0>] ? kernel_init+0x14c/0x1a2 [<ffffffff8100c8da>] ? child_rip+0xa/0x20 [<ffffffff81641254>] ? kernel_init+0x0/0x1a2 [<ffffffff8100c8d0>] ? child_rip+0x0/0x20 kobject_create_and_add: kobject_add error: -17 (Probably the for_each_cpu loop should be entirely removed.) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100827092006.GB5348@loge.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Driver core: Constify struct sysfs_ops in struct kobj_typeEmese Revfy2010-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
OpenPOWER on IntegriCloud