op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	watchdog: Convert BookE watchdog driver to watchdog infrastructure	Guenter Roeck	2013-03-01	2	-120/+66
\| \| \| \| \|	Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: s3c2410_wdt: Use devm_* functions	Jingoo Han	2013-03-01	1	-33/+9
\| \| \| \| \| \| \|	Use devm_* functions to make cleanup paths more simple. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: remove old STMP3xxx driver	Wolfram Sang	2013-03-01	3	-298/+0
\| \| \| \| \| \| \|	Now that the new driver is in place, we can remove the old one. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: add new driver for STMP3xxx and i.MX23/28	Wolfram Sang	2013-03-01	3	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace the existing STMP3xxx driver because it has enough drawbacks that a rewrite is apropriate. The new driver is designed to use the watchdog framework which makes it a lot smaller and avoids open coding the watchdog API again. It also uses now an explicitly exported function from the RTC driver to set up its registers (the old driver silently reused the hopefully(!) already remapped RTC registers). Also, this driver is mach independent, while the old one depends on a mach replaced by another one a year ago. Since the user interface is still the standard watchdog API, users don't need to adapt. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	rtc: stmp3xxx: add wdt-accessor function	Wolfram Sang	2013-03-01	2	-0/+79
\| \| \| \| \| \| \| \| \| \| \|	This RTC also includes a watchdog timer. Provide an accessor function for setting the watchdog timeout value which will be picked up by a watchdog driver. Also register the platform_device for the watchdog here to get the boot-time dependencies right. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: introduce retu_wdt driver	Aaro Koskinen	2013-03-01	3	-0/+191
\| \| \| \| \| \| \| \| \| \|	Introduce Retu watchdog driver. Cc: linux-watchdog@vger.kernel.org Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: intel_scu_watchdog: fix Kconfig dependency	Wim Van Sebroeck	2013-03-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Kernel symbol X86_MRST has been removed from the kernel. INTEL_SCU_WATCHDOG driver can never be compiled due dependence of X86_MRST which remained in the drivers/watchdog/Kconfig. Reported-by: Alexander Shiyan <shc_work@mail.ru> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: orion_wdt: Add platform alias	Lubomir Rintel	2013-03-01	1	-0/+1
\| \| \| \| \| \| \| \|	...so that it's automatically picked up on relevant platforms. Tested on Kirkwood-based GuruPlug. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: ath79_wdt: add device tree matching	Gabor Juhos	2013-03-01	2	-0/+24
\| \| \| \| \| \| \|	Cc: Grant Likely <grant.likely@secretlab.ca> Cc: devicetree-discuss@lists.ozlabs.org Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: ath79_wdt: get register base from platform device's resources	Gabor Juhos	2013-03-01	2	-11/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ath79_wdt driver uses a fixed memory address currently. Although this is working with each currently supported SoCs, but this may change in the future. Additionally, the driver includes platform specific header files in order to be able to get the memory base of the watchdog device. The patch adds a memory resource to the platform device, and converts the driver to get the base address of the watchdog device from that. Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	MIPS: ath79: use dynamically allocated watchdog device	Gabor Juhos	2013-03-01	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \|	Remove the static watchdog device variable and use the 'platform_device_register_simple' helper to allocate and register the device in one step. This allows us to save a few bytes in the kernel image. Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: ath79_wdt: convert to use devm_clk_get	Gabor Juhos	2013-03-01	1	-5/+2
\| \| \| \| \| \| \| \|	Use the managed version of clk_get. This allows to simplify the probe/remove functions a bit. Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: sp5100_tco: Write back the original value to reserved bits, ↵	Takahisa Tanaka	2013-03-01	1	-20/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of zero In case of SP5100 or SB7x0 chipsets, the sp5100_tco module writes zero to reserved bits. The module, however, shouldn't depend on specific default value, and should perform a read-merge-write operation for the reserved bits. This patch makes the sp5100_tco module perform a read-merge-write operation on all the chipset (sp5100, sb7x0, sb8x0 or later). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43176 Signed-off-by: Takahisa Tanaka <mc74hc00@gmail.com> Tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Cc: stable <stable@vger.kernel.org>
*	watchdog: sp5100_tco: Fix wrong indirect I/O access for getting value of ↵	Takahisa Tanaka	2013-03-01	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reserved bits In case of SB800 or later chipset and re-programming MMIO address(), sp5100_tco module may read incorrect value of reserved bit, because the module reads a value from an incorrect I/O address. However, this bug doesn't cause a problem, because when re-programming MMIO address, by chance the module writes zero (this is BIOS's default value) to the low three bits of register. In most cases, PC with SB8x0 or later chipset doesn't need to re-programming MMIO address, because such PC can enable AcpiMmio and can use 0xfed80b00 for watchdog register base address. This patch fixes this bug. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43176 Signed-off-by: Takahisa Tanaka <mc74hc00@gmail.com> Tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Cc: stable <stable@vger.kernel.org>
*	watchdog: gef_wdt.c: add missing remove callback	Devendra Naga	2013-03-01	1	-0/+1
\| \| \| \| \| \| \|	this module missed a remove callback in the platform ops. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
*	watchdog: at91sam9: at91_wdt_dt_ids cannot be __init	Arnd Bergmann	2013-03-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The device IDs are referenced by the driver and potentially used beyond the init time, as kbuild correctly warns about. Remove the __initconst annotation. Without this patch, building at91_dt_defconfig results in: WARNING: drivers/watchdog/built-in.o(.data+0x28): Section mismatch in reference from the variable at91wdt_driver to the (unknown reference) .init.rodata:(unknown) The variable at91wdt_driver references the (unknown reference) __initconst (unknown) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Tested-by: Fabio Porcedda <fabio.porcedda@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Cc: linux-watchdog@vger.kernel.org
*	watchdog: da9055_wdt needs to select WATCHDOG_CORE	Randy Dunlap	2013-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	DA9055_WATCHDOG (introduced in v3.8) needs to select WATCHDOG_CORE so that it will build cleanly. Fixes these build errors: da9055_wdt.c:(.text+0xe9bc7): undefined reference to `watchdog_unregister_device' da9055_wdt.c:(.text+0xe9f4b): undefined reference to `watchdog_register_device' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: David Dajun Chen <dchen@diasemi.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Cc: linux-watchdog@vger.kernel.org Cc: stable <stable@vger.kernel.org>
*	Merge tag 'lzo-update-signature-20130226' of ↵	Linus Torvalds	2013-02-28	8	-434/+488
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://github.com/markus-oberhumer/linux Pull LZO compression update from Markus Oberhumer: "Summary: ======== Update the Linux kernel LZO compression and decompression code to the current upstream version which features significant performance improvements on modern machines. Some synthetic benchmarks: ============================ x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 150 MB/sec 468 MB/sec LZO-2012 : 434 MB/sec 1210 MB/sec i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 143 MB/sec 409 MB/sec LZO-2012 : 372 MB/sec 1121 MB/sec armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 27 MB/sec 84 MB/sec LZO-2012 : 44 MB/sec 117 MB/sec LZO-2013-UA : 47 MB/sec 167 MB/sec Legend: LZO-2005 : LZO version in current 3.8 kernel (which is based on the LZO 2.02 release from 2005) LZO-2012 : updated LZO version available in linux-next LZO-2013-UA : updated LZO version available in linux-next plus experimental ARM Unaligned Access patch. This needs approval from some ARM maintainer ist NOT YET INCLUDED." Andrew Morton <akpm@linux-foundation.org> acks it and says: "There's a new LZ4 on the block which is even faster than the sped-up LZO, but various filesystems and things use LZO" * tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux: crypto: testmgr - update LZO compression test vectors lib/lzo: Update LZO compression to current upstream version lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
\| *	crypto: testmgr - update LZO compression test vectors	Markus F.X.J. Oberhumer	2013-02-20	1	-18/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the LZO compression test vectors according to the latest compressor version. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
\| *	lib/lzo: Update LZO compression to current upstream version	Markus F.X.J. Oberhumer	2013-02-20	4	-343/+395
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit updates the kernel LZO code to the current upsteam version which features a significant speed improvement - benchmarking the Calgary and Silesia test corpora typically shows a doubled performance in both compression and decompression on modern i386/x86_64/powerpc machines. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
\| *	lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c	Markus F.X.J. Oberhumer	2013-02-20	3	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename the source file to match the function name and thereby also make room for a possible future even slightly faster "non-safe" decompressor version. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
* \|	Merge git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	2013-02-28	1	-1/+1
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull one kvm bugfix from Gleb Natapov. * git://git.kernel.org/pub/scm/virt/kvm/kvm: x86/kvm: Fix pvclock vsyscall fixmap
\| * \|	x86/kvm: Fix pvclock vsyscall fixmap	Peter Hurley	2013-02-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The physical memory fixmapped for the pvclock clock_gettime vsyscall was allocated, and thus is not a kernel symbol. __pa() is the proper method to use in this case. Fixes the crash below when booting a next-20130204+ smp guest on a 3.8-rc5+ KVM host. [ 0.666410] udevd[97]: starting version 175 [ 0.674043] udevd[97]: udevd:[97]: segfault at ffffffffff5fd020 ip 00007fff069e277f sp 00007fff068c9ef8 error d Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* \| \|	Merge branch 'linux_next' of ↵	Linus Torvalds	2013-02-28	18	-140/+1077
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac Pull EDAC fixes and ghes-edac from Mauro Carvalho Chehab: "For: - Some fixes at edac drivers (i7core_edac, sb_edac, i3200_edac); - error injection support for i5100, when EDAC debug is enabled; - fix edac when it is loaded builtin (early init for the subsystem); - a "Firmware First" EDAC driver, allowing ghes to report errors via EDAC (ghes-edac). With regards to ghes-edac, this fixes a longstanding BZ at Red Hat that happens with Nehalem and Sandy Bridge CPUs: when both GHES and i7core_edac or sb_edac are running, the error reports are unpredictable, as both BIOS and OS race to access the registers. With ghes-edac, the EDAC core will refuse to register any other concurrent memory error driver. This patchset moves the ghes struct definitions to a separate header file (include/acpi/ghes.h) and adds 3 hooks at apei/ghes.c to register/unregister and to report errors via ghes-edac. Those changes were acked by ghes driver maintainer (Huang)." * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (30 commits) i5100_edac: convert to use simple_open() ghes_edac: fix to use list_for_each_entry_safe() when delete list items ghes_edac: Fix RAS tracing ghes_edac: Make it compliant with UEFI spec 2.3.1 ghes_edac: Improve driver's printk messages ghes_edac: Don't credit the same memory dimm twice ghes_edac: do a better job of filling EDAC DIMM info ghes_edac: add support for reporting errors via EDAC ghes_edac: Register at EDAC core the BIOS report ghes: add the needed hooks for EDAC error report ghes: move structures/enum to a header file edac: add support for error type "Info" edac: add support for raw error reports edac: reduce stack pressure by using a pre-allocated buffer edac: lock module owner to avoid error report conflicts edac: remove proc_name from mci structure edac: add a new memory layer type edac: initialize the core earlier edac: better report error conditions in debug mode i5100_edac: Remove two checkpatch warnings ...
\| * \| \|	i5100_edac: convert to use simple_open()	Wei Yongjun	2013-02-26	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes an open coded simple_open() function and replaces file operations references to the function with simple_open() instead. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: fix to use list_for_each_entry_safe() when delete list items	Wei Yongjun	2013-02-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we will remove items off the list using list_del() we need to use a safe version of the list_for_each_entry() macro aptly named list_for_each_entry_safe(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: Fix RAS tracing	Mauro Carvalho Chehab	2013-02-25	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the current version of CPER, there's no way to associate an error with the memory error. So, the error location in EDAC layers is unused. As CPER has its own idea about memory architectural layers, just output whatever is there inside the driver's detail at the RAS tracepoint. The EDAC location keeps untouched, in the case that, in some future, we could actually map the error into the dimm labels. Now, the error message: [ 72.396625] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 [ 72.396627] {1}[Hardware Error]: APEI generic hardware error status [ 72.396628] {1}[Hardware Error]: severity: 2, corrected [ 72.396630] {1}[Hardware Error]: section: 0, severity: 2, corrected [ 72.396632] {1}[Hardware Error]: flags: 0x01 [ 72.396634] {1}[Hardware Error]: primary [ 72.396635] {1}[Hardware Error]: section_type: memory error [ 72.396637] {1}[Hardware Error]: error_status: 0x0000000000000400 [ 72.396638] {1}[Hardware Error]: node: 3 [ 72.396639] {1}[Hardware Error]: card: 0 [ 72.396640] {1}[Hardware Error]: module: 0 [ 72.396641] {1}[Hardware Error]: device: 0 [ 72.396643] {1}[Hardware Error]: error_type: 18, unknown [ 72.396666] EDAC MC0: 1 CE reserved error (18) on unknown label (node:3 card:0 module:0 page:0x0 offset:0x0 grain:0 syndrome:0x0 - status(0x0000000000000400): Storage error in DRAM memory) Is properly represented on the trace event: kworker/0:2-584 [000] .... 72.396657: mc_event: 1 Corrected error: reserved error (18) on unknown label (mc:0 location:-1:-1:-1 address:0x00000000 grain:1 syndrome:0x00000000 APEI location: node:3 card:0 module:0 status(0x0000000000000400): Storage error in DRAM memory) Tested on a 4 sockets E5-4650 Sandy Bridge machine. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: Make it compliant with UEFI spec 2.3.1	Mauro Carvalho Chehab	2013-02-25	1	-15/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UEFI spec defines the memory error types ans the bits that validate each field on the memory error record, at Appendix N om items N.2.5 (Memory Error Section) and N.2.11 (Error Status). Make the error description compliant with it, only showing the valid fields. The EDAC error log is now properly reporting the error: [ 281.556854] mce: [Hardware Error]: Machine check events logged [ 281.557042] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 [ 281.557044] {2}[Hardware Error]: APEI generic hardware error status [ 281.557046] {2}[Hardware Error]: severity: 2, corrected [ 281.557048] {2}[Hardware Error]: section: 0, severity: 2, corrected [ 281.557050] {2}[Hardware Error]: flags: 0x01 [ 281.557052] {2}[Hardware Error]: primary [ 281.557053] {2}[Hardware Error]: section_type: memory error [ 281.557055] {2}[Hardware Error]: error_status: 0x0000000000000400 [ 281.557056] {2}[Hardware Error]: node: 3 [ 281.557057] {2}[Hardware Error]: card: 0 [ 281.557058] {2}[Hardware Error]: module: 1 [ 281.557059] {2}[Hardware Error]: device: 0 [ 281.557061] {2}[Hardware Error]: error_type: 18, unknown [ 281.557067] EDAC DEBUG: ghes_edac_report_mem_error: error validation_bits: 0x000040b9 [ 281.557084] EDAC MC0: 1 CE reserved error (18) on unknown label (node:3 card:0 module:1 page:0x0 offset:0x0 grain:0 syndrome:0x0 - status(0x0000000000000400): Storage error in DRAM memory) Tested on a 4 CPUs E5-4650 Sandy Bridge machine. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: Improve driver's printk messages	Mauro Carvalho Chehab	2013-02-25	1	-10/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide a better infrastructure for printk's inside the driver: - use edac_dbg() for debug messages; - standardize the usage of pr_info(); - provide warning about the risk of relying on this driver. While here, changes the size of a fake memory to 1 page. This is as good or as bad as 1000 pages, but it is easier for userspace to detect, as I don't expect that any machine implementing GHES would provide just 1 page available ;) Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Conflicts: drivers/edac/ghes_edac.c
\| * \| \|	ghes_edac: Don't credit the same memory dimm twice	Mauro Carvalho Chehab	2013-02-25	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On my tests on a 4xE5-4650 CPU's system, the GHES EDAC driver is called twice. As the SMBIOS DMI enumeration call will seek for the entire DIMM sockets in the system, on this machine, equipped with 128 GB of RAM, the memory is displayed twice: +-----------------------+ \| mc0 \| mc1 \| ----------+-----------------------+ memory45: \| 8192 MB \| 8192 MB \| memory44: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory43: \| 0 MB \| 0 MB \| memory42: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory41: \| 0 MB \| 0 MB \| memory40: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory39: \| 8192 MB \| 8192 MB \| memory38: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory37: \| 0 MB \| 0 MB \| memory36: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory35: \| 0 MB \| 0 MB \| memory34: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory33: \| 8192 MB \| 8192 MB \| memory32: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory31: \| 0 MB \| 0 MB \| memory30: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory29: \| 0 MB \| 0 MB \| memory28: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory27: \| 8192 MB \| 8192 MB \| memory26: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory25: \| 0 MB \| 0 MB \| memory24: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory23: \| 0 MB \| 0 MB \| memory22: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory21: \| 8192 MB \| 8192 MB \| memory20: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory19: \| 0 MB \| 0 MB \| memory18: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory17: \| 0 MB \| 0 MB \| memory16: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory15: \| 8192 MB \| 8192 MB \| memory14: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory13: \| 0 MB \| 0 MB \| memory12: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory11: \| 0 MB \| 0 MB \| memory10: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory9: \| 8192 MB \| 8192 MB \| memory8: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory7: \| 0 MB \| 0 MB \| memory6: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ memory5: \| 0 MB \| 0 MB \| memory4: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory3: \| 8192 MB \| 8192 MB \| memory2: \| 0 MB \| 0 MB \| ----------+-----------------------+ memory1: \| 0 MB \| 0 MB \| memory0: \| 8192 MB \| 8192 MB \| ----------+-----------------------+ Total sum of 256 GB. As there's no reliable way to credit DIMMS to the right memory controller, just put everything on memory controller 0 (with should always exist). Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: do a better job of filling EDAC DIMM info	Mauro Carvalho Chehab	2013-02-25	1	-12/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of just faking a random value for the DIMM data, get the information that it is available via DMI table. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: add support for reporting errors via EDAC	Mauro Carvalho Chehab	2013-02-25	1	-2/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that the EDAC core is capable of just forward the errors via the userspace API, add a report mechanism for the GHES errors. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes_edac: Register at EDAC core the BIOS report	Mauro Carvalho Chehab	2013-02-25	4	-0/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Register GHES at EDAC MC core, in order to avoid other drivers to also handle errors and mangle with error data. The edac core will warrant that just one driver will be used, so the first one to register (BIOS first) will be the one that will be reporting the hardware errors. For now, the EDAC driver does nothing but to register at the EDAC core, preventing the hardware-driven mechanism to interfere with GHES. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes: add the needed hooks for EDAC error report	Mauro Carvalho Chehab	2013-02-25	2	-5/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to allow reporting errors via EDAC, add hooks for: 1) register an EDAC driver; 2) unregister an EDAC driver; 3) report errors via EDAC. As the EDAC driver will need to access the ghes structure, adds it as one of the parameters for ghes_do_proc. Acked-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	ghes: move structures/enum to a header file	Mauro Carvalho Chehab	2013-02-21	2	-45/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As a ghes_edac driver will need to access ghes structures, in order to properly handle the errors, move those structures to a separate header file. No functional changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: add support for error type "Info"	Mauro Carvalho Chehab	2013-02-21	2	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The CPER spec defines a forth type of error: informational logs. Add support for it at the edac API and at the trace event interface. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: add support for raw error reports	Mauro Carvalho Chehab	2013-02-21	2	-22/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That allows APEI GHES driver to report errors directly, using the EDAC error report API. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: reduce stack pressure by using a pre-allocated buffer	Mauro Carvalho Chehab	2013-02-21	2	-33/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of variables at the stack is too big. Reduces the stack usage by using a pre-allocated error buffer. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: lock module owner to avoid error report conflicts	Mauro Carvalho Chehab	2013-02-21	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	APEI GHES and i7core_edac/sb_edac currently can be loaded at the same time, but those are Highlander modules: "There can be only one". There are two reasons for that: 1) Each driver assumes that it is the only one registering at the EDAC core, as it is driver's responsibility to number the memory controllers, and all of them start from 0; 2) If BIOS is handling the memory errors, the OS can't also be doing it, as one will mangle with the other. So, we need to add an module owner's lock at the EDAC core, in order to avoid having two different modules handling memory errors at the same time. The best way for doing this lock seems to use the driver's name, as this is unique, and won't require changes on every driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: remove proc_name from mci structure	Mauro Carvalho Chehab	2013-02-21	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	proc_name isn't used anywhere. Remove it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: add a new memory layer type	Mauro Carvalho Chehab	2013-02-21	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are some cases where the memory controller layout is completely hidden. This is the case of firmware-driven error code, like the one provided by GHES. Add a new layer to be used on such memory error report mechanisms. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: initialize the core earlier	Mauro Carvalho Chehab	2013-02-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order for it to work with it builtin, the EDAC core should be initialized earlier, otherwise the ghes_edac driver initializes before edac_mc_sysfs_init() being called: ... [ 4.998373] EDAC MC0: Giving out device to 'ghes_edac.c' 'ghes_edac': DEV ghes ... [ 4.998373] EDAC MC1: Giving out device to 'ghes_edac.c' 'ghes_edac': DEV ghes [ 6.519495] EDAC MC: Ver: 3.0.0 [ 6.523749] EDAC DEBUG: edac_mc_sysfs_init: device mc created The net result is that no EDAC sysfs nodes will appear. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: better report error conditions in debug mode	Mauro Carvalho Chehab	2013-02-21	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is hard to find what's wrong without a proper error report. Improve it, in debug mode. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i5100_edac: Remove two checkpatch warnings	Mauro Carvalho Chehab	2013-02-21	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The last changeset introduced a few checkpatch warnings: WARNING: debugfs_remove_recursive(NULL) is safe this check is probably not required 261: FILE: drivers/edac/i5100_edac.c:1207: + if (priv->debugfs) + debugfs_remove_recursive(priv->debugfs); WARNING: debugfs_remove(NULL) is safe this check is probably not required 290: FILE: drivers/edac/i5100_edac.c:1250: + if (i5100_debugfs) + debugfs_remove(i5100_debugfs); Get rid of them. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i5100_edac: connect fault injection to debugfs node	Niklas Söderlund	2013-02-21	1	-1/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a debugfs direcotry i5100_edac/mcX for each memory controller and add nodes to control how fault injection is preformed. After configuring an injection using inject_channel, inject_deviceptr1, inject_deviceptr2, inject_eccmask1, inject_eccmask2 and inject_hlinesel trigger the injection by writing anything to inject_enable. Example of a CE injection: echo 0 > /sys/kernel/debug/i5100_edac/mc0/inject_channel echo 1 > /sys/kernel/debug/i5100_edac/mc0/inject_hlinesel echo 61440 > /sys/kernel/debug/i5100_edac/mc0/inject_eccmask1 echo 1 > /sys/kernel/debug/i5100_edac/mc0/inject_enable Example of UE injection: echo 0 > /sys/kernel/debug/i5100_edac/mc0/inject_channel echo 2 > /sys/kernel/debug/i5100_edac/mc0/inject_hlinesel echo 65535 > /sys/kernel/debug/i5100_edac/mc0/inject_eccmask1 echo 65535 > /sys/kernel/debug/i5100_edac/mc0/inject_eccmask2 echo 17 > /sys/kernel/debug/i5100_edac/mc0/inject_deviceptr1 echo 0 > /sys/kernel/debug/i5100_edac/mc0/inject_deviceptr2 echo 1 > /sys/kernel/debug/i5100_edac/mc0/inject_enable Sometimes it is needed to enable the injection more then once (echo to the inject_enable node) for the injection to happen, I am not sure why. Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i5100_edac: add fault injection code	Niklas Söderlund	2013-02-21	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add fault injection based on information datasheet for i5100, see 1. In addition to the i5100 datasheet some missing information on injection functions where found through experimentation and the i7300 datasheet, see 2. [1] Intel 5100 Memory Controller Hub Chipset Doc.Nr: 318378 http://www.intel.com/content/dam/doc/datasheet/5100- memory-controller-hub-chipset-datasheet.pdf [2] Intel 7300 Chipset MemoryController Hub (MCH) Doc.Nr: 318082 http://www.intel.com/assets/pdf/datasheet/318082.pdf Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i5100_edac: probe for device 19 function 0	Niklas Söderlund	2013-02-21	2	-1/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Probe and store the device handle for the device 19 function 0 during driver initialization. The device is used during fault injection. Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	edac: only create sdram_scrub_rate where supported	Mauro Carvalho Chehab	2013-02-21	1	-10/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, sdram_scrub_rate sysfs node is created even if the device doesn't support get/set the scub rate. Change the logic to only create this device node when the operation is supported. Reported-by: Felipe Balbi <balbi@ti.com> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i3200_edac: Fix the logic that detects filled memories	Mauro Carvalho Chehab	2013-02-21	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After running a series of tests on an HP DL320, filled with different memory sizes, it was noticed that, when filled with just one DIMM on such hardware, the driver wrongly detects twice the memory, and thinks that both channels 0 and 1 are filled. It seems to be partially caused by the BIOS and partially by the driver. The i3200_edac current logic would be working fine if the BIOS were disabling the unused second channel when just one DIMM is connected, in order to do power-saving, as recommended on this chipset's datasheet. However, the BIOS on this particular machine doesn't do it: [ 16.741421] EDAC DEBUG: how_many_channels: In dual channel mode [ 16.741424] EDAC DEBUG: how_many_channels: 2 DIMMS per channel enabled So, the driver were assuming that 2 channels are enabled (well, they are, but the second is unused). Combined with that, I found two issues at the logic that creates the EDAC data, that were failing when the two channels are not equally filled (AFAICT, that happens only when just 1 DIMM is plugged). The first one is that a 0 at DRB means that nothing is filled. The driver's logic, however, do some calculation with that. The second one is that the logic that fills the DIMM data currently assumes that both channels are equally filled. I tested the system already with the current configuration and my patch and it is now working fine. So, for a 2R single DIMM 2Gb memory at dimm slot 01 (channel 0), it is now displaying: [ 16.741406] EDAC DEBUG: i3200_get_drbs: drb[0][0] = 16, drb[1][0] = 0 [ 16.741410] EDAC DEBUG: i3200_get_drbs: drb[0][1] = 32, drb[1][1] = 0 [ 16.741413] EDAC DEBUG: i3200_get_drbs: drb[0][2] = 32, drb[1][2] = 0 [ 16.741416] EDAC DEBUG: i3200_get_drbs: drb[0][3] = 32, drb[1][3] = 0 ... [ 16.741896] EDAC DEBUG: i3200_probe1: csrow 0, channel 0, size = 1024 Mb [ 16.741899] EDAC DEBUG: i3200_probe1: csrow 1, channel 0, size = 1024 Mb and the corresponding sysfs nodes are now properly filled. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
\| * \| \|	i3200_edac: Add more debug to the driver	Mauro Carvalho Chehab	2013-02-21	1	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, it is not possible to know, when debug is enabled, if the driver is using 2 DIMMS per channel mode or not. It is not possible to know the values of the drbs registers, used to identify the memory rank sizes. Add debug for both, as it helps to track issues on the driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>