summaryrefslogtreecommitdiffstats
path: root/drivers/edac
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'mce-for-linus' of ↵Linus Torvalds2011-01-074-118/+359
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE: Fix NB error formatting EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit EDAC, MCE: Enable MCE decoding on F15h EDAC, MCE: Allow F15h bank 6 MCE injection EDAC, MCE: Shorten error report formatting EDAC, MCE: Overhaul error fields extraction macros EDAC, MCE: Add F15h FP MCE decoder EDAC, MCE: Add F15 EX MCE decoder EDAC, MCE: Add an F15h NB MCE decoder EDAC, MCE: No F15h LS MCE decoder EDAC, MCE: Add F15h CU MCE decoder EDAC, MCE: Add F15h IC MCE decoder EDAC, MCE: Add F15h DC MCE decoder EDAC, MCE: Select extended error code mask
| * EDAC, MCE: Fix NB error formattingBorislav Petkov2011-01-071-7/+10
| | | | | | | | | | | | | | Minor formatting fixup since the information which core was associated with the MCE is not always valid. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bitRandy Dunlap2011-01-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Building for X86_32 produces shift count warnings, so use BIT_64() to eliminate the warnings. drivers/edac/mce_amd.c:778: warning: left shift count >= width of type drivers/edac/mce_amd.c:778: warning: left shift count >= width of type Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: bluesmoke-devel@lists.sourceforge.net Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Enable MCE decoding on F15hBorislav Petkov2011-01-071-6/+8
| | | | | | | | | | | | | | Now that everything is inplace, enable MCE decoding on F15h. Make initcall routine a bit more readable. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Allow F15h bank 6 MCE injectionBorislav Petkov2011-01-071-4/+5
| | | | | | | | | | | | | | F15h adds a sixth MCE bank: adjust bank number check in the injection code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Shorten error report formattingBorislav Petkov2011-01-071-22/+32
| | | | | | | | | | | | | | Shorten up MCi_STATUS flags and add BD's new deferred and poison types. Also, simplify formatting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Overhaul error fields extraction macrosBorislav Petkov2011-01-073-54/+43
| | | | | | | | | | | | Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add F15h FP MCE decoderBorislav Petkov2011-01-071-0/+44
| | | | | | | | | | | | Add decoder for FP MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add F15 EX MCE decoderBorislav Petkov2011-01-071-7/+34
| | | | | | | | | | | | | | Integrate the single FIROB signature into an expanded table along with the new BD MCE types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add an F15h NB MCE decoderBorislav Petkov2011-01-071-0/+10
| | | | | | | | | | | | by (almost) reusing the F10h one since the signatures are the same. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: No F15h LS MCE decoderBorislav Petkov2011-01-071-1/+1
| | | | | | | | | | | | F15h BD doesn't generate LS MCEs so warn about it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add F15h CU MCE decoderBorislav Petkov2011-01-071-1/+61
| | | | | | | | | | | | | | MCE bank 2 is redefined from a BU to a CU (Combined Unit) bank on F15h. Add a decoder function for CU MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add F15h IC MCE decoderBorislav Petkov2011-01-072-4/+51
| | | | | | | | | | | | Add support for decoding F15h IC MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Add F15h DC MCE decoderBorislav Petkov2011-01-072-19/+62
| | | | | | | | | | | | | | Add a decoder for F15h DC MCEs to support the new types of DC MCEs introduced by the BD microarchitecture. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * EDAC, MCE: Select extended error code maskBorislav Petkov2011-01-071-4/+9
| | | | | | | | | | | | | | | | F15h enlarges the extended error code of an MCE to a 5-bit field (MCi_STATUS[20:16]). Add a mask variable which default 0xf is overridden on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Disable DRAM ECC injection on K8Borislav Petkov2011-01-071-2/+3
| | | | | | | | | | | | | | K8 does not allow for an atomic RMW to a cacheline as F10h does so disable the error injection interface for it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC: Fixup scrubrate manipulationBorislav Petkov2011-01-077-63/+52
| | | | | | | | | | | | | | | | | | Make the ->{get|set}_sdram_scrub_rate return the actual scrub rate bandwidth it succeeded setting and remove superfluous arg pointer used for that. A negative value returned still means that an error occurred while setting the scrubrate. Document this for future reference. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove two-stage initializationBorislav Petkov2011-01-071-100/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that all prerequisites are in place, drop the two-stage driver instances initialization in favor of the following simple init sequence: 1. Probe PCI device: we only test ECC capabilities here and if none exit early. 2. If the hw supports ECC and it is/can be enabled, we init the per-node instance. Remove "amd64_" prefix from static functions touched, while at it. There actually should be no visible functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Check ECC capabilities initiallyBorislav Petkov2011-01-071-66/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the code to check the hardware ECC capabilities at PCI probing time. We do all further initialization only if we actually can/have ECC enabled. While at it: 0. Fix function naming. 1. Simplify/clarify debug output. 2. Remove amd64_ prefix from the static functions 3. Reorganize code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Carve out ECC-related hw settingsBorislav Petkov2011-01-072-24/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is in preparation for the init path reorganization where we want only to 1) test whether a particular node supports ECC 2) can it be enabled and only then do the necessary allocation/initialization. For that, we need to decouple the ECC settings of the node from the instance's descriptor. The should be no functional change introduced by this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove PCI ECS enabling functionsBorislav Petkov2011-01-072-59/+0
| | | | | | | | | | | | | | PCI ECS is being enabled by default since 2.6.26 on AMD so this code is just superfluous now, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove explicit Kconfig PCI dependencyBorislav Petkov2011-01-071-4/+4
| | | | | | | | | | | | AMD_NB pulls in the dependency on PCI. Clarify/fix help text while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Allocate driver instances dynamicallyBorislav Petkov2011-01-072-18/+29
| | | | | | | | | | | | | | | | | | Remove static allocation in favor of dynamically allocating space for as many driver instances as northbridges present on the system. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Rework printk macrosBorislav Petkov2011-01-075-107/+87
| | | | | | | | | | | | | | Add a macro per printk level, shorten up error messages. Add relevant information to KERN_INFO level. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Rename CPU PCI devicesBorislav Petkov2011-01-073-95/+77
| | | | | | | | | | | | | | Rename variables representing PCI devices to their BKDG names for faster search and shorter, clearer code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Concentrate per-family init even moreBorislav Petkov2011-01-072-29/+17
| | | | | | | | | | | | | | | | Move the remaining per-family init code into the proper place and simplify the rest of the initialization. Reorganize error handling in amd64_init_one_instance(). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Cleanup the CPU PCI device reservationBorislav Petkov2011-01-071-30/+12
| | | | | | | | | | | | Shorten code and clarify comments, return proper -E* values on error. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Simplify CPU family detectionBorislav Petkov2011-01-072-32/+32
| | | | | | | | | | | | Concentrate CPU family detection in the per-family init function. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Add per-family init functionBorislav Petkov2011-01-072-19/+28
| | | | | | | | | | | | | | | | Run a per-family init function which does all the settings based on the family this driver instance is running on. Move the scrubrate calculation in it and simplify code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Use cached extended CPU modelBorislav Petkov2011-01-071-3/+2
| | | | | | | | | | | | ... instead of computing it needlessly again. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove F11h supportBorislav Petkov2011-01-072-49/+3
| | | | | | | | | | | | F11h doesn't support DRAM ECC so whack it away. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | Merge branch 'x86-amd-nb-for-linus' of ↵Linus Torvalds2011-01-061-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code
| * x86, amd-nb: Cleanup AMD northbridge caching codeHans Rosenfeld2010-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Support more than just the "Misc Control" part of the northbridges. Support more flags by turning "gart_supported" into a single bit flag that is stored in a flags member. Clean up related code by using a set of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb()) instead of accessing the NB data structures directly. Reorder the initialization code and put the GART flush words caching in a separate function. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
| * x86, amd-nb: Complete the rename of AMD NB and related codeHans Rosenfeld2010-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Not only the naming of the files was confusing, it was even more so for the function and variable names. Renamed the K8 NB and NUMA stuff that is also used on other AMD platforms. This also renames the CONFIG_K8_NUMA option to CONFIG_AMD_NUMA and the related file k8topology_64.c to amdtopology_64.c. No functional changes intended. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Fix interleaving checkBorislav Petkov2010-12-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When matching error address to the range contained by one memory node, we're in valid range when node interleaving 1. is disabled, or 2. enabled and when the address bits we interleave on match the interleave selector on this node (see the "Node Interleaving" section in the BKDG for an enlightening example). Thus, when we early-exit, we need to reverse the compound logic statement properly. Cc: <stable@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC: Correct MiB_TO_PAGES() macroAndrei Konovalov2010-12-081-3/+3
| | | | | | | | | | | | | | | | | | | | This corrects the misprint introduced when moving '#if PAGE_SHIFT' from i7core_edac.c to edac_core.h (commit e9144601d364d5b81f3e63949337f8507eb58dca) Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Andrei Konovalov <akonovalov@mvista.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC: Fix workqueue-related crashesBorislav Petkov2010-12-081-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00740c58541b6087d78418cebca1fcb86dc6077d changed edac_core to un-/register a workqueue item only if a lowlevel driver supplies a polling routine. Normally, when we remove a polling low-level driver, we go and cancel all the queued work. However, the workqueue unreg happens based on the ->op_state setting, and edac_mc_del_mc() sets this to OP_OFFLINE _before_ we cancel the work item, leading to NULL ptr oops on the workqueue list. Fix it by putting the unreg stuff in proper order. Cc: <stable@kernel.org> #36.x Reported-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com> LKML-Reference: <1291201307.3029.21.camel@Tobias-Karnat> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC, MCE: Fix edac_init_mce_inject error handlingAxel Lin2010-11-221-1/+1
| | | | | | | | | | | | | | | | Otherwise, variable i will be -1 inside the latest iteration of the while loop. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC: Remove deprecated kbuild goal definitionsTracey Dent2010-11-221-4/+4
|/ | | | | | | | | | | | Change EDAC's Makefile to use <modules>-y instead of <modules>-objs because -objs is deprecated and not mentioned in Documentation/kbuild/makefiles.txt. [bp: Fixup commit message] [bp: Fixup indentation] Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* Merge branch 'linux_next' of ↵Linus Torvalds2010-10-264-224/+313
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (34 commits) i7core_edac: return -ENODEV when devices were already probed i7core_edac: properly terminate pci_dev_table i7core_edac: Avoid PCI refcount to reach zero on successive load/reload i7core_edac: Fix refcount error at PCI devices i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL i7core_edac: Fix an oops at i7core probe i7core_edac: Remove unused member channels in i7core_pvt i7core_edac: Remove unused arg csrow from get_dimm_config i7core_edac: Reduce args of i7core_register_mci i7core_edac: Introduce i7core_unregister_mci i7core_edac: Use saved pointers i7core_edac: Check probe counter in i7core_remove i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failed i7core_edac: Fix error path of i7core_register_mci i7core_edac: Fix order of lines in i7core_register_mci i7core_edac: Always do get/put for all devices i7core_edac: Introduce i7core_pci_ctl_create/release i7core_edac: Introduce free_i7core_dev i7core_edac: Introduce alloc_i7core_dev i7core_edac: Reduce args of i7core_get_onedevice ...
| * i7core_edac: return -ENODEV when devices were already probedMauro Carvalho Chehab2010-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Due to the nature of i7core, we need to probe and attach all PCI devices used by this driver during the first time probe is called. However, PCI core will call the probe routine one time for each CPU socket. If we return -EINVAL to those calls, it would seem that the driver fails, when, in fact, there's no more devices left to initialize. Changing the return code to -ENODEV solves this issue. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: properly terminate pci_dev_tableMauro Carvalho Chehab2010-10-241-4/+5
| | | | | | | | | | | | | | | | | | At pci_xeon_fixup(), it waits for a null-terminated table, while at i7core_get_all_devices, it just do a for 0..ARRAY_SIZE. As other tables are zero-terminated, change it to be terminate with 0 as well, and fixes a bug where it may be running out of the table elements. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Avoid PCI refcount to reach zero on successive load/reloadMauro Carvalho Chehab2010-10-241-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | That's a nasty bug that took me a lot of time to track, and whose solution took just one line to solve. The best fragrances and the worse poisons are shipped on the smalest bottles. The drivers/pci/quick.c implements the pci_get_device function. The normal behavior is that you call it, the function returns you a pdev pointer and increment pdev->kobj.kref.refcount of the pci device. However, if you want to keep searching an object, you need to pass the previous pdev function to the search. When you use a not null pointer to pdev "from" field, pci_get_device will decrement pdev->kobj.kref.refcount, assuming that the driver won't be using the previous pdev. The solution is simple: we just need to call pci_dev_get() manually, for the pdev's that the driver will actually use. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Fix refcount error at PCI devicesMauro Carvalho Chehab2010-10-241-35/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Probably due to a bug or some testing logic at PCI level, device refcount for <bus>:00.0 device is decremented at the end of the pci_get_device, made by i7core_get_all_devices(). The fact is that the first versions of the driver relied on those devices to probe for Nehalem, but the current versions don't use it at all. So, let's just remove those devices from the driver, making it simpler and fixing the bug. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: it is safe to i7core_unregister_mci() when mci=NULLMauro Carvalho Chehab2010-10-241-8/+5
| | | | | | | | | | | | | | i7core_unregister_mci() checks internally when mci=NULL. There's no need to test it outside. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Fix an oops at i7core probeMauro Carvalho Chehab2010-10-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init of the priv pointer to the end of the probe routine. However, we need them before that, otherwise, we hit an OOPS: [ 67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0 [ 67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 67.759685] IP: [<ffffffffa017e484>] i7core_probe+0x979/0x130c [i7core_edac] [ 67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0 [ 67.771178] Oops: 0000 [#1] SMP [ 67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map [ 67.782213] CPU 1 [ 67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Remove unused member channels in i7core_pvtHidetoshi Seto2010-10-241-2/+0
| | | | | | | | | | Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Remove unused arg csrow from get_dimm_configHidetoshi Seto2010-10-241-7/+7
| | | | | | | | | | | | | | A local is enough. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Reduce args of i7core_register_mciHidetoshi Seto2010-10-241-15/+9
| | | | | | | | | | | | | | We can check the number of channels in i7core_register_mci. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * i7core_edac: Introduce i7core_unregister_mciHidetoshi Seto2010-10-241-32/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In i7core_probe, when setup of mci for 2nd or later socket failed, we should cleanup prepared mci for 1st socket or so before "put" of all devices. So let have i7core_unregister_mci that can be shared between here and i7core_remove. While here fix a typo "hanler". Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
OpenPOWER on IntegriCloud