summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* x86/mid: Remove Intel MoorestownAlan Cox2012-01-266-1115/+26
| | | | | | | | | | | | All production devices operate in the Oaktrail configuration with legacy PC elements present and an ACPI BIOS. Continue stripping out the Moorestown elements from the tree leaving Medfield. Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: jacob.jun.pan@linux.intel.com Link: http://lkml.kernel.org/n/tip-fvm1hgpq99jln6l0fbek68ik@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86/mrst: Set ISA bus type for fake MP IRQsJacob Pan2012-01-261-2/+2
| | | | | | | | | | | We use MP IRQs for SFI presented timer interrupts, we should also set mp_bus_not_pci for MP_ISA_BUS so that pin_2_irq mapping is correct. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> Link: http://lkml.kernel.org/n/tip-8h3rc1igpp8ir94aas69qmhk@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86/ioapic: Use legacy_pic to set correct gsi-irq mappingJacob Pan2012-01-261-2/+2
| | | | | | | | | | | | | Using compile time NR_LEGACY_IRQS causes the wrong gsi-irq mapping on non-PC platforms, such as Moorestown. This patch uses legacy_pic abstraction to set the correct number of legacy interrupts at runtime. For Moorestown, nr_legacy_irqs = 0. We have 1:1 mapping for gsi-irq even within the legacy irq range. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> Link: http://lkml.kernel.org/n/tip-kzvj4xp9tmicuoqoh2w05iay@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2012-01-241-14/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Davem says: 1) Fix JIT code generation on x86-64 for divide by zero, from Eric Dumazet. 2) tg3 header length computation correction from Eric Dumazet. 3) More build and reference counting fixes for socket memory cgroup code from Glauber Costa. 4) module.h snuck back into a core header after all the hard work we did to remove that, from Paul Gortmaker and Jesper Dangaard Brouer. 5) Fix PHY naming regression and add some new PCI IDs in stmmac, from Alessandro Rubini. 6) Netlink message generation fix in new team driver, should only advertise the entries that changed during events, from Jiri Pirko. 7) SRIOV VF registration and unregistration fixes, and also add a missing PCI ID, from Roopa Prabhu. 8) Fix infinite loop in tx queue flush code of brcmsmac, from Stanislaw Gruszka. 9) ftgmac100/ftmac100 build fix, missing interrupt.h include. 10) Memory leak fix in net/hyperv do_set_mutlicast() handling, from Wei Yongjun. 11) Off by one fix in netem packet scheduler, from Vijay Subramanian. 12) TCP loss detection fix from Yuchung Cheng. 13) TCP reset packet MD5 calculation uses wrong address, fix from Shawn Lu. 14) skge carrier assertion and DMA mapping fixes from Stephen Hemminger. 15) Congestion recovery undo performed at the wrong spot in BIC and CUBIC congestion control modules, fix from Neal Cardwell. 16) Ethtool ETHTOOL_GSSET_INFO is unnecessarily restrictive, from Michał Mirosław. 17) Fix triggerable race in ipv6 sysctl handling, from Francesco Ruggeri. 18) Statistics bug fixes in mlx4 from Eugenia Emantayev. 19) rds locking bug fix during info dumps, from your's truly. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) rds: Make rds_sock_lock BH rather than IRQ safe. netprio_cgroup.h: dont include module.h from other includes net: flow_dissector.c missing include linux/export.h team: send only changed options/ports via netlink net/hyperv: fix possible memory leak in do_set_multicast() drivers/net: dsa/mv88e6xxx.c files need linux/module.h stmmac: added PCI identifiers llc: Fix race condition in llc_ui_recvmsg stmmac: fix phy naming inconsistency dsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches. tg3: fix ipv6 header length computation skge: add byte queue limit support mv643xx_eth: Add Rx Discard and Rx Overrun statistics bnx2x: fix compilation error with SOE in fw_dump bnx2x: handle CHIP_REVISION during init_one bnx2x: allow user to change ring size in ISCSI SD mode bnx2x: fix Big-Endianess in ethtool -t bnx2x: fixed ethtool statistics for MF modes bnx2x: credit-leakage fixup on vlan_mac_del_all macvlan: fix a possible use after free ...
| * net: bpf_jit: fix divide by 0 generationEric Dumazet2012-01-181-14/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several problems fixed in this patch : 1) Target of the conditional jump in case a divide by 0 is performed by a bpf is wrong. 2) Must 'generate' the full function prologue/epilogue at pass=0, or else we can stop too early in pass=1 if the proglen doesnt change. (if the increase of prologue/epilogue equals decrease of all instructions length because some jumps are converted to near jumps) 3) Change the wrong length detection at the end of code generation to issue a more explicit message, no need for a full stack trace. Reported-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| |
| \
| \
| \
*---. \ Merge branches 'sched-urgent-for-linus', 'perf-urgent-for-linus' and ↵Linus Torvalds2012-01-197-101/+438
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/accounting, proc: Fix /proc/stat interrupts sum * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignore x86/kprobes: Fix typo transferred from Intel manual * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits x86, tsc: Fix SMI induced variation in quick_pit_calibrate() x86, opcode: ANDN and Group 17 in x86-opcode-map.txt x86/kconfig: Move the ZONE_DMA entry under a menu x86/UV2: Add accounting for BAU strong nacks x86/UV2: Ack BAU interrupt earlier x86/UV2: Remove stale no-resources test for UV2 BAU x86/UV2: Work around BAU bug x86/UV2: Fix BAU destination timeout initialization x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode x86: Get rid of dubious one-bit signed bitfield
| | | * | x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bitsH. Peter Anvin2012-01-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In checkin 303395ac3bf3 x86: Generate system call tables and unistd_*.h from tables the feature macros in <asm/unistd.h> were unified between 32 and 64 bits. Unfortunately 32 bits requires __ARCH_WANT_SYS_IPC and this was inadvertently dropped. Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/CALLzPKbeXN5gdngo8uYYU8mAow=XhrwBFBhKfG811f37BubQOg@mail.gmail.com
| | | * | Merge remote-tracking branch 'linus/master' into x86/urgentH. Peter Anvin2012-01-19153-4145/+8721
| | | |\ \ | |_|_|/ / |/| | | |
| | | * | x86, tsc: Fix SMI induced variation in quick_pit_calibrate()Linus Torvalds2012-01-171-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pit_expect_msb() returns success wrongly in the below SMI scenario: a. pit_verify_msb() has not yet seen the MSB transition. b. we are close to the MSB transition though and got a SMI immediately after returning from pit_verify_msb() which didn't see the MSB transition. PIT MSB transition has happened somewhere during SMI execution. c. returned from SMI and we noted down the 'tsc', saw the pit MSB change now and exited the loop to calculate 'deltatsc'. Instead of noting the TSC at the MSB transition, we are way off because of the SMI. And as the SMI happened between the pit_verify_msb() and before the 'tsc' is recorded in the for loop, 'delattsc' (d1/d2 in quick_pit_calibrate()) will be small and quick_pit_calibrate() will not notice this error. Depending on whether SMI disturbance happens while computing d1 or d2, we will see the TSC calibrated value smaller or bigger than the expected value. As a result, in a cluster we were seeing a variation of approximately +/- 20MHz in the calibrated values, resulting in NTP failures. [ As far as the SMI source is concerned, this is a periodic SMI that gets disabled after ACPI is enabled by the OS. But the TSC calibration happens before the ACPI is enabled. ] To address this, change pit_expect_msb() so that - the 'tsc' is the TSC in between the two reads that read the MSB change from the PIT (same as before) - the 'delta' is the difference in TSC from *before* the MSB changed to *after* the MSB changed. Now the delta is twice as big as before (it covers four PIT accesses, roughly 4us) and quick_pit_calibrate() will loop a bit longer to get the calibrated value with in the 500ppm precision. As the delta (d1/d2) covers four PIT accesses, actual calibrated result might be closer to 250ppm precision. As the loop now takes longer to stabilize, double MAX_QUICK_PIT_MS to 50. SMI disturbance will showup as much larger delta's and the loop will take longer than usual for the result to be with in the accepted precision. Or will fallback to slow PIT calibration if it takes more than 50msec. Also while we are at this, remove the calibration correction that aims to get the result to the middle of the error bars. We really don't know which direction to correct into, so remove it. Reported-and-tested-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1326843337.5291.4.camel@sbsiddha-mobl2 Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | | * | x86, opcode: ANDN and Group 17 in x86-opcode-map.txtUlrich Drepper2012-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Intel documentation at http://software.intel.com/file/36945 shows the ANDN opcode and Group 17 with encoding f2 and f3 encoding respectively. The current version of x86-opcode-map.txt shows them with f3 and f4. Unless someone can point to documentation which shows the currently used encoding the following patch be applied. Signed-off-by: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/CAOPLpQdq5SuVo9=023CYhbFLAX9rONyjmYq7jJkqc5xwctW5eA@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | | * | x86/kconfig: Move the ZONE_DMA entry under a menuRandy Dunlap2012-01-171-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the ZONE_DMA kconfig symbol under a menu item instead of having it listed before everything else in "make {xconfig | gconfig | nconfig | menuconfig}". This drops the first line of the top-level kernel config menu (in 3.2) below and moves it under "Processor type and features". [*] DMA memory allocation support General setup ---> [*] Enable loadable module support ---> [*] Enable the block layer ---> Processor type and features ---> Power management and ACPI options ---> Bus options (PCI etc.) ---> Executable file formats / Emulations ---> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: David Rientjes <rientjes@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-mm@kvack.org <linux-mm@kvack.org> Link: http://lkml.kernel.org/r/4F14811E.6090107@xenotime.net Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: David Rientjes <rientjes@google.com>
| | | * | x86/UV2: Add accounting for BAU strong nacksCliff Wickman2012-01-172-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds separate accounting of UV2 message "strong nack's" in the BAU statistics. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116212238.GF5767@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86/UV2: Ack BAU interrupt earlierCliff Wickman2012-01-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the ack of the BAU interrupt to the beginning of the interrupt handler so that there is less possibility of a lost interrupt and slower response to a shootdown message. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116212146.GE5767@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86/UV2: Remove stale no-resources test for UV2 BAUCliff Wickman2012-01-171-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes an unnecessary test for a no-destination-resources-available condition that looks like a destination timeout in UV1, but is separately distinguishable in UV2. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116212050.GD5767@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86/UV2: Work around BAU bugCliff Wickman2012-01-172-33/+254
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a workaround for a UV2 hardware bug. The bug is a non-atomic update of a memory-mapped register. When hardware message delivery and software message acknowledge occur simultaneously the pending message acknowledge for the arriving message may be lost. This causes the sender's message status to stay busy. Part of the workaround is to not acknowledge a completed message until it is verified that no other message is actually using the resource that is mistakenly recorded in the completed message. Part of the workaround is to test for long elapsed time in such a busy condition, then handle it by using a spare sending descriptor. The stay-busy condition is eventually timed out by hardware, and then the original sending descriptor can be re-used. Most of that logic change is in keeping track of the current descriptor and the state of the spares. The occurrences of the workaround are added to the BAU statistics. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116211947.GC5767@sgi.com Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86/UV2: Fix BAU destination timeout initializationCliff Wickman2012-01-171-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the call to enable_timeouts() forward so that BAU_MISC_CONTROL is initialized before using it in calculate_destination_timeout(). Fix the calculation of a BAU destination timeout for UV2 (in calculate_destination_timeout()). Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116211848.GB5767@sgi.com Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86/UV2: Fix new UV2 hardware by using native UV2 broadcast modeCliff Wickman2012-01-172-30/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the use of the Broadcast Assist Unit on SGI Altix UV2 to the use of native UV2 mode on new hardware (not the legacy mode). UV2 native mode has a different format for a broadcast message. We also need quick differentiaton between UV1 and UV2. Signed-off-by: Cliff Wickman <cpw@sgi.com> Link: http://lkml.kernel.org/r/20120116211750.GA5767@sgi.com Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | x86: Get rid of dubious one-bit signed bitfieldAnton Vorontsov2012-01-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This very noisy sparse warning appears on almost every file in the kernel: CHECK init/main.c arch/x86/include/asm/thread_info.h:43:55: error: dubious one-bit signed bitfield arch/x86/include/asm/thread_info.h:44:46: error: dubious one-bit signed bitfield Sparse is right and this patch changes sig_on_uaccess_error and uaccess_err flags to unsigned type and thus fixes the warning. Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: Andy Lutomirski <luto@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Dan Carpenter <error27@gmail.com> Link: http://lkml.kernel.org/r/20120111011146.GA30428@oksana.dev.rtsoft.ru Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | Merge branch 'tip/perf/urgent-2' of ↵Ingo Molnar2012-01-173-15/+18
| | |\ \ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
| | * | | | x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignorexiyou.wangcong@gmail.com2012-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After compiling the kernel, I got: % git status # On branch master # Untracked files: # (use "git add <file>..." to include in what will be committed) # # arch/x86/tools/insn_sanity nothing added to commit but untracked files present (use "git add" to track) it should be added to .gitignore. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/1326628937-27609-1-git-send-email-xiyou.wangcong@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86/kprobes: Fix typo transferred from Intel manualUlrich Drepper2012-01-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch/x86/lib/x86-opcode-map.txt file [used by the kprobes instruction decoder] contains the line: af: SCAS/W/D/Q rAX,Xv This is what the Intel manuals show, but it's not correct. The 'X' stands for: Memory addressed by the DS:rSI register pair (for example, MOVS, CMPS, OUTS, or LODS). On the other hand 'Y' means (also see the ae byte entry for SCASB): Memory addressed by the ES:rDI register pair (for example, MOVS, CMPS, INS, STOS, or SCAS). Signed-off-by: Ulrich Drepper <drepper@gmail.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/CAOPLpQfytPyDEBF1Hbkpo7ovUerEsstVGxBr%3DEpDL-BKEMaqLA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | uml: fix compile for x86-64Linus Torvalds2012-01-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Randy Dunlap reports that we get arch/x86/um/shared/sysdep/ptrace.h:7:20: error: redefinition of 'regs_return_value' arch/x86/um/shared/sysdep/ptrace.h:7:20: note: previous definition of 'regs_return_value' was here when compiling UML for x86-64. Stephen Rothwell root-caused it and says: "Caused by commit d7e7528bcd45 ("Audit: push audit success and retcode into arch ptrace.h") (another patch that was never in linux-next :-(). This file now needs protection against double inclusion." so let's do as the man says. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Analyzed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'release' of ↵Linus Torvalds2012-01-182-2/+6
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux This includes initial support for the recently published ACPI 5.0 spec. In particular, support for the "hardware-reduced" bit that eliminates the dependency on legacy hardware. APEI has patches resulting from testing on real hardware. Plus other random fixes. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits) acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec intel_idle: Split up and provide per CPU initialization func ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2 ACPI processor: Remove unneeded cpuidle_unregister_driver call intel idle: Make idle driver more robust intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle ACPI: kernel-parameters.txt : Add intel_idle.max_cstate intel_idle: remove redundant local_irq_disable() call ACPI processor: Fix error path, also remove sysdev link ACPI: processor: fix acpi_get_cpuid for UP processor intel_idle: fix API misuse ACPI APEI: Convert atomicio routines ACPI: Export interfaces for ioremapping/iounmapping ACPI registers ACPI: Fix possible alignment issues with GAS 'address' references ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64) ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64) ACPI: Store SRAT table revision ACPI, APEI, Resolve false conflict between ACPI NVS and APEI ACPI, Record ACPI NVS regions ACPI, APEI, EINJ, Refine the fix of resource conflict ...
| | | | | |
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| *---. \ \ \ \ Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into ↵Len Brown2012-01-181-0/+4
| |\ \ \ \ \ \ \ | | | |_|_|_|/ / | | |/| | | | | | | | | | | | | release
| | | * | | | | ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)Kurt Garloff2012-01-171-0/+4
| | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides 32bits for these. The new fields were reserved before. According to the ACPI spec, the OS must disregrard reserved fields. x86/x86-64 was rather inconsistent prior to this patch; it used 8 bits for the pxm field in cpu_affinity, but 32 bits in mem_affinity. This patch makes it consistent: Either use 8 bits consistently (SRAT rev 1 or lower) or 32 bits (SRAT rev 2 or higher). cc: x86@kernel.org Signed-off-by: Kurt Garloff <kurt@garloff.de> Signed-off-by: Len Brown <len.brown@intel.com>
| * | | | | | ACPI, Record ACPI NVS regionsHuang Ying2012-01-171-2/+2
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some firmware will access memory in ACPI NVS region via APEI. That is, instructions in APEI ERST/EINJ table will read/write ACPI NVS region. The original resource conflict checking in APEI code will check memory/ioport accessed by APEI via general resource management mechanism. But ACPI NVS region is marked as busy already, so that the false resource conflict will prevent APEI ERST/EINJ to work. To fix this, this patch record ACPI NVS regions, so that we can avoid request resources for memory region inside it. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | | | | | x86-32: Fix build failure with AUDIT=y, AUDITSYSCALL=nAl Viro2012-01-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JONGMAN HEO reports: With current linus git (commit a25a2b84), I got following build error, arch/x86/kernel/vm86_32.c: In function 'do_sys_vm86': arch/x86/kernel/vm86_32.c:340: error: implicit declaration of function '__audit_syscall_exit' make[3]: *** [arch/x86/kernel/vm86_32.o] Error 1 OK, I can reproduce it (32bit allmodconfig with AUDIT=y, AUDITSYSCALL=n) It's due to commit d7e7528bcd45: "Audit: push audit success and retcode into arch ptrace.h". Reported-by: JONGMAN HEO <jongman.heo@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-01-176-34/+38
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits) audit: no leading space in audit_log_d_path prefix audit: treat s_id as an untrusted string audit: fix signedness bug in audit_log_execve_info() audit: comparison on interprocess fields audit: implement all object interfield comparisons audit: allow interfield comparison between gid and ogid audit: complex interfield comparison helper audit: allow interfield comparison in audit rules Kernel: Audit Support For The ARM Platform audit: do not call audit_getname on error audit: only allow tasks to set their loginuid if it is -1 audit: remove task argument to audit_set_loginuid audit: allow audit matching on inode gid audit: allow matching on obj_uid audit: remove audit_finish_fork as it can't be called audit: reject entry,always rules audit: inline audit_free to simplify the look of generic code audit: drop audit_set_macxattr as it doesn't do anything audit: inline checks for not needing to collect aux records audit: drop some potentially inadvisable likely notations ... Use evil merge to fix up grammar mistakes in Kconfig file. Bad speling and horrible grammar (and copious swearing) is to be expected, but let's keep it to commit messages and comments, rather than expose it to users in config help texts or printouts.
| * | | | | | audit: inline audit_syscall_entry to reduce burden on archsEric Paris2012-01-174-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every arch calls: if (unlikely(current->audit_context)) audit_syscall_entry() which requires knowledge about audit (the existance of audit_context) in the arch code. Just do it all in static inline in audit.h so that arch's can remain blissfully ignorant. Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | | | | audit: ia32entry.S sign extend error codes when calling 64 bit codeEric Paris2012-01-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the ia32entry syscall exit audit fastpath we have assembly code which calls __audit_syscall_exit directly. This code was, however, zeroes the upper 32 bits of the return code. It then proceeded to call code which expects longs to be 64bits long. In order to handle code which expects longs to be 64bit we sign extend the return code if that code is an error. Thus the __audit_syscall_exit function can correctly handle using the values in snprintf("%ld"). This fixes the regression introduced in 5cbf1565f29eb57a86a. Old record: type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=4294967283 New record: type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=-13 Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | Audit: push audit success and retcode into arch ptrace.hEric Paris2012-01-176-18/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The audit system previously expected arches calling to audit_syscall_exit to supply as arguments if the syscall was a success and what the return code was. Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things by converting from negative retcodes to an audit internal magic value stating success or failure. This helper was wrong and could indicate that a valid pointer returned to userspace was a failed syscall. The fix is to fix the layering foolishness. We now pass audit_syscall_exit a struct pt_reg and it in turns calls back into arch code to collect the return value and to determine if the syscall was a success or failure. We also define a generic is_syscall_success() macro which determines success/failure based on if the value is < -MAX_ERRNO. This works for arches like x86 which do not use a separate mechanism to indicate syscall failure. We make both the is_syscall_success() and regs_return_value() static inlines instead of macros. The reason is because the audit function must take a void* for the regs. (uml calls theirs struct uml_pt_regs instead of just struct pt_regs so audit_syscall_exit can't take a struct pt_regs). Since the audit function takes a void* we need to use static inlines to cast it back to the arch correct structure to dereference it. The other major change is that on some arches, like ia64, MIPS and ppc, we change regs_return_value() to give us the negative value on syscall failure. THE only other user of this macro, kretprobe_example.c, won't notice and it makes the value signed consistently for the audit functions across all archs. In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old audit code as the return value. But the ptrace_64.h code defined the macro regs_return_value() as regs[3]. I have no idea which one is correct, but this patch now uses the regs_return_value() function, so it now uses regs[3]. For powerpc we previously used regs->result but now use the regs_return_value() function which uses regs->gprs[3]. regs->gprs[3] is always positive so the regs_return_value(), much like ia64 makes it negative before calling the audit code when appropriate. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion] Acked-by: Tony Luck <tony.luck@intel.com> [for ia64] Acked-by: Richard Weinberger <richard@nod.at> [for uml] Acked-by: David S. Miller <davem@davemloft.net> [for sparc] Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips] Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
* | | | | | | Merge branch 'x86-syscall-for-linus' of ↵Linus Torvalds2012-01-1628-1948/+1012
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Move <asm/asm-offsets.h> from trace_syscalls.c to asm/syscall.h x86, um: Fix typo in 32-bit system call modifications um: Use $(srctree) not $(KBUILD_SRC) x86, um: Mark system call tables readonly x86, um: Use the same style generated syscall tables as native um: Generate headers before generating user-offsets.s um: Run host archheaders, allow use of host generated headers kbuild, headers.sh: Don't make archheaders explicitly x86, syscall: Allow syscall offset to be symbolic x86, syscall: Re-fix typo in comment x86: Simplify syscallhdr.sh x86: Generate system call tables and unistd_*.h from tables checksyscalls: Use arch/x86/syscalls/syscall_32.tbl as source x86: Machine-readable syscall tables and scripts to process them trace: Include <asm/asm-offsets.h> in trace_syscalls.c x86-64, ia32: Move compat_ni_syscall into C and its own file x86-64, syscall: Adjust comment spacing and remove typo kbuild: Add support for an "archheaders" target kbuild: Add support for installing generated asm headers
| * | | | | | | x86: Move <asm/asm-offsets.h> from trace_syscalls.c to asm/syscall.hH. Peter Anvin2012-01-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d5e553d6e0a4bdea43adae7373e3fa144b9a1aaa, which caused large numbers of build warnings on PowerPC. This moves the #include <asm/asm-offsets.h> to <asm/syscall.h>, which makes some kind of sense since NR_syscalls is syscalls related. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/20111214181545.6e13bc954cb7ddce9086e861@canb.auug.org.au
| * | | | | | | x86, um: Fix typo in 32-bit system call modificationsH. Peter Anvin2011-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We override sys_iopl(), not stub_iopl(); the latter is a 64-bitism that doesn't apply to i386 in the first place. Reported-by: Richard Weinberger <richard@nod.at> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | x86, um: Mark system call tables readonlyH. Peter Anvin2011-12-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark the system call tables readonly, as they already are on native, and the 32-bit UM version was in the previous assembly version. The 32-bit version lost it due to copy and paste from the 64-bit version, which was missing the const. Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Link: http://lkml.kernel.org/r/tip-45db1c6176c8171d9ae6fa6d82e07d115a5950ca@git.kernel.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | x86, um: Use the same style generated syscall tables as nativeH. Peter Anvin2011-12-055-46/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now when the native kernel uses a single style of generated system call table, follow suite for UML and implement the same style, all in C. This requires __NR_syscall_max and NR_syscalls to be generated; on native this is done in asm-headers.h but that file is common to all UML architectures; therefore put it in user-headers.h instead which already have accommodations for architecture-specific values. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, syscall: Allow syscall offset to be symbolicH. Peter Anvin2011-11-181-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the specified syscall offset to be symbolic, e.g. a macro. For offset system calls, this if nothing else makes the generated code easier to read. Suggested-by: H. J. Lu <hjl.tools@gmail.com> Link: http://lkml.kernel.org/r/1321569446-20433-7-git-send-email-hpa@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, syscall: Re-fix typo in commentH. Peter Anvin2011-11-182-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the same typo as was fixed in: b7641d2c x86-64, syscall: Adjust comment spacing and remove typo ... for the new versions of this file (32-bit and IA32 compat). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1321569446-20433-4-git-send-email-hpa@linux.intel.com
| * | | | | | | x86: Simplify syscallhdr.shH. Peter Anvin2011-11-181-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify syscallhdr.sh by letting grep sort out the ABIs that we want, rather than relying on manual list matching. This is safe since the ABI strings already have to consist only of characters which are valid in C macro names. Suggested-by: Matt Helsley <matthltc@us.ibm.com> Link: http://lkml.kernel.org/r/20111118221558.GA6408@count0.beaverton.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86: Generate system call tables and unistd_*.h from tablesH. Peter Anvin2011-11-1716-1896/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generate system call tables and unistd_*.h automatically from the tables in arch/x86/syscalls. All other information, like NR_syscalls, is auto-generated, some of which is in asm-offsets_*.c. This allows us to keep all the system call information in one place, and allows for kernel space and user space to see different information; this is currently used for the ia32 system call numbers when building the 64-bit kernel, but will be used by the x32 ABI in the near future. This also removes some gratuitious differences between i386, x86-64 and ia32; in particular, now all system call tables are generated with the same mechanism. Cc: H. J. Lu <hjl.tools@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86: Machine-readable syscall tables and scripts to process themH. Peter Anvin2011-11-175-0/+771
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a simple set of syscall tables and scripts to turn them into both header files (unistd_*.h) and macros for generating the system call tables. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86-64, ia32: Move compat_ni_syscall into C and its own fileH. Peter Anvin2011-11-173-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move compat_ni_syscall out of ia32entry.S and into its own .c file. Although this is a trivial function, it is not performance-critical, and this will simplify further cleanups. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86-64, syscall: Adjust comment spacing and remove typoH. Peter Anvin2011-11-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjust spacing for comment so that it matches the multiline comment style used in the rest of the kernel, and remove word duplication. It is not really clear what version of gcc this refers to, but the extra & doesn't cause any harm, so there is no reason to remove it. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | | mce: fix warning messages about static struct mce_deviceGreg Kroah-Hartman2012-01-163-12/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When suspending, there was a large list of warnings going something like: Device 'machinecheck1' does not have a release() function, it is broken and must be fixed This patch turns the static mce_devices into dynamically allocated, and properly frees them when they are removed from the system. It solves the warning messages on my laptop here. Reported-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Djalal Harouni <tixxdz@opendz.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@amd64.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2012-01-157-33/+369
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) perf tools: Fix compile error on x86_64 Ubuntu perf report: Fix --stdio output alignment when --showcpuutilization used perf annotate: Get rid of field_sep check perf annotate: Fix usage string perf kmem: Fix a memory leak perf kmem: Add missing closedir() calls perf top: Add error message for EMFILE perf test: Change type of '-v' option to INCR perf script: Add missing closedir() calls tracing: Fix compile error when static ftrace is enabled recordmcount: Fix handling of elf64 big-endian objects. perf tools: Add const.h to MANIFEST to make perf-tar-src-pkg work again perf tools: Add support for guest/host-only profiling perf kvm: Do guest-only counting by default perf top: Don't update total_period on process_sample perf hists: Stop using 'self' for struct hist_entry perf hists: Rename total_session to total_period x86: Add counter when debug stack is used with interrupts enabled x86: Allow NMIs to hit breakpoints in i386 x86: Keep current stack in NMI breakpoints ...
| * \ \ \ \ \ \ \ Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2012-01-086-24/+27
| |\ \ \ \ \ \ \ \ | | | |_|/ / / / / | | |/| | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
| * | | | | | | | Merge branch 'tip/x86/core-3' of ↵Ingo Molnar2012-01-077-33/+369
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
| | * | | | | | | | x86: Add counter when debug stack is used with interrupts enabledSteven Rostedt2011-12-214-8/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mathieu Desnoyers pointed out a case that can cause issues with NMIs running on the debug stack: int3 -> interrupt -> NMI -> int3 Because the interrupt changes the stack, the NMI will not see that it preempted the debug stack. Looking deeper at this case, interrupts only happen when the int3 is from userspace or in an a location in the exception table (fixup). userspace -> int3 -> interurpt -> NMI -> int3 All other int3s that happen in the kernel should be processed without ever enabling interrupts, as the do_trap() call will panic the kernel if it is called to process any other location within the kernel. Adding a counter around the sections that enable interrupts while using the debug stack allows the NMI to also check that case. If the NMI sees that it either interrupted a task using the debug stack or the debug counter is non-zero, then it will have to change the IDT table to make the int3 not change stacks (which will corrupt the stack if it does). Note, I had to move the debug_usage functions out of processor.h and into debugreg.h because of the static inlined functions to inc and dec the debug_usage counter. __get_cpu_var() requires smp.h which includes processor.h, and would fail to build. Link: http://lkml.kernel.org/r/1323976535.23971.112.camel@gandalf.stny.rr.com Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Turner <pjt@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | x86: Allow NMIs to hit breakpoints in i386Steven Rostedt2011-12-211-7/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With i386, NMIs and breakpoints use the current stack and they do not reset the stack pointer to a fix point that might corrupt a previous NMI or breakpoint (as it does in x86_64). But NMIs are still not made to be re-entrant, and need to prevent the case that an NMI hitting a breakpoint (which does an iret), doesn't allow another NMI to run. The fix is to let the NMI be in 3 different states: 1) not running 2) executing 3) latched When no NMI is executing on a given CPU, the state is "not running". When the first NMI comes in, the state is switched to "executing". On exit of that NMI, a cmpxchg is performed to switch the state back to "not running" and if that fails, the NMI is restarted. If a breakpoint is hit and does an iret, which re-enables NMIs, and another NMI comes in before the first NMI finished, it will detect that the state is not in the "not running" state and the current NMI is nested. In this case, the state is switched to "latched" to let the interrupted NMI know to restart the NMI handler, and the nested NMI exits without doing anything. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Turner <pjt@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | x86: Keep current stack in NMI breakpointsSteven Rostedt2011-12-216-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to allow NMI handlers to have breakpoints to be able to remove stop_machine from ftrace, kprobes and jump_labels. But if an NMI interrupts a current breakpoint, and then it triggers a breakpoint itself, it will switch to the breakpoint stack and corrupt the data on it for the breakpoint processing that it interrupted. Instead, have the NMI check if it interrupted breakpoint processing by checking if the stack that is currently used is a breakpoint stack. If it is, then load a special IDT that changes the IST for the debug exception to keep the same stack in kernel context. When the NMI is done, it puts it back. This way, if the NMI does trigger a breakpoint, it will keep using the same stack and not stomp on the breakpoint data for the breakpoint it interrupted. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
OpenPOWER on IntegriCloud