perf, x86: Fix accidentally ack'ing a second event on intel perf counter

During testing of a patch to stop having the perf subsytem swallow nmis, it was uncovered that Nehalem boxes were randomly getting unknown nmis when using the perf tool. Moving the ack'ing of the PMI closer to when we get the status allows the hardware to properly re-set the PMU bit signaling another PMI was triggered during the processing of the first PMI. This allows the new logic for dealing with the shortcomings of multiple PMIs to handle the extra NMI by 'eat'ing it later. Now one can wonder why are we getting a second PMI when we disable all the PMUs in the begining of the NMI handler to prevent such a case, for that I do not know. But I know the fix below helps deal with this quirk. Tested on multiple Nehalems where the problem was occuring. With the patch, the code now loops a second time to handle the second PMI (whereas before it was not). Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: robert.richter@amd.com Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com Cc: ying.huang@intel.com Cc: ming.m.lin@intel.com Cc: eranian@google.com LKML-Reference: <1283454469-1909-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
author: Don Zickus <dzickus@redhat.com> 2010-09-02 15:07:47 -0400
committer: Ingo Molnar <mingo@elte.hu> 2010-09-03 08:05:17 +0200
commit: 2e556b5b320838fde98480a1f6cf220a5af200fc (patch)
tree: 13154588f289bdc31a2150036727c7fb826c0bb7 /arch
parent: b4c69d45c4c0d7480e9df183ebda62148984af25 (diff)
download: op-kernel-dev-2e556b5b320838fde98480a1f6cf220a5af200fc.zip
op-kernel-dev-2e556b5b320838fde98480a1f6cf220a5af200fc.tar.gz
1 files changed, 2 insertions, 4 deletions
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index d8d86d0..1297bf1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	struct perf_sample_data data;
 	struct cpu_hw_events *cpuc;
 	int bit, loops;
-	u64 ack, status;
+	u64 status;
 
 	perf_sample_data_init(&data, 0);
 
@@ -728,6 +728,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 
 	loops = 0;
 again:
+	intel_pmu_ack_status(status);
 	if (++loops > 100) {
 		WARN_ONCE(1, "perfevents: irq loop stuck!\n");
 		perf_event_print_debug();
@@ -736,7 +737,6 @@ again:
 	}
 
 	inc_irq_stat(apic_perf_irqs);
-	ack = status;
 
 	intel_pmu_lbr_read();
 
@@ -761,8 +761,6 @@ again:
 			x86_pmu_stop(event);
 	}
 
-	intel_pmu_ack_status(ack);
-
 	/*
 	 * Repeat if there is more work to be done:
 	 */
author	Don Zickus <dzickus@redhat.com>	2010-09-02 15:07:47 -0400
committer	Ingo Molnar <mingo@elte.hu>	2010-09-03 08:05:17 +0200
commit	2e556b5b320838fde98480a1f6cf220a5af200fc (patch)
tree	13154588f289bdc31a2150036727c7fb826c0bb7 /arch
parent	b4c69d45c4c0d7480e9df183ebda62148984af25 (diff)
download	op-kernel-dev-2e556b5b320838fde98480a1f6cf220a5af200fc.zip op-kernel-dev-2e556b5b320838fde98480a1f6cf220a5af200fc.tar.gz