diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2013-07-02 16:15:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-07-02 16:15:23 -0700 |
commit | f0bb4c0ab064a8aeeffbda1cee380151a594eaab (patch) | |
tree | 14d55a89c5db455aa10ff9a96ca14c474a9c4d55 /Documentation | |
parent | a4883ef6af5e513a1e8c2ab9aab721604aa3a4f5 (diff) | |
parent | 983433b5812c5cf33a9008fa38c6f9b407fedb76 (diff) | |
download | op-kernel-dev-f0bb4c0ab064a8aeeffbda1cee380151a594eaab.zip op-kernel-dev-f0bb4c0ab064a8aeeffbda1cee380151a594eaab.tar.gz |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel improvements:
- watchdog driver improvements by Li Zefan
- Power7 CPI stack events related improvements by Sukadev Bhattiprolu
- event multiplexing via hrtimers and other improvements by Stephane
Eranian
- kernel stack use optimization by Andrew Hunter
- AMD IOMMU uncore PMU support by Suravee Suthikulpanit
- NMI handling rate-limits by Dave Hansen
- various hw_breakpoint fixes by Oleg Nesterov
- hw_breakpoint overflow period sampling and related signal handling
fixes by Jiri Olsa
- Intel Haswell PMU support by Andi Kleen
Tooling improvements:
- Reset SIGTERM handler in workload child process, fix from David
Ahern.
- Makefile reorganization, prep work for Kconfig patches, from Jiri
Olsa.
- Add automated make test suite, from Jiri Olsa.
- Add --percent-limit option to 'top' and 'report', from Namhyung
Kim.
- Sorting improvements, from Namhyung Kim.
- Expand definition of sysfs format attribute, from Michael Ellerman.
Tooling fixes:
- 'perf tests' fixes from Jiri Olsa.
- Make Power7 CPI stack events available in sysfs, from Sukadev
Bhattiprolu.
- Handle death by SIGTERM in 'perf record', fix from David Ahern.
- Fix printing of perf_event_paranoid message, from David Ahern.
- Handle realloc failures in 'perf kvm', from David Ahern.
- Fix divide by 0 in variance, from David Ahern.
- Save parent pid in thread struct, from David Ahern.
- Handle JITed code in shared memory, from Andi Kleen.
- Fixes for 'perf diff', from Jiri Olsa.
- Remove some unused struct members, from Jiri Olsa.
- Add missing liblk.a dependency for python/perf.so, fix from Jiri
Olsa.
- Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
- No need to do locking when adding hists in perf report, only 'top'
needs that, from Namhyung Kim.
- Fix alignment of symbol column in in the hists browser (top,
report) when -v is given, from NAmhyung Kim.
- Fix 'perf top' -E option behavior, from Namhyung Kim.
- Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
- Fix compile errors in bp_signal 'perf test', from Sukadev
Bhattiprolu.
... and more things"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits)
perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()
perf/x86: Fix shared register mutual exclusion enforcement
perf/x86/intel: Support full width counting
x86: Add NMI duration tracepoints
perf: Drop sample rate when sampling is too slow
x86: Warn when NMI handlers take large amounts of time
hw_breakpoint: Introduce "struct bp_cpuinfo"
hw_breakpoint: Simplify *register_wide_hw_breakpoint()
hw_breakpoint: Introduce cpumask_of_bp()
hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths
hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths
perf/x86/intel: Add mem-loads/stores support for Haswell
perf/x86/intel: Support Haswell/v4 LBR format
perf/x86/intel: Move NMI clearing to end of PMI handler
perf/x86/intel: Add Haswell PEBS support
perf/x86/intel: Add simple Haswell PMU support
perf/x86/intel: Add Haswell PEBS record support
perf/x86/intel: Fix sparse warning
perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation
perf/x86/amd: Add IOMMU Performance Counter resource management
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-bus-event_source-devices-events | 32 | ||||
-rw-r--r-- | Documentation/ABI/testing/sysfs-bus-event_source-devices-format | 6 | ||||
-rw-r--r-- | Documentation/sysctl/kernel.txt | 50 | ||||
-rw-r--r-- | Documentation/trace/events-nmi.txt | 43 |
4 files changed, 116 insertions, 15 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events index 0adeb52..8b25ffb 100644 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events @@ -27,14 +27,36 @@ Description: Generic performance monitoring events "basename". -What: /sys/devices/cpu/events/PM_LD_MISS_L1 - /sys/devices/cpu/events/PM_LD_REF_L1 - /sys/devices/cpu/events/PM_CYC +What: /sys/devices/cpu/events/PM_1PLUS_PPC_CMPL /sys/devices/cpu/events/PM_BRU_FIN - /sys/devices/cpu/events/PM_GCT_NOSLOT_CYC /sys/devices/cpu/events/PM_BRU_MPRED - /sys/devices/cpu/events/PM_INST_CMPL /sys/devices/cpu/events/PM_CMPLU_STALL + /sys/devices/cpu/events/PM_CMPLU_STALL_BRU + /sys/devices/cpu/events/PM_CMPLU_STALL_DCACHE_MISS + /sys/devices/cpu/events/PM_CMPLU_STALL_DFU + /sys/devices/cpu/events/PM_CMPLU_STALL_DIV + /sys/devices/cpu/events/PM_CMPLU_STALL_ERAT_MISS + /sys/devices/cpu/events/PM_CMPLU_STALL_FXU + /sys/devices/cpu/events/PM_CMPLU_STALL_IFU + /sys/devices/cpu/events/PM_CMPLU_STALL_LSU + /sys/devices/cpu/events/PM_CMPLU_STALL_REJECT + /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR + /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR_LONG + /sys/devices/cpu/events/PM_CMPLU_STALL_STORE + /sys/devices/cpu/events/PM_CMPLU_STALL_THRD + /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR + /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR_LONG + /sys/devices/cpu/events/PM_CYC + /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED + /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED_IC_MISS + /sys/devices/cpu/events/PM_GCT_NOSLOT_CYC + /sys/devices/cpu/events/PM_GCT_NOSLOT_IC_MISS + /sys/devices/cpu/events/PM_GRP_CMPL + /sys/devices/cpu/events/PM_INST_CMPL + /sys/devices/cpu/events/PM_LD_MISS_L1 + /sys/devices/cpu/events/PM_LD_REF_L1 + /sys/devices/cpu/events/PM_RUN_CYC + /sys/devices/cpu/events/PM_RUN_INST_CMPL Date: 2013/01/08 diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-format b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format index 079afc7..77f47ff 100644 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-format +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format @@ -9,6 +9,12 @@ Description: we want to export, so that userspace can deal with sane name/value pairs. + Userspace must be prepared for the possibility that attributes + define overlapping bit ranges. For example: + attr1 = 'config:0-23' + attr2 = 'config:0-7' + attr3 = 'config:12-35' + Example: 'config1:1,6-10,44' Defines contents of attribute that occupies bits 1,6-10,44 of perf_event_attr::config1. diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index ccd4258..ab7d16e 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -70,12 +70,12 @@ show up in /proc/sys/kernel: - shmall - shmmax [ sysv ipc ] - shmmni -- softlockup_thresh - stop-a [ SPARC only ] - sysrq ==> Documentation/sysrq.txt - tainted - threads-max - unknown_nmi_panic +- watchdog_thresh - version ============================================================== @@ -427,6 +427,32 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled. ============================================================== +perf_cpu_time_max_percent: + +Hints to the kernel how much CPU time it should be allowed to +use to handle perf sampling events. If the perf subsystem +is informed that its samples are exceeding this limit, it +will drop its sampling frequency to attempt to reduce its CPU +usage. + +Some perf sampling happens in NMIs. If these samples +unexpectedly take too long to execute, the NMIs can become +stacked up next to each other so much that nothing else is +allowed to execute. + +0: disable the mechanism. Do not monitor or correct perf's + sampling rate no matter how CPU time it takes. + +1-100: attempt to throttle perf's sample rate to this + percentage of CPU. Note: the kernel calculates an + "expected" length of each sample event. 100 here means + 100% of that expected length. Even if this is set to + 100, you may still see sample throttling if this + length is exceeded. Set to 0 if you truly do not care + how much CPU is consumed. + +============================================================== + pid_max: @@ -604,15 +630,6 @@ without users and with a dead originative process will be destroyed. ============================================================== -softlockup_thresh: - -This value can be used to lower the softlockup tolerance threshold. The -default threshold is 60 seconds. If a cpu is locked up for 60 seconds, -the kernel complains. Valid values are 1-60 seconds. Setting this -tunable to zero will disable the softlockup detection altogether. - -============================================================== - tainted: Non-zero if the kernel has been tainted. Numeric values, which @@ -648,3 +665,16 @@ that time, kernel debugging information is displayed on console. NMI switch that most IA32 servers have fires unknown NMI up, for example. If a system hangs up, try pressing the NMI switch. + +============================================================== + +watchdog_thresh: + +This value can be used to control the frequency of hrtimer and NMI +events and the soft and hard lockup thresholds. The default threshold +is 10 seconds. + +The softlockup threshold is (2 * watchdog_thresh). Setting this +tunable to zero will disable lockup detection altogether. + +============================================================== diff --git a/Documentation/trace/events-nmi.txt b/Documentation/trace/events-nmi.txt new file mode 100644 index 0000000..c03c8c8 --- /dev/null +++ b/Documentation/trace/events-nmi.txt @@ -0,0 +1,43 @@ +NMI Trace Events + +These events normally show up here: + + /sys/kernel/debug/tracing/events/nmi + +-- + +nmi_handler: + +You might want to use this tracepoint if you suspect that your +NMI handlers are hogging large amounts of CPU time. The kernel +will warn if it sees long-running handlers: + + INFO: NMI handler took too long to run: 9.207 msecs + +and this tracepoint will allow you to drill down and get some +more details. + +Let's say you suspect that perf_event_nmi_handler() is causing +you some problems and you only want to trace that handler +specifically. You need to find its address: + + $ grep perf_event_nmi_handler /proc/kallsyms + ffffffff81625600 t perf_event_nmi_handler + +Let's also say you are only interested in when that function is +really hogging a lot of CPU time, like a millisecond at a time. +Note that the kernel's output is in milliseconds, but the input +to the filter is in nanoseconds! You can filter on 'delta_ns': + +cd /sys/kernel/debug/tracing/events/nmi/nmi_handler +echo 'handler==0xffffffff81625600 && delta_ns>1000000' > filter +echo 1 > enable + +Your output would then look like: + +$ cat /sys/kernel/debug/tracing/trace_pipe +<idle>-0 [000] d.h3 505.397558: nmi_handler: perf_event_nmi_handler() delta_ns: 3236765 handled: 1 +<idle>-0 [000] d.h3 505.805893: nmi_handler: perf_event_nmi_handler() delta_ns: 3174234 handled: 1 +<idle>-0 [000] d.h3 506.158206: nmi_handler: perf_event_nmi_handler() delta_ns: 3084642 handled: 1 +<idle>-0 [000] d.h3 506.334346: nmi_handler: perf_event_nmi_handler() delta_ns: 3080351 handled: 1 + |