summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/sort.c
Commit message (Collapse)AuthorAgeFilesLines
* perf tools: Prevent condition that all sort keys are elidedNamhyung Kim2013-11-111-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If given sort keys are all elided there'll be no output except for the overhead column - actually the TUI shows a noisy output. In this case it'd be better to show up the sort keys rather than elide. Before: $ perf report -s comm -c perf (...) # Overhead # ........ # 100.00% After: $ perf report -s comm -c perf (...) # Overhead Command # ........ ....... # 100.00% perf Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383900822-14609-1-git-send-email-namhyung@kernel.org [ Us curly braces around multi-line statements, as requested by Ingo Molnar ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Get current comm instead of last oneNamhyung Kim2013-11-041-9/+5
| | | | | | | | | | | | | At insert time, a hist entry should reference comm at the time otherwise it'll get the last comm anyway. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-n6pykiiymtgmcjs834go2t8x@git.kernel.org [ Fixed up const pointer issues ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Compare hists comm by addressesFrederic Weisbecker2013-11-041-1/+2
| | | | | | | | | | | | | | | | | Now that comm strings are allocated only once and refcounted to be shared among threads, these can now be safely compared by addresses. This should remove most hists collapses on post processing. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1381468543-25334-8-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* perf tools: Use an accessor to read thread commFrederic Weisbecker2013-11-041-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | As the thread comm is going to be implemented by way of a more complicated data structure than just a pointer to a string from the thread struct, convert the readers of comm to use an accessor instead of accessing it directly. The accessor will be later overriden to support an enhanced comm implementation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wr683zwy94hmj4ibogmnv9ce@git.kernel.org [ Rename thread__comm_curr() to thread__comm_str() ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> [ Fixed up some minor const pointer issues ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Stop using 'self' in some more placesArnaldo Carvalho de Melo2013-10-231-62/+62
| | | | | | | | | | | | | | | | As suggested by tglx, 'self' should be replaced by something that is more useful. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-fmblhc6tbb99tk1q8vowtsbj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Compare dso's also when comparing symbolsNamhyung Kim2013-10-211-0/+10
| | | | | | | | | | | | | | | | | | | | | | Linus reported that sometimes 'perf report -s symbol' exits without any message on TUI. David and Jiri found that it's because it failed to add a hist entry due to an invalid symbol length. It turns out that sorting by symbol (address) was broken since it only compares symbol addresses. The symbol address is a relative address within a dso thus just checking its address can result in merging unrelated symbols together. Fix it by checking dso before comparing symbol address. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381802517-18812-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix srcline sort key behaviorNamhyung Kim2013-10-091-21/+20
| | | | | | | | | | | | | | | | | | | Currently the srcline sort key compares ip rather than srcline info. I guess this was due to a performance reason to run external addr2line utility. Now we have implemented the functionality inside, use the srcline info when comparing hist entries. Also constantly print "??:0" string for unknown srcline rather than printing ip. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378876173-13363-10-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Pass dso instead of dso_name to get_srcline()Namhyung Kim2013-10-091-1/+1
| | | | | | | | | | | | | | This is a preparation of next change. No functional changes are intended. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378876173-13363-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Do not try to call addr2line on non-binary filesNamhyung Kim2013-10-091-3/+0
| | | | | | | | | | | | | No need to call addr2line since they don't have such information. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378876173-13363-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Factor out get/free_srcline()Namhyung Kim2013-10-091-14/+3
| | | | | | | | | | | | | | | Currently external addr2line tool is used for srcline sort key and annotate with srcline info. Separate the common code to prepare upcoming enhancements. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378876173-13363-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Fix a memory leak on srclineNamhyung Kim2013-10-091-4/+1
| | | | | | | | | | | | | | In the hist_entry__srcline_snprintf(), path and self->srcline are pointing the same memory region, but they are doubly allocated. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378876173-13363-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* tools/perf: Add support for record transaction flagsAndi Kleen2013-10-041-0/+73
| | | | | | | | | | | | Add support for recording and displaying the transaction flags. They are essentially a new sort key. Also display them in a nice way to the user. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* tools/perf: Support sorting by in_tx or abort branch flagsAndi Kleen2013-10-041-0/+51
| | | | | | | | | | | | | | | | Extend the perf branch sorting code to support sorting by in_tx or abort_tx qualifiers. Also print out those qualifiers. This also fixes up some of the existing sort key documentation. We do not support no_tx here, because it's simply not showing the in_tx flag. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf tools: Move weight back to common sort keysAndi Kleen2013-07-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | This is a partial revert of Namhyung's patch afab87b91f3f331d55664172dad8e476e6ffca9d perf sort: Separate out memory-specific sort keys He wrote For global/local weights, I'm not entirely sure to place them into the memory dimension. But it's the only user at this time. Well TSX is another (in fact the original) user of the flags, and it needs them to be common. So move local/global weight back to the common sort keys. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Link: http://lkml.kernel.org/r/1374188333-17899-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report/top: Add option to collapse undesired parts of call graphGreg Price2013-07-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example, in an application with an expensive function implemented with deeply nested recursive calls, the default call-graph presentation is dominated by the different callchains within that function. By ignoring these callees, we can collect the callchains leading into the function and compactly identify what to blame for expensive calls. For example, in this report the callers of garbage_collect() are scattered across the tree: $ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z] 22.03% ruby [.] gc_mark --- gc_mark |--59.40%-- mark_keyvalue | st_foreach | gc_mark_children | |--99.75%-- rb_gc_mark | | rb_vm_mark | | gc_mark_children | | gc_marks | | |--99.00%-- garbage_collect If we ignore the callees of garbage_collect(), its callers are coalesced: $ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z] 72.92% ruby [.] garbage_collect --- garbage_collect vm_xmalloc |--47.08%-- ruby_xmalloc | st_insert2 | rb_hash_aset | |--98.45%-- features_index_add | | rb_provide_feature | | rb_require_safe | | vm_call_method Signed-off-by: Greg Price <price@mit.edu> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu Cc: Fengguang Wu <fengguang.wu@intel.com> [ remove spaces at beginning of line, reported by Fengguang Wu ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: struct thread has a tid not a pidAdrian Hunter2013-07-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | As evident from 'machine__process_fork_event()' and 'machine__process_exit_event()' the 'pid' member of struct thread is actually the tid. Rename 'pid' to 'tid' in struct thread accordingly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1372944040-32690-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Cleanup sort__has_sym settingNamhyung Kim2013-05-281-4/+1
| | | | | | | | | | | | | | | | | The sort__has_sym variable is set only if a symbol-related sort key was added. Since branch stack and memory sort dimensions are separated, it doesn't need to be checked from common dimension. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1365125198-8334-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Consolidate sort_entry__setup_elide()Namhyung Kim2013-05-281-2/+43
| | | | | | | | | | | | | | | | The same code was duplicate to places, factor them out to common sort__setup_elide(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364991979-3008-11-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Separate out memory-specific sort keysNamhyung Kim2013-05-281-8/+31
| | | | | | | | | | | | | | | | | | | Since they're used only for perf mem, separate out them to a different dimension so that normal user cannot access them by any chance. For global/local weights, I'm not entirely sure to place them into the memory dimension. But it's the only user at this time. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364991979-3008-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Factor out common code in sort_dimension__add()Namhyung Kim2013-05-281-24/+17
| | | | | | | | | | | | | | | | Let's remove duplicate code. Suggested-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364991979-3008-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Introduce sort__mode variableNamhyung Kim2013-05-281-2/+2
| | | | | | | | | | | | | | | | It's used for determining current sort mode which can be one of NORMAL, BRANCH and new MEMORY. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364816125-12212-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Fix alignment of symbol column when -v is givenNamhyung Kim2013-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When -v option is given, the symbol sort key prints its address also but it wasn't properly aligned since hists__calc_col_len() misses the additional part. Also it missed 2 spaces for 0x prefix when printing. $ perf report --stdio -v -s sym # Samples: 133 of event 'cycles' # Event count (approx.): 50536717 # # Overhead Symbol # ........ .............................. # 12.20% 0xffffffff81384c50 v [k] intel_idle 7.62% 0xffffffff8170976a v [k] ftrace_caller 7.02% 0x2d986d B [.] 0x00000000002d986d Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1364816125-12212-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix output of symbol_daddr offsetNamhyung Kim2013-04-011-1/+1
| | | | | | | | | | | | | | | | The symbol addresses in a dso have relative offsets from the start of a mapping. So in order to ouput correct offset value from @ip, one of them should be converted. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1359040242-8269-19-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add mem access sampling core supportStephane Eranian2013-04-011-6/+363
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the sorting and histogram support functions to enable profiling of memory accesses. The following sorting orders are added: - symbol_daddr: data address symbol (or raw address) - dso_daddr: data address shared object - locked: access uses locked transaction - tlb : TLB access - mem : memory level of the access (L1, L2, L3, RAM, ...) - snoop: access snoop mode Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com [ committer note: changed to cope with fc5871ed, the move of methods to machine.[ch], and the rename of dsrc to data_src, to match the change made in the PERF_SAMPLE_DSRC in a previous patch. ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add support for weight v7 (modified)Andi Kleen2013-04-011-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf record has a new option -W that enables weightened sampling. Add sorting support in top/report for the average weight per sample and the total weight sum. This allows to both compare relative cost per event and the total cost over the measurement period. Add the necessary glue to perf report, record and the library. v2: Merge with new hist refactoring. v3: Fix manpage. Remove value check. Rename global_weight to weight and weight to local_weight. v4: Readd sort keys to manpage v5: Move weight to end v6: Move weight to template v7: Rename weight key. Original patch from Andi modified by Stephane Eranian <eranian@google.com> to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Check return value of strdup()Namhyung Kim2013-02-061-0/+5
| | | | | | | | | | | | | | | | When setup_sorting() is called, 'str' is passed to strtok_r() but it's not checked to have a valid pointer. As strtok_r() accepts NULL pointer on a first argument and use the third argument in that case, it can cause a trouble since our third argument, tmp, is not initialized. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Make setup_sorting returns an error codeNamhyung Kim2013-02-061-4/+6
| | | | | | | | | | | | | | | | | Currently the setup_sorting() is called for parsing sort keys and exits if it failed to add the sort key. As it's included in libperf it'd be better returning an error code rather than exiting application inside of the library. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Drop ip_[lr] arguments from _sort__sym_cmp()Namhyung Kim2013-02-061-17/+6
| | | | | | | | | | | | | | | | | | | | | | | Current _sort__sym_cmp() function is used for comparing symbols between two hist entries on symbol, symbol_from and symbol_to sort keys. Those functions pass addresses of symbols but it's meaningless since it gets over-written inside of the _sort__sym_cmp function to a start address of the symbol. So just get rid of them. This might cause a difference than prior output for branch stacks since it seems not using start address of the symbol but branch address. However AFAICS it'd be same as it gets overwritten anyway. Also remove redundant part of code in sort__sym_cmp(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Use pclose() instead of fclose() on pipe streamThomas Jarosch2013-01-301-2/+5
| | | | | | | | | | | | cppcheck message: [tools/perf/util/sort.c:277]: (error) Mismatching allocation and deallocation: fp Also fix descriptor leak on error and always initialize the "fp" variable. Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Link: http://lkml.kernel.org/r/1359112354.yZcisNZ4k0@storm Link: http://lkml.kernel.org/r/2266358.qvDXKLvJ67@storm Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Separate out branch stack specific sort keysNamhyung Kim2013-01-241-12/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current perf report gets segmentation fault when a branch stack specific sort key is provided by --sort option to a perf.data file which contains no branch infomation. It's because those sort keys reference branch info of a hist entry unconditionally. Maybe we can change it checks whether such branch info is valid or not. But if the branch stacks are not recorded, it'd be nop. Thus it'd be better to make those keys are unselectable. This patch separates those keys to a different dimension array, so that if user passes such a key to a file which has no branch stack will get following message rather than a segfault. Error: Invalid --sort key: `symbol_from' Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reported-by: Stefan Beller <stefanbeller@googlemail.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-10-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Clean up sort__first_dimension settingNamhyung Kim2013-01-241-24/+2
| | | | | | | | | | | | | | | | It doesn't need to compare to every sort key names since the index already has the required information. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-9-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Align cpu column to rightNamhyung Kim2013-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since cpu number is a natural number, it'd be more appropriate aligning it to right. Before: # Overhead CPU Command: Pid Shared Object # ........ ... ................. ..................... # 8.91% 8 gnome-shell: 1497 perf-1497.map 8.90% 7 gnome-shell: 1497 perf-1497.map 8.86% 9 gnome-shell: 1497 perf-1497.map 8.83% 6 gnome-shell: 1497 perf-1497.map 8.81% 10 gnome-shell: 1497 perf-1497.map 7.44% 5 gnome-shell: 1497 perf-1497.map 6.20% 3 gnome-shell: 1497 perf-1497.map 5.10% 0 gnome-shell: 1497 perf-1497.map After: # Overhead CPU Command: Pid Shared Object # ........ ... ................. ..................... # 8.91% 8 gnome-shell: 1497 perf-1497.map 8.90% 7 gnome-shell: 1497 perf-1497.map 8.86% 9 gnome-shell: 1497 perf-1497.map 8.83% 6 gnome-shell: 1497 perf-1497.map 8.81% 10 gnome-shell: 1497 perf-1497.map 7.44% 5 gnome-shell: 1497 perf-1497.map 6.20% 3 gnome-shell: 1497 perf-1497.map 5.10% 0 gnome-shell: 1497 perf-1497.map Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Fix --sort pid outputNamhyung Kim2013-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "pid" sort key prints "Command: Pid" output but it's misaligned. It's because of the offset of 6 was added to the column length during the calculation in order to reserve an space for Pid part but it isn't honored when printed. The output before this patch was like this: # Overhead Command: Pid Shared Object # ........ ............. ................. # 99.70% noploop:17814 noploop 0.29% noploop:17814 [kernel.kallsyms] 0.01% noploop:17814 ld-2.15.so Fix it by subtracting 6 for printing comm part. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Get rid of unnecessary __maybe_unusedNamhyung Kim2013-01-241-7/+4
| | | | | | | | | | | | | | | | Some functions have set __maybe_unused on its arguments that are used actually. Remove them. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Move misplaced sort entry functionsNamhyung Kim2013-01-241-59/+60
| | | | | | | | | | | | | | | | | | Some functions are misplaced along with other entries. Move them to a right place so that it can be found together with related functions. No functional change intended. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: remove redundant checks from _sort__sym_cmpSasha Levin2013-01-241-4/+2
| | | | | | | | | | | | | | | We already check that sym_l and sum_r are non-NULLs, no need to do it twice. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356030701-16284-12-git-send-email-sasha.levin@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove warnings on JIT samples for srcline sort keyNamhyung Kim2012-10-161-0/+3
| | | | | | | | | | | | | | | | | | | When using the srcline sort key with perf report, I see many lines of warning related to JIT samples like below: addr2line: '/tmp/perf-1397.map': No such file Since it's not a ELF binary and doesn't provide such information, just use the raw ip address. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1350272383-7016-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix segfault when using srcline sort keyNamhyung Kim2012-10-161-0/+3
| | | | | | | | | | | | | | | The srcline sort key is for grouping samples based on their source file and line number. It use addr2line tool to get the information but it requires dso name. It caused a segfault when a sample does not have the name by dereferencing a NULL pointer. Fix it by using raw ip addresses for those samples. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1350272383-7016-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add sort__has_symNamhyung Kim2012-09-171-0/+5
| | | | | | | | | | | | The sort__has_sym variable is for checking whether the sort_list includes 'symbol' as a sort key. It will be used for later patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1347611729-16994-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Use __maybe_used for unused variablesIrina Tirdea2012-09-111-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Replace sort's standalone field_sep with symbol_conf.field_sepJiri Olsa2012-09-071-4/+2
| | | | | | | | | | | | | | | | | | | | The repsep_snprintf function was still using standalone field_sep, which not even set anymore. Replacing it with 'symbol_conf.field_sep'. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1346946426-13496-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add sort by src line/numberArnaldo Carvalho de Melo2012-06-191-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using addr2line for now, requires debuginfo, needs more work to support detached debuginfo, aka foo-debuginfo packages. Example: [root@sandy ~]# perf record -a sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.555 MB perf.data (~24236 samples) ] [root@sandy ~]# perf report -s dso,srcline 2>&1 | grep -v ^# | head -5 22.41% [kernel.kallsyms] /home/git/linux/drivers/idle/intel_idle.c:280 4.79% [kernel.kallsyms] /home/git/linux/drivers/cpuidle/cpuidle.c:148 4.78% [kernel.kallsyms] /home/git/linux/arch/x86/include/asm/atomic64_64.h:121 4.49% [kernel.kallsyms] /home/git/linux/kernel/sched/core.c:1690 4.30% [kernel.kallsyms] /home/git/linux/include/linux/seqlock.h:90 [root@sandy ~]# [root@sandy ~]# perf top -U -s dso,symbol,srcline Samples: 1K of event 'cycles', Event count (approx.): 589617389 18.66% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:143 7.83% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:39 6.59% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:38 3.66% [kernel] [k] page_fault /home/git/linux/arch/x86/kernel/entry_64.S:1379 3.25% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:40 3.12% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:37 2.74% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:36 2.39% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:43 2.12% [kernel] [k] ioread32 /home/git/linux/lib/iomap.c:90 1.51% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:144 1.19% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:154 Suggested-by: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pdmqbng9twz06jzkbgtuwbp8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2012-03-201-51/+236
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events changes for v3.4 from Ingo Molnar: - New "hardware based branch profiling" feature both on the kernel and the tooling side, on CPUs that support it. (modern x86 Intel CPUs with the 'LBR' hardware feature currently.) This new feature is basically a sophisticated 'magnifying glass' for branch execution - something that is pretty difficult to extract from regular, function histogram centric profiles. The simplest mode is activated via 'perf record -b', and the result looks like this in perf report: $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf This output shows from/to branch columns and shows the highest percentage (from,to) jump combinations - i.e. the most likely taken branches in the system. "branches" can also include function calls and any other synchronous and asynchronous transitions of the instruction pointer that are not 'next instruction' - such as system calls, traps, interrupts, etc. This feature comes with (hopefully intuitive) flat ascii and TUI support in perf report. - Various 'perf annotate' visual improvements for us assembly junkies. It will now recognize function calls in the TUI and by hitting enter you can follow the call (recursively) and back, amongst other improvements. - Multiple threads/processes recording support in perf record, perf stat, perf top - which is activated via a comma-list of PIDs: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 - Support for per UID views, via the --uid paramter to perf top, perf report, etc. For example 'perf top --uid mingo' will only show the tasks that I am running, excluding other users, root, etc. - Jump label restructurings and improvements - this includes the factoring out of the (hopefully much clearer) include/linux/static_key.h generic facility: struct static_key key = STATIC_KEY_INIT_FALSE; ... if (static_key_false(&key)) do unlikely code else do likely code ... static_key_slow_inc(); ... static_key_slow_inc(); ... The static_key_false() branch will be generated into the code with as little impact to the likely code path as possible. the static_key_slow_*() APIs flip the branch via live kernel code patching. This facility can now be used more widely within the kernel to micro-optimize hot branches whose likelihood matches the static-key usage and fast/slow cost patterns. - SW function tracer improvements: perf support and filtering support. - Various hardenings of the perf.data ABI, to make older perf.data's smoother on newer tool versions, to make new features integrate more smoothly, to support cross-endian recording/analyzing workflows better, etc. - Restructuring of the kprobes code, the splitting out of 'optprobes', and a corner case bugfix. - Allow the tracing of kernel console output (printk). - Improvements/fixes to user-space RDPMC support, allowing user-space self-profiling code to extract PMU counts without performing any system calls, while playing nice with the kernel side. - 'perf bench' improvements - ... and lots of internal restructurings, cleanups and fixes that made these features possible. And, as usual this list is incomplete as there were also lots of other improvements * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits) perf report: Fix annotate double quit issue in branch view mode perf report: Remove duplicate annotate choice in branch view mode perf/x86: Prettify pmu config literals perf report: Enable TUI in branch view mode perf report: Auto-detect branch stack sampling mode perf record: Add HEADER_BRANCH_STACK tag perf record: Provide default branch stack sampling mode option perf tools: Make perf able to read files from older ABIs perf tools: Fix ABI compatibility bug in print_event_desc() perf tools: Enable reading of perf.data files from different ABI rev perf: Add ABI reference sizes perf report: Add support for taken branch sampling perf record: Add support for sampling taken branch perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK x86/kprobes: Split out optprobe related code to kprobes-opt.c x86/kprobes: Fix a bug which can modify kernel code permanently x86/kprobes: Fix instruction recovery on optimized path perf: Add callback to flush branch_stack on context switch perf: Disable PERF_SAMPLE_BRANCH_* when not supported perf/x86: Add LBR software filter support for Intel CPUs ...
| * perf report: Auto-detect branch stack sampling modeStephane Eranian2012-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enhances perf report to auto-detect when the perf.data file contains samples with branch stacks. That way it is not necessary to use the -b option. To force branch view mode to off, simply use --no-branch-stack. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf tools: Add code to support PERF_SAMPLE_BRANCH_STACKRoberto Agostino Vitillo2012-03-091-51/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds: - ability to parse samples with PERF_SAMPLE_BRANCH_STACK - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict) - build histograms on branches Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf tools: Incorrect use of snprintf results in SEGVAnton Blanchard2012-03-141-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have a workload where perf top scribbles over the stack and we SEGV. What makes it interesting is that an snprintf is causing this. The workload is a c++ gem that has method names over 3000 characters long, but snprintf is designed to avoid overrunning buffers. So what went wrong? The problem is we assume snprintf returns the number of characters written: ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level); ... ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name); Unfortunately this is not how snprintf works. snprintf returns the number of characters that would have been written if there was enough space. In the above case, if the first snprintf returns a value larger than size, we pass a negative size into the second snprintf and happily scribble over the stack. If you have 3000 character c++ methods thats a lot of stack to trample. This patch fixes repsep_snprintf by clamping the value at size - 1 which is the maximum snprintf can write before adding the NULL terminator. I get the sinking feeling that there are a lot of other uses of snprintf that have this same bug, we should audit them all. Cc: David Ahern <dsahern@gmail.com> Cc: Eric B Munson <emunson@mgebm.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20120307114249.44275ca3@kryten Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists browser: Elide DSO column when it is set to just one DSO, ditto ↵Arnaldo Carvalho de Melo2011-10-201-1/+3
| | | | | | | | | | | | | | | | for threads And also no leed to show the [.] (level: k, . for userspace) when showing just one DSO. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4h3f6ro5o7ebepjbssxf0dd3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Fix symbol sort output by separating unresolved samples by typeAnton Blanchard2011-09-231-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I took a profile that suggested 60% of total CPU time was in the hypervisor: ... 60.20% [H] 0x33d43c 4.43% [k] ._spin_lock_irqsave 1.07% [k] ._spin_lock Using perf stat to get the user/kernel/hypervisor breakdown contradicted this. The problem is we merge all unresolved samples into the one unknown bucket. If add a comparison by sample type to sort__sym_cmp we get the real picture: ... 57.11% [.] 0x80fbf63c 4.43% [k] ._spin_lock_irqsave 1.07% [k] ._spin_lock 0.65% [H] 0x33d43c So it was almost all userspace, not hypervisor as the initial profile suggested. I found another issue while adding this. Symbol sorting sometimes shows multiple entries for the unknown bucket: ... 16.65% [.] 0x6cd3a8 7.25% [.] 0x422460 5.37% [.] yylex 4.79% [.] malloc 4.78% [.] _int_malloc 4.03% [.] _int_free 3.95% [.] hash_source_code_string 2.82% [.] 0x532908 2.64% [.] 0x36b538 0.94% [H] 0x8000000000e132a4 0.82% [H] 0x800000000000e8b0 This happens because we aren't consistent with our sorting. On one hand we check to see if both symbols match and for two unresolved samples sym is NULL so we match: if (left->ms.sym == right->ms.sym) return 0; On the other hand we use sample IP for unresolved samples when comparing against a symbol: ip_l = left->ms.sym ? left->ms.sym->start : left->ip; ip_r = right->ms.sym ? right->ms.sym->start : right->ip; This means unresolved samples end up spread across the rbtree and we can't merge them all. If we use cmp_null all unresolved samples will end up in the one bucket and the output makes more sense: ... 39.12% [.] 0x36b538 5.37% [.] yylex 4.79% [.] malloc 4.78% [.] _int_malloc 4.03% [.] _int_free 3.95% [.] hash_source_code_string 2.26% [H] 0x800000000000e8b0 Acked-by: Eric B Munson <emunson@mgebm.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ian Munsie <imunsie@au1.ibm.com> Link: http://lkml.kernel.org/r/20110831115145.4f598ab2@kryten Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Allow sort dimensions to be registered more than onceFrederic Weisbecker2011-06-301-6/+6
| | | | | | | | | | | | | | | | | | | | So that the parent sort dimension can be registered twice: once if we add it as an explicit sort dimension (-s parent) and twice if we request a parent filter (-p foo). We'll have only one parent sort dimension in the end but this allows to override the default parent filter with we gave in "-p" option. The goal of this is to prepare to allow the use of "-s parent" and "-p foo" at the same time, ie: sort by filtered parent. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Sam Liao <phyomh@gmail.com>
* perf tools: Make sort operations staticFrederic Weisbecker2011-06-301-112/+99
| | | | | | | | | | | | These don't need to be globally visible. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Sam Liao <phyomh@gmail.com>
OpenPOWER on IntegriCloud