summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* perf callchain: Start moving away from global per thread cursorsArnaldo Carvalho de Melo2016-04-149-22/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent perf_evsel__fprintf_callchain() move to evsel.c added several new symbol requirements to the python binding, for instance: # perf test -v python 16: Try 'import perf' in python, checking link problems : --- start --- test child forked, pid 18030 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: callchain_cursor test child finished with -1 ---- end ---- Try 'import perf' in python, checking link problems: FAILED! # This would require linking against callchain.c to access to the global callchain_cursor variables. Since lots of functions already receive as a parameter a callchain_cursor struct pointer, make that be the case for some more function so that we can start phasing out usage of yet another global variable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Move socket_type beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2016-04-142-59/+61
| | | | | | | | | | | | To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ao91htwxdqwlwxr47gbluou1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-20160414' of ↵Ingo Molnar2016-04-1412-221/+611
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements from Arnaldo Carvalho de Melo: User visible changes: - Introduce 'perf record --timestamp-filename', to add a timestamp at the end of the 'perf data' file. Will get added value when the patch to make 'perf.data' file snapshots gets merged (Wang Nan) - Fix display of variables present in both --config and --user in 'perf list' (Taeung Song) Build fixes: - Add seccomp and getradom beautifier related defines to fix the build in older systems where those definitions are not available (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf config: Make show_config() use perf_config_setTaeung Song2016-04-141-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently show_config() has a problem when user and system config files have the same config variables i.e.: # cat ~/.perfconfig [top] children = false When $(sysconfdir) is /usr/local/etc # cat /usr/local/etc/perfconfig [top] children = true Before: # perf config --user --list top.children=false # perf config --system --list top.children=true # perf config --list top.children=true top.children=false Because perf_config() can call show_config() each the config file (user and system). Fix it. After: # perf config --user --list top.children=false # perf config --system --list top.children=true # perf config --list top.children=false Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1460620401-23430-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf config: Introduce perf_config_set classTaeung Song2016-04-142-0/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This infrastructure code was designed for upcoming features of 'perf config'. That collect config key-value pairs from user and system config files (i.e. user wide ~/.perfconfig and system wide $(sysconfdir)/perfconfig) to manage perf's configs. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1460620401-23430-2-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf record: Add '--timestamp-filename' option to append timestamp to output ↵Wang Nan2016-04-141-4/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | file name This option appends current timestamp to the output file name. For example: # perf record -a --timestamp-filename ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622265847 ] [ perf record: Captured and wrote 0.742 MB perf.data.<timestamp> (90 samples) ] # ls perf.data.201512262226584 The timestamp will be useful for identifying each perf.data after the 'perf record' support for generating multiple output files gets introduced. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf record: Turns auxtrace_snapshot_enable into 3 statesWang Nan2016-04-141-10/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | auxtrace_snapshot_enable has only two states (0/1). Turns it into a triple states enum so SIGUSR2 handler can safely do other works without triggering auxtrace snapshot. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf data: Add perf_data_file__switch() helperWang Nan2016-04-142-1/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_data_file__switch() closes current output file, renames it, then open a new one to continue recording. It will be used by 'perf record' to split output into multiple perf.data files. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-3-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf session: Make ordered_events reusableWang Nan2016-04-141-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ordered_events__free() leaves linked lists and timestamps not cleared, so unable to be reused after ordered_events__free(). Which is inconvenient after 'perf record' supports generating multiple perf.data output and process build-ids for each of them. Use ordered_events__reinit() for this. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Split from larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf ordered_events: Introduce reinit()Wang Nan2016-04-142-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'perf record' will use this when outputting multiple perf.data files. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Split from larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Move eventfd beautifiers to trace/beauty/ directoryArnaldo Carvalho de Melo2016-04-142-40/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | To better organize all these beautifiers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-zrw5zz7cnrs44o5osouyutvt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Move mmap beautifiers to trace/beauty/ directoryArnaldo Carvalho de Melo2016-04-142-158/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | To better organize all these beautifiers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-zbr27mdy9ssdhux3ib2nfa7j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Add getrandom beautifier related defines for older systemsArnaldo Carvalho de Melo2016-04-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Were the detached tarball (make perf-tar-src-pkg) build was failing because those definitions aren't available in the system headers. On RHEL7, for instance: builtin-trace.c: In function ‘syscall_arg__scnprintf_getrandom_flags’: builtin-trace.c:1113:14: error: ‘GRND_RANDOM’ undeclared (first use in this function) P_FLAG(RANDOM); ^ builtin-trace.c:1114:14: error: ‘GRND_NONBLOCK’ undeclared (first use in this function) P_FLAG(NONBLOCK); ^ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-r8496g24a3kbqynvk6617b0e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Add seccomp beautifier related defines for older systemsArnaldo Carvalho de Melo2016-04-141-0/+11
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Were the detached tarball (make perf-tar-src-pkg) build was failing because those definitions aren't available in the system headers. On RHEL7, for instance: builtin-trace.c: In function ‘syscall_arg__scnprintf_seccomp_op’: builtin-trace.c:1069:7: error: ‘SECCOMP_SET_MODE_STRICT’ undeclared (first use in this function) P_SECCOMP_SET_MODE_OP(STRICT); ^ builtin-trace.c:1069:7: note: each undeclared identifier is reported only once for each function it appears in builtin-trace.c:1070:7: error: ‘SECCOMP_SET_MODE_FILTER’ undeclared (first use in this function) P_SECCOMP_SET_MODE_OP(FILTER); ^ builtin-trace.c: In function ‘syscall_arg__scnprintf_seccomp_flags’: builtin-trace.c:1091:14: error: ‘SECCOMP_FILTER_FLAG_TSYNC’ undeclared (first use in this function) P_FLAG(TSYNC); ^ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4f8dzzwd7g6l5dzz693u7kul@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-20160413' of ↵Ingo Molnar2016-04-1312-178/+416
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Print callchains asked for events requested via 'perf trace --event' too: (Arnaldo Carvalho de Melo) # trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1 0.346 (0.005 ms): usleep/24428 nanosleep(rqtp: 0x7fffa15a0540) ... 0.346 ( ): sched:sched_switch:usleep:24428 [120] S ==> swapper/3:0 [120]) __schedule+0xfe200402 ([kernel.kallsyms]) schedule+0xfe200035 ([kernel.kallsyms]) do_nanosleep+0xfe20006f ([kernel.kallsyms]) hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms]) sys_nanosleep+0xfe20007a ([kernel.kallsyms]) do_syscall_64+0xfe200062 ([kernel.kallsyms]) return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms]) __nanosleep+0xffff005b8d602010 (/usr/lib64/libc-2.22.so) 0.400 (0.059 ms): usleep/24428 ... [continued]: nanosleep()) = 0 __nanosleep+0x10 (/usr/lib64/libc-2.22.so) usleep+0x34 (/usr/lib64/libc-2.22.so) main+0x1eb (/usr/bin/usleep) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) _start+0x29 (/usr/bin/usleep) - Allow requesting that some CPUs or PIDs be highlighted in 'perf sched map' (Jiri Olsa) - Compact 'perf sched map' to show just CPUs with activity, improving the output in high core count systems (Jiri Olsa) - Fix segfault with 'perf trace --no-syscalls -e syscall-names' by bailing out such request, doesn't make sense to ask for no syscalls and then specify which ones should be printed (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf trace: Do not accept --no-syscalls together with -eArnaldo Carvalho de Melo2016-04-131-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doesn't make sense and was causing a segfault, fix it. # trace -e clone --no-syscalls --event sched:*exec firefox The -e option can't be used with --no-syscalls. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ccrahezikdk2uebptzr1eyyi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Move some methods from session.[ch] to evsel.[ch]Arnaldo Carvalho de Melo2016-04-136-153/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Those were converted to be evsel methods long ago, move the source to where it belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vja8rjmkw3gd5ungaeyb5s2j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched map: Display only given cpusJiri Olsa2016-04-132-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introducing --cpus option that will display only given cpus. Could be used together with color-cpus option. $ perf sched map --cpus 0,1 *A0 309999.786924 secs A0 => rcu_sched:7 *. 309999.786930 secs *B0 . 309999.786931 secs B0 => rcuos/2:25 B0 *A0 309999.786947 secs Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-9-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched map: Color given cpusJiri Olsa2016-04-132-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding --color-cpus option to display selected cpus with background color (red by default). It helps on navigating through the perf sched map output. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-8-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched map: Color given pidsJiri Olsa2016-04-132-4/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding --color-pids option to display selected pids in color (blue by default). It helps on navigating through the 'perf sched map' output. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-7-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf thread_map: Make new_by_tid_str constructor publicJiri Olsa2016-04-132-1/+3
| | | | | | | | | | | | | | | | | | | | | | It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched: Use color_fprintf for outputJiri Olsa2016-04-131-8/+10
| | | | | | | | | | | | | | | | | | | | | | As preparation for next patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched: Add compact display optionJiri Olsa2016-04-132-6/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add compact map display that does not output the whole cpu matrix, only cpus that got event. $ perf sched map --compact *A0 1082427.094098 secs A0 => perf:19404 (CPU 2) A0 *. 1082427.094127 secs . => swapper:0 (CPU 1) A0 . *B0 1082427.094174 secs B0 => rcuos/2:25 (CPU 3) A0 . *. 1082427.094177 secs *C0 . . 1082427.094187 secs C0 => migration/2:21 C0 *A0 . 1082427.094193 secs *. A0 . 1082427.094195 secs *D0 A0 . 1082427.094402 secs D0 => rngd:968 *. A0 . 1082427.094406 secs . *E0 . 1082427.095221 secs E0 => kworker/1:1:5333 . E0 *F0 1082427.095227 secs F0 => xterm:3342 It helps to display sane output for small thread loads on big cpu servers. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-4-git-send-email-jolsa@kernel.org [ Add entry in 'perf sched' man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf cpu_map: Add has() methodJiri Olsa2016-04-132-0/+14
| | | | | | | | | | | | | | | | | | | | | | Adding cpu_map__has() to return bool of cpu presence in cpus map. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf thread_map: Add has() methodJiri Olsa2016-04-132-0/+13
| | | | | | | | | | | | | | | | | | | | | | Adding thread_map__has() to return bool of pid presence in threads map. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Support callchains for --event tooArnaldo Carvalho de Melo2016-04-131-15/+26
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already were able to ask for callchains for a specific event: # trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1 This would enable tracing just the "nanosleep" syscall, with callchains at syscall exit and would ask the kernel for frame pointer callchains to be enabled for the "sched:sched_switch" tracepoint event, its just that we were not resolving the callchain and printing it in 'perf trace', do it: # trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1 0.425 ( 0.013 ms): usleep/6718 nanosleep(rqtp: 0x7ffcc1d16e20) ... 0.425 ( ): sched:sched_switch:usleep:6718 [120] S ==> swapper/2:0 [120]) __schedule+0xfe200402 ([kernel.kallsyms]) schedule+0xfe200035 ([kernel.kallsyms]) do_nanosleep+0xfe20006f ([kernel.kallsyms]) hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms]) sys_nanosleep+0xfe20007a ([kernel.kallsyms]) do_syscall_64+0xfe200062 ([kernel.kallsyms]) return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms]) __nanosleep+0xffff008b8cbe2010 (/usr/lib64/libc-2.22.so) 0.486 ( 0.073 ms): usleep/6718 ... [continued]: nanosleep()) = 0 __nanosleep+0x10 (/usr/lib64/libc-2.22.so) usleep+0x34 (/usr/lib64/libc-2.22.so) main+0x1eb (/usr/bin/usleep) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) _start+0x29 (/usr/bin/usleep) # Pretty compact, huh? DWARF callchains for raw_syscalls:sys_exit + frame pointer callchains for a tracepoint, if your hardware supports LBR, go wild with /call-graph=lbr/, guess the next step is to lift this from 'perf script': -F, --fields <str> comma separated output fields prepend with 'type:'. Valid types: hw,sw,trace,raw. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,period,iregs,brstack,brstacksym,flags Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2e7yiv5hqdm8jywlmfivvx2v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf/x86/amd/uncore: Do not register a task ctx for uncore PMUsPeter Zijlstra2016-04-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The new sanity check introduced by: 26657848502b ("perf/core: Verify we have a single perf_hw_context PMU") ... triggered on the AMD uncore driver. Uncore PMUs are per node, they cannot have per-task counters. Fix it. Reported-by: Borislav Petkov <bp@suse.de> Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: alexander.shishkin@linux.intel.com Cc: eranian@google.com Cc: jolsa@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: vincent.weaver@maine.edu Link: http://lkml.kernel.org/r/20160404140208.GA3448@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf/x86/intel/pt: Use boot_cpu_has() because it's thereAlexander Shishkin2016-04-131-1/+1
| | | | | | | | | | | | | | | | | | | At the moment, initialization path is using test_cpu_cap(&boot_cpu_data), to detect PT, which is just open coding boot_cpu_has(). Use the latter instead. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/1459953307-14372-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* uprobes/x86: Constify uprobe_xol_ops structuresJulia Lawall2016-04-131-2/+2
| | | | | | | | | | | | | | | | | The uprobe_xol_ops structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/1460200649-32526-1-git-send-email-Julia.Lawall@lip6.fr Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge tag 'perf-core-for-mingo-20160411' of ↵Ingo Molnar2016-04-1328-146/+487
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements from Arnaldo Carvalho de Melo: User visible changes: - Automagically create a 'bpf-output' event, easing the setup of BPF C "scripts" that produce output via the perf ring buffer. Now it is just a matter of calling any perf tool, such as 'trace', with a C source file that references the __bpf_stdout__ output channel and that channel will be created and connected to the script: # trace -e nanosleep --event test_bpf_stdout.c usleep 1 0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ... 0.013 ( ): __bpf_stdout__:Raise a BPF event!..) 0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.261 ( ): __bpf_stdout__:Raise a BPF event!..) 0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0 # Further work is needed to reduce the number of lines in a perf bpf C source file, this being the part where we greatly reduce the command line setup (Wang Nan) - 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI processing. This reduces the overhead by asking just for userspace callchains and also only for the syscall exit tracepoint (raw_syscalls:sys_exit) (Milian Wolff, Arnaldo Carvalho de Melo) Try it with, for instance: # perf trace --call dwarf ping 127.0.0.1 An excerpt of a system wide 'perf trace --call dwarf" session is at: https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt You may need to bump the number of mmap pages, using -m/--mmap-pages, but on a Broadwell machine the defaults allowed system wide tracing to work without losing that many records, experiment with just some syscalls, like: # perf trace --call dwarf -e nanosleep,futex All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu, etc) should work. Also --duration may be interesting to try. To get filenames from in various syscalls pointer args (open, ettc), add this to the mix: # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string' Making this work is next in line: # trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1 I.e. honouring per-tracepoint callchains in 'perf trace' in addition to in raw_syscalls:sys_exit. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf trace: Print unresolved symbol names as addressesArnaldo Carvalho de Melo2016-04-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of having "[unknown]" as the name used for unresolved symbols, use the address in the callchain, in hexadecimal form: 28.801 ( 0.007 ms): qemu-system-x8/10065 ppoll(ufds: 0x55c98b39e400, nfds: 72, tsp: 0x7fffe4e4fe60, sigsetsize: 8) = 0 Timeout ppoll+0x91 (/usr/lib64/libc-2.22.so) [0x337309] (/usr/bin/qemu-system-x86_64) [0x336ab4] (/usr/bin/qemu-system-x86_64) main+0x1724 (/usr/bin/qemu-system-x86_64) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) [0xc59a9] (/usr/bin/qemu-system-x86_64) 35.265 (14.805 ms): gnome-shell/2287 ... [continued]: poll()) = 1 [0xf6fdd] (/usr/lib64/libc-2.22.so) g_main_context_iterate.isra.29+0x17c (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2) meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0) main+0x3f7 (/usr/bin/gnome-shell) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) [0x2909] (/usr/bin/gnome-shell) Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Allow unresolved symbol names to be printed as addressesArnaldo Carvalho de Melo2016-04-114-13/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fprintf_sym() and fprintf_callchain() methods now allow users to change the existing behaviour of showing "[unknown]" as the name of unresolved symbols to instead show "[0x123456]", i.e. its address. The current patch doesn't change tools to use this facility, the results from 'perf trace' and 'perf script' cotinue like: 70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout [unknown] (/usr/lib64/libc-2.22.so) [unknown] (/usr/lib64/libspice-server.so.1.10.0) [unknown] (/usr/lib64/libspice-server.so.1.10.0) [unknown] (/usr/lib64/libspice-server.so.1.10.0) start_thread+0xca (/usr/lib64/libpthread-2.22.so) __clone+0x6d (/usr/lib64/libc-2.22.so) The next patch will make 'perf trace' use the new formatting. Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Make "--call-graph" affect just "raw_syscalls:sys_exit"Arnaldo Carvalho de Melo2016-04-111-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need the callchains at the syscall enter tracepoint, just when finishing it at syscall exit, so reduce the overhead by asking for callchains just at syscall exit. Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Rename config_callgraph() to config_callchain() and make it publicArnaldo Carvalho de Melo2016-04-112-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rename is for consistency with the parameter name. Make it public for fine grained control of which evsels should have callchains enabled, like, for instance, will be done in the next changesets in 'perf trace', to enable callchains just on the "raw_syscalls:sys_exit" tracepoint. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evlist: Add (reset,set)_sample_bit methodsArnaldo Carvalho de Melo2016-04-112-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | For fiddling with sample_type fields in all evsels in an evlist. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Do not use globals in config()Arnaldo Carvalho de Melo2016-04-1115-18/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead receive a callchain_param pointer to configure callchain aspects, not doing so if NULL is passed. This will allow fine grained control over which evsels in an evlist gets callchains enabled. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Exclude the kernel part of the callchain leading to a syscallArnaldo Carvalho de Melo2016-04-112-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel parts are not that useful: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0 syscall_slow_exit_work ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # So lets just use perf_event_attr.exclude_callchain_kernel to avoid collecting it in the ring buffer: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0 __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Introduce fprintf_callchain() method out of fprintf_sym()Arnaldo Carvalho de Melo2016-04-113-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 'perf trace' we're just interested in printing callchains, and we don't want to use the symbol_conf.use_callchain, so move the callchain part to a new method. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Rename print_ip() to fprintf_sym()Arnaldo Carvalho de Melo2016-04-114-43/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it receives a FILE, and its more than just the IP, which can even be requested not to be printed. For consistency with other similar methods in tools/perf/, name it as perf_evsel__fprintf_sym() and make it return the number of bytes printed, just like 'fprintf(3)' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf trace: Add support for printing call chains on sys_exit events.Milian Wolff2016-04-112-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, one can print the call chain for every encountered sys_exit event, e.g.: $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep 1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0 syscall_slow_exit_work ([kernel.kallsyms]) syscall_return_slowpath ([kernel.kallsyms]) int_ret_from_sys_call ([kernel.kallsyms]) __nanosleep (/usr/lib/libc-2.23.so) [unknown] (/usr/lib/libQt5Core.so.5.6.0) QThread::sleep (/usr/lib/libQt5Core.so.5.6.0) main (path/to/ex_sleep) __libc_start_main (/usr/lib/libc-2.23.so) _start (path/to/ex_sleep) Note that it is advised to increase the number of mmap pages to prevent event losses when using this new feature. Often, adding `-m 10M` to the `perf trace` invocation is enough. This feature is also available in strace when built with libunwind via `strace -k`. Performance wise, this solution is much better: $ time find path/to/linux &> /dev/null real 0m0.051s user 0m0.013s sys 0m0.037s $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null real 0m2.624s user 0m1.203s sys 0m1.333s $ time strace -k find path/to/linux &> /dev/null real 0m35.398s user 0m10.403s sys 0m23.173s Note that it is currently not possible to configure the print output. Adding such a feature, similar to what is available in `perf script` via its `--fields` knob can be added later on. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com [ Split from a larger patch, do not print the IP, left align, remove dup call symbol__init(), added man page entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Allow passing a left alignment when printing a symbolArnaldo Carvalho de Melo2016-04-113-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | For callchains, etc where we want it to align just below the syscall name, for instance, in 'perf trace' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Allow specifying a file to output in perf_evsel__print_ipMilian Wolff2016-04-113-21/+25
| | | | | | | | | | | | | | | | | | As this function will be used in 'perf trace'. Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org [ Split from a larger patch ] Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
| * perf bpf: Automatically create bpf-output event __bpf_stdout__Wang Nan2016-04-111-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the need to set a bpf-output event in cmdline. By referencing a map named '__bpf_stdout__', perf automatically creates an event for it. For example: # perf record -e ./test_bpf_trace.c usleep 100000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ] # perf script usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" perf record -e ./test_bpf_trace.c usleep 100000 equals to: perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \ -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \ usleep 100000 Where test_bpf_trace.c is: /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; char err_str[] = "BAD %d\n"; int err; err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), &output_str, sizeof(output_str)); if (err) trace_printk(err_str, sizeof(err_str), err); return 1; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Committer note: Testing with 'perf trace': # trace -e nanosleep --ev test_bpf_stdout.c usleep 1 0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ... 0.007 ( ): __bpf_stdout__:Raise a BPF event!..) 0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.069 ( ): __bpf_stdout__:Raise a BPF event!..) 0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.072 ( 0.072 ms): usleep/729 ... [continued]: nanosleep()) = 0 # Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf bpf: Clone bpf stdout events in multiple bpf scriptsWang Nan2016-04-114-0/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows cloning bpf-output event configuration among multiple bpf scripts. If there exist a map named '__bpf_output__' and not configured using 'map:__bpf_output__.event=', this patch clones the configuration of another '__bpf_stdout__' map. For example, following command: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c usleep 100000 equals to: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \ usleep 100000 Signed-off-by: Wang Nan <wangnan0@huawei.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf dwarf: Guard !x86_64 definitions under #ifdef else clauseArnaldo Carvalho de Melo2016-04-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17): CC /tmp/build/perf/arch/x86/util/dwarf-regs.o arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=] static const struct pt_regs_offset x86_32_regoffset_table[] = { ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo2016-04-081-30/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when parsing tracepoint event definitions, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo2016-04-081-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf thread_map: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo2016-04-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in thread_map, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf script: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo2016-04-081-36/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in 'perf script', so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge tag 'perf-core-for-mingo-20160408' of ↵Ingo Molnar2016-04-1329-117/+1061
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Beautify more syscall arguments in 'perf trace', using the type column in tracepoint /format fields to attach, for instance, a pid_t resolver to the thread COMM, also attach a mode_t beautifier in the same fashion (Arnaldo Carvalho de Melo) - Build the syscall table id <-> name resolver using the same .tbl file used in the kernel to generate headers, to avoid the delay in getting new syscalls supported in the audit-libs external dependency, done so far only for x86_64 (Arnaldo Carvalho de Melo) - Improve the documentation of event specifications (Andi Kleen) - Process update events in 'perf script', fixing up this use case: # perf stat -a -I 1000 -e cycles record | perf script -s script.py - Shared object symbol adjustment fixes, fixing symbol resolution in Android (Wang Nan) Infrastructure changes: - Add dedicated unwind addr_space member into thread struct, to allow tools to use thread->priv, noticed while working on having callchains in 'perf trace' (Jiri Olsa) Build fixes: - Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
OpenPOWER on IntegriCloud