From 67540759151aefafddade3e27c4671ab7b3d230f Mon Sep 17 00:00:00 2001 From: Milian Wolff Date: Tue, 16 Aug 2016 17:39:26 +0200 Subject: perf unwind: Use addr_location::addr instead of ip for entries This fixes the srcline translation for call chains of user space applications. Before we got: perf report --stdio --no-children -s sym,srcline -g address 8.92% [.] main mandelbrot.h:41 | |--3.70%--main +8390240 | __libc_start_main +139950056726769 | _start +8388650 | |--2.74%--main +8390189 | --2.08%--main +8390296 __libc_start_main +139950056726769 _start +8388650 7.59% [.] main complex:1326 | |--4.79%--main +8390203 | __libc_start_main +139950056726769 | _start +8388650 | --2.80%--main +8390219 7.12% [.] __muldc3 libgcc2.c:1945 | |--3.76%--__muldc3 +139950060519490 | main +8390224 | __libc_start_main +139950056726769 | _start +8388650 | --3.32%--__muldc3 +139950060519512 main +8390224 With this patch applied, we instead get: perf report --stdio --no-children -s sym,srcline -g address 8.92% [.] main mandelbrot.h:41 | |--3.70%--main mandelbrot.h:41 | __libc_start_main +241 | _start +4194346 | |--2.74%--main mandelbrot.h:41 | --2.08%--main mandelbrot.h:41 __libc_start_main +241 _start +4194346 7.59% [.] main complex:1326 | |--4.79%--main complex:1326 | __libc_start_main +241 | _start +4194346 | --2.80%--main complex:1326 7.12% [.] __muldc3 libgcc2.c:1945 | |--3.76%--__muldc3 libgcc2.c:1945 | main mandelbrot.h:39 | __libc_start_main +241 | _start +4194346 | --3.32%--__muldc3 libgcc2.c:1945 main mandelbrot.h:39 Suggested-and-Acked-by: Namhyung Kim Signed-off-by: Milian Wolff Tested-by: Arnaldo Carvalho de Melo LPU-Reference: 20160816153926.11288-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/unwind-libdw.c | 2 +- tools/perf/util/unwind-libunwind-local.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c index cf5e250..783a53f 100644 --- a/tools/perf/util/unwind-libdw.c +++ b/tools/perf/util/unwind-libdw.c @@ -66,7 +66,7 @@ static int entry(u64 ip, struct unwind_info *ui) if (__report_module(&al, ip, ui)) return -1; - e->ip = ip; + e->ip = al.addr; e->map = al.map; e->sym = al.sym; diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c index 97c0f8f..20c2e57 100644 --- a/tools/perf/util/unwind-libunwind-local.c +++ b/tools/perf/util/unwind-libunwind-local.c @@ -542,7 +542,7 @@ static int entry(u64 ip, struct thread *thread, thread__find_addr_location(thread, PERF_RECORD_MISC_USER, MAP__FUNCTION, ip, &al); - e.ip = ip; + e.ip = al.addr; e.map = al.map; e.sym = al.sym; -- cgit v1.1 From 0215d59b154ab90c56c4fe49bc1deefe8bca18f1 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 18 Aug 2016 09:28:23 -0700 Subject: tools lib: Reinstate strlcpy() header guard with __UCLIBC__ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit perf tools build in recent kernels spews splat when cross compiling with uClibc: | CC util/alias.o | In file included from tools/perf/util/../ui/../util/cache.h:8:0, | from tools/perf/util/../ui/helpline.h:7, | from tools/perf/util/debug.h:8, | from arch/../util/cpumap.h:9, | from arch/../util/env.h:5, | from arch/common.h:4, | from arch/common.c:3: | tools/include/linux/string.h:12:15: warning: redundant redeclaration of ‘strlcpy’ [-Wredundant-decls] | extern size_t strlcpy(char *dest, const char *src, size_t size); ^ This is after commit 61a6445e463a31 ("tools lib: Guard the strlcpy() header with __GLIBC__"). The problem is uClibc also defines __GLIBC__ for exported headers for applications. So add that specific check to not trip for uClibc. Signed-off-by: Vineet Gupta Cc: Adrian Hunter Cc: Alexey Brodkin Cc: David Ahern Cc: Jiri Olsa Cc: Josh Poimboeuf Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Petri Gynther Cc: Wang Nan Cc: linux-snps-arc@lists.infradead.org Link: http://lkml.kernel.org/r/1471537703-16439-1-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/linux/string.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h index b968794..f436d24 100644 --- a/tools/include/linux/string.h +++ b/tools/include/linux/string.h @@ -8,7 +8,11 @@ void *memdup(const void *src, size_t len); int strtobool(const char *s, bool *res); -#ifdef __GLIBC__ +/* + * glibc based builds needs the extern while uClibc doesn't. + * However uClibc headers also define __GLIBC__ hence the hack below + */ +#if defined(__GLIBC__) && !defined(__UCLIBC__) extern size_t strlcpy(char *dest, const char *src, size_t size); #endif -- cgit v1.1 From c53412ee8c7ec31373a4176ff7f3a6b79296c05c Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 18 Aug 2016 16:30:28 -0300 Subject: perf evsel: Do not access outside hw cache name arrays We have to check if the values are >= *_MAX, not just >, fix it. From the bugzilla report: ''In file /tools/perf/util/evsel.c function __perf_evsel__hw_cache_name it appears that there is a bug that reads beyond the end of the buffer. The statement "if (type > PERF_COUNT_HW_CACHE_MAX)" allows type to be equal to the maximum value. Later, when statement "if (!perf_evsel__is_cache_op_valid(type, op))" is executed, the function can access array perf_evsel__hw_cache_stat[type] beyond the end of the buffer. It appears to me that the statement "if (type > PERF_COUNT_HW_CACHE_MAX)" should be "if (type >= PERF_COUNT_HW_CACHE_MAX)" Bug found with Coverity and manual code review. No attempts were made to execute the code with a maximum type value.'' Committer note: Testing it: $ perf record -e $(echo $(perf list cache | cut -d \[ -f1) | sed 's/ /,/g') usleep 1 [ perf record: Woken up 16 times to write data ] [ perf record: Captured and wrote 0.023 MB perf.data (34 samples) ] $ perf evlist L1-dcache-load-misses L1-dcache-loads L1-dcache-stores L1-icache-load-misses LLC-load-misses LLC-loads LLC-store-misses LLC-stores branch-load-misses branch-loads dTLB-load-misses dTLB-loads dTLB-store-misses dTLB-stores iTLB-load-misses iTLB-loads node-load-misses node-loads node-store-misses node-stores $ perf list cache List of pre-defined events (to be used in -e): L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event] node-store-misses [Hardware cache event] node-stores [Hardware cache event] $ Reported-by: Brian Sweeney Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=153351 Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/evsel.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index d9b80ef..21fd573 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -507,17 +507,17 @@ static int __perf_evsel__hw_cache_name(u64 config, char *bf, size_t size) u8 op, result, type = (config >> 0) & 0xff; const char *err = "unknown-ext-hardware-cache-type"; - if (type > PERF_COUNT_HW_CACHE_MAX) + if (type >= PERF_COUNT_HW_CACHE_MAX) goto out_err; op = (config >> 8) & 0xff; err = "unknown-ext-hardware-cache-op"; - if (op > PERF_COUNT_HW_CACHE_OP_MAX) + if (op >= PERF_COUNT_HW_CACHE_OP_MAX) goto out_err; result = (config >> 16) & 0xff; err = "unknown-ext-hardware-cache-result"; - if (result > PERF_COUNT_HW_CACHE_RESULT_MAX) + if (result >= PERF_COUNT_HW_CACHE_RESULT_MAX) goto out_err; err = "invalid-cache"; -- cgit v1.1 From 8b6a3fe8fab97716990a3abde1a01fb5a34552a3 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Wed, 24 Aug 2016 10:07:14 +0100 Subject: perf/core: Use this_cpu_ptr() when stopping AUX events When tearing down an AUX buf for an event via perf_mmap_close(), __perf_event_output_stop() is called on the event's CPU to ensure that trace generation is halted before the process of unmapping and freeing the buffer pages begins. The callback is performed via cpu_function_call(), which ensures that it runs with interrupts disabled and is therefore not preemptible. Unfortunately, the current code grabs the per-cpu context pointer using get_cpu_ptr(), which unnecessarily disables preemption and doesn't pair the call with put_cpu_ptr(), leading to a preempt_count() imbalance and a BUG when freeing the AUX buffer later on: WARNING: CPU: 1 PID: 2249 at kernel/events/ring_buffer.c:539 __rb_free_aux+0x10c/0x120 Modules linked in: [...] Call Trace: [] dump_stack+0x4f/0x72 [] __warn+0xc6/0xe0 [] warn_slowpath_null+0x18/0x20 [] __rb_free_aux+0x10c/0x120 [] rb_free_aux+0x13/0x20 [] perf_mmap_close+0x29e/0x2f0 [] ? perf_iterate_ctx+0xe0/0xe0 [] remove_vma+0x25/0x60 [] exit_mmap+0x106/0x140 [] mmput+0x1c/0xd0 [] do_exit+0x253/0xbf0 [] do_group_exit+0x3e/0xb0 [] get_signal+0x249/0x640 [] do_signal+0x23/0x640 [] ? _raw_write_unlock_irq+0x12/0x30 [] ? _raw_spin_unlock_irq+0x9/0x10 [] ? __schedule+0x2c6/0x710 [] exit_to_usermode_loop+0x74/0x90 [] prepare_exit_to_usermode+0x26/0x30 [] retint_user+0x8/0x10 This patch uses this_cpu_ptr() instead of get_cpu_ptr(), since preemption is already disabled by the caller. Signed-off-by: Will Deacon Reviewed-by: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 95ff4ca26c49 ("perf/core: Free AUX pages in unmap path") Link: http://lkml.kernel.org/r/20160824091905.GA16944@arm.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 5650f53..3cfabdf 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6166,7 +6166,7 @@ static int __perf_pmu_output_stop(void *info) { struct perf_event *event = info; struct pmu *pmu = event->pmu; - struct perf_cpu_context *cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); + struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); struct remote_output ro = { .rb = event->rb, }; -- cgit v1.1