From 78a5a93c1eeb4e6933d1f62b33e5496d53b46c5a Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Thu, 8 Jun 2017 19:06:25 +0200 Subject: bpf, tests: fix endianness selection I noticed that test_l4lb was failing in selftests: # ./test_progs test_pkt_access:PASS:ipv4 77 nsec test_pkt_access:PASS:ipv6 44 nsec test_xdp:PASS:ipv4 2933 nsec test_xdp:PASS:ipv6 1500 nsec test_l4lb:PASS:ipv4 377 nsec test_l4lb:PASS:ipv6 544 nsec test_l4lb:FAIL:stats 6297600000 200000 test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 1 FAILED Tracking down the issue actually revealed that endianness selection in bpf_endian.h is broken when compiled with clang with bpf target. test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as __BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len) and bpf_ntohs(iph->tot_len), and compares them against a defined value and given a wrong endianness, the test outcome is different, of course. Turns out that there are actually two bugs: i) when we do __BYTE_ORDER comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the include order we see different outcomes. Reason is that __BYTE_ORDER is undefined due to missing endian.h include. Before we include the asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals __LITTLE_ENDIAN since both are undefined, after the include which correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN is defined, but given __BYTE_ORDER is still undefined, we match on __BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also undefined at that point, sigh. ii) But even that would be wrong, since when compiling the test cases with clang, one can select between bpfeb and bpfel targets for cross compilation. Hence, we can also not rely on what the system's endian.h provides, but we need to look at the compiler's defined endianness. The compiler defines __BYTE_ORDER__, and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__, which also reflects targets bpf (native), bpfel, bpfeb correctly, thus really only rely on that. After patch: # ./test_progs test_pkt_access:PASS:ipv4 74 nsec test_pkt_access:PASS:ipv6 42 nsec test_xdp:PASS:ipv4 2340 nsec test_xdp:PASS:ipv6 1461 nsec test_l4lb:PASS:ipv4 400 nsec test_l4lb:PASS:ipv6 530 nsec test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 0 FAILED Fixes: 43bcf707ccdc ("bpf: fix _htons occurences in test_progs") Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: David S. Miller --- tools/testing/selftests/bpf/bpf_endian.h | 41 +++++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 11 deletions(-) (limited to 'tools') diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h index 19d0604..487cbfb 100644 --- a/tools/testing/selftests/bpf/bpf_endian.h +++ b/tools/testing/selftests/bpf/bpf_endian.h @@ -1,23 +1,42 @@ #ifndef __BPF_ENDIAN__ #define __BPF_ENDIAN__ -#include +#include -#if __BYTE_ORDER == __LITTLE_ENDIAN -# define __bpf_ntohs(x) __builtin_bswap16(x) -# define __bpf_htons(x) __builtin_bswap16(x) -#elif __BYTE_ORDER == __BIG_ENDIAN -# define __bpf_ntohs(x) (x) -# define __bpf_htons(x) (x) +/* LLVM's BPF target selects the endianness of the CPU + * it compiles on, or the user specifies (bpfel/bpfeb), + * respectively. The used __BYTE_ORDER__ is defined by + * the compiler, we cannot rely on __BYTE_ORDER from + * libc headers, since it doesn't reflect the actual + * requested byte order. + * + * Note, LLVM's BPF target has different __builtin_bswapX() + * semantics. It does map to BPF_ALU | BPF_END | BPF_TO_BE + * in bpfel and bpfeb case, which means below, that we map + * to cpu_to_be16(). We could use it unconditionally in BPF + * case, but better not rely on it, so that this header here + * can be used from application and BPF program side, which + * use different targets. + */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ +# define __bpf_ntohs(x) __builtin_bswap16(x) +# define __bpf_htons(x) __builtin_bswap16(x) +# define __bpf_constant_ntohs(x) ___constant_swab16(x) +# define __bpf_constant_htons(x) ___constant_swab16(x) +#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +# define __bpf_ntohs(x) (x) +# define __bpf_htons(x) (x) +# define __bpf_constant_ntohs(x) (x) +# define __bpf_constant_htons(x) (x) #else -# error "Fix your __BYTE_ORDER?!" +# error "Fix your compiler's __BYTE_ORDER__?!" #endif #define bpf_htons(x) \ (__builtin_constant_p(x) ? \ - __constant_htons(x) : __bpf_htons(x)) + __bpf_constant_htons(x) : __bpf_htons(x)) #define bpf_ntohs(x) \ (__builtin_constant_p(x) ? \ - __constant_ntohs(x) : __bpf_ntohs(x)) + __bpf_constant_ntohs(x) : __bpf_ntohs(x)) -#endif +#endif /* __BPF_ENDIAN__ */ -- cgit v1.1 From 7a1ac110c22eb726684c837544a2d42c33e07be7 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Fri, 9 Jun 2017 16:54:28 -0300 Subject: perf evsel: Fix probing of precise_ip level for default cycles event Since commit 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done in the routine used to probe the max precise level when no events were passed to 'perf record' or 'perf top', i.e.: perf_evsel__new_cycles() perf_event_attr__set_max_precise_ip() The x86 code, in x86_pmu_hw_config(), which is called all the way from sys_perf_event_open() did, starting with the aforementioned commit: /* There's no sense in having PEBS for non sampling events: */ if (!is_sampling_event(event)) return -EINVAL; Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using just the non precise cycles variant. To make sure that this is the case, I tested it, before this patch, with: # perf probe -L x86_pmu_hw_config 0 int x86_pmu_hw_config(struct perf_event *event) 1 { 2 if (event->attr.precise_ip) { 17 if (event->attr.precise_ip > precise) 18 return -EOPNOTSUPP; /* There's no sense in having PEBS for non sampling events: */ 21 if (!is_sampling_event(event)) 22 return -EINVAL; } # perf probe x86_pmu_hw_config:22 Added new events: probe:x86_pmu_hw_config (on x86_pmu_hw_config:22) probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22) You can now use it in all perf tools, such as: perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1 # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1 0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] # I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times. So fix it by just setting attr.sample_period Now, after this patch: # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1 [ perf record: Woken up 1 times to write data ] 0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_open_cloexec_flag (/home/acme/bin/perf) 0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) 0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # The probe one called from perf_event_attr__set_max_precise_ip() works the first time, with attr.precise_ip = 3, wit hthe next ones being the per cpu ones for the cycles:ppp event. And here is the text from a report and alternative proposed patch by Thomas-Mich Richter: --- On s390 the counter and sampling facility do not support a precise IP skid level and sometimes returns EOPNOTSUPP when structure member precise_ip in struct perf_event_attr is not set to zero. On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This happens only when no events are specified on command line. The functions called are ... --> perf_evlist__add_default --> perf_evsel__new_cycles --> perf_event_attr__set_max_precise_ip The last function determines the value of structure member precise_ip by invoking the perf_event_open() system call and checking the return code. The first successful open is the value for precise_ip. However the value is determined without setting member sample_period and indicates no sampling. On s390 the counter facility and sampling facility are different. The above procedure determines a precise_ip value of 3 using the counter facility. Later it uses the sampling facility with a value of 3 and fails with EOPNOTSUPP. --- v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members of unnamed union members in the container struct initialization, so move from: struct perf_event_attr attr = { ... .sample_period = 1, }; to right after it as: struct perf_event_attr attr = { ... }; attr.sample_period = 1; v3: We need to reset .sample_period to 0 to let the users of perf_evsel__new_cycles() to properly setup attr.sample_period or attr.sample_freq. Reported by Ingo Molnar. Reported-and-Acked-by: Thomas-Mich Richter Acked-by: Hendrik Brueckner Acked-by: Jiri Olsa Cc: Adrian Hunter Cc: Alexander Shishkin Cc: David Ahern Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Wang Nan Fixes: 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/tests/task-exit.c | 2 +- tools/perf/util/evsel.c | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) (limited to 'tools') diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c index 32873ec..cf00eba 100644 --- a/tools/perf/tests/task-exit.c +++ b/tools/perf/tests/task-exit.c @@ -83,7 +83,7 @@ int test__task_exit(int subtest __maybe_unused) evsel = perf_evlist__first(evlist); evsel->attr.task = 1; - evsel->attr.sample_freq = 0; + evsel->attr.sample_freq = 1; evsel->attr.inherit = 0; evsel->attr.watermark = 0; evsel->attr.wakeup_events = 1; diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index e4f7902..cda44b0 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -273,8 +273,20 @@ struct perf_evsel *perf_evsel__new_cycles(void) struct perf_evsel *evsel; event_attr_init(&attr); + /* + * Unnamed union member, not supported as struct member named + * initializer in older compilers such as gcc 4.4.7 + * + * Just for probing the precise_ip: + */ + attr.sample_period = 1; perf_event_attr__set_max_precise_ip(&attr); + /* + * Now let the usual logic to set up the perf_event_attr defaults + * to kick in when we return and before perf_evsel__open() is called. + */ + attr.sample_period = 0; evsel = perf_evsel__new(&attr); if (evsel == NULL) -- cgit v1.1 From 7a759cd8e8272ee18922838ee711219c7c796a31 Mon Sep 17 00:00:00 2001 From: Jiada Wang Date: Sun, 9 Apr 2017 20:02:37 -0700 Subject: perf tools: Fix build with ARCH=x86_64 With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable) when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of ARCH=x86, so the perf build process searchs header files from tools/arch/x86_64/include, which doesn't exist. The following build failure is seen: In file included from util/event.c:2:0: tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory compilation terminated. Fix this issue by using SRCARCH instead of ARCH in perf, just like the main kernel Makefile and tools/objtool's. Signed-off-by: Jiada Wang Tested-by: Arnaldo Carvalho de Melo Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Andi Kleen Cc: Eugeniu Rosca Cc: Jan Stancek Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Rui Teng Cc: Sukadev Bhattiprolu Cc: Wang Nan Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable") Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.config | 38 +++++++++++++++++++------------------- tools/perf/Makefile.perf | 2 +- tools/perf/arch/Build | 2 +- tools/perf/pmu-events/Build | 4 ++-- tools/perf/tests/Build | 2 +- tools/perf/util/header.c | 2 +- 6 files changed, 25 insertions(+), 25 deletions(-) (limited to 'tools') diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 8354d04..1f4fbc9 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -19,18 +19,18 @@ CFLAGS := $(EXTRA_CFLAGS) $(EXTRA_WARNINGS) include $(srctree)/tools/scripts/Makefile.arch -$(call detected_var,ARCH) +$(call detected_var,SRCARCH) NO_PERF_REGS := 1 # Additional ARCH settings for ppc -ifeq ($(ARCH),powerpc) +ifeq ($(SRCARCH),powerpc) NO_PERF_REGS := 0 LIBUNWIND_LIBS := -lunwind -lunwind-ppc64 endif # Additional ARCH settings for x86 -ifeq ($(ARCH),x86) +ifeq ($(SRCARCH),x86) $(call detected,CONFIG_X86) ifeq (${IS_64_BIT}, 1) CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -DHAVE_SYSCALL_TABLE -I$(OUTPUT)arch/x86/include/generated @@ -43,12 +43,12 @@ ifeq ($(ARCH),x86) NO_PERF_REGS := 0 endif -ifeq ($(ARCH),arm) +ifeq ($(SRCARCH),arm) NO_PERF_REGS := 0 LIBUNWIND_LIBS = -lunwind -lunwind-arm endif -ifeq ($(ARCH),arm64) +ifeq ($(SRCARCH),arm64) NO_PERF_REGS := 0 LIBUNWIND_LIBS = -lunwind -lunwind-aarch64 endif @@ -61,7 +61,7 @@ endif # Disable it on all other architectures in case libdw unwind # support is detected in system. Add supported architectures # to the check. -ifneq ($(ARCH),$(filter $(ARCH),x86 arm)) +ifneq ($(SRCARCH),$(filter $(SRCARCH),x86 arm)) NO_LIBDW_DWARF_UNWIND := 1 endif @@ -115,9 +115,9 @@ endif FEATURE_CHECK_CFLAGS-libbabeltrace := $(LIBBABELTRACE_CFLAGS) FEATURE_CHECK_LDFLAGS-libbabeltrace := $(LIBBABELTRACE_LDFLAGS) -lbabeltrace-ctf -FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(ARCH)/include/uapi -I$(srctree)/tools/include/uapi +FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(SRCARCH)/include/uapi -I$(srctree)/tools/include/uapi # include ARCH specific config --include $(src-perf)/arch/$(ARCH)/Makefile +-include $(src-perf)/arch/$(SRCARCH)/Makefile ifdef PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET CFLAGS += -DHAVE_ARCH_REGS_QUERY_REGISTER_OFFSET @@ -228,12 +228,12 @@ ifeq ($(DEBUG),0) endif INC_FLAGS += -I$(src-perf)/util/include -INC_FLAGS += -I$(src-perf)/arch/$(ARCH)/include +INC_FLAGS += -I$(src-perf)/arch/$(SRCARCH)/include INC_FLAGS += -I$(srctree)/tools/include/uapi INC_FLAGS += -I$(srctree)/tools/include/ -INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/uapi -INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/ -INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/ +INC_FLAGS += -I$(srctree)/tools/arch/$(SRCARCH)/include/uapi +INC_FLAGS += -I$(srctree)/tools/arch/$(SRCARCH)/include/ +INC_FLAGS += -I$(srctree)/tools/arch/$(SRCARCH)/ # $(obj-perf) for generated common-cmds.h # $(obj-perf)/util for generated bison/flex headers @@ -355,7 +355,7 @@ ifndef NO_LIBELF ifndef NO_DWARF ifeq ($(origin PERF_HAVE_DWARF_REGS), undefined) - msg := $(warning DWARF register mappings have not been defined for architecture $(ARCH), DWARF support disabled); + msg := $(warning DWARF register mappings have not been defined for architecture $(SRCARCH), DWARF support disabled); NO_DWARF := 1 else CFLAGS += -DHAVE_DWARF_SUPPORT $(LIBDW_CFLAGS) @@ -380,7 +380,7 @@ ifndef NO_LIBELF CFLAGS += -DHAVE_BPF_PROLOGUE $(call detected,CONFIG_BPF_PROLOGUE) else - msg := $(warning BPF prologue is not supported by architecture $(ARCH), missing regs_query_register_offset()); + msg := $(warning BPF prologue is not supported by architecture $(SRCARCH), missing regs_query_register_offset()); endif else msg := $(warning DWARF support is off, BPF prologue is disabled); @@ -406,7 +406,7 @@ ifdef PERF_HAVE_JITDUMP endif endif -ifeq ($(ARCH),powerpc) +ifeq ($(SRCARCH),powerpc) ifndef NO_DWARF CFLAGS += -DHAVE_SKIP_CALLCHAIN_IDX endif @@ -487,7 +487,7 @@ else endif ifndef NO_LOCAL_LIBUNWIND - ifeq ($(ARCH),$(filter $(ARCH),arm arm64)) + ifeq ($(SRCARCH),$(filter $(SRCARCH),arm arm64)) $(call feature_check,libunwind-debug-frame) ifneq ($(feature-libunwind-debug-frame), 1) msg := $(warning No debug_frame support found in libunwind); @@ -740,7 +740,7 @@ ifeq (${IS_64_BIT}, 1) NO_PERF_READ_VDSO32 := 1 endif endif - ifneq ($(ARCH), x86) + ifneq ($(SRCARCH), x86) NO_PERF_READ_VDSOX32 := 1 endif ifndef NO_PERF_READ_VDSOX32 @@ -769,7 +769,7 @@ ifdef LIBBABELTRACE endif ifndef NO_AUXTRACE - ifeq ($(ARCH),x86) + ifeq ($(SRCARCH),x86) ifeq ($(feature-get_cpuid), 0) msg := $(warning Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc); NO_AUXTRACE := 1 @@ -872,7 +872,7 @@ sysconfdir = $(prefix)/etc ETC_PERFCONFIG = etc/perfconfig endif ifndef lib -ifeq ($(ARCH)$(IS_64_BIT), x861) +ifeq ($(SRCARCH)$(IS_64_BIT), x861) lib = lib64 else lib = lib diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 79fe31f..5008f51 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -226,7 +226,7 @@ endif ifeq ($(config),0) include $(srctree)/tools/scripts/Makefile.arch --include arch/$(ARCH)/Makefile +-include arch/$(SRCARCH)/Makefile endif # The FEATURE_DUMP_EXPORT holds location of the actual diff --git a/tools/perf/arch/Build b/tools/perf/arch/Build index 109eb75..d9b6af8 100644 --- a/tools/perf/arch/Build +++ b/tools/perf/arch/Build @@ -1,2 +1,2 @@ libperf-y += common.o -libperf-y += $(ARCH)/ +libperf-y += $(SRCARCH)/ diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build index 9213a12..999a4e8 100644 --- a/tools/perf/pmu-events/Build +++ b/tools/perf/pmu-events/Build @@ -2,7 +2,7 @@ hostprogs := jevents jevents-y += json.o jsmn.o jevents.o pmu-events-y += pmu-events.o -JDIR = pmu-events/arch/$(ARCH) +JDIR = pmu-events/arch/$(SRCARCH) JSON = $(shell [ -d $(JDIR) ] && \ find $(JDIR) -name '*.json' -o -name 'mapfile.csv') # @@ -10,4 +10,4 @@ JSON = $(shell [ -d $(JDIR) ] && \ # directory and create tables in pmu-events.c. # $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JEVENTS) - $(Q)$(call echo-cmd,gen)$(JEVENTS) $(ARCH) pmu-events/arch $(OUTPUT)pmu-events/pmu-events.c $(V) + $(Q)$(call echo-cmd,gen)$(JEVENTS) $(SRCARCH) pmu-events/arch $(OUTPUT)pmu-events/pmu-events.c $(V) diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build index af58ebc..84222bd 100644 --- a/tools/perf/tests/Build +++ b/tools/perf/tests/Build @@ -75,7 +75,7 @@ $(OUTPUT)tests/llvm-src-relocation.c: tests/bpf-script-test-relocation.c tests/B $(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@ $(Q)echo ';' >> $@ -ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64 powerpc)) +ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 arm arm64 powerpc)) perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o endif diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 5cac8d5..b5baff3 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -841,7 +841,7 @@ static int write_group_desc(int fd, struct perf_header *h __maybe_unused, /* * default get_cpuid(): nothing gets recorded - * actual implementation must be in arch/$(ARCH)/util/header.c + * actual implementation must be in arch/$(SRCARCH)/util/header.c */ int __weak get_cpuid(char *buffer __maybe_unused, size_t sz __maybe_unused) { -- cgit v1.1 From 92b0a1416be587b87c8ff489b6a74fd929048ca7 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Thu, 15 Jun 2017 08:20:35 -0500 Subject: objtool: Add fortify_panic as __noreturn function CONFIG_FORTIFY_SOURCE=y implements fortify_panic() as a __noreturn function, so objtool needs to know about it too. Suggested-by: Daniel Micay Tested-by: Stephen Rothwell Signed-off-by: Kees Cook Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1497532835-32704-1-git-send-email-jpoimboe@redhat.com Signed-off-by: Ingo Molnar --- tools/objtool/builtin-check.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'tools') diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 282a603..5f66697f 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -192,7 +192,8 @@ static int __dead_end_function(struct objtool_file *file, struct symbol *func, "complete_and_exit", "kvm_spurious_fault", "__reiserfs_panic", - "lbug_with_loc" + "lbug_with_loc", + "fortify_panic", }; if (func->bind == STB_WEAK) -- cgit v1.1 From 9126cbbacecb8917bd0418809ef1d26616b2061e Mon Sep 17 00:00:00 2001 From: Milian Wolff Date: Fri, 2 Jun 2017 16:37:53 +0200 Subject: perf unwind: Report module before querying isactivation in dwfl unwind The PC returned by dwfl_frame_pc() may map into a not-yet-reported module. We have to report it before we continue unwinding. But when we query for the isactivation flag in dwfl_frame_pc, libdw will actually do one more unwinding step internally which can then break and lead to missed frames or broken stacks. With libunwind we get e.g.: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0) f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0) 298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) ~~~~~ Note the two frames 1589e8 and 78622 in the first sample. These are missing when unwinding with libdw. The second sample's breakage is more obvious: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) 723dbf [unknown] ([unknown]) ~~~~~ This patch fixes this issue and the libdw unwinder mimicks the libunwind behavior more closely. Signed-off-by: Milian Wolff Acked-by: Jan Kratochvil Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/unwind-libdw.c | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'tools') diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c index da45c4b..7755a5e0 100644 --- a/tools/perf/util/unwind-libdw.c +++ b/tools/perf/util/unwind-libdw.c @@ -178,6 +178,14 @@ frame_callback(Dwfl_Frame *state, void *arg) Dwarf_Addr pc; bool isactivation; + if (!dwfl_frame_pc(state, &pc, NULL)) { + pr_err("%s", dwfl_errmsg(-1)); + return DWARF_CB_ABORT; + } + + // report the module before we query for isactivation + report_module(pc, ui); + if (!dwfl_frame_pc(state, &pc, &isactivation)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; -- cgit v1.1