summaryrefslogtreecommitdiffstats
path: root/tools/perf
Commit message (Collapse)AuthorAgeFilesLines
* perf top: Don't stop if no kernel symtab is foundArnaldo Carvalho de Melo2011-05-271-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | We now just warn the user about the fact and go on providing just userspace samples. This fixes a problem when no vmlinux is explicetely passed by the user, thus symbol_conf.vmlinux_name is NULL, no suitable vmlinux is found, and then we get: aldebaran:~> perf top -p 7557 [kernel.kallsyms] with build id 44d9a989eabbd79e486bc079d6b743d397c204e0 not found, continuing without symbols The (null) file can't be used Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-cj2g81hn64wv2bipmqk4fy2m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Handle kptr_restrictArnaldo Carvalho de Melo2011-05-271-0/+15
| | | | | | | | | | | | Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-cyl5zmi1nu35vyu7l5im2pyv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Remove unused macroArnaldo Carvalho de Melo2011-05-271-2/+0
| | | | | | | | | | | | Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-weqbs0tkk2u0qp1xxdxxosfg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf events: initialize fd array to -1 instead of 0David Ahern2011-05-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_evsel__alloc_fd allocates an array of file descriptors with the memory initialized to 0. The array has dimensions for cpus and threads. Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread dimensions. If the open fails for any of the cpus or threads then the fd's for this event are closed and the fd entry in the array is set to -1. Now, if the first attempt fails for the event (e.g., the event is not supported) the remaining dimensions (cpu > 0 and thread > 0) are not touched and left at the initialized value of 0. builtin-stat catches ENOENT and ENOSYS failures and allows the command to continue. The end result is that stat attempts to read from an fd of 0 which of course is stdin and so the command hangs until you type ctrl-D. Resolve by initializing the array to -1 since an fd < 0 is already handled. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1306511914-8016-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Make sure kptr_restrict warnings fit 80 col termsArnaldo Carvalho de Melo2011-05-272-21/+15
| | | | | | | | | | | | Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix build on older systemsArnaldo Carvalho de Melo2011-05-262-0/+3
| | | | | | | | | | | | | | | | Where /usr/include/linux/const.h is not present, e.g. RHEL5. Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf symbols: Handle /proc/sys/kernel/kptr_restrictArnaldo Carvalho de Melo2011-05-266-6/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf uses /proc/modules to figure out where kernel modules are loaded. With the advent of kptr_restrict, non root users get zeroes for all module start addresses. So check if kptr_restrict is non zero and don't generate the syntethic PERF_RECORD_MMAP events for them. Warn the user about it in perf record and in perf report. In perf report the reference relocation symbol being zero means that kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't use it to fixup symbol addresses when using a valid kallsyms (in the buildid cache) or vmlinux (in the vmlinux path) build-id located automatically or specified by the user. Provide an explanation about it in 'perf report' if kernel samples were taken, checking if a suitable vmlinux or kallsyms was found/specified. Restricted /proc/kallsyms don't go to the buildid cache anymore. Example: [acme@emilia ~]$ perf record -F 100000 sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ] [acme@emilia ~]$ [acme@emilia ~]$ perf report --stdio Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. If some relocation was applied (e.g. kexec) symbols may be misresolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ..................... # 20.24% sleep [kernel.kallsyms] [k] page_fault 20.04% sleep [kernel.kallsyms] [k] filemap_fault 19.78% sleep [kernel.kallsyms] [k] __lru_cache_add 19.69% sleep ld-2.12.so [.] memcpy 14.71% sleep [kernel.kallsyms] [k] dput 4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers 0.73% sleep [kernel.kallsyms] [k] perf_event_comm 0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ This is because it found a suitable vmlinux (build-id checked) in /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long file name). If we remove that file from the vmlinux path: [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \ /lib/modules/2.6.39-rc7+/build/vmlinux.OFF [acme@emilia ~]$ perf report --stdio [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562 not found, continuing without symbols Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a 19.69% sleep ld-2.12.so [.] memcpy # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ Reported-by: Stephane Eranian <eranian@google.com> Suggested-by: David Miller <davem@davemloft.net> Cc: Dave Jones <davej@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Remove duplicate headersJesper Juhl2011-05-262-3/+0
| | | | | | | | | | | Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: trivial@kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1105261011290.17400@swampdragon.chaosbits.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2011-05-231-1/+1
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Fix sample type size calculation in 32 bits archs profile: Use vzalloc() rather than vmalloc() & memset()
| * perf tools: Fix sample type size calculation in 32 bits archsFrederic Weisbecker2011-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shift used here to count the number of bits set in the mask doesn't work above the low part for archs that are not 64 bits. Fix the constant used for the shift. This fixes a 32-bit perf top failure reported by Eric Dumazet: Can't parse sample, err = -14 Can't parse sample, err = -14 ... Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2011-05-2313-68/+172
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Fix sample size bit operations perf tools: Fix ommitted mmap data update on remap watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh watchdog: Disable watchdog when thresh is zero watchdog: Only disable/enable watchdog if neccessary watchdog: Fix rounding bug in get_sample_period() perf tools: Propagate event parse error handling perf tools: Robustify dynamic sample content fetch perf tools: Pre-check sample size before parsing perf tools: Move evlist sample helpers to evlist area perf tools: Remove junk code in mmap size handling perf tools: Check we are able to read the event size on mmap
| * perf tools: Fix sample size bit operationsFrederic Weisbecker2011-05-231-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | What we want is to count the number of bits in the mask, not some other random operation written in the middle of the night. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1306148788-6179-2-git-send-email-fweisbec@gmail.com [ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf tools: Fix ommitted mmap data update on remapFrederic Weisbecker2011-05-231-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit eac9eacee16 "perf tools: Check we are able to read the event size on mmap" brought a check to ensure we can read the size of the event before dereferencing it, and do a remap otherwise to move the buffer forward. However that remap was ommitting all the necessary work to update the new page offset, head, and to unmap previous pages, etc... To fix this, gather all the code that fetches the event in a seperate helper which does all the necessary checks about the header/event size and tells us anytime a remap is needed. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * Merge branch 'perf/core' of ↵Ingo Molnar2011-05-2213-47/+138
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent Conflicts: tools/perf/builtin-top.c Semantic conflict: util/include/linux/list.h # fix prefetch.h removal fallout Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf tools: Propagate event parse error handlingFrederic Weisbecker2011-05-224-11/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Better handle event parsing error by propagating the details in upper layers or by dumping some failure message. So that the user knows he has some crazy events in the batch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
| | * perf tools: Robustify dynamic sample content fetchFrederic Weisbecker2011-05-221-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure the size of the dynamic fields such as callchains or raw events don't overlap the whole event boundaries. This prevents from dereferencing junk if the given size of the callchain goes too eager. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
| | * perf tools: Pre-check sample size before parsingFrederic Weisbecker2011-05-227-5/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check that the total size of the sample fields having a fixed size do not exceed the one of the whole event. This robustifies the sample parsing. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
| | * perf tools: Move evlist sample helpers to evlist areaFrederic Weisbecker2011-05-224-33/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These APIs should belong to evlist.c as they may not be exclusively tied to the headers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com
| | * perf tools: Remove junk code in mmap size handlingFrederic Weisbecker2011-05-221-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | size is overriden later and used only then. Those lines are only junk, probably a leftover. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
| | * perf tools: Check we are able to read the event size on mmapFrederic Weisbecker2011-05-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check we have enough mmaped space to read the current event size from its headers, otherwise we may dereference some hell there. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-05-231-1/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) b43: fix comment typo reqest -> request Haavard Skinnemoen has left Atmel cris: typo in mach-fs Makefile Kconfig: fix copy/paste-ism for dell-wmi-aio driver doc: timers-howto: fix a typo ("unsgined") perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course'). treewide: fix a few typos in comments regulator: change debug statement be consistent with the style of the rest Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations" audit: acquire creds selectively to reduce atomic op overhead rtlwifi: don't touch with treewide double semicolon removal treewide: cleanup continuations and remove logging message whitespace ath9k_hw: don't touch with treewide double semicolon removal include/linux/leds-regulator.h: fix syntax in example code tty: fix typo in descripton of tty_termios_encode_baud_rate xtensa: remove obsolete BKL kernel option from defconfig m68k: fix comment typo 'occcured' arch:Kconfig.locks Remove unused config option. treewide: remove extra semicolons ...
| * | | perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.cJesper Juhl2011-05-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Including "../../annotate.h" once in tools/perf/util/ui/browsers/annotate.c is enough. No need to do it twice. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | | sanitize <linux/prefetch.h> usageLinus Torvalds2011-05-201-1/+1
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e66eed651fd1 ("list: remove prefetching from regular list iterators") removed the include of prefetch.h from list.h, which uncovered several cases that had apparently relied on that rather obscure header file dependency. So this fixes things up a bit, using grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]') grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]') to guide us in finding files that either need <linux/prefetch.h> inclusion, or have it despite not needing it. There are more of them around (mostly network drivers), but this gets many core ones. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2011-05-191-1/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...
| * | | rcu: move TREE_RCU from softirq to kthreadPaul E. McKenney2011-05-051-1/+0
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If RCU priority boosting is to be meaningful, callback invocation must be boosted in addition to preempted RCU readers. Otherwise, in presence of CPU real-time threads, the grace period ends, but the callbacks don't get invoked. If the callbacks don't get invoked, the associated memory doesn't get freed, so the system is still subject to OOM. But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit moves the callback invocations to a kthread, which can be boosted easily. Also add comments and properly synchronized all accesses to rcu_cpu_kthread_task, as suggested by Lai Jiangshan. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| | |
| \ \
*-. \ \ Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of ↵Linus Torvalds2011-05-192-2/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set_*_buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug
| * \ \ \ Merge commit 'v2.6.39-rc7' into sched/coreIngo Molnar2011-05-1211-51/+56
| |\ \ \ \ | | |/ / /
| * | | | sched: Get rid of lock_depthJonathan Corbet2011-04-242-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Neil Brown pointed out that lock_depth somehow escaped the BKL removal work. Let's get rid of it now. Note that the perf scripting utilities still have a bunch of code for dealing with common_lock_depth in tracepoints; I have left that in place in case anybody wants to use that code with older kernels. Suggested-by: Neil Brown <neilb@suse.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2011-05-1915-564/+1599
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits) perf stat: Add more cache-miss percentage printouts perf stat: Add -d -d and -d -d -d options to show more CPU events ftrace/kbuild: Add recordmcount files to force full build ftrace: Add self-tests for multiple function trace users ftrace: Modify ftrace_set_filter/notrace to take ops ftrace: Allow dynamically allocated function tracers ftrace: Implement separate user function filtering ftrace: Free hash with call_rcu_sched() ftrace: Have global_ops store the functions that are to be traced ftrace: Add ops parameter to ftrace_startup/shutdown functions ftrace: Add enabled_functions file ftrace: Use counters to enable functions to trace ftrace: Separate hash allocation and assignment ftrace: Create a global_ops to hold the filter and notrace hashes ftrace: Use hash instead for FTRACE_FL_FILTER ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions perf bench, x86: Add alternatives-asm.h wrapper x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB ...
| * | | | perf stat: Add more cache-miss percentage printoutsIngo Molnar2011-05-191-2/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print out the cache-miss percentage as well if the cache refs were collected, for all the generic cache event types. Before: 11,103,723,230 dTLB-loads # 622.471 M/sec ( +- 0.30% ) 87,065,337 dTLB-load-misses # 4.881 M/sec ( +- 0.90% ) After: 11,353,713,242 dTLB-loads # 626.020 M/sec ( +- 0.35% ) 113,393,472 dTLB-load-misses # 1.00% of all dTLB cache hits ( +- 0.49% ) Also ASCII color highlight too high percentages, them when it's executed on the console. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Add -d -d and -d -d -d options to show more CPU eventsIngo Molnar2011-05-191-55/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print even more detailed statistics if requested via perf stat -d: -d: detailed events, L1 and LLC data cache -d -d: more detailed events, dTLB and iTLB events -d -d -d: very detailed events, adding prefetch events Full output looks like this now: Performance counter stats for '/home/mingo/hackbench 10' (5 runs): 1703.674707 task-clock # 8.709 CPUs utilized ( +- 4.19% ) 49,068 context-switches # 0.029 M/sec ( +- 16.66% ) 8,303 CPU-migrations # 0.005 M/sec ( +- 24.90% ) 17,397 page-faults # 0.010 M/sec ( +- 0.46% ) 2,345,389,239 cycles # 1.377 GHz ( +- 4.61% ) [55.90%] 1,884,503,527 stalled-cycles-frontend # 80.35% frontend cycles idle ( +- 5.67% ) [50.39%] 743,919,737 stalled-cycles-backend # 31.72% backend cycles idle ( +- 8.75% ) [49.91%] 1,314,416,379 instructions # 0.56 insns per cycle # 1.43 stalled cycles per insn ( +- 2.53% ) [60.87%] 272,592,567 branches # 160.003 M/sec ( +- 1.74% ) [56.56%] 3,794,846 branch-misses # 1.39% of all branches ( +- 6.59% ) [58.50%] 449,982,778 L1-dcache-loads # 264.125 M/sec ( +- 2.47% ) [49.88%] 22,404,961 L1-dcache-load-misses # 4.98% of all L1-dcache hits ( +- 6.08% ) [55.05%] 6,204,750 LLC-loads # 3.642 M/sec ( +- 8.91% ) [43.75%] 1,837,411 LLC-load-misses # 1.078 M/sec ( +- 7.27% ) [12.07%] 411,440,421 L1-icache-loads # 241.502 M/sec ( +- 5.60% ) [36.52%] 27,556,832 L1-icache-load-misses # 16.175 M/sec ( +- 7.46% ) [46.72%] 464,067,627 dTLB-loads # 272.392 M/sec ( +- 4.46% ) [54.17%] 10,765,648 dTLB-load-misses # 6.319 M/sec ( +- 3.18% ) [48.68%] 1,273,080,386 iTLB-loads # 747.256 M/sec ( +- 3.38% ) [47.53%] 117,481 iTLB-load-misses # 0.069 M/sec ( +- 14.99% ) [47.01%] 4,590,653 L1-dcache-prefetches # 2.695 M/sec ( +- 4.49% ) [46.19%] 1,712,660 L1-dcache-prefetch-misses # 1.005 M/sec ( +- 3.75% ) [44.82%] 0.195622057 seconds time elapsed ( +- 6.84% ) Also clean up the attribute construction code to be appending, and factor it out into add_default_attributes(). Tweak the coverage percentage printout a bit, so that it's easier to view it alongside the +- sttddev colum. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf bench, x86: Add alternatives-asm.h wrapperIngo Molnar2011-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf bench needs this to build the kernel's memcpy routine: In file included from bench/mem-memcpy-x86-64-asm.S:2:0: bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf: Fix multi-event parsing bugStephane Eranian2011-05-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an issue with event parsing. The following commit appears to have broken the ability to specify a comma separated list of events: commit ceb53fbf6dbb1df26d38379a262c6981fe73dd36 Author: Ingo Molnar <mingo@elte.hu> Date: Wed Apr 27 04:06:33 2011 +0200 perf stat: Fail more clearly when an invalid modifier is specified This patch fixes this while preserving the desired effect: $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null Performance counter stats for 'ls /dev/null': 365956 instructions:u # 0.00 insns per cycle 731806 instructions:k # 0.00 insns per cycle 0.001108862 seconds time elapsed $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers Signed-off-by: Stephane Eranian <eranian@google.com> Cc: acme@redhat.com Cc: peterz@infradead.org Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf probe: Fix the missed parameter initializationLin Ming2011-05-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pubname_callback_param::found should be initialized to 0 in fastpath lookup, the structure is on the stack and uninitialized otherwise. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | Merge commit 'v2.6.39-rc7' into perf/coreIngo Molnar2011-05-101-6/+10
| |\ \ \ \ | | | |/ / | | |/| | | | | | | | | | | | | | | | | Merge reason: pull in the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Tell user about unsupported events in the listDavid Ahern2011-04-301-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to perf-record, tell user about unsupported events that will not be counted if invoked in verbose mode. e.g., $ perf stat -e dTLB-prefetch-misses -v -- sleep 1 dTLB-prefetch-misses event is not supported by the kernel. dTLB-prefetch-misses: 0 0 0 Performance counter stats for 'sleep 1': <not counted> dTLB-prefetch-misses 1.001884783 seconds time elapsed Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf list: Fix max event string sizeIngo Molnar2011-04-291-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent stalled-cycles event names were larger than the 40 chars printout used by perf list. Extend that, make it robust for future extensions and also adjust alignments in face of wider event names. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Fail softly on unsupported eventsIngo Molnar2011-04-291-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern reported this perf stat failure: > # /tmp/build-perf/perf stat -- sleep 1 > Error: stalled-cycles-frontend event is not supported. > Fatal: Not all events could be opened. > > This is a Dell R410 with an E5620 processor. Fail in a softer fashion on unknown/unsupported events. Reported-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Leave more room for percentagesIngo Molnar2011-04-291-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Triple digit percentages do not fit otherwise. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n005io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Adjust stall cycles warning percentagesIngo Molnar2011-04-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjust to color thresholds to better match the percentages seen in real workloads. Both are now a bit more sensitive. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n004io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Analyze front-end and back-end stall countsIngo Molnar2011-04-292-9/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sample output: Performance counter stats for './loop_1b': 873.691065 task-clock # 1.000 CPUs utilized 1 context-switches # 0.000 M/sec 1 CPU-migrations # 0.000 M/sec 96 page-faults # 0.000 M/sec 2,012,637,222 cycles # 2.304 GHz (66.58%) 1,001,397,911 stalled-cycles-frontend # 49.76% frontend cycles idle (66.58%) 7,523,398 stalled-cycles-backend # 0.37% backend cycles idle (66.76%) 2,004,551,046 instructions # 1.00 insns per cycle # 0.50 stalled cycles per insn (66.80%) 1,001,304,992 branches # 1146.063 M/sec (66.76%) 39,453 branch-misses # 0.00% of all branches (66.64%) 0.874046121 seconds time elapsed Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n003io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf tools: Add front-end and back-end stalled cycles supportIngo Molnar2011-04-293-23/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update perf tooling to deal with front-end and back-end stalled cycles events. Add both the default 'perf stat' output. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n002io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Fix compatibility behaviorIngo Molnar2011-04-281-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of failing on an unknown event, when new perf stat is run on older kernels: $ ./perf stat true Error: open_counter returned with 22 (Invalid argument). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. Just ignore EINVAL and ENOSYS, we'll print the results as not counted: Performance counter stats for 'true': 0.239483 task-clock # 0.493 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 86 page-faults # 0.359 M/sec 704,766 cycles # 2.943 GHz <not counted> stalled-cycles 381,961 instructions # 0.54 insns per cycle 69,626 branches # 290.735 M/sec 4,594 branch-misses # 6.60% of all branches 0.000485883 seconds time elapsed Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Add --sync/-S optionIngo Molnar2011-04-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --sync will tell perf stat to run sync() before starting a command. This allows IO-heavy tests to be used with --repeat, without one iteration impacting the other. Elapsed time will stabilize for example: before: 3.971525714 seconds time elapsed ( +- 8.56% ) after: 3.211098537 seconds time elapsed ( +- 1.52% ) So measurements will be more accurate. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Fix printout vertical alignmentIngo Molnar2011-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | After: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | The last column (stddev noise) is properly aligned, vertically. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn0dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Add -d/--detailed flag to run with a lot of eventsIngo Molnar2011-04-261-8/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the new -d/--detailed flag, which generates a pretty detailed event list: Performance counter stats for './hackbench 10' (10 runs): 1514.287888 task-clock # 10.897 CPUs utilized ( +- 3.05% ) 39,698 context-switches # 0.026 M/sec ( +- 12.19% ) 8,147 CPU-migrations # 0.005 M/sec ( +- 16.55% ) 17,918 page-faults # 0.012 M/sec ( +- 0.37% ) 2,944,504,050 cycles # 1.944 GHz ( +- 3.89% ) (32.60%) 1,043,971,283 stalled-cycles # 35.45% of all cycles are idle ( +- 5.22% ) (44.48%) 1,655,906,768 instructions # 0.56 insns per cycle # 0.63 stalled cycles per insn ( +- 1.95% ) (55.09%) 338,832,373 branches # 223.757 M/sec ( +- 1.96% ) (64.47%) 3,892,416 branch-misses # 1.15% of all branches ( +- 5.49% ) (73.12%) 606,410,482 L1-dcache-loads # 400.459 M/sec ( +- 1.29% ) (71.21%) 31,204,395 L1-dcache-load-misses # 5.15% of all L1-dcache hits ( +- 3.04% ) (60.43%) 3,922,751 LLC-loads # 2.590 M/sec ( +- 6.80% ) (46.87%) 5,037,288 LLC-load-misses # 3.327 M/sec ( +- 3.56% ) (13.00%) 0.138966828 seconds time elapsed ( +- 4.11% ) This can be used "at a glance" for narrower analysis. -d can also be used in addition to other -e events, to further expand an event list. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-cxs98quixs3qyvdqx3goojc4@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Print out miss/hit ratio for L1 data-cache eventsIngo Molnar2011-04-261-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print out this kind of l1-dcache-misses percentage: Performance counter stats for './bw_tcp localhost': 29,956,262,201 cycles # 3.002 GHz (scaled from 85.14%) 8,255,209,558 stalled-cycles # 27.56% of all cycles are idle (scaled from 86.56%) 1,206,130,308 l1-dcache-misses # 40.49% of all L1-dcache hits (scaled from 86.30%) 2,978,756,779 l1-dcache-refs # 298.512 M/sec (scaled from 70.02%) 8,861,956,159 instructions # 0.30 insns per cycle # 0.93 stalled cycles per insn (scaled from 84.27%) 1,644,306,068 branches # 164.782 M/sec (scaled from 86.43%) 74,778,443 branch-misses # 4.55% of all branches (scaled from 70.69%) 9978.695711 task-clock # 0.693 CPUs utilized 14.404347983 seconds time elapsed And color the result depending on the severity of cache-trashing. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-54gmz0zymaid84zcs7joq02p@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Print branch misses warning colorsIngo Molnar2011-04-261-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print the missed-branches percentage with different warning level ASCII colors, as the percentage passes the 5%/10%/20% thresholds. These thresholds are set to relatively low levels, because on most CPUs even a moderate percentage of branch-misses already shows up as a slowdown. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-ybqukg7p86leiup7gl03ecgk@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Print stalled cycles warning colorsIngo Molnar2011-04-261-6/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print the stalled-cycles percentage with different warning level ASCII colors, as the percentage passes the 25%/50%/75% thresholds. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-e25zz44rcms7mu9az4fu5zp0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf stat: Fix -nan% output in perf stat noise printoutsIngo Molnar2011-04-261-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: 0 CPU-migrations # 0.000 M/sec ( +- -nan% ) After: 0 CPU-migrations # 0.000 M/sec ( +- 0.00% ) Also factor out the noise printing function. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-z89h2v1bk1mikcbsf7e6v34q@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
OpenPOWER on IntegriCloud