summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* perf: Swap inclusion order of util.h and string.h in util/string.cHitoshi Mitake2010-04-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Currently util/string.c includes headers in this order: string.h, util.h But this causes a build error because __USE_GNU definition is needed for strndup() definition: % make -j touch .perf.dev.null CC util/string.o cc1: warnings being treated as errors util/string.c: In function ‘argv_split’: util/string.c:171: error: implicit declaration of function ‘strndup’ util/string.c:171: error: incompatible implicit declaration of built-in function ‘strndup’ So this patch swaps the headers inclusion order. util.h defines _GNU_SOURCE, and /usr/include/features.h defines __USE_GNU as 1 if _GNU_SOURCE is defined. Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1270368798-27232-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* perf_event: Make perf fd non seekableArnd Bergmann2010-04-041-0/+1
| | | | | | | | | | | | | Perf_event does not need seeking, so prevent it in order to get rid of default_llseek, which uses the BKL. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> [drop the nonseekable_open, not needed for anon inodes] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* perf: Fetch hot regs from the template callerFrederic Weisbecker2010-04-041-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trace events can be defined from a template using DECLARE_EVENT_CLASS/DEFINE_EVENT or directly with TRACE_EVENT. In both cases we have a template tracepoint handler, used to record the trace, to which we pass our ftrace event instance. In the function level, if the class is named "foo" and the event is named "blah", we have the following chain of calls: perf_trace_blah() -> perf_trace_templ_foo() In the case we have several events sharing the class "blah", we'll have multiple users of perf_trace_templ_foo(), and it won't be inlined by the compiler. This is usually what happens with the DECLARE_EVENT_CLASS/DEFINE_EVENT based definition. But if perf_trace_blah() is the only caller of perf_trace_templ_foo() there are fair chances that it will be inlined. The problem is that we fetch the regs from perf_trace_templ_foo() after we rewinded the frame pointer to the second caller, we want to reach the caller of perf_trace_blah() to get the right source of the event. And we do this by always assuming that perf_trace_templ_foo() is not inlined. But as shown above this is not always true. And if it is inlined we miss the first caller, losing the most important level of precision. We get: 61.31% ls [kernel.kallsyms] [k] do_softirq | --- do_softirq irq_exit do_IRQ common_interrupt | |--25.00%-- tty_buffer_request_room Instead of: 61.31% ls [kernel.kallsyms] [k] __do_softirq | --- __do_softirq do_softirq irq_exit do_IRQ common_interrupt | |--25.00%-- tty_buffer_request_room To fix this, we fetch the regs from perf_trace_blah() rather than perf_trace_templ_foo() so that we don't have to deal with inlining surprises. That also bring us the advantage of having the true source of the event even if we don't have frame pointers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
* perf: Drop the frame reliablity checkFrederic Weisbecker2010-04-041-2/+1
| | | | | | | | | | | | It is useless now that we have a pure stack frame walker, as given addr are always reliable. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
* perf TUI: Add a "Zoom into COMM(PID) thread" and reverse operationsArnaldo Carvalho de Melo2010-04-032-10/+81
| | | | | | | | | | | | | | | | | Now one can press the right arrow key and in addition to being able to filter by DSO, filter out by thread too, or a combination of both filters. With this one can start collecting events for the whole system, then focus on a subset of the collected data quickly. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf newt: Add a "Zoom into foo.so DSO" and reverse operationsArnaldo Carvalho de Melo2010-04-032-32/+90
| | | | | | | | | | | | | | | | | | | | | | Clicking on -> will bring as one of the popup menu options a "Zoom into CURRENT DSO", i.e. CURRENT will be replaced by the name of the DSO in the current line. Choosing this option will filter out all samples that didn't took place in a symbol in this DSO. After that the option reverts to "Zoom out of CURRENT DSO", to allow going back to the more compreensive view, not filtered by DSO. Future similar operations will include zooming into a particular thread, COMM, CPU, "last minute", "last N usecs", etc. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-04-034-14/+18
|\ | | | | | | | | | | | | | | | | Conflicts: tools/perf/Makefile Merge reason: resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf: Always build the powerpc perf_arch_fetch_caller_regs versionFrederic Weisbecker2010-04-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that software events use perf_arch_fetch_caller_regs() too, we need the powerpc version to be always built. Fixes the following build error: (.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs' (.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs' arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 Reported-by: Michael Ellerman <michael@ellerman.id.au> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>
| * perf: Always build the stub perf_arch_fetch_caller_regs versionFrederic Weisbecker2010-04-031-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that software events use perf_arch_fetch_caller_regs() too, we need the stub version to be always built in for archs that don't implement it. Fixes the following build error in PARISC: kernel/built-in.o: In function `perf_event_task_sched_out': (.text.perf_event_task_sched_out+0x54): undefined reference to `perf_arch_fetch_caller_regs' Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>
| * perf, probe-finder: Build fix on DebianBorislav Petkov2010-04-021-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building chokes with: In file included from /usr/include/gelf.h:53, from /usr/include/elfutils/libdw.h:53, from util/probe-finder.h:61, from util/probe-finder.c:39: /usr/include/libelf.h:98: error: expected specifier-qualifier-list before 'off64_t' [...] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <20100329164755.GA16034@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf/scripts: Tuple was set from long in both branches in python_process_event()Tom Zanussi2010-04-021-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix to the signed/unsigned field handling in the Python scripting engine, based on a patch from Roel Kluin. Basically, Python wants to use a PyInt (which is internally a long) if it can i.e. if the value will fit into that type. If not, it stores it into a PyLong, which isn't actually a long, but an arbitrary-precision integer variable. The code below is similar to to what Python does internally, and it seems to work as expected on the x86 and x86_64 sytems I tested it on. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Roel Kluin <roel.kluin@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org LKML-Reference: <1270184305.6422.10.camel@tropicana> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'perf' of ↵Ingo Molnar2010-04-0325-587/+806
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
| * | perf newt: Pass the input_name to perf_session__browse_histsArnaldo Carvalho de Melo2010-04-033-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So that it can use it in the 'perf annotate' command line, otherwise it'll use the default and not the specified -i filename passed to 'perf report'. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf newt: Move the hist browser population bits to separare functionArnaldo Carvalho de Melo2010-04-031-45/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Next patches will use that when applying filtes to then repopulate the browser with the narrowed vision. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf newt: Remove useless column width calculationArnaldo Carvalho de Melo2010-04-031-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not used in the TUI interface. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf symbols: Fill in pgoff in mmap synthesized eventsAnton Blanchard2010-04-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we synthesize mmap events we need to fill in the pgoff field. I wasn't able to test this completely since I couldn't find an executable region with a non 0 offset. We will see it when we start doing data profiling. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20100403115331.GK5594@kryten> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf tools: Move the prototypes in util/string.h to util.hArnaldo Carvalho de Melo2010-04-039-23/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So that we avoid conflict with libc's string.h header. Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf tools: sort_dimension__add shouldn't dieArnaldo Carvalho de Melo2010-04-022-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Propagate error instead. LKML-Reference: <new-submission> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf session: Remove one more exit() call from library codeArnaldo Carvalho de Melo2010-04-022-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return NULL instead and make the caller propagate the error. LKML-Reference: <new-submission> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf hist: Only allocate callchain_node if processing callchainsArnaldo Carvalho de Melo2010-04-023-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct callchain_node size is 120 bytes, that are never used when there are no callchains or '-g none' is specified, so conditionally allocate it, reducing sizeof(struct hist_entry) from 210 bytes to only 96, greatly speeding the non-callchain processing. LKML-Reference: <new-submission> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf kmem: Fixup the symbol address before using itArnaldo Carvalho de Melo2010-04-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We get absolute addresses in the events, but relative ones from the symbol subsystem, so calculate the absolute address by asking for the map where the symbol was found, that has the place where the DSO was actually loaded. For the core kernel this poses no problems if the kernel is not relocated by things like kexec, or if we use /proc/kallsyms, but for modules we were getting really large, negative offsets. LKML-Reference: <new-submission> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf kmem: Resolve kernel symbols againArnaldo Carvalho de Melo2010-04-023-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the assumption in perf_session__new that the kernel maps would be created using the fake PERF_RECORD_MMAP event in a perf.data file 'perf kmem --stat caller', that doesn't have such event, ends up not being able to resolve the kernel addresses. Fix it by calling perf_session__create_kernel_maps() in __cmd_kmem(). LKML-Reference: <new-submission> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf hist: Replace ->print() routines by ->snprintf() equivalentsArnaldo Carvalho de Melo2010-04-027-114/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Then hist_entry__fprintf will just us the newly introduced hist_entry__snprintf, add the newline and fprintf it to the supplied FILE descriptor. This allows us to remove the use_browser checking in the color_printf routines, that now got color_snprintf variants too. The newt TUI browser (and other GUIs that may come in the future) don't have to worry about stdio specific stuff in the strings they get from the se->snprintf routines and instead use whatever means to do the equivalent. Also the newt TUI browser don't have to use the fmemopen() hack, instead it can use the se->snprintf routines directly. For now tho use the hist_entry__snprintf routine to reduce the patch size. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf record: Add a fallback to the reference relocation symbolArnaldo Carvalho de Melo2010-04-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usually "_text" is enough, but I received reports that its not always available, so fallback to "_stext" for the symbol we use to check if we need to apply any relocation to all the symbols in the kernel symtab, for when, for instance, kexec is being used. Reported-by: Darren Hart <dvhltc@us.ibm.com> Cc: Darren Hart <dvhltc@us.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf tools: Allow specifying O= to build files in a separate directoryArnaldo Carvalho de Melo2010-04-022-156/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoiding polluting the source tree with build files. Reported-by: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf tools: Use -o $(BITBUCKET) in one more caseArnaldo Carvalho de Melo2010-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As described in 1703f2c some gcc versions has issues using /dev/null, so use the mechanism used elsewhere. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf report: Add progress barsArnaldo Carvalho de Melo2010-04-027-32/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For when we are processing the events and inserting the entries in the browser. Experimentation here: naming "ui_something" we may be treading into creating a TUI/GUI set of routines that can then be implemented in terms of multiple backends. Also the time it takes for adding things to the "browser" takes, visually (I guess I should do some profiling here ;-) ), more time than for processing the events... That means we probably need to create a custom hist_entry browser, so that we reuse the structures we have in place instead of duplicating them in newt. But progress was made and at least we can see something while long files are being loaded, that must be one of UI 101 bullet points :-) Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf symbols: map_groups__find_symbol must return the map tooArnaldo Carvalho de Melo2010-04-023-6/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tools need to know from which map in the map_group a symbol was resolved to, so that, for isntance, we can annotate kernel modules symbols by getting its precise name, etc. Also add the _by_name variants for completeness. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf symbols: Move more map_groups methods to map.cArnaldo Carvalho de Melo2010-04-024-170/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While writing a standalone test app that uses the symbol system to find kernel space symbols I noticed these also need to be moved. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf, x86: Add Nehalem programming quirk to WestmerePeter Zijlstra2010-04-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | According to the Xeon-5600 errata the Westmere suffers the same PMU programming bug as the original Nehalem did. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf, x86: Fix __initconst vs constPeter Zijlstra2010-04-024-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | All variables that have __initconst should also be const. Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf, x86: Fix up the ANY flag stuffPeter Zijlstra2010-04-025-51/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stephane noticed that the ANY flag was in generic arch code, and Cyrill reported that it broke the P4 code. Solve this by merging x86_pmu::raw_event into x86_pmu::hw_config and provide intel_pmu and amd_pmu specific versions of this callback. The intel_pmu one deals with the ANY flag, the amd_pmu adds the few extra event bits AMD64 has. Reported-by: Stephane Eranian <eranian@google.com> Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Robert Richter <robert.richter@amd.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269968113.5258.442.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf, x86: implement ARCH_PERFMON_EVENTSEL bit masksRobert Richter2010-04-025-89/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This patch adds macros for the bit masks and removes local defines. The function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is generic for x86 models and same also for p6. Duplicate code is removed. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100330092821.GH11907@erda.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf, x86: Undo some some *_counter* -> *_event* renamesRobert Richter2010-04-027-65/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The big rename: cdd6c48 perf: Do the big rename: Performance Counters -> Performance Events accidentally renamed some members of stucts that were named after registers in the spec. To avoid confusion this patch reverts some changes. The related specs are MSR descriptions in AMD's BKDGs and the ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32 Architectures Software Developer's Manuals. This patch does: $ sed -i -e 's:num_events:num_counters:g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/oprofile/op_model_ppro.c $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-04-02460-4153/+6917
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/cpu/perf_event.c Merge reason: Resolve the conflict, pick up fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | perf: Fix 'perf sched record' deadlockMike Galbraith2010-04-021-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf sched record can deadlock a box should the holder of handle->data->lock take an interrupt, and then attempt to acquire an rq lock held by a CPU trying to acquire the same lock. Disable interrupts. CPU0 CPU1 sched event with rq->lock held grab handle->data->lock spin on handle->data->lock interrupt try to grab rq->lock Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1269598293.6174.8.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernelsTorok Edwin2010-04-022-5/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | perf, x86: Fix AMD hotplug & constraint initializationPeter Zijlstra2010-04-022-36/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ robustify using amd_has_nb() ] Signed-off-by: Stephane Eranian <eranian@google.com> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: Move notify_cpu_starting() callback to a later stagePeter Zijlstra2010-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because we need to have cpu identification things done by the time we run CPU_STARTING notifiers. ( This init ordering will be relied on by the next fix. ) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'perf/urgent' of ↵Ingo Molnar2010-04-025-15/+25
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
| | * | x86,kgdb: Always initialize the hw breakpoint attributeJason Wessel2010-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is required to call hw_breakpoint_init() on an attr before using it in any other calls. This fixes the problem where kgdb will sometimes fail to initialize on x86_64. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: 2.6.33 <stable@kernel.org> LKML-Reference: <1269975907-27602-1-git-send-email-jason.wessel@windriver.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| | * | perf: Use hot regs with software sched switch/migrate eventsFrederic Weisbecker2010-04-013-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
| | * | perf: Correctly align perf event tracing bufferFrederic Weisbecker2010-04-011-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trace event buffer used by perf to record raw sample events is typed as an array of char and may then not be aligned to 8 by alloc_percpu(). But we need it to be aligned to 8 in sparc64 because we cast this buffer into a random structure type built by the TRACE_EVENT() macro to store the traces. So if a random 64 bits field is accessed inside, it may be not under an expected good alignment. Use an array of long instead to force the appropriate alignment, and perform a compile time check to ensure the size in byte of the buffer is a multiple of sizeof(long) so that its actual size doesn't get shrinked under us. This fixes unaligned accesses reported while using perf lock in sparc 64. Suggested-by: David Miller <davem@davemloft.net> Suggested-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Steven Rostedt <rostedt@goodmis.org>
| * | | Merge branch 'drm-linus' of ↵Linus Torvalds2010-04-0176-1676/+2904
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (76 commits) drm/radeon/kms: enable ACPI powermanagement mode on radeon gpus. drm/radeon/kms: rs400/480 should set common registers. drm/radeon/kms: add sanity check to wptr. drm/radeon/kms/evergreen: get DP working drm/radeon/kms: add hw_i2c module option drm/radeon/kms: use new pre/post_xfer i2c bit algo hooks drm/radeon/kms: disable MSI on IGP chips drm/radeon/kms: display watermark updates (v2) drm/radeon/kms/dp: disable training pattern on the sink at the end of link training drm/radeon/kms: minor fixes for eDP with LCD* device tags (v2) drm/radeon/kms/dp: remove extraneous training complete call drm/radeon/kms/atom: minor fixes to transmitter setup drm/radeon/kms: Only restrict BO to visible VRAM size when pinning to VRAM. drm: fix build error when SYSRQ is disabled drm/radeon/kms: fix macbookpro connector quirk drm/radeon/r6xx/r7xx: further safe reg clean up drm/radeon: bump the UMS driver version for r6xx/r7xx const buffer support drm/radeon/kms: bump the version for r6xx/r7xx const buffer support drm/radeon/r6xx/r7xx: CS parser fixes drm/radeon/kms: fix some typos in r6xx/r7xx hpd setup ... Fix up MSI-related conflicts in drivers/gpu/drm/radeon/radeon_irq_kms.c
| | * | | drm/radeon/kms: enable ACPI powermanagement mode on radeon gpus.Dave Airlie2010-04-012-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some GPUs have an APM/ACPI PM mode selection switch and some BIOSes set this to APM. We really want this in ACPI mode for Linux. Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | drm/radeon/kms: rs400/480 should set common registers.Dave Airlie2010-04-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These GPUs should be setting these registers up also. Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | drm/radeon/kms: add sanity check to wptr.Dave Airlie2010-04-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we resume in a bad way, we'll get 0xffffffff in wptr, and then oops with no console. This just adds a sanity check so that we can avoid the oops and hopefully get more details out of people's systems. Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | drm/radeon/kms/evergreen: get DP workingAlex Deucher2010-04-011-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Need to enable the VID stream after link training Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | drm/radeon/kms: add hw_i2c module optionAlex Deucher2010-03-313-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn off hw i2c by default except for mm i2c which is hw only until we sort out the remaining prescale issues on older chips. hw i2c can be enabled with hw_i2c=1. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | drm/radeon/kms: use new pre/post_xfer i2c bit algo hooksAlex Deucher2010-03-313-79/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to remove the internal bit algo bus used by the radeon i2c algo. We now register a radeon algo adapter if the gpio line is hw capable and the hw inplementation is available, otherwise we register a bit algo adapter. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
OpenPOWER on IntegriCloud