summaryrefslogtreecommitdiffstats
path: root/kernel/trace
Commit message (Collapse)AuthorAgeFilesLines
* uprobes/tracing: Fix dentry/mount leak in create_trace_uprobe()Oleg Nesterov2013-02-081-4/+6
| | | | | | | | create_trace_uprobe() does kern_path() to find ->d_inode, but forgets to do path_put(). We can do this right after igrab(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change handle_swbp() to expose bp_vaddr to handler_chain()Oleg Nesterov2013-02-081-2/+2
| | | | | | | | | | | | | Change handle_swbp() to set regs->ip = bp_vaddr in advance, this is what consumer->handler() needs but uprobe_get_swbp_addr() is not exported. This also simplifies the code and makes it more consistent across the supported architectures. handle_swbp() becomes the only caller of uprobe_get_swbp_addr(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
* uprobes: Kill uprobe_consumer->filter()Oleg Nesterov2013-02-081-1/+0
| | | | | | | | | | | uprobe_consumer->filter() is pointless in its current form, kill it. We will add it back, but with the different signature/semantics. Perhaps we will even re-introduce the callsite in handler_chain(), but not to just skip uc->handler(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* tracing: Init current_trace to nop_trace and remove NULL checksSteven Rostedt (Red Hat)2013-02-011-18/+12
| | | | | | | | | | | | | | | | | | | | | | On early boot up, when the ftrace ring buffer is initialized, the static variable current_trace is initialized to &nop_trace. Before this initialization, current_trace is NULL and will never become NULL again. It is always reassigned to a ftrace tracer. Several places check if current_trace is NULL before it uses it, and this check is frivolous, because at the point in time when the checks are made the only way current_trace could be NULL is if ftrace failed its allocations at boot up, and the paths to these locations would probably not be possible. By initializing current_trace to &nop_trace where it is declared, current_trace will never be NULL, and we can remove all these checks of current_trace being NULL which never needed to be checked in the first place. Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Make a snapshot feature available from userspaceHiraku Toyooka2013-01-303-26/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add "snapshot" debugfs file in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot This will allocate the spare buffer for snapshot (if it is not allocated), and take a snapshot. # cat snapshot This will show contents of the snapshot. # echo 0 > snapshot This will free the snapshot if it is allocated. Any other positive values will clear the snapshot contents if the snapshot is allocated, or return EINVAL if it is not allocated. Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> [ Fixed irqsoff selftest and also a conflict with a change that fixes the update_max_tr. ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Replace static old_tracer check of tracer nameHiraku Toyooka2013-01-301-13/+9
| | | | | | | | | | | | | | | | | | | | Currently the trace buffer read functions use a static variable "old_tracer" for detecting if the current tracer changes. This was suitable for a single trace file ("trace"), but to add a snapshot feature that will use the same function for its file, a check against a static variable is not sufficient. To use the output functions for two different files, instead of storing the current tracer in a static variable, as the trace iterator descriptor contains a pointer to the original current tracer's name, that pointer can now be used to check if the current tracer has changed between different reads of the trace file. Link: http://lkml.kernel.org/r/20121226025252.3252.9276.stgit@liselsia Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Use sched_clock_cpu for trace_clock_globalNamhyung Kim2013-01-301-1/+1
| | | | | | | | | | | | | | For systems with an unstable sched_clock, all cpu_clock() does is enable/ disable local irq during the call to sched_clock_cpu(). And for stable systems they are same. trace_clock_global() already disables interrupts, so it can call sched_clock_cpu() directly. Link: http://lkml.kernel.org/r/1356576585-28782-2-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ring-buffer: Add stats field for amount read from trace ring bufferSteven Rostedt (Red Hat)2013-01-302-0/+21
| | | | | | | | | | | | | | | | Add a stat about the number of events read from the ring buffer: # cat /debug/tracing/per_cpu/cpu0/stats entries: 39869 overrun: 870512 commit overrun: 0 bytes: 1449912 oldest event ts: 6561.368690 now ts: 6565.246426 dropped events: 0 read events: 112 <-- Added Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing/fgraph: Adjust fgraph depth before calling trace return callbackSteven Rostedt (Red Hat)2013-01-291-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While debugging the virtual cputime with the function graph tracer with a max_depth of 1 (most common use of the max_depth so far), I found that I was missing kernel execution because of a race condition. The code for the return side of the function has a slight race: ftrace_pop_return_trace(&trace, &ret, frame_pointer); trace.rettime = trace_clock_local(); ftrace_graph_return(&trace); barrier(); current->curr_ret_stack--; The ftrace_pop_return_trace() initializes the trace structure for the callback. The ftrace_graph_return() uses the trace structure for its own use as that structure is on the stack and is local to this function. Then the curr_ret_stack is decremented which is what the trace.depth is set to. If an interrupt comes in after the ftrace_graph_return() but before the curr_ret_stack, then the called function will get a depth of 2. If max_depth is set to 1 this function will be ignored. The problem is that the trace has already been called, and the timestamp for that trace will not reflect the time the function was about to re-enter userspace. Calls to the interrupt will not be traced because the max_depth has prevented this. To solve this issue, the ftrace_graph_return() can safely be moved after the current->curr_ret_stack has been updated. This way the timestamp for the return callback will reflect the actual time. If an interrupt comes in after the curr_ret_stack update and ftrace_graph_return(), it will be traced. It may look a little confusing to see it within the other function, but at least it will not be lost. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Remove second iterator initializerJovi Zhang2013-01-291-4/+1
| | | | | | | | | | The trace iterator is already initialized by trace_init_global_iter(), so there is no need to initialize it again. Link: http://lkml.kernel.org/r/CACV3sb+G1YnO6168JhY3dEadmJi58pA5-2cSZT8E0WVHJNFt9Q@mail.gmail.com Signed-off-by: Jovi Zhang <bookjovi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Use __this_cpu_inc/dec operation instead of __get_cpu_varShan Wei2013-01-251-2/+2
| | | | | | | | | | | __this_cpu_inc_return() or __this_cpu_dec generates a single instruction, which is faster than __get_cpu_var operation. Link: http://lkml.kernel.org/r/50A9C1BD.1060308@gmail.com Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Shan Wei <davidshan@tencent.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Mark tracing_dentry_percpu() staticJosh Triplett2013-01-241-1/+1
| | | | | | | | | Nothing outside of kernel/trace/trace.c references tracing_dentry_percpu(). Link: http://lkml.kernel.org/r/1353302917-13995-7-git-send-email-josh@joshtriplett.org Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix unsigned int compare of zero in recursion checkSteven Rostedt2013-01-241-1/+1
| | | | | | | | | | | | | | | Dan's smatch found a compare bug with the result of the trace_test_and_set_recursion() and comparing to less than zero. If the function fails, it returns -1, but was saved in an unsigned int, which will never be less than zero and will ignore the result of the test if a recursion did happen. Luckily this is the last of the recursion tests, as the infrastructure of ftrace would catch recursions before it got here, except for some few exceptions. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ring-buffer: Remove trace.h from ring_buffer.cSteven Rostedt2013-01-221-1/+2
| | | | | | | | | | | | | | | | ring_buffer.c use to require declarations from trace.h, but these have moved to the generic header files. There's nothing in trace.h that ring_buffer.c requires. There's some headers that trace.h included that ring_buffer.c needs, but it's best that it includes them directly, and not include trace.h. Also, some things may use ring_buffer.c without having tracing configured. This removes the dependency that may come in the future. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ring-buffer: User context bit recursion checkingSteven Rostedt2013-01-222-31/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using context bit recursion checking, we can help increase the performance of the ring buffer. Before this patch: # echo function > /debug/tracing/current_tracer # for i in `seq 10`; do ./hackbench 50; done Time: 10.285 Time: 10.407 Time: 10.243 Time: 10.372 Time: 10.380 Time: 10.198 Time: 10.272 Time: 10.354 Time: 10.248 Time: 10.253 (average: 10.3012) Now we have: # echo function > /debug/tracing/current_tracer # for i in `seq 10`; do ./hackbench 50; done Time: 9.712 Time: 9.824 Time: 9.861 Time: 9.827 Time: 9.962 Time: 9.905 Time: 9.886 Time: 10.088 Time: 9.861 Time: 9.834 (average: 9.876) a 4% savings! Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Use only the preempt version of function tracingSteven Rostedt2013-01-221-47/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function tracer had two different versions of function tracing. The disabling of irqs version and the preempt disable version. As function tracing in very intrusive and can cause nasty recursion issues, it has its own recursion protection. But the old method to do this was a flat layer. If it detected that a recursion was happening then it would just return without recording. This made the preempt version (much faster than the irq disabling one) not very useful, because if an interrupt were to occur after the recursion flag was set, the interrupt would not be traced at all, because every function that was traced would think it recursed on itself (due to the context it preempted setting the recursive flag). Now that we have a recursion flag for every context level, we no longer need to worry about that. We can disable preemption, set the current context recursion check bit, and go on. If an interrupt were to come along, it would check its own context bit and happily continue to trace. As the preempt version is faster than the irq disable version, there's no more reason to keep the preempt version around. And the irq disable version still had an issue with missing out on tracing NMI code. Remove the irq disable function tracer version and have the preempt disable version be the default (and only version). Before this patch we had from running: # echo function > /debug/tracing/current_tracer # for i in `seq 10`; do ./hackbench 50; done Time: 12.028 Time: 11.945 Time: 11.925 Time: 11.964 Time: 12.002 Time: 11.910 Time: 11.944 Time: 11.929 Time: 11.941 Time: 11.924 (average: 11.9512) Now we have: # echo function > /debug/tracing/current_tracer # for i in `seq 10`; do ./hackbench 50; done Time: 10.285 Time: 10.407 Time: 10.243 Time: 10.372 Time: 10.380 Time: 10.198 Time: 10.272 Time: 10.354 Time: 10.248 Time: 10.253 (average: 10.3012) a 13.8% savings! Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Avoid unnecessary multiple recursion checksSteven Rostedt2013-01-222-36/+110
| | | | | | | | | | | | | | | | | | | | | | | | | When function tracing occurs, the following steps are made: If arch does not support a ftrace feature: call internal function (uses INTERNAL bits) which calls... If callback is registered to the "global" list, the list function is called and recursion checks the GLOBAL bits. then this function calls... The function callback, which can use the FTRACE bits to check for recursion. Now if the arch does not suppport a feature, and it calls the global list function which calls the ftrace callback all three of these steps will do a recursion protection. There's no reason to do one if the previous caller already did. The recursion that we are protecting against will go through the same steps again. To prevent the multiple recursion checks, if a recursion bit is set that is higher than the MAX bit of the current check, then we know that the check was made by the previous caller, and we can skip the current check. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Make the trace recursion bits into enumsSteven Rostedt2013-01-221-13/+17
| | | | | | | Convert the bits into enums which makes the code a little easier to maintain. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Add context level recursion bit checkingSteven Rostedt2013-01-222-10/+42
| | | | | | | | | | | | | | | | | | | | Currently for recursion checking in the function tracer, ftrace tests a task_struct bit to determine if the function tracer had recursed or not. If it has, then it will will return without going further. But this leads to races. If an interrupt came in after the bit was set, the functions being traced would see that bit set and think that the function tracer recursed on itself, and would return. Instead add a bit for each context (normal, softirq, irq and nmi). A check of which context the task is in is made before testing the associated bit. Now if an interrupt preempts the function tracer after the previous context has been set, the interrupt functions can still be traced. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Optimize the function tracer list loopSteven Rostedt2013-01-221-22/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is lots of places that perform: op = rcu_dereference_raw(ftrace_control_list); while (op != &ftrace_list_end) { Add a helper macro to do this, and also optimize for a single entity. That is, gcc will optimize a loop for either no iterations or more than one iteration. But usually only a single callback is registered to the function tracer, thus the optimized case should be a single pass. to do this we now do: op = rcu_dereference_raw(list); do { [...] } while (likely(op = rcu_dereference_raw((op)->next)) && unlikely((op) != &ftrace_list_end)); An op is always registered (ftrace_list_end when no callbacks is registered), thus when a single callback is registered, the link list looks like: top => callback => ftrace_list_end => NULL. The likely(op = op->next) still must be performed due to the race of removing the callback, where the first op assignment could equal ftrace_list_end. In that case, the op->next would be NULL. But this is unlikely (only happens in a race condition when removing the callback). But it is very likely that the next op would be ftrace_list_end, unless more than one callback has been registered. This tells gcc what the most common case is and makes the fast path with the least amount of branches. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Fix function tracing recursion self testSteven Rostedt2013-01-221-1/+2
| | | | | | | | | The function tracing recursion self test should not crash the machine if the resursion test fails. If it detects that the function tracing is recursing when it should not be, then bail, don't go into an infinite recursive loop. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Fix global function tracers that are not recursion safeSteven Rostedt2013-01-221-2/+16
| | | | | | | | If one of the function tracers set by the global ops is not recursion safe, it can still be called directly without the added recursion supplied by the ftrace infrastructure. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix selftest function recursion accountingSteven Rostedt2013-01-221-13/+3
| | | | | | | | | | | | The test that checks function recursion does things differently if the arch does not support all ftrace features. But that really doesn't make a difference with how the test runs, and either way the count variable should be 2 at the end. Currently the test wrongly fails for archs that don't support all the ftrace features. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix race with max_tr and changing tracersSteven Rostedt2013-01-221-7/+22
| | | | | | | | | | | | | | | | | | | | | | There's a race condition between the setting of a new tracer and the update of the max trace buffers (the swap). When a new tracer is added, it sets current_trace to nop_trace before disabling the old tracer. At this moment, if the old tracer uses update_max_tr(), the update may trigger the warning against !current_trace->use_max-tr, as nop_trace doesn't have that set. As update_max_tr() requires that interrupts be disabled, we can add a check to see if current_trace == nop_trace and bail if it does. Then when disabling the current_trace, set it to nop_trace and run synchronize_sched(). This will make sure all calls to update_max_tr() have completed (it was called with interrupts disabled). As a clean up, this commit also removes shrinking and recreating the max_tr buffer if the old and new tracers both have use_max_tr set. The old way use to always shrink the buffer, and then expand it for the next tracer. This is a waste of time. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Remove trace.h header from trace_clock.cSteven Rostedt2013-01-221-2/+0
| | | | | | | | As trace_clock is used by other things besides tracing, and it does not require anything from trace.h, it is best not to include the header file in trace_clock.c. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Remove the extra 4 bytes of padding in eventsSteven Rostedt2013-01-212-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to a userspace issue with PowerTop v2beta, which hardcoded the offset of event fields that it was using, it broke when we removed the Big Kernel Lock counter from the event header. (commit e6e1e2593 "tracing: Remove lock_depth from event entry") Because this broke userspace, it was determined that we must keep those 4 bytes around. (commit a3a4a5acd "Regression: partial revert "tracing: Remove lock_depth from event entry"") This unfortunately wastes space in the ring buffer. 4 bytes per event, where a lot of events are just 24 bytes. That's 16% of the buffer wasted. A million events will add 4 megs of white space into the buffer. It was later noticed that PowerTop v2beta could not work on systems where the kernel was 64 bit but the userspace was 32 bits. The reason was because the offsets are different between the two and the hard coded offset of one would not work with the other. With PowerTop v2 final, it implemented the same interface that both perf and trace-cmd use. That is, it reads the format file of the event to find the offsets of the fields it needs. This fixes the problem with running powertop on a 32 bit userspace running on a 64 bit kernel. It also no longer requires the 4 byte padding. As PowerTop v2 has been out for a while, and is included in all major distributions, it is time that we can safely remove the 4 bytes of padding. Users of PowerTop v2beta should upgrade to PowerTop v2 final. Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Move ARCH_SUPPORTS_FTRACE_SAVE_REGS in KconfigMasami Hiramatsu2013-01-213-4/+12
| | | | | | | | | | | | | | | | | | | | | | Move SAVE_REGS support flag into Kconfig and rename it to CONFIG_DYNAMIC_FTRACE_WITH_REGS. This also introduces CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which indicates the architecture depending part of ftrace has a code that saves full registers. On the other hand, CONFIG_DYNAMIC_FTRACE_WITH_REGS indicates the code is enabled. Link: http://lkml.kernel.org/r/20120928081516.3560.72534.stgit@ltc138.sdl.hitachi.co.jp Cc: Ingo Molnar <mingo@elte.hu> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing/fgraph: Add max_graph_depth to limit function_graph depthSteven Rostedt2013-01-211-2/+58
| | | | | | | | | | | | | | | | | | | | | Add the file max_graph_depth to the debug tracing directory that lets the user define the depth of the function graph. A very useful operation is to set the depth to 1. Then it traces only the first function that is called when entering the kernel. This can be used to determine what system operations interrupt a process. For example, to work on NOHZ processes (single tasks running without a timer tick), if any interrupt goes off and preempts that task, this code will show it happening. # cd /sys/kernel/debug/tracing # echo 1 > max_graph_depth # echo function_graph > current_tracer # cat per_cpu/cpu/<cpu-of-process>/trace Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Remove unneeded check of max_tr->buffer before tracing_resetSteven Rostedt2013-01-211-2/+1
| | | | | | | There's now a check in tracing_reset_online_cpus() if the buffer is allocated or NULL. No need to do a check before calling it with max_tr. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Add checks if tr->buffer is NULL in tracing_reset{_online_cpus}Hiraku Toyooka2013-01-211-0/+6
| | | | | | | | | | | | | | | max_tr->buffer could be NULL in the tracing_reset{_online_cpus}. In this case, a NULL pointer dereference happens, so we should return immediately from these functions. Note, the current code does not call tracing_reset*() with max_tr when its buffer is NULL, but future code will. This patch is needed to prevent the future code from crashing. Link: http://lkml.kernel.org/r/20121219070234.31200.93863.stgit@liselsia Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing/syscalls: Make local functions staticFengguang Wu2013-01-211-9/+9
| | | | | | | | Some functions in the syscall tracing is used only locally to the file, but they are labeled global. Convert them to static functions. Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Verify target file before registering a uprobe eventJovi Zhang2013-01-211-1/+5
| | | | | | | | | | | | | | | | | | | | Without this patch, we can register a uprobe event for a directory. Enabling such a uprobe event would anyway fail. Example: $ echo 'p /bin:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events However dirctories cannot be valid targets for uprobe. Hence verify if the target is a regular file during the probe registration. Link: http://lkml.kernel.org/r/20130103004212.690763002@goodmis.org Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jovi Zhang <bookjovi@gmail.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> [ cleaned up whitespace and removed redundant IS_DIR() check ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Use this_cpu_ptr per-cpu helperShan Wei2013-01-212-5/+2
| | | | | | | | | | | | | typeof(&buffer) is a pointer to array of 1024 char, or char (*)[1024]. But, typeof(&buffer[0]) is a pointer to char which match the return type of get_trace_buf(). As well-known, the value of &buffer is equal to &buffer[0]. so return this_cpu_ptr(&percpu_buffer->buffer[0]) can avoid type cast. Link: http://lkml.kernel.org/r/50A1A800.3020102@gmail.com Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Shan Wei <davidshan@tencent.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ring-buffer: Remove unnecessary recusive call in rb_advance_iter()Steven Rostedt2013-01-211-1/+1
| | | | | | | | | | | | | | | The original ring-buffer code had special checks at the start of rb_advance_iter() and instead of repeating them again at the end of the function if a certain condition existed, I just did a recursive call to rb_advance_iter() because the special condition would cause rb_advance_iter() to return early (after the checks). But as things have changed, the special checks no longer exist and the only thing done for the special_condition is to call rb_inc_iter() and return. Instead of doing a confusing recursive call, just call rb_inc_iter instead. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* ftrace: Be first to run code modification on modulesSteven Rostedt2013-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If some other kernel subsystem has a module notifier, and adds a kprobe to a ftrace mcount point (now that kprobes work on ftrace points), when the ftrace notifier runs it will fail and disable ftrace, as well as kprobes that are attached to ftrace points. Here's the error: WARNING: at kernel/trace/ftrace.c:1618 ftrace_bug+0x239/0x280() Hardware name: Bochs Modules linked in: fat(+) stap_56d28a51b3fe546293ca0700b10bcb29__8059(F) nfsv4 auth_rpcgss nfs dns_resolver fscache xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack lockd sunrpc ppdev parport_pc parport microcode virtio_net i2c_piix4 drm_kms_helper ttm drm i2c_core [last unloaded: bid_shared] Pid: 8068, comm: modprobe Tainted: GF 3.7.0-0.rc8.git0.1.fc19.x86_64 #1 Call Trace: [<ffffffff8105e70f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81134106>] ? __probe_kernel_read+0x46/0x70 [<ffffffffa0180000>] ? 0xffffffffa017ffff [<ffffffffa0180000>] ? 0xffffffffa017ffff [<ffffffff8105e76a>] warn_slowpath_null+0x1a/0x20 [<ffffffff810fd189>] ftrace_bug+0x239/0x280 [<ffffffff810fd626>] ftrace_process_locs+0x376/0x520 [<ffffffff810fefb7>] ftrace_module_notify+0x47/0x50 [<ffffffff8163912d>] notifier_call_chain+0x4d/0x70 [<ffffffff810882f8>] __blocking_notifier_call_chain+0x58/0x80 [<ffffffff81088336>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff810c2a23>] sys_init_module+0x73/0x220 [<ffffffff8163d719>] system_call_fastpath+0x16/0x1b ---[ end trace 9ef46351e53bbf80 ]--- ftrace failed to modify [<ffffffffa0180000>] init_once+0x0/0x20 [fat] actual: cc:bb:d2:4b:e1 A kprobe was added to the init_once() function in the fat module on load. But this happened before ftrace could have touched the code. As ftrace didn't run yet, the kprobe system had no idea it was a ftrace point and simply added a breakpoint to the code (0xcc in the cc:bb:d2:4b:e1). Then when ftrace went to modify the location from a call to mcount/fentry into a nop, it didn't see a call op, but instead it saw the breakpoint op and not knowing what to do with it, ftrace shut itself down. The solution is to simply give the ftrace module notifier the max priority. This should have been done regardless, as the core code ftrace modification also happens very early on in boot up. This makes the module modification closer to core modification. Link: http://lkml.kernel.org/r/20130107140333.593683061@goodmis.org Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reported-by: Frank Ch. Eigler <fche@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix regression of trace_pipeLiu Bo2013-01-141-2/+2
| | | | | | | | | | | | | | | Commit 0fb9656d "tracing: Make tracing_enabled be equal to tracing_on" changes the behaviour of trace_pipe, ie. it makes trace_pipe return if we've read something and tracing is enabled, and this means that we have to 'cat trace_pipe' again and again while running tests. IMO the right way is if tracing is enabled, we always block and wait for ring buffer, or we may lose what we want since ring buffer's size is limited. Link: http://lkml.kernel.org/r/1358132051-5410-1-git-send-email-bo.li.liu@oracle.com Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix regression with irqsoff tracer and tracing_on fileSteven Rostedt2013-01-111-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 02404baf1b47 "tracing: Remove deprecated tracing_enabled file" removed the tracing_enabled file as it never worked properly and the tracing_on file should be used instead. But the tracing_on file didn't call into the tracers start/stop routines like the tracing_enabled file did. This caused trace-cmd to break when it enabled the irqsoff tracer. If you just did "echo irqsoff > current_tracer" then it would work properly. But the tool trace-cmd disables tracing first by writing "0" into the tracing_on file. Then it writes "irqsoff" into current_tracer and then writes "1" into tracing_on. Unfortunately, the above commit changed the irqsoff tracer to check the tracing_on status instead of the tracing_enabled status. If it's disabled then it does not start the tracer internals. The problem is that writing "1" into tracing_on does not call the tracers "start" routine like writing "1" into tracing_enabled did. This makes the irqsoff tracer not start when using the trace-cmd tool, and is a regression for userspace. Simple fix is to have the tracing_on file call the tracers start() method when being enabled (and the stop() method when disabled). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Fix regression of trace_options file settingSteven Rostedt2013-01-091-0/+2
| | | | | | | | | | | | | | | | The latest change to allow trace options to be set on the command line also broke the trace_options file. The zeroing of the last byte of the option name that is echoed into the trace_option file was removed with the consolidation of some of the code. The compare between the option and what was written to the trace_options file fails because the string holding the data written doesn't terminate with a null character. A zero needs to be added to the end of the string copied from user space. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* Merge branch 'tip/perf/core-2' of ↵Linus Torvalds2012-12-182-33/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull minor tracing updates and fixes from Steven Rostedt: "It seems that one of my old pull requests have slipped through. The changes are contained to just the files that I maintain, and are changes from others that I told I would get into this merge window. They have already been in linux-next for several weeks, and should be well tested." * 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Remove unnecessary WARN_ONCE's from tracing_buffers_splice_read tracing: Remove unneeded checks from the stack tracer tracing: Add a resize function to make one buffer equivalent to another buffer
| * tracing: Remove unnecessary WARN_ONCE's from tracing_buffers_splice_readDave Jones2012-11-191-2/+0
| | | | | | | | | | | | | | | | | | WARN shouldn't be used as a means of communicating failure to a userspace programmer. Link: http://lkml.kernel.org/r/20120725153908.GA25203@redhat.com Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Remove unneeded checks from the stack tracerAnton Vorontsov2012-11-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems that 'ftrace_enabled' flag should not be used inside the tracer functions. The ftrace core is using this flag for internal purposes, and the flag wasn't meant to be used in tracers' runtime checks. stack tracer is the only tracer that abusing the flag. So stop it from serving as a bad example. Also, there is a local 'stack_trace_disabled' flag in the stack tracer, which is never updated; so it can be removed as well. Link: http://lkml.kernel.org/r/1342637761-9655-1-git-send-email-anton.vorontsov@linaro.org Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add a resize function to make one buffer equivalent to another bufferHiraku Toyooka2012-11-151-27/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trace buffer size is now per-cpu, so that there are the following two patterns in resizing of buffers. (1) resize per-cpu buffers to same given size (2) resize per-cpu buffers to another trace_array's buffer size for each CPU (such as preparing the max_tr which is equivalent to the global_trace's size) __tracing_resize_ring_buffer() can be used for (1), and had implemented (2) inside it for resetting the global_trace to the original size. (2) was also implemented in another place. So this patch assembles them in a new function - resize_buffer_duplicate_size(). Link: http://lkml.kernel.org/r/20121017025616.2627.91226.stgit@falsita Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | trace: use kbasename()Andy Shevchenko2012-12-171-4/+4
| | | | | | | | | | | | | | | | Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | lseek: the "whence" argument is called "whence"Andrew Morton2012-12-171-2/+2
| | | | | | | | | | | | | | | | | | But the kernel decided to call it "origin" instead. Fix most of the sites. Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-12-135-5/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...
| * | propagate name change to comments in kernel sourceNadia Yvette Chambers2012-12-065-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | I've legally changed my name with New York State, the US Social Security Administration, et al. This patch propagates the name change and change in initials and login to comments in the kernel source as well. Signed-off-by: Nadia Yvette Chambers <nyc@holomorphy.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2012-12-112-4/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "These are late-v3.7 pending fixes for tracing." Fix up trivial conflict in kernel/trace/ring_buffer.c: the NULL pointer fix clashed with the change of type of the 'ret' variable. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ring-buffer: Fix race between integrity check and readers ring-buffer: Fix NULL pointer if rb_set_head_page() fails ftrace: Clear bits properly in reset_iter_read()
| * | | ring-buffer: Fix race between integrity check and readersSteven Rostedt2012-11-301-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function rb_check_pages() was added to make sure the ring buffer's pages were sane. This check is done when the ring buffer size is modified as well as when the iterator is released (closing the "trace" file), as that was considered a non fast path and a good place to do a sanity check. The problem is that the check does not have any locks around it. If one process were to read the trace file, and another were to read the raw binary file, the check could happen while the reader is reading the file. The issues with this is that the check requires to clear the HEAD page before doing the full check and it restores it afterward. But readers require the HEAD page to exist before it can read the buffer, otherwise it gives a nasty warning and disables the buffer. By adding the reader lock around the check, this keeps the race from happening. Cc: stable@vger.kernel.org # 3.6 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: Fix NULL pointer if rb_set_head_page() failsSteven Rostedt2012-11-301-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function rb_set_head_page() searches the list of ring buffer pages for a the page that has the HEAD page flag set. If it does not find it, it will do a WARN_ON(), disable the ring buffer and return NULL, as this should never happen. But if this bug happens to happen, not all callers of this function can handle a NULL pointer being returned from it. That needs to be fixed. Cc: stable@vger.kernel.org # 3.0+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ftrace: Clear bits properly in reset_iter_read()Dan Carpenter2012-11-151-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a typo here where '&' is used instead of '|' and it turns the statement into a noop. The original code is equivalent to: iter->flags &= ~((1 << 2) & (1 << 4)); Link: http://lkml.kernel.org/r/20120609161027.GD6488@elgon.mountain Cc: stable@vger.kernel.org # all of them Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
OpenPOWER on IntegriCloud