summaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-4.12-1' of ↵Linus Torvalds2017-05-051-14/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Larger virtual address space on 64-bit server CPUs. By default we use a 128TB virtual address space, but a process can request access to the full 512TB by passing a hint to mmap(). - Support for the new Power9 "XIVE" interrupt controller. - TLB flushing optimisations for the radix MMU on Power9. - Support for CAPI cards on Power9, using the "Coherent Accelerator Interface Architecture 2.0". - The ability to configure the mmap randomisation limits at build and runtime. - Several small fixes and cleanups to the kprobes code, as well as support for KPROBES_ON_FTRACE. - Major improvements to handling of system reset interrupts, correctly treating them as NMIs, giving them a dedicated stack and using a new hypervisor call to trigger them, all of which should aid debugging and robustness. - Many fixes and other minor enhancements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt, Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy, Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy, Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin, Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C. Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar, Yang Shi" * tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it powerpc/powernv: Fix TCE kill on NVLink2 powerpc/mm/radix: Drop support for CPUs without lockless tlbie powerpc/book3s/mce: Move add_taint() later in virtual mode powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body powerpc/smp: Document irq enable/disable after migrating IRQs powerpc/mpc52xx: Don't select user-visible RTAS_PROC powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset() powerpc/eeh: Clean up and document event handling functions powerpc/eeh: Avoid use after free in eeh_handle_special_event() cxl: Mask slice error interrupts after first occurrence cxl: Route eeh events to all drivers in cxl_pci_error_detected() cxl: Force context lock during EEH flow powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST powerpc/xmon: Teach xmon oops about radix vectors powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids powerpc/pseries: Enable VFIO powerpc/powernv: Fix iommu table size calculation hook for small tables powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc powerpc: Add arch/powerpc/tools directory ...
| * powerpc/kprobes: Fix handling of function offsets on ABIv2Naveen N. Rao2017-04-201-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling with kallsyms on ppc64le") changed how we use the offset field in struct kprobe on ABIv2. perf now offsets from the global entry point if an offset is specified and otherwise chooses the local entry point. Fix the same in kernel for kprobe API users. We do this by extending kprobe_lookup_name() to accept an additional parameter to indicate the offset specified with the kprobe registration. If offset is 0, we return the local function entry and return the global entry point otherwise. With: # cd /sys/kernel/debug/tracing/ # echo "p _do_fork" >> kprobe_events # echo "p _do_fork+0x10" >> kprobe_events before this patch: # cat ../kprobes/list c0000000000d0748 k _do_fork+0x8 [DISABLED] c0000000000d0758 k _do_fork+0x18 [DISABLED] c0000000000412b0 k kretprobe_trampoline+0x0 [OPTIMIZED] and after: # cat ../kprobes/list c0000000000d04c8 k _do_fork+0x8 [DISABLED] c0000000000d04d0 k _do_fork+0x10 [DISABLED] c0000000000412b0 k kretprobe_trampoline+0x0 [OPTIMIZED] Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * kprobes: Convert kprobe_lookup_name() to a functionNaveen N. Rao2017-04-201-12/+8
| | | | | | | | | | | | | | | | | | | | | | The macro is now pretty long and ugly on powerpc. In the light of further changes needed here, convert it to a __weak variant to be over-ridden with a nicer looking function. Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * kprobes: Skip preparing optprobe if the probe is ftrace-basedMasami Hiramatsu2017-04-201-2/+9
| | | | | | | | | | | | | | | | | | Skip preparing optprobe if the probe is ftrace-based, since anyway, it must not be optimized (or already optimized by ftrace). Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'for-linus' of ↵Linus Torvalds2017-05-053-4/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "This is a set of small fixes that were mostly stumbled over during more significant development. This proc fix and the fix to posix-timers are the most significant of the lot. There is a lot of good development going on but unfortunately it didn't quite make the merge window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Fix unbalanced hard link numbers signal: Make kill_proc_info static rlimit: Properly call security_task_setrlimit signal: Remove unused definition of sig_user_definied ia64: Remove unused IA64_TASK_SIGHAND_OFFSET and IA64_SIGHAND_SIGLOCK_OFFSET ipc: Remove unused declaration of recompute_msgmni posix-timers: Correct sanity check in posix_cpu_nsleep sysctl: Remove dead register_sysctl_root
| * | signal: Make kill_proc_info staticEric W. Biederman2017-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | There are no users outside of signal.c so make the function static so the compiler and other developers have that information. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | rlimit: Properly call security_task_setrlimitEric W. Biederman2017-04-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify do_prlimit to call security_task_setrlimit passing the task whose rlimit we are changing not the tsk->group_leader. In general this should not matter as the lsms implementing security_task_setrlimit apparmor and selinux both examine the task->cred to see what should be allowed on the destination task. That task->cred is shared between tasks created with CLONE_THREAD unless thread keyrings are in play, in which case both apparmor and selinux create duplicate security contexts. So the only time when it will matter which thread is passed to security_task_setrlimit is if one of the threads of a process performs an operation that changes only it's credentials. At which point if a thread has done that we don't want to hide that information from the lsms. So fix the call of security_task_setrlimit. With the removal of tsk->group_leader this makes the code slightly faster, more comprehensible and maintainable. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | posix-timers: Correct sanity check in posix_cpu_nsleepEric W. Biederman2017-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | CPUCLOCK_PID(which_clock) is a pid value from userspace so compare it against task_pid_vnr, not current->pid. As task_pid_vnr is in the tasks pid value in the tasks pid namespace, and current->pid is in the initial pid namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | Merge tag 'modules-for-v4.12' of ↵Linus Torvalds2017-05-031-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: - Minor code cleanups - Fix section alignment for .init_array * tag 'modules-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: kallsyms: Use bounded strnchr() when parsing string module: Unify the return value type of try_module_get module: set .init_array alignment to 8
| * | | kallsyms: Use bounded strnchr() when parsing stringNaveen N. Rao2017-04-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parsing for the <module:name> format, we use strchr() to look for the separator, when we know that the module name can't be longer than MODULE_NAME_LEN. Enforce the same using strnchr(). Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jessica Yu <jeyu@redhat.com>
* | | | Merge tag 'trace-v4.12' of ↵Linus Torvalds2017-05-0312-585/+1303
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New features for this release: - Pretty much a full rewrite of the processing of function plugins. i.e. echo do_IRQ:stacktrace > set_ftrace_filter - The rewrite was needed to add plugins to be unique to tracing instances. i.e. mkdir instance/foo; cd instances/foo; echo do_IRQ:stacktrace > set_ftrace_filter The old way was written very hacky. This removes a lot of those hacks. - New "function-fork" tracing option. When set, pids in the set_ftrace_pid will have their children added when the processes with their pids listed in the set_ftrace_pid file forks. - Exposure of "maxactive" for kretprobe in kprobe_events - Allow for builtin init functions to be traced by the function tracer (via the kernel command line). Module init function tracing will come in the next release. - Added more selftests, and have selftests also test in an instance" * tag 'trace-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (60 commits) ring-buffer: Return reader page back into existing ring buffer selftests: ftrace: Allow some event trigger tests to run in an instance selftests: ftrace: Have some basic tests run in a tracing instance too selftests: ftrace: Have event tests also run in an tracing instance selftests: ftrace: Make func_event_triggers and func_traceonoff_triggers tests do instances selftests: ftrace: Allow some tests to be run in a tracing instance tracing/ftrace: Allow for instances to trigger their own stacktrace probes tracing/ftrace: Allow for the traceonoff probe be unique to instances tracing/ftrace: Enable snapshot function trigger to work with instances tracing/ftrace: Allow instances to have their own function probes tracing/ftrace: Add a better way to pass data via the probe functions ftrace: Dynamically create the probe ftrace_ops for the trace_array tracing: Pass the trace_array into ftrace_probe_ops functions tracing: Have the trace_array hold the list of registered func probes ftrace: If the hash for a probe fails to update then free what was initialized ftrace: Have the function probes call their own function ftrace: Have each function probe use its own ftrace_ops ftrace: Have unregister_ftrace_function_probe_func() return a value ftrace: Add helper function ftrace_hash_move_and_update_ops() ftrace: Remove data field from ftrace_func_probe structure ...
| * | | | ring-buffer: Return reader page back into existing ring bufferSteven Rostedt (VMware)2017-05-013-9/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading the ring buffer for consuming, it is optimized for splice, where a page is taken out of the ring buffer (zero copy) and sent to the reading consumer. When the read is finished with the page, it calls ring_buffer_free_read_page(), which simply frees the page. The next time the reader needs to get a page from the ring buffer, it must call ring_buffer_alloc_read_page() which allocates and initializes a reader page for the ring buffer to be swapped into the ring buffer for a new filled page for the reader. The problem is that there's no reason to actually free the page when it is passed back to the ring buffer. It can hold it off and reuse it for the next iteration. This completely removes the interaction with the page_alloc mechanism. Using the trace-cmd utility to record all events (causing trace-cmd to require reading lots of pages from the ring buffer, and calling ring_buffer_alloc/free_read_page() several times), and also assigning a stack trace trigger to the mm_page_alloc event, we can see how many times the ring_buffer_alloc_read_page() needed to allocate a page for the ring buffer. Before this change: # trace-cmd record -e all -e mem_page_alloc -R stacktrace sleep 1 # trace-cmd report |grep ring_buffer_alloc_read_page | wc -l 9968 After this change: # trace-cmd record -e all -e mem_page_alloc -R stacktrace sleep 1 # trace-cmd report |grep ring_buffer_alloc_read_page | wc -l 4 Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/ftrace: Allow for instances to trigger their own stacktrace probesSteven Rostedt (VMware)2017-04-201-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have the stacktrace function trigger probe trigger stack traces within the instance that they were added to in the set_ftrace_filter. ># cd /sys/kernel/debug/tracing ># mkdir instances/foo ># cd instances/foo ># echo schedule:stacktrace:1 > set_ftrace_filter ># cat trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [001] .N.2 202.585010: <stack trace> => => schedule => schedule_preempt_disabled => do_idle => cpu_startup_entry => start_secondary => verify_cpu Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/ftrace: Allow for the traceonoff probe be unique to instancesSteven Rostedt (VMware)2017-04-203-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have the traceon/off function probe triggers affect only the instance they are set in. This required making the trace_on/off accessible for other files in the tracing directory. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/ftrace: Enable snapshot function trigger to work with instancesSteven Rostedt (VMware)2017-04-201-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify the snapshot probe trigger to work with instances. This way the snapshot function trigger will only affect the instance that it is added to in the set_ftrace_filter file. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/ftrace: Allow instances to have their own function probesSteven Rostedt (VMware)2017-04-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass around the local trace_array that is the descriptor for tracing instances, when enabling and disabling probes. This by default sets the enable/disable of event probe triggers to work with instances. The other probes will need some more work to get them working with instances. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/ftrace: Add a better way to pass data via the probe functionsSteven Rostedt (VMware)2017-04-205-107/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the redesign of the registration and execution of the function probes (triggers), data can now be passed from the setup of the probe to the probe callers that are specific to the trace_array it is on. Although, all probes still only affect the toplevel trace array, this change will allow for instances to have their own probes separated from other instances and the top array. That is, something like the stacktrace probe can be set to trace only in an instance and not the toplevel trace array. This isn't implement yet, but this change sets the ground work for the change. When a probe callback is triggered (someone writes the probe format into set_ftrace_filter), it calls register_ftrace_function_probe() passing in init_data that will be used to initialize the probe. Then for every matching function, register_ftrace_function_probe() will call the probe_ops->init() function with the init data that was passed to it, as well as an address to a place holder that is associated with the probe and the instance. The first occurrence will have a NULL in the pointer. The init() function will then initialize it. If other probes are added, or more functions are part of the probe, the place holder will be passed to the init() function with the place holder data that it was initialized to the last time. Then this place_holder is passed to each of the other probe_ops functions, where it can be used in the function callback. When the probe_ops free() function is called, it can be called either with the rip of the function that is being removed from the probe, or zero, indicating that there are no more functions attached to the probe, and the place holder is about to be freed. This gives the probe_ops a way to free the data it assigned to the place holder if it was allocade during the first init call. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Dynamically create the probe ftrace_ops for the trace_arraySteven Rostedt (VMware)2017-04-205-57/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to eventually have each trace_array instance have its own unique set of function probes (triggers), the trace array needs to hold the ops and the filters for the probes. This is the first step to accomplish this. Instead of having the private data of the probe ops point to the trace_array, create a separate list that the trace_array holds. There's only one private_data for a probe, we need one per trace_array. The probe ftrace_ops will be dynamically created for each instance, instead of being static. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Pass the trace_array into ftrace_probe_ops functionsSteven Rostedt (VMware)2017-04-205-30/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the trace_array associated to a ftrace_probe_ops into the probe_ops func(), init() and free() functions. The trace_array is the descriptor that describes a tracing instance. This will help create the infrastructure that will allow having function probes unique to tracing instances. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Have the trace_array hold the list of registered func probesSteven Rostedt (VMware)2017-04-205-30/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a link list to the trace_array to hold func probes that are registered. Currently, all function probes are the same for all instances as it was before, that is, only the top level trace_array holds the function probes. But this lays the ground work to have function probes be attached to individual instances, and having the event trigger only affect events in the given instance. But that work is still to be done. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: If the hash for a probe fails to update then free what was initializedSteven Rostedt (VMware)2017-04-201-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the ftrace_hash_move_and_update_ops() fails, and an ops->free() function exists, then it needs to be called on all the ops that were added by this registration. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Have the function probes call their own functionSteven Rostedt (VMware)2017-04-202-127/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the function probes have their own ftrace_ops, there's no reason to continue using the ftrace_func_hash to find which probe to call in the function callback. The ops that is passed in to the function callback is part of the probe_ops to call. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Have each function probe use its own ftrace_opsSteven Rostedt (VMware)2017-04-202-148/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have the function probes have their own ftrace_ops, and remove the trace_probe_ops. This simplifies some of the ftrace infrastructure code. Individual entries for each function is still allocated for the use of the output for set_ftrace_filter, but they will be removed soon too. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Have unregister_ftrace_function_probe_func() return a valueSteven Rostedt (VMware)2017-04-205-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently unregister_ftrace_function_probe_func() is a void function. It does not give any feedback if an error occurred or no item was found to remove and nothing was done. Change it to return status and success if it removed something. Also update the callers to return that feedback to the user. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Add helper function ftrace_hash_move_and_update_ops()Steven Rostedt (VMware)2017-04-201-52/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The processes of updating a ops filter_hash is a bit complex, and requires setting up an old hash to perform the update. This is done exactly the same in two locations for the same reasons. Create a helper function that does it in one place. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Remove data field from ftrace_func_probe structureSteven Rostedt (VMware)2017-04-205-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No users of the function probes uses the data field anymore. Remove it, and change the init function to take a void *data parameter instead of a void **data, because the init will just get the data that the registering function was received, and there's no state after it is called. The other functions for ftrace_probe_ops still take the data parameter, but it will currently only be passed NULL. It will stay as a parameter for future data to be passed to these functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Remove printing of data in showing of a function probeSteven Rostedt (VMware)2017-04-201-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | None of the probe users uses the data field anymore of the entry. They all have their own print() function. Remove showing the data field in the generic function as the data field will be going away. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Remove unused unregister_ftrace_function_probe_all() functionSteven Rostedt (VMware)2017-04-202-20/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no users of unregister_ftrace_function_probe_all(). The only probe function that is used is unregister_ftrace_function_probe_func(). Rename the internal static function __unregister_ftrace_function_probe() to unregister_ftrace_function_probe_func() and make it global. Also remove the PROBE_TEST_FUNC as it would be always set. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Remove unused unregister_ftrace_function_probe() functionSteven Rostedt (VMware)2017-04-202-18/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing calls unregister_ftrace_function_probe(). Remove it as well as the flag PROBE_TEST_DATA, as this function was the only one to set it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Convert the rest of the function trigger over to the mapping functionsSteven Rostedt (VMware)2017-04-201-38/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the data pointer for individual ips will soon be removed and no longer passed to the callback function probe handlers, convert the rest of the function trigger counters over to the new ftrace_func_mapper helper functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Have the snapshot trigger use the mapping helper functionsSteven Rostedt (VMware)2017-04-201-8/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the data pointer for individual ips will soon be removed and no longer passed to the callback function probe handlers, convert the snapshot trigger counter over to the new ftrace_func_mapper helper functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Added ftrace_func_mapper for function probe triggersSteven Rostedt (VMware)2017-04-203-15/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to move the ops to the function probes directly, they need a way to map function ips to their own data without depending on the infrastructure of the function probes, as the data field will be going away. New helper functions are added that are based on the ftrace_hash code. ftrace_func_mapper functions are there to let the probes map ips to their data. These can be allocated by the probe ops, and referenced in the function callbacks. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Pass probe ops to probe functionSteven Rostedt (VMware)2017-04-205-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to cleaning up the probe function registration code, the "data" parameter will eventually be removed from the probe->func() call. Instead it will receive its own "ops" function, in which it can set up its own data that it needs to map. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Remove unused "flags" field from struct ftrace_func_probeSteven Rostedt (VMware)2017-04-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing uses "flags" in the ftrace_func_probe descriptor. Remove it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Move the function commands into the tracing directorySteven Rostedt (VMware)2017-04-201-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As nothing outside the tracing directory uses the function command mechanism, I'm moving the prototypes out of the include/linux/ftrace.h and into the local kernel/trace/trace.h header. I plan on making them hook to the trace_array structure which is local to kernel/trace, and I do not want to expose it to the rest of the kernel. This requires that the command functions must also be local to tracing. But luckily nothing else uses them. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Move the probe function into the tracing directorySteven Rostedt (VMware)2017-04-181-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As nothing outside the tracing directory uses the function probes mechanism, I'm moving the prototypes out of the include/linux/ftrace.h and into the local kernel/trace/trace.h header. I plan on making them hook to the trace_array structure which is local to kernel/trace, and I do not want to expose it to the rest of the kernel. This requires that the probe functions must also be local to tracing. But luckily nothing else uses them. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Add 'function-fork' trace optionNamhyung Kim2017-04-173-2/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function-fork option is same as event-fork that it tracks task fork/exit and set the pid filter properly. This can be useful if user wants to trace selected tasks including their children only. Link: http://lkml.kernel.org/r/20170417024430.21194-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Have the trace_event benchmark thread call cond_resched_rcu_qs()Steven Rostedt (VMware)2017-04-171-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trace_event benchmark thread runs in kernel space in an infinite loop while also calling cond_resched() in case anything else wants to schedule in. Unfortunately, on a PREEMPT kernel, that makes it a nop, in which case, this will never voluntarily schedule. That will cause synchronize_rcu_tasks() to forever block on this thread, while it is running. This is exactly what cond_resched_rcu_qs() is for. Use that instead. Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Fix indexing of t_hash_start() from t_next()Steven Rostedt (VMware)2017-04-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | t_hash_start() does not increment *pos, where as t_next() must. But when t_next() does increment *pos, it must still pass in the original *pos to t_hash_start() otherwise it will skip the first instance: # cd /sys/kernel/debug/tracing # echo schedule:traceoff > set_ftrace_filter # echo do_IRQ:traceoff > set_ftrace_filter # echo call_rcu > set_ftrace_filter # cat set_ftrace_filter call_rcu schedule:traceoff:unlimited do_IRQ:traceoff:unlimited The above called t_hash_start() from t_start() as there was only one function (call_rcu), but if we add another function: # echo xfrm_policy_destroy_rcu >> set_ftrace_filter # cat set_ftrace_filter call_rcu xfrm_policy_destroy_rcu do_IRQ:traceoff:unlimited The "schedule:traceoff" disappears. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Fix removing of second function probeSteven Rostedt (VMware)2017-04-151-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When two function probes are added to set_ftrace_filter, and then one of them is removed, the update to the function locations is not performed, and the record keeping of the function states are corrupted, and causes an ftrace_bug() to occur. This is easily reproducable by adding two probes, removing one, and then adding it back again. # cd /sys/kernel/debug/tracing # echo schedule:traceoff > set_ftrace_filter # echo do_IRQ:traceoff > set_ftrace_filter # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter # echo do_IRQ:traceoff > set_ftrace_filter Causes: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1098 at kernel/trace/ftrace.c:2369 ftrace_get_addr_curr+0x143/0x220 Modules linked in: [...] CPU: 2 PID: 1098 Comm: bash Not tainted 4.10.0-test+ #405 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 Call Trace: dump_stack+0x68/0x9f __warn+0x111/0x130 ? trace_irq_work_interrupt+0xa0/0xa0 warn_slowpath_null+0x1d/0x20 ftrace_get_addr_curr+0x143/0x220 ? __fentry__+0x10/0x10 ftrace_replace_code+0xe3/0x4f0 ? ftrace_int3_handler+0x90/0x90 ? printk+0x99/0xb5 ? 0xffffffff81000000 ftrace_modify_all_code+0x97/0x110 arch_ftrace_update_code+0x10/0x20 ftrace_run_update_code+0x1c/0x60 ftrace_run_modify_code.isra.48.constprop.62+0x8e/0xd0 register_ftrace_function_probe+0x4b6/0x590 ? ftrace_startup+0x310/0x310 ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30 ? update_stack_state+0x88/0x110 ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320 ? preempt_count_sub+0x18/0xd0 ? mutex_lock_nested+0x104/0x800 ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320 ? __unwind_start+0x1c0/0x1c0 ? _mutex_lock_nest_lock+0x800/0x800 ftrace_trace_probe_callback.isra.3+0xc0/0x130 ? func_set_flag+0xe0/0xe0 ? __lock_acquire+0x642/0x1790 ? __might_fault+0x1e/0x20 ? trace_get_user+0x398/0x470 ? strcmp+0x35/0x60 ftrace_trace_onoff_callback+0x48/0x70 ftrace_regex_write.isra.43.part.44+0x251/0x320 ? match_records+0x420/0x420 ftrace_filter_write+0x2b/0x30 __vfs_write+0xd7/0x330 ? do_loop_readv_writev+0x120/0x120 ? locks_remove_posix+0x90/0x2f0 ? do_lock_file_wait+0x160/0x160 ? __lock_is_held+0x93/0x100 ? rcu_read_lock_sched_held+0x5c/0xb0 ? preempt_count_sub+0x18/0xd0 ? __sb_start_write+0x10a/0x230 ? vfs_write+0x222/0x240 vfs_write+0xef/0x240 SyS_write+0xab/0x130 ? SyS_read+0x130/0x130 ? trace_hardirqs_on_caller+0x182/0x280 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x7fe61c157c30 RSP: 002b:00007ffe87890258 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: ffffffff8114a410 RCX: 00007fe61c157c30 RDX: 0000000000000010 RSI: 000055814798f5e0 RDI: 0000000000000001 RBP: ffff8800c9027f98 R08: 00007fe61c422740 R09: 00007fe61ca53700 R10: 0000000000000073 R11: 0000000000000246 R12: 0000558147a36400 R13: 00007ffe8788f160 R14: 0000000000000024 R15: 00007ffe8788f15c ? trace_hardirqs_off_caller+0xc0/0x110 ---[ end trace 99fa09b3d9869c2c ]--- Bad trampoline accounting at: ffffffff81cc3b00 (do_IRQ+0x0/0x150) Cc: stable@vger.kernel.org Fixes: 59df055f1991 ("ftrace: trace different functions with a different tracer") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not workSteven Rostedt (VMware)2017-04-102-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when it needs to use rcu_read_lock() and friends. This is because tracing can happen as RCU is about to enter user space, or about to go idle, and RCU does not watch for RCU read side critical sections as it makes the transition. There is a small location within the RCU infrastructure that rcu_irq_enter() itself will not work. If tracing were to occur in that section it will break if it tries to use rcu_irq_enter(). Originally, this happens with the stack_tracer, because it will call save_stack_trace when it encounters stack usage that is greater than any stack usage it had encountered previously. There was a case where that happened in the RCU section where rcu_irq_enter() did not work, and lockdep complained loudly about it. To fix it, stack tracing added a call to be disabled and RCU would disable stack tracing during the critical section that rcu_irq_enter() was inoperable. This solution worked, but there are other cases that use rcu_irq_enter() and it would be a good idea to let RCU give a way to let others know that rcu_irq_enter() will not work. For example, in trace events. Another helpful aspect of this change is that it also moves the per cpu variable called in the RCU critical section into a cache locale along with other RCU per cpu variables used in that same location. I'm keeping the stack_trace_disable() code, as that still could be used in the future by places that really need to disable it. And since it's only a static inline, it wont take up any kernel text if it is not used. Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | rcu: Fix dyntick-idle tracingPaul E. McKenney2017-04-101-25/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit() (with my blessing) to allow the current _rcuidle alternative tracepoint name to be dispensed with while still maintaining good performance. Unfortunately, this causes RCU's dyntick-idle entry code's tracing to appear to RCU like an interrupt that occurs where RCU is not designed to handle interrupts. This commit fixes this problem by moving the zeroing of ->dynticks_nesting after the offending trace_rcu_dyntick() statement, which narrows the window of vulnerability to a pair of adjacent statements that are now marked with comments to that effect. Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Rename trace_active to disable_stack_tracer and inline its modificationSteven Rostedt (VMware)2017-04-101-41/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to eliminate a function call, make "trace_active" into "disable_stack_tracer" and convert stack_tracer_disable() and friends into static inline functions. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Add stack_tracer_disable/enable() functionsSteven Rostedt (VMware)2017-04-101-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are certain parts of the kernel that cannot let stack tracing proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU internals cannot handle having RCU read side locks taken. Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU stop stack tracing on the current CPU when it is in those critical sections. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing: Replace the per_cpu() with __this_cpu*() in trace_stack.cSteven Rostedt (VMware)2017-04-101-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The updates to the trace_active per cpu variable can be updated with the __this_cpu_*() functions as it only gets updated on the CPU that the variable is on. Thanks to Paul McKenney for suggesting __this_cpu_* instead of this_cpu_*. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolinesSteven Rostedt (VMware)2017-04-072-25/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function tracer needs to be more careful than other subsystems when it comes to freeing data. Especially if that data is actually executable code. When a single function is traced, a trampoline can be dynamically allocated which is called to jump to the function trace callback. When the callback is no longer needed, the dynamic allocated trampoline needs to be freed. This is where the issues arise. The dynamically allocated trampoline must not be used again. As function tracing can trace all subsystems, including subsystems that are used to serialize aspects of freeing (namely RCU), it must take extra care when doing the freeing. Before synchronize_rcu_tasks() was around, there was no way for the function tracer to know that nothing was using the dynamically allocated trampoline when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(), it will wait till all tasks have either voluntarily scheduled (not on the trampoline) or goes into userspace (not on the trampoline). Then it is safe to free the trampoline even with CONFIG_PREEMPT set. Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | tracing/kprobes: expose maxactive for kretprobe in kprobe_eventsAlban Crequy2017-04-041-6/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a kretprobe is installed on a kernel function, there is a maximum limit of how many calls in parallel it can catch (aka "maxactive"). A kernel module could call register_kretprobe() and initialize maxactive (see example in samples/kprobes/kretprobe_example.c). But that is not exposed to userspace and it is currently not possible to choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events The default maxactive can be as low as 1 on single-core with a non-preemptive kernel. This is too low and we need to increase it not only for recursive functions, but for functions that sleep or resched. This patch updates the format of the command that can be written to kprobe_events so that maxactive can be optionally specified. I need this for a bpf program attached to the kretprobe of inet_csk_accept, which can sleep for a long time. This patch includes a basic selftest: > # ./ftracetest -v test.d/kprobe/ > === Ftrace unit tests === > [1] Kprobe dynamic event - adding and removing [PASS] > [2] Kprobe dynamic event - busy event check [PASS] > [3] Kprobe dynamic event with arguments [PASS] > [4] Kprobes event arguments with types [PASS] > [5] Kprobe dynamic event with function tracer [PASS] > [6] Kretprobe dynamic event with arguments [PASS] > [7] Kretprobe dynamic event with maxactive [PASS] > > # of passed: 7 > # of failed: 0 > # of unresolved: 0 > # of untested: 0 > # of unsupported: 0 > # of xfailed: 0 > # of undefined(test bug): 0 BugLink: https://github.com/iovisor/bcc/issues/1072 Link: http://lkml.kernel.org/r/1491215782-15490-1-git-send-email-alban@kinvolk.io Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Alban Crequy <alban@kinvolk.io> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Have init/main.c call ftrace directly to free init memorySteven Rostedt (VMware)2017-04-031-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Relying on free_reserved_area() to call ftrace to free init memory proved to not be sufficient. The issue is that on x86, when debug_pagealloc is enabled, the init memory is not freed, but simply set as not present. Since ftrace was uninformed of this, starting function tracing still tries to update pages that are not present according to the page tables, causing ftrace to bug, as well as killing the kernel itself. Instead of relying on free_reserved_area(), have init/main.c call ftrace directly just before it frees the init memory. Then it needs to use __init_begin and __init_end to know where the init memory location is. Looking at all archs (and testing what I can), it appears that this should work for each of them. Reported-by: kernel test robot <xiaolong.ye@intel.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Create separate t_func_next() to simplify the function / hash logicSteven Rostedt (VMware)2017-03-311-14/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that if I use dd to read the set_ftrace_filter file that the first hash command is repeated. # cd /sys/kernel/debug/tracing # echo schedule > set_ftrace_filter # echo do_IRQ >> set_ftrace_filter # echo schedule:traceoff >> set_ftrace_filter # echo do_IRQ:traceoff >> set_ftrace_filter # cat set_ftrace_filter schedule do_IRQ schedule:traceoff:unlimited do_IRQ:traceoff:unlimited # dd if=set_ftrace_filter bs=1 schedule do_IRQ schedule:traceoff:unlimited schedule:traceoff:unlimited do_IRQ:traceoff:unlimited 98+0 records in 98+0 records out 98 bytes copied, 0.00265011 s, 37.0 kB/s This is due to the way t_start() calls t_next() as well as the seq_file calls t_next() and the state is slightly different between the two. Namely, t_start() will call t_next() with a local "pos" variable. By separating out the function listing from t_next() into its own function, we can have better control of outputting the functions and the hash of triggers. This simplifies the code. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | | | ftrace: Update func_pos in t_start() when all functions are enabledSteven Rostedt (VMware)2017-03-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If all functions are enabled, there's a comment displayed in the file to denote that: # cd /sys/kernel/debug/tracing # cat set_ftrace_filter #### all functions enabled #### If a function trigger is set, those are displayed as well: # echo schedule:traceoff >> /debug/tracing/set_ftrace_filter # cat set_ftrace_filter #### all functions enabled #### schedule:traceoff:unlimited But if you read that file with dd, the output can change: # dd if=/debug/tracing/set_ftrace_filter bs=1 #### all functions enabled #### 32+0 records in 32+0 records out 32 bytes copied, 7.0237e-05 s, 456 kB/s This is because the "pos" variable is updated for the comment, but func_pos is not. "func_pos" is used by the triggers (or hashes) to know how many functions were printed and it bases its index from the pos - func_pos. func_pos should be 1 to count for the comment printed. But since it is not, t_hash_start() thinks that one trigger was already printed. The cat gets to t_hash_start() via t_next() and not t_start() which updates both pos and func_pos. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
OpenPOWER on IntegriCloud