From 2b144498350860b6ee9dc57ff27a93ad488de5dc Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Thu, 9 Feb 2012 14:56:42 +0530 Subject: uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints Add uprobes support to the core kernel, with x86 support. This commit adds the kernel facilities, the actual uprobes user-space ABI and perf probe support comes in later commits. General design: Uprobes are maintained in an rb-tree indexed by inode and offset (the offset here is from the start of the mapping). For a unique (inode, offset) tuple, there can be at most one uprobe in the rb-tree. Since the (inode, offset) tuple identifies a unique uprobe, more than one user may be interested in the same uprobe. This provides the ability to connect multiple 'consumers' to the same uprobe. Each consumer defines a handler and a filter (optional). The 'handler' is run every time the uprobe is hit, if it matches the 'filter' criteria. The first consumer of a uprobe causes the breakpoint to be inserted at the specified address and subsequent consumers are appended to this list. On subsequent probes, the consumer gets appended to the existing list of consumers. The breakpoint is removed when the last consumer unregisters. For all other unregisterations, the consumer is removed from the list of consumers. Given a inode, we get a list of the mms that have mapped the inode. Do the actual registration if mm maps the page where a probe needs to be inserted/removed. We use a temporary list to walk through the vmas that map the inode. - The number of maps that map the inode, is not known before we walk the rmap and keeps changing. - extending vm_area_struct wasn't recommended, it's a size-critical data structure. - There can be more than one maps of the inode in the same mm. We add callbacks to the mmap methods to keep an eye on text vmas that are of interest to uprobes. When a vma of interest is mapped, we insert the breakpoint at the right address. Uprobe works by replacing the instruction at the address defined by (inode, offset) with the arch specific breakpoint instruction. We save a copy of the original instruction at the uprobed address. This is needed for: a. executing the instruction out-of-line (xol). b. instruction analysis for any subsequent fixups. c. restoring the instruction back when the uprobe is unregistered. We insert or delete a breakpoint instruction, and this breakpoint instruction is assumed to be the smallest instruction available on the platform. For fixed size instruction platforms this is trivially true, for variable size instruction platforms the breakpoint instruction is typically the smallest (often a single byte). Writing the instruction is done by COWing the page and changing the instruction during the copy, this even though most platforms allow atomic writes of the breakpoint instruction. This also mirrors the behaviour of a ptrace() memory write to a PRIVATE file map. The core worker is derived from KSM's replace_page() logic. In essence, similar to KSM: a. allocate a new page and copy over contents of the page that has the uprobed vaddr b. modify the copy and insert the breakpoint at the required address c. switch the original page with the copy containing the breakpoint d. flush page tables. replace_page() is being replicated here because of some minor changes in the type of pages and also because Hugh Dickins had plans to improve replace_page() for KSM specific work. Instruction analysis on x86 is based on instruction decoder and determines if an instruction can be probed and determines the necessary fixups after singlestep. Instruction analysis is done at probe insertion time so that we avoid having to repeat the same analysis every time a probe is hit. A lot of code here is due to the improvement/suggestions/inputs from Peter Zijlstra. Changelog: (v10): - Add code to clear REX.B prefix as suggested by Denys Vlasenko and Masami Hiramatsu. (v9): - Use insn_offset_modrm as suggested by Masami Hiramatsu. (v7): Handle comments from Peter Zijlstra: - Dont take reference to inode. (expect inode to uprobe_register to be sane). - Use PTR_ERR to set the return value. - No need to take reference to inode. - use PTR_ERR to return error value. - register and uprobe_unregister share code. (v5): - Modified del_consumer as per comments from Peter. - Drop reference to inode before dropping reference to uprobe. - Use i_size_read(inode) instead of inode->i_size. - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called. - Includes errno.h as recommended by Stephen Rothwell to fix a build issue on sparc defconfig - Remove restrictions while unregistering. - Earlier code leaked inode references under some conditions while registering/unregistering. - Continue the vma-rmap walk even if the intermediate vma doesnt meet the requirements. - Validate the vma found by find_vma before inserting/removing the breakpoint - Call del_consumer under mutex_lock. - Use hash locks. - Handle mremap. - Introduce find_least_offset_node() instead of close match logic in find_uprobe - Uprobes no more depends on MM_OWNER; No reference to task_structs while inserting/removing a probe. - Uses read_mapping_page instead of grab_cache_page so that the pages have valid content. - pass NULL to get_user_pages for the task parameter. - call SetPageUptodate on the new page allocated in write_opcode. - fix leaking a reference to the new page under certain conditions. - Include Instruction Decoder if Uprobes gets defined. - Remove const attributes for instruction prefix arrays. - Uses mm_context to know if the application is 32 bit. Signed-off-by: Srikar Dronamraju Also-written-by: Jim Keniston Reviewed-by: Peter Zijlstra Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Roland McGrath Cc: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Cc: Stephen Rothwell Cc: Denys Vlasenko Cc: Peter Zijlstra Cc: Linus Torvalds Cc: Andrew Morton Cc: Linux-mm Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com [ Made various small edits to the commit log ] Signed-off-by: Ingo Molnar --- arch/Kconfig | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index 4f55c73..284f589 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -65,6 +65,17 @@ config OPTPROBES depends on KPROBES && HAVE_OPTPROBES depends on !PREEMPT +config UPROBES + bool "User-space probes (EXPERIMENTAL)" + depends on ARCH_SUPPORTS_UPROBES + default n + help + Uprobes enables kernel subsystems to establish probepoints + in user applications and execute handler functions when + the probepoints are hit. + + If in doubt, say "N". + config HAVE_EFFICIENT_UNALIGNED_ACCESS bool help -- cgit v1.1 From 7b2d81d48a2d8e37efb6ce7b4d5ef58822b30d89 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Fri, 17 Feb 2012 09:27:41 +0100 Subject: uprobes/core: Clean up, refactor and improve the code Make the uprobes code readable to me: - improve the Kconfig text so that a mere mortal gets some idea what CONFIG_UPROBES=y is really about - do trivial renames to standardize around the uprobes_*() namespace - clean up and simplify various code flow details - separate basic blocks of functionality - line break artifact and white space related removal - use standard local varible definition blocks - use vertical spacing to make things more readable - remove unnecessary volatile - restructure comment blocks to make them more uniform and more readable in general Cc: Srikar Dronamraju Cc: Jim Keniston Cc: Peter Zijlstra Cc: Oleg Nesterov Cc: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.org Signed-off-by: Ingo Molnar --- arch/Kconfig | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index 284f589..cca5b54 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -66,13 +66,19 @@ config OPTPROBES depends on !PREEMPT config UPROBES - bool "User-space probes (EXPERIMENTAL)" + bool "Transparent user-space probes (EXPERIMENTAL)" depends on ARCH_SUPPORTS_UPROBES default n help - Uprobes enables kernel subsystems to establish probepoints - in user applications and execute handler functions when - the probepoints are hit. + Uprobes is the user-space counterpart to kprobes: they + enable instrumentation applications (such as 'perf probe') + to establish unintrusive probes in user-space binaries and + libraries, by executing handler functions when the probes + are hit by user-space applications. + + ( These probes come in the form of single-byte breakpoints, + managed by the kernel and kept transparent to the probed + application. ) If in doubt, say "N". -- cgit v1.1 From a5f4374a9610fd7286c2164d4e680436727eff71 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 22 Feb 2012 11:01:49 +0100 Subject: uprobes: Move to kernel/events/ Consolidate the uprobes code under kernel/events/, where the various core kernel event handling routines live. Acked-by: Peter Zijlstra Cc: Srikar Dronamraju Cc: Jim Keniston Cc: Oleg Nesterov Cc: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Link: http://lkml.kernel.org/n/tip-biuyhhwohxgbp2vzbap5yr8o@git.kernel.org Signed-off-by: Ingo Molnar --- arch/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index cca5b54..d0e37c9 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -67,7 +67,7 @@ config OPTPROBES config UPROBES bool "Transparent user-space probes (EXPERIMENTAL)" - depends on ARCH_SUPPORTS_UPROBES + depends on ARCH_SUPPORTS_UPROBES && PERF_EVENTS default n help Uprobes is the user-space counterpart to kprobes: they -- cgit v1.1 From f3f096cfedf8113380c56fc855275cc75cd8cf55 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Wed, 11 Apr 2012 16:00:43 +0530 Subject: tracing: Provide trace events interface for uprobes Implements trace_event support for uprobes. In its current form it can be used to put probes at a specified offset in a file and dump the required registers when the code flow reaches the probed address. The following example shows how to dump the instruction pointer and %ax a register at the probed text address. Here we are trying to probe zfree in /bin/zsh: # cd /sys/kernel/debug/tracing/ # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp 00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh # objdump -T /bin/zsh | grep -w zfree 0000000000446420 g DF .text 0000000000000012 Base zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events # cat uprobe_events p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420 # echo 1 > events/uprobes/enable # sleep 20 # echo 0 > events/uprobes/enable # cat trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 Signed-off-by: Srikar Dronamraju Acked-by: Steven Rostedt Acked-by: Masami Hiramatsu Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- arch/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index e5d3778..0f8f968 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -78,7 +78,7 @@ config OPTPROBES config UPROBES bool "Transparent user-space probes (EXPERIMENTAL)" - depends on ARCH_SUPPORTS_UPROBES && PERF_EVENTS + depends on UPROBE_EVENTS && PERF_EVENTS default n help Uprobes is the user-space counterpart to kprobes: they -- cgit v1.1 From ec83db0f78cd44c3b586ec1c3a348d1a8a389797 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Tue, 8 May 2012 16:41:26 +0530 Subject: tracing: Fix kconfig warning due to a typo Commit f3f096cfe ("tracing: Provide trace events interface for uprobes") throws a warning about unmet dependencies. The exact warning message is: warning: (UPROBE_EVENT) selects UPROBES which has unmet direct dependencies (UPROBE_EVENTS && PERF_EVENTS) This is due to a typo in arch/Kconfig file. Fix similar typos in the uprobetracer documentation. Also add sample format of a uprobe event in the uprobetracer documentation as suggested by Masami Hiramatsu. Reported-by: Stephen Boyd Reported-by: Ingo Molnar Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Oleg Nesterov Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120508111126.21004.38285.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- arch/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index 0f8f968..2880abf 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -78,7 +78,7 @@ config OPTPROBES config UPROBES bool "Transparent user-space probes (EXPERIMENTAL)" - depends on UPROBE_EVENTS && PERF_EVENTS + depends on UPROBE_EVENT && PERF_EVENTS default n help Uprobes is the user-space counterpart to kprobes: they -- cgit v1.1