summaryrefslogtreecommitdiffstats
path: root/sys/kern/sys_process.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r304440, r304487:markj2016-08-221-4/+5
| | | | Fix some handling of P2_PTRACE_FSTP.
* MFC 303001: Add PTRACE_VFORK to trace vfork events.jhb2016-08-191-2/+6
| | | | | | | | | First, PL_FLAG_FORKED events now also set a PL_FLAG_VFORKED flag when the new child was created via vfork() rather than fork(). Second, a new PL_FLAG_VFORK_DONE event can now be enabled via the PTRACE_VFORK event mask. This new stop is reported after the vfork parent resumes due to the child calling exit or exec. Debuggers can use this stop to reinsert breakpoints in the vfork parent process before it resumes.
* MFC r303423:kib2016-08-151-7/+25
| | | | Force SIGSTOP to be the first signal reported after the attach.
* MFC 302900,302902,302921,303461,304009:jhb2016-08-151-17/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a mask of optional ptrace() events. 302900: Add a test for user signal delivery. This test verifies we get the correct ptrace event details when a signal is posted to a traced process from userland. 302902: Add a mask of optional ptrace() events. ptrace() now stores a mask of optional events in p_ptevents. Currently this mask is a single integer, but it can be expanded into an array of integers in the future. Two new ptrace requests can be used to manipulate the event mask: PT_GET_EVENT_MASK fetches the current event mask and PT_SET_EVENT_MASK sets the current event mask. The current set of events include: - PTRACE_EXEC: trace calls to execve(). - PTRACE_SCE: trace system call entries. - PTRACE_SCX: trace syscam call exits. - PTRACE_FORK: trace forks and auto-attach to new child processes. - PTRACE_LWP: trace LWP events. The S_PT_SCX and S_PT_SCE events in the procfs p_stops flags have been replaced by PTRACE_SCE and PTRACE_SCX. PTRACE_FORK replaces P_FOLLOW_FORK and PTRACE_LWP replaces P2_LWP_EVENTS. The PT_FOLLOW_FORK and PT_LWP_EVENTS ptrace requests remain for compatibility but now simply toggle corresponding flags in the event mask. While here, document that PT_SYSCALL, PT_TO_SCE, and PT_TO_SCX both modify the event mask and continue the traced process. 302921: Rename PTRACE_SYSCALL to LINUX_PTRACE_SYSCALL. 303461: Note that not all optional ptrace events use SIGTRAP. New child processes attached due to PTRACE_FORK use SIGSTOP instead of SIGTRAP. All other ptrace events use SIGTRAP. 304009: Remove description of P_FOLLOWFORK as this flag was removed.
* MFC 292894,292896: Add ptrace(2) reporting for LWP events.jhb2016-08-121-0/+15
| | | | | | | | | | | | | | 292894: Add ptrace(2) reporting for LWP events. Add two new LWPINFO flags: PL_FLAG_BORN and PL_FLAG_EXITED for reporting thread creation and destruction. Newly created threads will stop to report PL_FLAG_BORN before returning to userland and exiting threads will stop to report PL_FLAG_EXIT before exiting completely. Both of these events are only enabled and reported if PT_LWP_EVENTS is enabled on a process. 292896: Document the recently added support for ptrace(2) LWP events.
* MFC r302919:kib2016-07-221-1/+1
| | | | | In ptrace_vm_entry(), do not call vmspace_free() while owning a vm object lock.
* MFC 289636:jhb2015-11-061-1/+1
| | | | Switch pl_child_pid from int to pid_t.
* MFC 288902:jhb2015-11-061-18/+23
| | | | | | | Include additional info in ptrace(2) KTR traces: - The new PC value and signal passed to PT_CONTINUE, PT_DETACH, PT_SYSCALL, and PT_TO_SC[EX]. - The system call code returned via PT_LWPINFO.
* MFC r289660,r289664:kib2015-11-031-2/+13
| | | | | Do not allow to execute ptrace(PT_TRACE_ME) when the process is already traced or when there is no parent which can trace the process.
* MFC r289658:kib2015-11-031-1/+1
| | | | | No need to dereference struct proc to pids when comparing processes for equality.
* MFC 287386,288949,288993:jhb2015-10-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Export current system call code and argument count for system call entry and exit events. To preserve the ABI, the new fields are moved to the end of struct thread in these branches (unlike HEAD) and explicitly copied when new threads are created. In addition, the new tests are only added in 10. r287386: Export current system call code and argument count for system call entry and exit events. procfs stop events for system call tracing report these values (argument count for system call entry and code for system call exit), but ptrace() does not provide this information. (Note that while the system call code can be determined in an ABI-specific manner during system call entry, it is not generally available during system call exit.) The values are exported via new fields at the end of struct ptrace_lwpinfo available via PT_LWPINFO. r288949: Fix various edge cases related to system call tracing. - Always set td_dbg_sc_* when P_TRACED is set on system call entry even if the debugger is not tracing system call entries. This ensures the fields are valid when reporting other stops that occur at system call boundaries such as for PT_FOLLOW_FORKS or when only tracing system call exits. - Set TDB_SCX when reporting the stop for a new child process in fork_return(). This causes the event to be reported as a system call exit. - Report a system call exit event in fork_return() for new threads in a traced process. - Copy td_dbg_sc_* to new threads instead of zeroing. This ensures that td_dbg_sc_code in particular will report the system call that created the new thread or process when it reports a system call exit event in fork_return(). - Add new ptrace tests to verify that new child processes and threads report system call exit events with a valid pl_syscall_code via PT_LWPINFO. r288993: Document the recently added pl_syscall_* fields in struct ptrace_lwpinfo.
* MFC r283924vangyzen2015-10-021-1/+1
| | | | | | | | | | | | | | | Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431
* MFC 283281,283282,283562,283647,283836,284000,286158:jhb2015-09-091-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various fixes to orphan handling which also fix issues with following forks. 283281: Always set p_oppid when attaching to an existing process via procfs tracing. This matches the behavior of ptrace(PT_ATTACH). Also, the procfs detach request assumes p_oppid is always set. 283282: Only reparent a traced process to its old parent if the tracing process is not the old parent. Otherwise, proc_reap() will leave the zombie in place resulting in the process' status being returned twice to its parent. Add test cases for PT_TRACE_ME and PT_ATTACH which are fixed by this change. 283562: Do not allow a process to reap an orphan (a child currently being traced by another process such as a debugger). The parent process does need to check for matching orphan pids to avoid returning ECHILD if an orphan has exited, but it should not return the exited status for the child until after the debugger has detached from the orphan process either explicitly or implicitly via wait(). Add two tests for for this case: one where the debugger is the direct child (thus the parent has a non-empty children list) and one where the debugger is not a direct child (so the only "child" of the parent is the orphan). 283647: Tweak the description of when waitpid() doesn't return any status for a non-blocking wait to avoid the word "empty". 283836: Consistently only use one end of the pipe in the parent and debugger processes and do not rely on EOF due to a close() in the debugger. 284000: Add a CHILD_REQUIRE macro similar to ATF_REQUIRE for use in child processes of the main test process. 286158: Clear P_TRACED before reparenting a detached process back to its original parent. Otherwise the debugee will be set as an orphan of the debugger. Add tests for tracing forks via PT_FOLLOW_FORK.
* MFC r283889,r283891:delphij2015-06-151-0/+1
| | | | | | | | | | | | Clear p_stops when doing PT_DETACH and PROCFS_CTL_DETACH. Without this, if a process was being traced by truss(1), which uses different p_stops bits than gdb(1), the latter would misbehave because of the unexpected bits. Reported by: jceel Submitted by: sef Sponsored by: iXsystems, Inc.
* MFC 283546:jhb2015-06-131-1/+74
| | | | Add KTR tracing for some MI ptrace events.
* Merge reaper facility.kib2015-01-051-194/+0
| | | | | | | | | | | | | | | | | | | | | MFC r270443 (by mjg): Properly reparent traced processes when the tracer dies. MFC r273452 (by mjg): Plug unnecessary PRS_NEW check in kern_procctl. MFC 275800: Add a facility for non-init process to declare itself the reaper of the orphaned descendants. MFC r275821: Add missed break. MFC r275846 (by mckusick): Add some additional clarification and fix a few gammer nits. MFC r275847 (by bdrewery): Bump Dd for r275846.
* MFC 272449:jhb2014-10-171-1/+1
| | | | | Require p_cansched() for changing a process' protection status via procctl() rather than p_cansee().
* MFC r269656:kib2014-08-211-9/+1
| | | | | | | | | | Implement and use proc_realparent(9). MFC r270024 (by markj): Correct the order of arguments passed to LIST_INSERT_AFTER(). For merge, the p_treeflag member of struct proc was moved to the end of the structure, to keep KBI intact.
* Extend the support for exempting processes from being killed when swap isjhb2013-09-191-0/+195
| | | | | | | | | | | | | | | | | | | | | | exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes. Reviewed by: kib, jilles (earlier version) Approved by: re (delphij) MFC after: 1 month
* Revert r253939:attilio2013-08-051-6/+5
| | | | | | | | | | | | | We cannot busy a page before doing pagefaults. Infact, it can deadlock against vnode lock, as it tries to vget(). Other functions, right now, have an opposite lock ordering, like vm_object_sync(), which acquires the vnode lock first and then sleeps on the busy mechanism. Before this patch is reinserted we need to break this ordering. Sponsored by: EMC / Isilon storage division Reported by: kib
* The page hold mechanism is fast but it has couple of fallouts:attilio2013-08-041-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | - It does not let pages respect the LRU policy - It bloats the active/inactive queues of few pages Try to avoid it as much as possible with the long-term target to completely remove it. Use the soft-busy mechanism to protect page content accesses during short-term operations (like uiomove_fromphys()). After this change only vm_fault_quick_hold_pages() is still using the hold mechanism for page content access. There is an additional complexity there as the quick path cannot immediately access the page object to busy the page and the slow path cannot however busy more than one page a time (to avoid deadlocks). Fixing such primitive can bring to complete removal of the page hold mechanism. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff Tested by: pho
* Switch some "low-hanging fruit" to acquire read lock on vmobjectsattilio2013-04-081-5/+5
| | | | | | | | rather than write locks. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
* Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-5/+5
| | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* Switch vm_object lock to be a rwlock.attilio2013-02-201-1/+1
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* When vforked child is traced, the debugging events are not generatedkib2013-02-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | until child performs exec(). The behaviour is reasonable when a debugger is the real parent, because the parent is stopped until exec(), and sending a debugging event to the debugger would deadlock both parent and child. On the other hand, when debugger is not the parent of the vforked child, not sending debugging signals makes it impossible to debug across vfork. Fix the issue by declining generating debug signals only when vfork() was done and child called ptrace(PT_TRACEME). Set a new process flag P_PPTRACE from the attach code for PT_TRACEME, if P_PPWAIT flag is set, which indicates that the process was created with vfork() and still did not execed. Check P_PPTRACE from issignal(), instead of refusing the trace outright for the P_PPWAIT case. The scope of P_PPTRACE is exactly contained in the scope of P_PPWAIT. Found and tested by: zont Reviewed by: pluknet MFC after: 2 weeks
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-3/+1
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* Always initialize pl_event.kib2012-08-081-0/+1
| | | | | Submitted by: Andrey Zonov <andrey@zonov.org> MFC after: 3 days
* If you have pressed CTRL+Z and a process is suspended, then you use gdbdavidxu2012-07-091-4/+4
| | | | | | | | | | | | to attach to the process, it is surprising that the process is resumed without inputting any gdb commands, however ptrace manual said: The tracing process will see the newly-traced process stop and may then control it as if it had been traced all along. But the current code does not work in this way, unless traced process received a signal later, it will continue to run as a background task. To fix this problem, just send signal SIGSTOP to the traced process after we resumed it, this works like that you are attaching to a running process, it is not perfect but better than nothing.
* Allow the parent to gather the exit status of the children reparentedkib2012-02-231-3/+0
| | | | | | | | | | to the debugger. When reparenting for debugging, keep the child in the new orphan list of old parent. When looping over the children in kern_wait(), iterate over both children list and orphan list to search for the process by pid. Submitted by: Dmitry Mikulin <dmitrym juniper.net> MFC after: 2 weeks
* Mark the automatically attached child with PL_FLAG_CHILD in structkib2012-02-101-0/+2
| | | | | | | lwpinfo flags, for PT_FOLLOWFORK auto-attachment. In collaboration with: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-2/+2
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Add comment from CSRG rev 7.27 (1992/06/23 19:56:55; author: mckusick)obrien2011-06-171-0/+9
|
* We should not return ECHILD when debugging a child and the parent does aobrien2011-06-141-2/+6
| | | | | | | "wait4(-1, ..., WNOHANG, ...)". Instead wait(2) should behave as if the child does not wish to report status at this time. Reviewed by: jhb
* Add macro to test the sv_flags of any process. Change some places to testdchagin2011-01-261-1/+1
| | | | | | | the flags instead of explicit comparing with address of known sysentvec structures. MFC after: 1 month
* Allow debugger to specify that children of the traced process should bekib2011-01-251-1/+15
| | | | | | | | automatically traced. Extend the ptrace(PL_LWPINFO) to report that child just forked. Reviewed by: davidxu, jhb MFC after: 2 weeks
* Introduce vm_fault_hold() and use it to (1) eliminate a long-standing racealc2010-12-201-63/+17
| | | | | | | | | | condition in proc_rwmem() and to (2) simplify the implementation of the cxgb driver's vm_fault_hold_user_pages(). Specifically, in proc_rwmem() the requested read or write could fail because the targeted page could be reclaimed between the calls to vm_fault() and vm_page_hold(). In collaboration with: kib@ MFC after: 6 weeks
* Add the ability for GDB to printout the thread name along with otherattilio2010-11-221-0/+3
| | | | | | | | | | | | | | | | | | | | thread specific informations. In order to do that, and in order to avoid KBI breakage with existing infrastructure the following semantic is implemented: - For live programs, a new member to the PT_LWPINFO is added (pl_tdname) - For cores, a new ELF note is added (NT_THRMISC) that can be used for storing thread specific, miscellaneous, informations. Right now it is just popluated with a thread name. GDB, then, retrieves the correct informations from the corefile via the BFD interface, as it groks the ELF notes and create appropriate pseudo-sections. Sponsored by: Sandvine Incorporated Tested by: gianni Discussed with: dim, kan, kib MFC after: 2 weeks
* Create a global thread hash table to speed up thread lookup, usedavidxu2010-10-091-14/+3
| | | | | | | | | | rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian
* Extend ptrace(PT_LWPINFO) to report siginfo for the signal that causedkib2010-07-041-3/+62
| | | | | | | debugee stop. The change should keep the ABI. Take care of compat32. Discussed with: davidxu, jhb MFC after: 2 weeks
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-3/+3
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Ignore the 'addr' argument passed to PT_STEP (it is required to be '1'jhb2010-05-251-14/+20
| | | | | | | for PT_STEP which means "ignore") and PT_DETACH. PR: kern/146167 MFC after: 1 week
* Reorganize syscall entry and leave handling.kib2010-05-231-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* On Alan's advice, rather than do a wholesale conversion on a singlekmacy2010-04-301-4/+4
| | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
* Provide groundwork for 32-bit binary compatibility on non-x86 platforms,nwhitehorn2010-03-111-16/+14
| | | | | | | | | for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms. Reviewed by: kib, jhb
* Initialize pve_fsid and pve_fileid to VNOVAL.marcel2010-02-111-0/+3
|
* o Add support for COMPAT_IA32.marcel2010-02-111-69/+123
| | | | | | | | | | o Incorporate review comments: - Properly reference and lock the map - Take into account that the VM map can change inbetween requests - Add the fileid and fsid attributes Credits: kib@ Reviewed by: kib@
* Unbreak building kernels with COMPAT_32 enabled. The actual supportmarcel2010-02-091-0/+19
| | | | | | for the PT_VM_ENTRY request from 32-bit processes will follow. Pointy hat: marcel
* Add PT_VM_TIMESTAMP and PT_VM_ENTRY so that the tracing process canmarcel2010-02-091-0/+103
| | | | | | | | obtain the memory map of the traced process. PT_VM_TIMESTAMP can be used to check if the memory map changed since the last time to avoid iterating over all the VM entries unnecesarily. MFC after: 1 month
* For PT_TO_SCE stop that stops the ptraced process upon syscall entry,kib2010-01-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | syscall arguments are collected before ptracestop() is called. As a consequence, debugger cannot modify syscall or its arguments. For i386, amd64 and ia32 on amd64 MD syscall(), reread syscall number and arguments after ptracestop(), if debugger modified anything in the process environment. Since procfs stopeven requires number of syscall arguments in p_xstat, this cannot be solved by moving stop/trace point before argument fetching. Move the code to read arguments into separate function fetch_syscall_args() to avoid code duplication. Note that ktrace point for modified syscall is intentionally recorded twice, once with original arguments, and second time with the arguments set by debugger. PT_TO_SCX stop is executed after cpu_syscall_set_retval() already. Reported by: Ali Polatel <alip exherbo org> Briefly discussed with: jhb MFC after: 3 weeks
* Replace VM_PROT_OVERRIDE_WRITE by VM_PROT_COPY. VM_PROT_OVERRIDE_WRITE hasalc2009-11-261-9/+12
| | | | | | | | | | | | | | | | | | | | | | represented a write access that is allowed to override write protection. Until now, VM_PROT_OVERRIDE_WRITE has been used to write breakpoints into text pages. Text pages are not just write protected but they are also copy-on-write. VM_PROT_OVERRIDE_WRITE overrides the write protection on the text page and triggers the replication of the page so that the breakpoint will be written to a private copy. However, here is where things become confused. It is the debugger, not the process being debugged that requires write access to the copied page. Nonetheless, the copied page is being mapped into the process with write access enabled. In other words, once the debugger sets a breakpoint within a text page, the program can write to its private copy of that text page. Whereas prior to setting the breakpoint, a SIGSEGV would have occurred upon a write access. VM_PROT_COPY addresses this problem. The combination of VM_PROT_READ and VM_PROT_COPY forces the replication of a copy-on-write page even though the access is only for read. Moreover, the replicated page is only mapped into the process with read access, and not write access. Reviewed by: kib MFC after: 4 weeks
OpenPOWER on IntegriCloud