| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Rewrite subr_sleepqueue.c use of callouts to not depend on the
specifics of callout KPI.
|
|
|
|
| |
Fix some handling of P2_PTRACE_FSTP.
|
|
|
|
|
|
|
|
|
| |
First, PL_FLAG_FORKED events now also set a PL_FLAG_VFORKED flag when
the new child was created via vfork() rather than fork(). Second, a
new PL_FLAG_VFORK_DONE event can now be enabled via the PTRACE_VFORK
event mask. This new stop is reported after the vfork parent resumes
due to the child calling exit or exec. Debuggers can use this stop to
reinsert breakpoints in the vfork parent process before it resumes.
|
|
|
|
| |
Force SIGSTOP to be the first signal reported after the attach.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a mask of optional ptrace() events.
302900:
Add a test for user signal delivery.
This test verifies we get the correct ptrace event details when a signal
is posted to a traced process from userland.
302902:
Add a mask of optional ptrace() events.
ptrace() now stores a mask of optional events in p_ptevents. Currently
this mask is a single integer, but it can be expanded into an array of
integers in the future.
Two new ptrace requests can be used to manipulate the event mask:
PT_GET_EVENT_MASK fetches the current event mask and PT_SET_EVENT_MASK
sets the current event mask.
The current set of events include:
- PTRACE_EXEC: trace calls to execve().
- PTRACE_SCE: trace system call entries.
- PTRACE_SCX: trace syscam call exits.
- PTRACE_FORK: trace forks and auto-attach to new child processes.
- PTRACE_LWP: trace LWP events.
The S_PT_SCX and S_PT_SCE events in the procfs p_stops flags have
been replaced by PTRACE_SCE and PTRACE_SCX. PTRACE_FORK replaces
P_FOLLOW_FORK and PTRACE_LWP replaces P2_LWP_EVENTS.
The PT_FOLLOW_FORK and PT_LWP_EVENTS ptrace requests remain for
compatibility but now simply toggle corresponding flags in the
event mask.
While here, document that PT_SYSCALL, PT_TO_SCE, and PT_TO_SCX both
modify the event mask and continue the traced process.
302921:
Rename PTRACE_SYSCALL to LINUX_PTRACE_SYSCALL.
303461:
Note that not all optional ptrace events use SIGTRAP.
New child processes attached due to PTRACE_FORK use SIGSTOP instead of
SIGTRAP. All other ptrace events use SIGTRAP.
304009:
Remove description of P_FOLLOWFORK as this flag was removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
292894:
Add ptrace(2) reporting for LWP events.
Add two new LWPINFO flags: PL_FLAG_BORN and PL_FLAG_EXITED for reporting
thread creation and destruction. Newly created threads will stop to report
PL_FLAG_BORN before returning to userland and exiting threads will stop to
report PL_FLAG_EXIT before exiting completely. Both of these events are
only enabled and reported if PT_LWP_EVENTS is enabled on a process.
292896:
Document the recently added support for ptrace(2) LWP events.
|
|
|
|
|
|
| |
cred: add proc_set_cred_init helper
PR: D7431
|
|
|
|
|
|
|
|
|
| |
r280330:
fork: assign refed credentials earlier
r282567:
Fix up panics when fork fails due to hitting proc limit
PR: D7431
|
|
|
|
| |
Remove mention of Giant from the fork_return() description.
|
|
|
|
|
|
| |
Fix style issues around existing SDT probes.
** Changes to sys/netinet/in_kdtrace.c and sys/netinet/in_kdtrace.h skipped.
|
|
|
|
| |
sys/kern: spelling fixes in comments.
|
|
|
|
| |
cred: add proc_set_cred helper
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r297203,r297256:
r297156:
Track filemon usage via a proc.p_filemon pointer rather than its own lists.
r297157:
Stop tracking stat(2).
r297158:
Consolidate open(2) and openat(2) code.
r297159:
Use curthread for vn_fullpath.
r297161:
Attempt to use the namecache for openat(2) path resolution.
r297172:
Consolidate common link(2) logic.
r297200:
Follow-up r297156: Close the log in filemon_dtr rather than in the last
reference.
r297201:
Return any log write failure encountered when closing the filemon fd.
r297202:
Remove unused done argument to copyinstr(9).
r297203:
Handle copyin failures.
r297256:
Remove unneeded return left from refactoring.
Relnotes: yes (filemon stability/performance updates)
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix hangs or panics when misbehaved kernel threads return from their
main function.
295418:
Mark proc0 as a kernel process via the P_KTHREAD flag.
All other kernel processes have this flag set and all threads in proc0
(including thread0) have the similar TDP_KTHREAD flag set.
295419:
Call kthread_exit() rather than kproc_exit() for a premature kthread exit.
Kernel threads (and processes) are supposed to call kthread_exit() (or
kproc_exit()) to terminate. However, the kernel includes a fallback in
fork_exit() to force a kthread exit if a kernel thread's "main" routine
returns. This fallback was added back when the kernel only had processes
and was not updated to call kthread_exit() instead of kproc_exit() when
threads were added to the kernel.
This mistake was particularly exciting when the errant thread belonged to
proc0. Due to the missing P_KTHREAD flag the fallback did not kick in
and instead tried to return to userland via whatever garbage was in the
trapframe. With P_KTHREAD set it tried to terminate proc0 resulting in
other amusements.
PR: 204999
Approved by: re (glebius)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Export current system call code and argument count for system call entry
and exit events. To preserve the ABI, the new fields are moved to the
end of struct thread in these branches (unlike HEAD) and explicitly copied
when new threads are created. In addition, the new tests are only added
in 10.
r287386:
Export current system call code and argument count for system call entry
and exit events. procfs stop events for system call tracing report these
values (argument count for system call entry and code for system call exit),
but ptrace() does not provide this information. (Note that while the system
call code can be determined in an ABI-specific manner during system call
entry, it is not generally available during system call exit.)
The values are exported via new fields at the end of struct ptrace_lwpinfo
available via PT_LWPINFO.
r288949:
Fix various edge cases related to system call tracing.
- Always set td_dbg_sc_* when P_TRACED is set on system call entry
even if the debugger is not tracing system call entries. This
ensures the fields are valid when reporting other stops that
occur at system call boundaries such as for PT_FOLLOW_FORKS or
when only tracing system call exits.
- Set TDB_SCX when reporting the stop for a new child process in
fork_return(). This causes the event to be reported as a system
call exit.
- Report a system call exit event in fork_return() for new threads in
a traced process.
- Copy td_dbg_sc_* to new threads instead of zeroing. This ensures
that td_dbg_sc_code in particular will report the system call that
created the new thread or process when it reports a system call
exit event in fork_return().
- Add new ptrace tests to verify that new child processes and threads
report system call exit events with a valid pl_syscall_code via
PT_LWPINFO.
r288993:
Document the recently added pl_syscall_* fields in struct ptrace_lwpinfo.
|
|
|
|
|
|
| |
Enforce the maxproc limitation before allocating struct proc.
In collaboration with: pho
|
|
|
|
| |
Add KTR tracing for some MI ptrace events.
|
| |
|
|
|
|
|
| |
Add procctl(2) PROC_TRACE_CTL command to enable or disable debugger
attachment to the process.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MFC r270443 (by mjg):
Properly reparent traced processes when the tracer dies.
MFC r273452 (by mjg):
Plug unnecessary PRS_NEW check in kern_procctl.
MFC 275800:
Add a facility for non-init process to declare itself the reaper of
the orphaned descendants.
MFC r275821:
Add missed break.
MFC r275846 (by mckusick):
Add some additional clarification and fix a few gammer nits.
MFC r275847 (by bdrewery):
Bump Dd for r275846.
|
|
|
|
|
|
|
|
|
|
| |
Add facility to stop all userspace processes.
MFC r275753:
Fix gcc build.
MFC r275820:
Add missed break.
|
| |
|
|
|
|
|
|
|
| |
Make fdunshare accept only td parameter.
Proc had to match the thread anyway and 2 parameters were inconsistent
with the rest.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fasttrap fork handler is responsible for removing tracepoints in the
child process that were inherited from its parent. However, this should
not be done in the case of a vfork, since the fork handler ends up removing
the tracepoints from the shared vm space, and userland DTrace probes in the
parent will no longer fire as a result.
Now the child of a vfork may trigger userland DTrace probes enabled in its
parent, so modify the fasttrap probe handler to handle this case and handle
the child process in the same way that it would handle the traced process.
In particular, if once traces function foo() in a process that vforks, and
the child calls foo(), fasttrap will treat this call as having come from the
parent. This is the behaviour of the upstream code.
While here, add #ifdef guards to some code that isn't present upstream.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
from arbitrary processes. Similar to ktrace it can apply a change to all
existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
control operations on processes (as opposed to the debugger-specific
operations provided by ptrace(2)). procctl(2) uses a combination of
idtype_t and an id to identify the set of processes on which to operate
similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
of a set of processes. MADV_PROTECT still works for backwards
compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
the first bit of which is used to track if P_PROTECT should be inherited
by new child processes.
Reviewed by: kib, jilles (earlier version)
Approved by: re (delphij)
MFC after: 1 month
|
|
|
|
|
|
|
|
|
| |
using SDT_PROBE_ARGTYPE(). This will make it easy to extend the SDT(9) API
to allow probes with dynamically-translated types.
There is no functional change.
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
is exceeded. Improve formatting of the message while here.
PR: kern/60550
Submitted by: Lowell Gilbert, bde
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Capability is no longer separate descriptor type. Now every descriptor
has set of its own capability rights.
- The cap_new(2) system call is left, but it is no longer documented and
should not be used in new code.
- The new syscall cap_rights_limit(2) should be used instead of
cap_new(2), which limits capability rights of the given descriptor
without creating a new one.
- The cap_getrights(2) syscall is renamed to cap_rights_get(2).
- If CAP_IOCTL capability right is present we can further reduce allowed
ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
ioctls can be retrived with cap_ioctls_get(2) syscall.
- If CAP_FCNTL capability right is present we can further reduce fcntls
that can be used with the new cap_fcntls_limit(2) syscall and retrive
them with cap_fcntls_get(2).
- To support ioctl and fcntl white-listing the filedesc structure was
heavly modified.
- The audit subsystem, kdump and procstat tools were updated to
recognize new syscalls.
- Capability rights were revised and eventhough I tried hard to provide
backward API and ABI compatibility there are some incompatible changes
that are described in detail below:
CAP_CREATE old behaviour:
- Allow for openat(2)+O_CREAT.
- Allow for linkat(2).
- Allow for symlinkat(2).
CAP_CREATE new behaviour:
- Allow for openat(2)+O_CREAT.
Added CAP_LINKAT:
- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
- Allow to be target for renameat(2).
Added CAP_SYMLINKAT:
- Allow for symlinkat(2).
Removed CAP_DELETE. Old behaviour:
- Allow for unlinkat(2) when removing non-directory object.
- Allow to be source for renameat(2).
Removed CAP_RMDIR. Old behaviour:
- Allow for unlinkat(2) when removing directory.
Added CAP_RENAMEAT:
- Required for source directory for the renameat(2) syscall.
Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
- Allow for unlinkat(2) on any object.
- Required if target of renameat(2) exists and will be removed by this
call.
Removed CAP_MAPEXEC.
CAP_MMAP old behaviour:
- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
PROT_WRITE.
CAP_MMAP new behaviour:
- Allow for mmap(2)+PROT_NONE.
Added CAP_MMAP_R:
- Allow for mmap(PROT_READ).
Added CAP_MMAP_W:
- Allow for mmap(PROT_WRITE).
Added CAP_MMAP_X:
- Allow for mmap(PROT_EXEC).
Added CAP_MMAP_RW:
- Allow for mmap(PROT_READ | PROT_WRITE).
Added CAP_MMAP_RX:
- Allow for mmap(PROT_READ | PROT_EXEC).
Added CAP_MMAP_WX:
- Allow for mmap(PROT_WRITE | PROT_EXEC).
Added CAP_MMAP_RWX:
- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).
Renamed CAP_MKDIR to CAP_MKDIRAT.
Renamed CAP_MKFIFO to CAP_MKFIFOAT.
Renamed CAP_MKNODE to CAP_MKNODEAT.
CAP_READ old behaviour:
- Allow pread(2).
- Disallow read(2), readv(2) (if there is no CAP_SEEK).
CAP_READ new behaviour:
- Allow read(2), readv(2).
- Disallow pread(2) (CAP_SEEK was also required).
CAP_WRITE old behaviour:
- Allow pwrite(2).
- Disallow write(2), writev(2) (if there is no CAP_SEEK).
CAP_WRITE new behaviour:
- Allow write(2), writev(2).
- Disallow pwrite(2) (CAP_SEEK was also required).
Added convinient defines:
#define CAP_PREAD (CAP_SEEK | CAP_READ)
#define CAP_PWRITE (CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ)
#define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
#define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W)
#define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X)
#define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X)
#define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
#define CAP_RECV CAP_READ
#define CAP_SEND CAP_WRITE
#define CAP_SOCK_CLIENT \
(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
#define CAP_SOCK_SERVER \
(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
CAP_SETSOCKOPT | CAP_SHUTDOWN)
Added defines for backward API compatibility:
#define CAP_MAPEXEC CAP_MMAP_X
#define CAP_DELETE CAP_UNLINKAT
#define CAP_MKDIR CAP_MKDIRAT
#define CAP_RMDIR CAP_UNLINKAT
#define CAP_MKFIFO CAP_MKFIFOAT
#define CAP_MKNOD CAP_MKNODAT
#define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER)
Sponsored by: The FreeBSD Foundation
Reviewed by: Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with: rwatson, benl, jonathan
ABI compatibility discussed with: kib
|
| |
|
|
|
|
|
|
|
| |
behaviour to differ from the documented, only on XEN. If there are
any issues with XEN pmap left, they should be fixed in pmap.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
callout is started before kern_setitimer() acquires process mutex, but
looses a race and kern_setitimer() gets the process mutex before the
callout. Then, assuming that new specified struct itimerval has
it_interval zero, but it_value non-zero, the callout, after it starts
executing again, clears p->p_realtimer.it_value, but kern_setitimer()
already rescheduled the callout.
As the result of the race, both p_realtimer is zero, and the callout
is rescheduled. Then, in the exit1(), the exit code sees that it_value
is zero and does not even try to stop the callout. This allows the
struct proc to be reused and eventually the armed callout is
re-initialized. The consequence is the corrupted callwheel tailq.
Use process mutex to interlock the callout start, which fixes the race.
Reported and tested by: pho
Reviewed by: jhb
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
there is no need to check if Giant is acquired after it.
Reviewed by: kib
MFC after: 1 week
|
|
|
|
|
|
|
| |
allowed to allocate, and corresponding tunable with the same
name. Note that existing processes with higher pids are left intact.
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
| |
On success we have to drop one after procdesc_finit() and on failure
we have to close allocated slot with fdclose(), which also drops one
reference for us and drop the remaining reference with fdrop().
Without this change closing process descriptor didn't result in killing
pdfork(2)ed child.
Reviewed by: rwatson
MFC after: 1 month
|
|
|
|
|
|
|
|
|
| |
creation. Move it into the copied region of the struct thread.
Update some comments.
Requested by: bde
X-MFC after: never
|
|
|
|
|
|
|
|
| |
situations, due to fork1() calling racct_proc_exit() without calling
racct_proc_fork() first.
Submitted by: Mateusz Guzik <mjguzik at gmail dot com> (earlier version)
Reviewed by: Mateusz Guzik <mjguzik at gmail dot com>
|
|
|
|
|
|
|
|
|
|
| |
not get syscall exit notification until the child performed exec of
exit. Swap the order of doing ptracestop() and waiting for P_PPWAIT
clearing, by postponing the wait into syscallret after ptracestop()
notification is done.
Reported, tested and reviewed by: Dmitry Mikulin <dmitrym juniper net>
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
| |
to the debugger. When reparenting for debugging, keep the child in
the new orphan list of old parent. When looping over the children in
kern_wait(), iterate over both children list and orphan list to search
for the process by pid.
Submitted by: Dmitry Mikulin <dmitrym juniper.net>
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
lwpinfo flags, for PT_FOLLOWFORK auto-attachment.
In collaboration with: Dmitry Mikulin <dmitrym juniper net>
MFC after: 1 week
|
|
|
|
|
|
| |
and it's more logical this way.
MFC after: 3 days
|
|
|
|
|
|
| |
fields in 'struct proc' before they got initialized in do_fork().
MFC after: 3 days
|
|
|
|
|
|
| |
freed properly if resource limit got exceeded.
Approved by: re (kib)
|
|
|
|
|
|
|
|
|
|
| |
we were accounting the newly created process to its parent instead
of the child itself. This caused problems later, when the child
changed its credentials - the per-uid, per-jail etc counters were
not properly updated, because the maxproc counter in the child
process was 0.
Approved by: re (kib)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.
Reviewed by: rwatson
Approved by: re (bz)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A "process descriptor" file descriptor is used to manage processes
without using the PID namespace. This is required for Capsicum's
Capability Mode, where the PID namespace is unavailable.
New system calls pdfork(2) and pdkill(2) offer the functional equivalents
of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote
process for debugging purposes. The currently-unimplemented pdwait(2) will,
in the future, allow querying rusage/exit status. In the interim, poll(2)
may be used to check (and wait for) process termination.
When a process is referenced by a process descriptor, it does not issue
SIGCHLD to the parent, making it suitable for use in libraries---a common
scenario when using library compartmentalisation from within large
applications (such as web browsers). Some observers may note a similarity
to Mach task ports; process descriptors provide a subset of this behaviour,
but in a UNIX style.
This feature is enabled by "options PROCDESC", but as with several other
Capsicum kernel features, is not enabled by default in GENERIC 9.0.
Reviewed by: jhb, kib
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
|
|
|
|
|
|
|
|
| |
delivered to parent when the child exists.
Submitted by: Petr Salinger <Petr.Salinger seznam cz> (Debian/kFreeBSD)
MFC after: 1 week
X-MFC-note: bump __FreeBSD_version
|
|
|
|
|
|
| |
won't happen before 9.0. This commit adds "#ifdef RACCT" around all the
"PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order
to avoid useless locking/unlocking in kernels built without "options RACCT".
|
|
|
|
|
| |
Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)
|