summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exit.c
Commit message (Collapse)AuthorAgeFilesLines
* Protect the p->p_pgrp dereference with the process lock.kib2013-01-061-0/+2
| | | | MFC after: 3 days
* Restore the proper handling of the pid 0 for waitpid(2).kib2012-11-161-4/+9
| | | | | | | Fix the style around. Reported and reviewed by: bde (previous version) MFC after: 28 days
* Style fixes for r242958.kib2012-11-161-8/+6
| | | | | Reported and reviewed by: bde MFC after: 28 days
* Add the wait6(2) system call. It takes POSIX waitid()-like processkib2012-11-131-37/+268
| | | | | | | | | | | | | | | | | | | | | designator to select a process which is waited for. The system call optionally returns siginfo_t which would be otherwise provided to SIGCHLD handler, as well as extended structure accounting for child and cumulative grandchild resource usage. Allow to get the current rusage information for non-exited processes as well, similar to Solaris. The explicit WEXITED flag is required to wait for exited processes, allowing for more fine-grained control of the events the waiter is interested in. Fix the handling of siginfo for WNOWAIT option for all wait*(2) family, by not removing the queued signal state. PR: standards/170346 Submitted by: "Jukka A. Ukkonen" <jau@iki.fi> MFC after: 1 month
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-3/+0
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* Ignore stop and continue signals sent to an exiting process. Stop signalsjhb2012-09-131-0/+8
| | | | | | | | | | | set p_xstat to the signal that triggered the stop, but p_xstat is also used to hold the exit status of an exiting process. Without this change, a stop signal that arrived after a process was marked P_WEXIT but before it was marked a zombie would overwrite the exit status with the stop signal number. Reviewed by: kib MFC after: 1 week
* A few whitespace and comment fixes.jhb2012-09-071-3/+3
|
* When process exists, not only the children shall be reparented tokib2012-04-021-0/+16
| | | | | | | | init, but also the orphans shall be removed from the orphan list, because the list header is destroyed. Reported and tested by: pho MFC after: 3 days
* Add helper function to remove the process from the orphans list andkib2012-04-021-8/+14
| | | | | | | use it instead of inlined code. Tested by: pho MFC after: 3 days
* Add an assert for proctree_lock to proc_to_reap().jh2012-03-141-0/+2
| | | | | Discussed with: kib MFC after: 1 week
* Lock the process around manipulations with p_flag.kib2012-03-131-0/+2
| | | | | Reported and reviewed by: jh MFC after: 3 days
* Restore the return statement erronously removed in the r232048.kib2012-02-241-0/+1
| | | | | | Submitted by: cognet Pointy hat to: kib (reuse the one I already got today) MFC after: 13 days
* Allow the parent to gather the exit status of the children reparentedkib2012-02-231-35/+86
| | | | | | | | | | to the debugger. When reparenting for debugging, keep the child in the new orphan list of old parent. When looping over the children in kern_wait(), iterate over both children list and orphan list to search for the process by pid. Submitted by: Dmitry Mikulin <dmitrym juniper.net> MFC after: 2 weeks
* Fix long-standing thinko regarding maxproc accounting. Basically,trasz2011-09-171-16/+4
| | | | | | | | | | we were accounting the newly created process to its parent instead of the child itself. This caused problems later, when the child changed its credentials - the per-uid, per-jail etc counters were not properly updated, because the maxproc counter in the child process was 0. Approved by: re (kib)
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-7/+7
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Add experimental support for process descriptorsjonathan2011-08-181-30/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | A "process descriptor" file descriptor is used to manage processes without using the PID namespace. This is required for Capsicum's Capability Mode, where the PID namespace is unavailable. New system calls pdfork(2) and pdkill(2) offer the functional equivalents of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote process for debugging purposes. The currently-unimplemented pdwait(2) will, in the future, allow querying rusage/exit status. In the interim, poll(2) may be used to check (and wait for) process termination. When a process is referenced by a process descriptor, it does not issue SIGCHLD to the parent, making it suitable for use in libraries---a common scenario when using library compartmentalisation from within large applications (such as web browsers). Some observers may note a similarity to Mach task ports; process descriptors provide a subset of this behaviour, but in a UNIX style. This feature is enabled by "options PROCDESC", but as with several other Capsicum kernel features, is not enabled by default in GENERIC 9.0. Reviewed by: jhb, kib Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc
* All the racct_*() calls need to happen with the proc locked. Fixing thistrasz2011-07-061-0/+6
| | | | | | won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT".
* We should not return ECHILD when debugging a child and the parent does aobrien2011-06-141-3/+8
| | | | | | | "wait4(-1, ..., WNOHANG, ...)". Instead wait(2) should behave as if the child does not wish to report status at this time. Reviewed by: jhb
* Remove stale M_ZOMBIE malloc type.pluknet2011-04-141-3/+0
| | | | | | This type is unused since embedding p_ru into struct proc. MFC after: 1 week
* Some callers of proc_reparent() already have the parent process locked.kib2011-04-101-2/+6
| | | | | | Detect the situation and avoid process lock recursion. Reported by: Fabian Keil <freebsd-listen fabiankeil de>
* Enable accounting for RACCT_NPROC and RACCT_NTHR.trasz2011-03-311-0/+8
| | | | | Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
* Add racct. It's an API to keep per-process, per-jail, per-loginclasstrasz2011-03-291-0/+6
| | | | | | | | | and per-loginclass resource accounting information, to be used by the new resource limits code. It's connected to the build, but the code that actually calls the new functions will come later. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
* By using the 32-bit Linux version of Sun's Java Development Kit 1.6netchild2010-11-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | on FreeBSD (amd64), invocations of "javac" (or "java") eventually end with the output of "Killed" and exit code 137. This is caused by: 1. After calling exec() in multithreaded linux program threads are not destroyed and continue running. They get killed after program being executed finishes. 2. linux_exit_group doesn't return correct exit code when called not from group leader. Which happens regularly using sun jvm. The submitters fix this in a similar way to how NetBSD handles this. I took the PRs away from dchagin, who seems to be out of touch of this since a while (no response from him). The patches committed here are from [2], with some little modifications from me to the style. PR: 141439 [1], 144194 [2] Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk Reviewed by: rdivacky (in april 2010) MFC after: 5 days
* - When disabling ktracing on a process, free any pending requests thatjhb2010-10-211-31/+1
| | | | | | | | | | | | | | | may be left. This fixes a memory leak that can occur when tracing is disabled on a process via disabling tracing of a specific file (or if an I/O error occurs with the tracefile) if the process's next system call is exit(). The trace disabling code clears p_traceflag, so exit1() doesn't do any KTRACE-related cleanup leading to the leak. I chose to make the free'ing of pending records synchronous rather than patching exit1(). - Move KTRACE-specific logic out of kern_(exec|exit|fork).c and into kern_ktrace.c instead. Make ktrace_mtx private to kern_ktrace.c as a result. MFC after: 1 month
* Create a global thread hash table to speed up thread lookup, usedavidxu2010-10-091-0/+2
| | | | | | | | | | rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian
* Add an extra comment to the SDT probes definition. This allows us to getrpaulo2010-08-221-1/+1
| | | | | | | | | use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]
* Tweak the in-kernel API for sending signals to threads:jhb2010-06-291-1/+1
| | | | | | | | | | - Rename tdsignal() to tdsendsignal() and make it private to kern_sig.c. - Add tdsignal() and tdksignal() routines that mirror psignal() and pksignal() except that they accept a thread as an argument instead of a process. They send a signal to a specific thread rather than to an individual process. Reviewed by: kib
* Let access overriding to TTYs depend on the cdev_priv, not the vnode.ed2009-12-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Basically this commit changes two things, which improves access to TTYs in exceptional conditions. Basically the problem was that when you ran jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if you want to attach to screens quickly, use ssh(1), etc. The fixes: - Cache the cdev_priv of the controlling TTY in struct session. Change devfs_access() to compare against the cdev_priv instead of the vnode. This allows you to bypass UNIX permissions, even across different mounts of devfs. - Extend devfs_prison_check() to unconditionally expose the device node of the controlling TTY, even if normal prison nesting rules normally don't allow this. This actually allows you to interact with this device node. To be honest, I'm not really happy with this solution. We now have to store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp). In an ideal world, we should just get rid of the latter two and only use s_ttyp, but this makes certian pieces of code very impractical (e.g. devfs, kern_exit.c). Reported by: Many people
* Refine r195509, instead of checking that vnode type is VBAD, that iskib2009-10-101-3/+3
| | | | | | | | set quite late in the revocation path, properly verify that vnode is not doomed before calling VOP. Reported and tested by: Harald Schmalzbauer <h.schmalzbauer omnilan de> MFC after: 3 days
* Add a temporary workaround which just lets init die instead ofmarius2009-08-261-1/+6
| | | | | | | | | causing a panic if it is killed due to a unsolved stack overflow seen very late during shutdown on sparc64 when the gmirror worker process exists, which is a regression introduced in 8.0. Reviewed by: kib MFC after: 3 days
* Remove the interim vimage containers, struct vimage and struct procg,jamie2009-07-171-5/+0
| | | | | | and the ioctl-based interface that supported them. Approved by: re (kib), bz (mentor)
* The control terminal revocation at the session leader exit does notkib2009-07-091-3/+4
| | | | | | | | | | correctly checks for reclaimed vnode, possibly calling VOP_REVOKE for such vnode. If the terminal is already revoked, or devfs mount was forcibly unmounted, the revocation of doomed ctty vnode causes panic. Reported and tested by: lstewart Approved by: re (kensmith) MFC after: 2 weeks
* udit the 'options' argument to wait4(2).rwatson2009-07-011-0/+1
| | | | | Approved by: re (kib) MFC after: 3 days
* Replace AUDIT_ARG() with variable argument macros with a set more morerwatson2009-06-271-2/+2
| | | | | | | | | | | | | | specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
* Perform some more cleanups to in-kernel session handling.ed2009-06-151-38/+33
| | | | | | | | | | | | | | | | | | | | | The code that was in place in exit1() was mainly based on code from the old TTY layer. The main reason behind this, was because at one moment I ran a system that had two TTY layers in place at the same time. It is now sufficient to do the following: - Remove references from the session structure to the TTY vnode and the session leader. - If we have a controlling TTY and the session used by the TTY is equal to our session, send the SIGHUP. - If we have a vnode to the controlling TTY which has not been revoked, revoke it. While there, change sys/kern/tty.c to use s_ttyp in the comparison instead of s_ttyvp. It should not make any difference, because s_ttyvp can only become null when the session leader already left, but it's nicer to compare against the proper value.
* Make tcsetsid(3) work on revoked TTYs.ed2009-06-151-1/+1
| | | | | | | | | | | Right now the only way to make tcsetsid(3)/TIOCSCTTY work, is by ensuring the session leader is dead. This means that an application that catches SIGHUPs and performs a sleep prevents us from assigning a new session leader. Change the code to make it work on revoked TTYs as well. This allows us to change init(8) to make the shutdown script run in a more clean environment.
* Move zombie-reaping code out of kern_wait() and into its own function,rwatson2009-06-081-106/+121
| | | | | | | proc_reap(). Reviewed by: jhb MFC after: 3 days Sponsored by: Google, Inc.
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
* Introduce a new virtualization container, provisionally named vprocg, to holdzec2009-05-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)
* Fix typo.kib2009-04-201-1/+1
| | | | | Noted by: jhb MFC after: 2 weeks
* On the exit of the child process which parent either set SA_NOCLDWAITkib2009-04-201-4/+4
| | | | | | | | | | | | | | | or ignored SIGCHLD, unconditionally wake up the parent instead of doing this only when the child is a last child. This brings us in line with other U**xes that support SA_NOCLDWAIT. If the parent called waitpid(childpid), then exit of the child should wake up the parent immediately instead of forcing it to wait for all children to exit. Reported by: Alan Ferrency <alan pair com> Submitted by: Jilles Tjoelker <jilles stack nl> PR: 108390 MFC after: 2 weeks
* Remove even more unneeded variable assignments.ed2009-02-261-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kern_time.c: - Unused variable `p'. kern_thr.c: - Variable `error' is always caught immediately, so no reason to initialize it. There is no way that error != 0 at the end of create_thread(). kern_sig.c: - Unused variable `code'. kern_synch.c: - `rval' is always assigned in all different cases. kern_rwlock.c: - `v' is always overwritten with RW_UNLOCKED further on. kern_malloc.c: - `size' is always initialized with the proper value before being used. kern_exit.c: - `error' is always caught and returned immediately. abort2() never returns a non-zero value. kern_exec.c: - `len' is always assigned inside the if-statement right below it. tty_info.c: - `td' is always overwritten by FOREACH_THREAD_IN_PROC(). Found by: LLVM's scan-build
* Several threads in a process may do vfork() simultaneously. Then, allkib2008-12-051-0/+2
| | | | | | | | | | | | | | | | | | | parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with proc mutex of the child, that triggers assertion in the sleepq_add(). The assertion requires that at any time, all simultaneous sleepers for the channel use the same interlock. Silent the assertion by using conditional variable allocated in the child. Broadcast the variable event on exec() and exit(). Since struct proc * sleep wait channel is overloaded for several unrelated events, I was unable to remove wakeups from the places where cv_broadcast() is added, except exec(). Reported and tested by: ganbold Suggested and reviewed by: jhb MFC after: 2 week
* MFp4:bz2008-11-291-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible
* Move per-thread userland debugging flags into seperated field,davidxu2008-10-151-0/+4
| | | | | | this eliminates some problems of locking, e.g, a thread lock is needed but can not be used at that time. Only the process lock is needed now for new field.
* Don't remove queued SIGCHLD if options contain WNOWAIT, so otherdavidxu2008-08-291-6/+6
| | | | threads still can be notified by the signal.
* Implement WNOWAIT flag for wait4(2). It specifies that process whose statuskib2008-08-261-2/+14
| | | | | | | | | is returned shall be kept in the waitable state. Add WSTOPPED as an alias for WUNTRACED. Submitted by: Jukka Ukkonen <jau at iki fi> PR: standards/116221 MFC after: 2 weeks
* Integrate the new MPSAFE TTY layer to the FreeBSD operating system.ed2008-08-201-34/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
* Add DTrace 'proc' provider probes using the Statically Defined Tracejb2008-05-241-0/+30
| | | | (sdt) mechanism.
OpenPOWER on IntegriCloud