summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_sig.c
Commit message (Collapse)AuthorAgeFilesLines
* When OOM searches for a process to kill, ignore the processes alreadykib2010-04-061-0/+1
| | | | | | | | | | | | | | | killed by OOM. When killed process waits for a page allocation, try to satisfy the request as fast as possible. This removes the often encountered deadlock, where OOM continously selects the same victim process, that sleeps uninterruptibly waiting for a page. The killed process may still sleep if page cannot be obtained immediately, but testing has shown that system has much higher chance to survive in OOM situation with the patch. In collaboration with: pho Reviewed by: alc MFC after: 4 weeks
* Merge projects/enhanced_coredumps (r204346) into HEAD:alfred2010-03-021-9/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enhanced process coredump routines. This brings in the following features: 1) Limit number of cores per process via the %I coredump formatter. Example: if corefilename is set to %N.%I.core AND num_cores = 3, then if a process "rpd" cores, then the corefile will be named "rpd.0.core", however if it cores again, then the kernel will generate "rpd.1.core" until we hit the limit of "num_cores". this is useful to get several corefiles, but also prevent filling the machine with corefiles. 2) Encode machine hostname in core dump name via %H. 3) Compress coredumps, useful for embedded platforms with limited space. A sysctl kern.compress_user_cores is made available if turned on. To enable compressed coredumps, the following config options need to be set: options COMPRESS_USER_CORES device zlib # brings in the zlib requirements. device gzio # brings in the kernel vnode gzip output module. 4) Eventhandlers are fired to indicate coredumps in progress. 5) The imgact sv_coredump routine has grown a flag to pass in more state, currently this is used only for passing a flag down to compress the coredump or not. Note that the gzio facility can be used for generic output of gzip'd streams via vnodes. Obtained from: Juniper Networks Reviewed by: kan
* Staticise sigqueue manipulation functions used only in kern_sig.c.kib2010-01-231-8/+8
| | | | MFC after: 1 week
* When traced process is about to receive the signal, the process iskib2010-01-201-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stopped and debugger may modify or drop the signal. After the changes to keep process-targeted signals on the process sigqueue, another thread may note the old signal on the queue and act before the thread removes changed or dropped signal from the process queue. Since process is traced, it usually gets stopped. Or, if the same signal is delivered while process was stopped, the thread may erronously remove it, intending to remove the original signal. Remove the signal from the queue before notifying the debugger. Restore the siginfo to the head of sigqueue when signal is allowed to be delivered to the debugee, using newly introduced KSI_HEAD ksiginfo_t flag. This preserves required order of delivery. Always restore the unchanged signal on the curthread sigqueue, not to the process queue, since the thread is about to get it anyway, because sigmask cannot be changed. Handle failure of reinserting the siginfo into the queue by falling back to sq_kill method, calling sigqueue_add with NULL ksi. If debugger changed the signal to be delivered, use sigqueue_add() with NULL ksi instead of only setting sq_signals bit. Reported by: Gardner Bell <gbell72 rogers com> Analyzed and first version of fix by: Tijl Coosemans <tijl coosemans org> PR: 142757 Reviewed by: davidxu MFC after: 2 weeks
* Remove wrong assertion. Debugee is allowed to lose a signal.kib2009-12-031-3/+2
| | | | | Reported and tested by: jh MFC after: 2 weeks
* Among signal generation syscalls, only sigqueue(2) is allowed by POSIXkib2009-11-171-29/+53
| | | | | | | | | | | | | | | | | | | | | | | | | to fail due to lack of resources to queue siginfo. Add KSI_SIGQ flag that allows sigqueue_add() to fail while trying to allocate memory for new siginfo. When the flag is not set, behaviour is the same as for KSI_TRAP: if memory cannot be allocated, set bit in sq_kill. KSI_TRAP is kept to preserve KBI. Add SI_KERNEL si_code, to be used in siginfo.si_code when signal is generated by kernel. Deliver siginfo when signal is generated by kill(2) family of syscalls (SI_USER with properly filled si_uid and si_pid), or by kernel (SI_KERNEL, mostly job control or SIGIO). Since KSI_SIGQ flag is not set for the ksi, low memory condition cause old behaviour. Keep psignal(9) KBI intact, but modify it to generate SI_KERNEL si_code. Pgsignal(9) and gsignal(9) now take ksi explicitely. Add pksignal(9) that behaves like psignal but takes ksi, and ddb kill command implemented as pksignal(..., ksi = NULL) to not do allocation while in debugger. While there, remove some register specifiers and use ANSI C prototypes. Reviewed by: davidxu MFC after: 1 month
* In r198506, kern_sigsuspend() started doing cursig/postsig loop to makekib2009-11-101-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | sure that a signal was delivered to the thread before returning from syscall. Signal delivery puts new return frame on the user stack, and modifies trap frame to enter signal handler. As a consequence, syscall return code sets EINTR as error return for signal frame, instead of the syscall return. Also, for ia64, due to different registers layout for those two kind of frames, usermode sigsegfaulted when returned from signal handler. Use newly-introduced cpu_set_syscall_retval(9) to set syscall result, and return EJUSTRETURN from kern_sigsuspend() to prevent syscall return code from modifying this frame [1]. Another issue is that pending SIGCONT might be cancelled by SIGSTOP, causing postsig() not to deliver any catched signal [2]. Modify postsig() to return 1 if signal was posted, and 0 otherwise, and use this in the kern_sigsuspend loop. Proposed by: marcel [1] Noted by: davidxu [2] Reviewed by: marcel, davidxu MFC after: 1 month
* Trapsignal() and postsig() call kern_sigprocmask() with both processkib2009-10-301-20/+16
| | | | | | | | | | | | lock and curproc->p_sigacts->ps_mtx. Reschedule_signals may need to have ps_mtx locked to decide and wakeup a thread, causing recursion on the mutex. Inform kern_sigprocmask() and reschedule_signals() about lock state of the ps_mtx by new flag SIGPROCMASK_PS_LOCKED to avoid recursion. Reported and tested by: keramida MFC after: 1 month
* Trapsignal() calls kern_sigprocmask() when delivering catched signalkib2009-10-291-1/+2
| | | | | | | with proc lock held. Reported and tested by: Mykola Dzham freebsd at levsha org ua MFC after: 1 month
* In r197963, a race with thread being selected for signal deliverykib2009-10-271-21/+22
| | | | | | | | | | | | | while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls. Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race. Reviewed by: davidxu Tested by: pho MFC after: 1 month
* In kern_sigsuspend(), better manipulate thread signal mask usingkib2009-10-271-22/+29
| | | | | | | | | | | | | | | | | | | | | | kern_sigprocmask() to properly notify other possible candidate threads for signal delivery. Since sigsuspend() shall only return to usermode after a signal was delivered, do cursig/postsig loop immediately after waiting for signal, repeating the wait if wakeup was spurious due to race with other thread fetching signal from the process queue before us. Add thread_suspend_check() call to allow the thread to be stopped or killed while in loop. Modify last argument of kern_sigprocmask() from boolean to flags, allowing the function to be called with locked proc. Convertion of the callers that supplied 1 to the old argument will be done in the next commit, and due to SIGPROCMASK_OLD value equial to 1, code is formally correct in between. Reviewed by: davidxu Tested by: pho MFC after: 1 month
* Improve the description of sysctl "kern.sugid_coredump".jkoshy2009-10-121-1/+1
| | | | | Submitted by: Mel Flynn <mel.flynn+fbsd.hackers at mailing.thruhere.net> on -hackers
* Fix typo.kib2009-10-121-1/+1
| | | | | Submitted by: rdivacky MFC after: 1 month
* Currently, when signal is delivered to the process and there is a threadkib2009-10-111-29/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not blocking the signal, signal is placed on the thread sigqueue. If the selected thread is in kernel executing thr_exit() or sigprocmask() syscalls, then signal might be not delivered to usermode for arbitrary amount of time, and for exiting thread it is lost. Put process-directed signals to the process queue unconditionally, selecting the thread to deliver the signal only by the thread returning to usermode, since only then the thread can handle delivery of signal reliably. For exiting thread or thread that has blocked some signals, check whether the newly blocked signal is queued for the process, and try to find a thread to wakeup for delivery, in reschedule_signal(). For exiting thread, assume that all signals are blocked. Change cursig() and postsig() to look both into the thread and process signal queues. When there is a signal that thread returning to usermode could consume, TDF_NEEDSIGCHK flag is not neccessary set now. Do unlocked read of p_siglist and p_pendingcnt to check for queued signals. Note that thread that has a signal unblocked might get spurious wakeup and EINTR from the interruptible system call now, due to the possibility of being selected by reschedule_signals(), while other thread returned to usermode earlier and removed the signal from process queue. This should not cause compliance issues, since the thread has not blocked a signal and thus should be ready to receive it anyway. Reported by: Justin Teller <justin.teller gmail com> Reviewed by: davidxu, jilles MFC after: 1 month
* Fix typo.kib2009-10-011-1/+1
| | | | MFC after: 3 days
* Use C99 initialization for struct filterops.rwatson2009-09-121-2/+6
| | | | | | Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks
* Add new msleep(9) flag PBDY that shall be specified together withkib2009-07-141-10/+27
| | | | | | | | | | | | PCATCH, to indicate that thread shall not be stopped upon receipt of SIGSTOP until it reaches the kernel->usermode boundary. Also change thread_single(SINGLE_NO_EXIT) to only stop threads at the user boundary unconditionally. Tested by: pho Reviewed by: jhb Approved by: re (kensmith)
* Replace AUDIT_ARG() with variable argument macros with a set more morerwatson2009-06-271-5/+5
| | | | | | | | | | | | | | specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
* vn_open_cred() needs a non NULL ucred pointerpho2009-06-231-1/+1
| | | | Reviewed by: kib
* Add another flags argument to vn_open_cred. Use it to specify that somekib2009-06-211-1/+2
| | | | | | | | | | | | | | vn_open_cred invocations shall not audit namei path. In particular, specify VN_OPEN_NOAUDIT for dotdot lookup performed by default implementation of vop_vptocnp, and for the open done for core file. vn_fullpath is called from the audit code, and vn_open there need to disable audit to avoid infinite recursion. Core file is created on return to user mode, that, in particular, happens during syscall return. The creation of the core file is audited by direct calls, and we do not want to overwrite audit information for syscall. Reported, reviewed and tested by: rwatson
* Remove VOP_LEASE and supporting functions. This hasn't been used sincerwatson2009-04-101-1/+0
| | | | | | | | | | | | | | the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon
* Remove even more unneeded variable assignments.ed2009-02-261-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kern_time.c: - Unused variable `p'. kern_thr.c: - Variable `error' is always caught immediately, so no reason to initialize it. There is no way that error != 0 at the end of create_thread(). kern_sig.c: - Unused variable `code'. kern_synch.c: - `rval' is always assigned in all different cases. kern_rwlock.c: - `v' is always overwritten with RW_UNLOCKED further on. kern_malloc.c: - `size' is always initialized with the proper value before being used. kern_exit.c: - `error' is always caught and returned immediately. abort2() never returns a non-zero value. kern_exec.c: - `len' is always assigned inside the if-statement right below it. tty_info.c: - `td' is always overwritten by FOREACH_THREAD_IN_PROC(). Found by: LLVM's scan-build
* Revert rev 184216 and 184199, due to the way the thread_lock works,davidxu2008-11-051-1/+28
| | | | | | it may cause a lockup. Noticed by: peter, jhb
* Actually, for signal and thread suspension, extra process spin lock isdavidxu2008-10-231-28/+1
| | | | | | unnecessary, the normal process lock and thread lock are enough. The spin lock is still needed for process and thread exiting to mimic single sched_lock.
* Move per-thread userland debugging flags into seperated field,davidxu2008-10-151-8/+4
| | | | | | this eliminates some problems of locking, e.g, a thread lock is needed but can not be used at that time. Only the process lock is needed now for new field.
* Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed threadattilio2008-08-281-2/+2
| | | | | | was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* If a thread that is swapped out is made runnable, then the setrunnable()jhb2008-08-051-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routine wakes up proc0 so that proc0 can swap the thread back in. Historically, this has been done by waking up proc0 directly from setrunnable() itself via a wakeup(). When waking up a sleeping thread that was swapped out (the usual case when waking proc0 since only sleeping threads are eligible to be swapped out), this resulted in a bit of recursion (e.g. wakeup() -> setrunnable() -> wakeup()). With sleep queues having separate locks in 6.x and later, this caused a spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock). An attempt was made to fix this in 7.0 by making the proc0 wakeup use the ithread mechanism for doing the wakeup. However, this required grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated into the same LOR since the thread lock would be some other sleepq lock. Fix this by deferring the wakeup of the swapper until after the sleepq lock held by the upper layer has been locked. The setrunnable() routine now returns a boolean value to indicate whether or not proc0 needs to be woken up. The end result is that consumers of the sleepq API such as *sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup proc0 if they get a non-zero return value from sleepq_abort(), sleepq_broadcast(), or sleepq_signal(). Discussed with: jeff Glanced at by: sam Tested by: Jurgen Weber jurgen - ish com au MFC after: 2 weeks
* Add DTrace 'proc' provider probes using the Statically Defined Tracejb2008-05-241-0/+22
| | | | (sdt) mechanism.
* - Add a new td flag TDF_NEEDSUSPCHK that is set whenever a thread needsjeff2008-03-211-0/+1
| | | | | | | | | | | | | to enter thread_suspend_check(). - Set TDF_ASTPENDING along with TDF_NEEDSUSPCHK so we can move the thread_suspend_check() to ast() rather than userret(). - Check TDF_NEEDSUSPCHK in the sleepq_catch_signals() optimization so that we don't miss a suspend request. If this is set use the expensive signal path. - Set NEEDSUSPCHK when creating a new thread in thr in case the creating thread is due to be suspended as well but has not yet. Reviewed by: davidxu (Authored original patch)
* - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice fromjeff2008-03-191-33/+13
| | | | | | | requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.
* Remove kernel support for M:N threading.jeff2008-03-121-157/+0
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* Use sbuf routines to construct core dump filenames rather than customrwatson2008-03-081-27/+22
| | | | | | | string buffer handling, making the code both easier to read and more robust against string-handling bugs. MFC after: 1 week
* Unlock the process lock when expand_name() fails, or we may leak therwatson2008-03-081-0/+1
| | | | | | process lock leading to a hang. This bug was introduced in kern_sig.c:1.351, when the call to expand_name() was moved earlier bit this particular error case was not updated.
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-3/+3
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-1/+1
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Be more exact with sigaction SA_SIGINFO handling.obrien2007-12-181-2/+5
| | | | Reviewed by: marcel
* Fix for the panic("vm_thread_new: kstack allocation failed") andkib2007-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb
* Implement AUE_CORE, which adds process core dump support into the kernel.csjp2007-10-261-6/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change introduces audit_proc_coredump() which is called by coredump(9) to create an audit record for the coredump event. When a process dumps a core, it could be security relevant. It could be an indicator that a stack within the process has been overflowed with an incorrectly constructed malicious payload or a number of other events. The record that is generated looks like this: header,111,10,process dumped core,0,Thu Oct 25 19:36:29 2007, + 179 msec argument,0,0xb,signal path,/usr/home/csjp/test.core subject,csjp,csjp,staff,csjp,staff,1101,1095,50457,10.37.129.2 return,success,1 trailer,111 - We allocate a completely new record to make sure we arent clobbering the audit data associated with the syscall that produced the core (assuming the core is being generated in response to SIGABRT and not an invalid memory access). - Shuffle around expand_name() so we can use the coredump name at the very beginning of the coredump call. Make sure we free the storage referenced by "name" if we need to bail out early. - Audit both successful and failed coredump creation efforts Obtained from: TrustedBSD Project Reviewed by: rwatson MFC after: 1 month
* Move where we audit the PID argument such that we unconditionallycsjp2007-10-241-1/+1
| | | | | | | | | audit it at the beginning of the syscall. This fixes a problem where the user supplies an invalid process ID which is > 0 which results in the PID argument not being audited. Obtained from: TrustedBSD Project MFC after: 1 week
* - Calling sched_nice() in tdsigwakeup() is no longer required by ULE andjeff2007-07-191-6/+2
| | | | | | | actually causes LORs and other panics. Reported by: mlaier Approved by: re
* - Add a missing PROC_SUNLOCK() in tdsignal()jeff2007-06-111-1/+3
|
* Initialized ets to zero. This is arguably a gcc bug in that ets is alwaysmjacob2007-06-101-0/+2
| | | | | set to rts when timeout is non-NULL and then timevalid is set and ets is only checked later when timervalid is set.
* Commit 4/14 of sched_lock decomposition.jeff2007-06-041-56/+66
| | | | | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. - Move some common code into thread_suspend_switch() to handle the mechanics of suspending a thread. The locking here is incredibly convoluted and should be simplified. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* - Move rusage from being per-process in struct pstats to per-thread injeff2007-06-011-2/+2
| | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
* Revert UF_OPENING workaround for CURRENT.kib2007-05-311-1/+1
| | | | | | | | | Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
* Comment that tdsignal() may be entered from the debugger.rwatson2007-05-231-0/+4
|
* Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,jhb2007-03-211-2/+2
| | | | rwlocks, and sx locks to 'lock_object'.
* Further system call comment cleanup:rwatson2007-03-051-12/+2
| | | | | | | | | | - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
* Remove 'MPSAFE' annotations from the comments above most system calls: allrwatson2007-03-041-75/+8
| | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
* Give which signal caller has attempted to deliver when panicking.delphij2007-02-091-2/+2
|
OpenPOWER on IntegriCloud