summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_trap.c
Commit message (Collapse)AuthorAgeFilesLines
* Fixup r240246: hwpmc needs to retain the pinning until ASTs are notattilio2012-10-301-1/+6
| | | | | | | | | | | | | executed. This means past the point where userret() is generally executed. Skip the td_pinned check if a callchain tracing is currently happening and add a more robust check to pmc_capture_user_callchain() in order to catch td_pinned leak past ast() in hwpmc case. Reported and tested by: fabient MFC after: 1 week X-MFC: r240246
* Add CPU percentage limit enforcement to RCTL. The resouce name is "pcpu".trasz2012-10-261-0/+13
| | | | It was implemented by Rudolf Tomori during Google Summer of Code 2012.
* Add a KPI to allow to reserve some amount of space in the numvnodeskib2012-10-141-0/+2
| | | | | | | | | | | | | counter, without actually allocating the vnodes. The supposed use of the getnewvnode_reserve(9) is to reclaim enough free vnodes while the code still does not hold any resources that might be needed during the reclamation, and to consume the slack later for getnewvnode() calls made from the innards. After the critical block is finished, the caller shall free any reserve left, by getnewvnode_drop_reserve(9). Reviewed by: avg Tested by: pho MFC after: 1 week
* Move the checks for td_pinned, td_critnest, TDP_NOFAULTING andattilio2012-09-081-1/+14
| | | | | | | | | | TDP_NOSLEEPING leaking from syscallret() to userret() so that also trap handling is covered. Also, the check on td_locks is not duplicated between the two functions. Reported by: avg Reviewed by: kib MFC after: 1 week
* Move PT_UPDATED_FLUSH() before td_locks check in order to have moreattilio2012-09-081-3/+3
| | | | | | | coverage also in the XEN case. Reviewed by: kib MFC after: 1 week
* userret() already checks for td_locks when INVARIANTS is enabled, soattilio2012-09-081-1/+0
| | | | | | | there is no need to check if Giant is acquired after it. Reviewed by: kib MFC after: 1 week
* Remove redundant include.pjd2012-06-101-1/+0
| | | | MFC after: 1 month
* Include the associated wait channel message for context switch ktracejhb2012-04-201-2/+2
| | | | | | | records. kdump supports both the old and new messages. Submitted by: Andrey Zonov andrey zonov org MFC after: 1 week
* Add software PMC support.fabient2012-03-281-0/+10
| | | | | | | | | | | | | New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8). Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions. Sponsored by: NETASQ MFC after: 1 month
* Assert that exiting process does not return to usermode.kib2011-10-031-0/+2
| | | | | Reviewed by: avg, jhb MFC after: 1 week
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-2/+2
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Inline the syscallenter() and syscallret(). This reduces the time measuredkib2011-09-111-162/+0
| | | | | | | | by the syscall entry speed microbenchmarks by ~10% on amd64. Submitted by: jhb Approved by: re (bz) MFC after: 2 weeks
* We may split today's CAPABILITIES into CAPABILITY_MODE (which hasjonathan2011-06-291-2/+2
| | | | | | | | | | | | | to do with global namespaces) and CAPABILITIES (which has to do with constraining file descriptors). Just in case, and because it's a better name anyway, let's move CAPABILITIES out of the way. Also, change opt_capabilities.h to opt_capsicum.h; for now, this will only hold CAPABILITY_MODE, but it will probably also hold the new CAPABILITIES (implying constrained file descriptors) in the future. Approved by: rwatson Sponsored by: Google UK Ltd
* Continue introducing Capsicum capability mode support:rwatson2011-03-011-0/+15
| | | | | | | | | | | If a system call wasn't listed in capabilities.conf, return ECAPMODE at syscall entry. Reviewed by: anderson Discussed with: benl, kris, pjd Sponsored by: Google, Inc. Obtained from: Capsicum Project MFC after: 3 months
* Mfp4 CH=177256:bz2011-02-141-0/+11
| | | | | | | | | | | | | | | Catch a set vnet upon return to user space. This usually means return paths with CURVNET_RESTORE() missing. If VNET_DEBUG is turned on we can even tell the function that did the CURVNET_SET() which is really helpful; else we print "N/A". Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb MFC after: 11 days
* Allow debugger to specify that children of the traced process should bekib2011-01-251-2/+2
| | | | | | | | automatically traced. Extend the ptrace(PL_LWPINFO) to report that child just forked. Reviewed by: davidxu, jhb MFC after: 2 weeks
* Remove extra braces for style(9) (found while cleaning up an old work tree).emaste2010-09-281-2/+1
|
* Call the systrace_probe_func() when the error value.rpaulo2010-08-221-2/+2
| | | | Sponsored by: The FreeBSD Foundation
* Retire td_syscalls now that it is no longer needed.jhb2010-07-151-1/+0
|
* Obey sv_syscallnames bounds in syscallname().kib2010-07-041-2/+4
| | | | Reported and tested by: pho
* Move prototypes for kern_sigtimedwait() and kern_sigprocmask() tojhb2010-06-301-0/+1
| | | | <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
* Count number of threads that enter and leave dynamically registeredkib2010-06-281-0/+4
| | | | | | | | | | syscalls. On the dynamic syscall deregistration, wait until all threads leave the syscall code. This somewhat increases the safety of the loadable modules unloading. Reviewed by: jhb Tested by: pho MFC after: 1 month
* Remove the support for int13 FPU exception reporting on i386. It iskib2010-06-231-21/+0
| | | | | | | | | believed that all 486-class CPUs FreeBSD is capable to run on, either have no FPU and cannot use external coprocessor, or have FPU on the package and can use #MF. Reviewed by: bde Tested by: pho (previous version)
* Make DTrace syscall provider work again by including opt_kdtrace.h here.rpaulo2010-06-171-0/+1
|
* Allow to use syscallname(9) outside subr_trap.c.kib2010-05-261-2/+1
| | | | MFC after: 1 month
* Reorganize syscall entry and leave handling.kib2010-05-231-0/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* Current pselect(3) is implemented in usermode and thus vulnerable tokib2009-10-271-0/+5
| | | | | | | | | | | | | | | | | well-known race condition, which elimination was the reason for the function appearance in first place. If sigmask supplied as argument to pselect() enables a signal, the signal might be delivered before thread called select(2), causing lost wakeup. Reimplement pselect() in kernel, making change of sigmask and sleep atomic. Since signal shall be delivered to the usermode, but sigmask restored, set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK should be cleared by ast() in case signal was not gelivered during syscall execution. Reviewed by: davidxu Tested by: pho MFC after: 1 month
* Currently, when signal is delivered to the process and there is a threadkib2009-10-111-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not blocking the signal, signal is placed on the thread sigqueue. If the selected thread is in kernel executing thr_exit() or sigprocmask() syscalls, then signal might be not delivered to usermode for arbitrary amount of time, and for exiting thread it is lost. Put process-directed signals to the process queue unconditionally, selecting the thread to deliver the signal only by the thread returning to usermode, since only then the thread can handle delivery of signal reliably. For exiting thread or thread that has blocked some signals, check whether the newly blocked signal is queued for the process, and try to find a thread to wakeup for delivery, in reschedule_signal(). For exiting thread, assume that all signals are blocked. Change cursig() and postsig() to look both into the thread and process signal queues. When there is a signal that thread returning to usermode could consume, TDF_NEEDSIGCHK flag is not neccessary set now. Do unlocked read of p_siglist and p_pendingcnt to check for queued signals. Note that thread that has a signal unblocked might get spurious wakeup and EINTR from the interruptible system call now, due to the possibility of being selected by reschedule_signals(), while other thread returned to usermode earlier and removed the signal from process queue. This should not cause compliance issues, since the thread has not blocked a signal and thus should be ready to receive it anyway. Reported by: Justin Teller <justin.teller gmail com> Reviewed by: davidxu, jilles MFC after: 1 month
* Add new msleep(9) flag PBDY that shall be specified together withkib2009-07-141-1/+1
| | | | | | | | | | | | PCATCH, to indicate that thread shall not be stopped upon receipt of SIGSTOP until it reaches the kernel->usermode boundary. Also change thread_single(SINGLE_NO_EXIT) to only stop threads at the user boundary unconditionally. Tested by: pho Reviewed by: jhb Approved by: re (kensmith)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* - Bug fix: prevent a thread from migrating between CPUs between thejkoshy2008-12-131-8/+0
| | | | | | | | | | | | | time it is marked for user space callchain capture in the NMI handler and the time the callchain capture callback runs. - Improve code and control flow clarity by invoking hwpmc(4)'s user space callchain capture callback directly from low-level code. Reviewed by: jhb (kern/subr_trap.c) Testing (various patch revisions): gnn, Fabien Thomas <fabien dot thomas at netasq dot com>, Artem Belevich <artemb at gmail dot com>
* - Forward port flush of page table updates on context switch or userretkmacy2008-10-191-0/+9
| | | | - Forward port vfork XEN hack
* - Make SCHED_STATS more generic by adding a wrapper to create thejeff2008-04-171-2/+1
| | | | | | | | | | | | | | | | | | variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia
* - Add a new td flag TDF_NEEDSUSPCHK that is set whenever a thread needsjeff2008-03-211-17/+11
| | | | | | | | | | | | | to enter thread_suspend_check(). - Set TDF_ASTPENDING along with TDF_NEEDSUSPCHK so we can move the thread_suspend_check() to ast() rather than userret(). - Check TDF_NEEDSUSPCHK in the sleepq_catch_signals() optimization so that we don't miss a suspend request. If this is set use the expensive signal path. - Set NEEDSUSPCHK when creating a new thread in thr in case the creating thread is due to be suspended as well but has not yet. Reviewed by: davidxu (Authored original patch)
* Remove kernel support for M:N threading.jeff2008-03-121-23/+1
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* Kernel and hwpmc(4) support for callchain capture.jkoshy2007-12-071-0/+13
| | | | Sponsored by: FreeBSD Foundation and Google Inc.
* A bunch more files that should probably print out a thread namejulian2007-11-141-1/+1
| | | | instead of a process name.
* - Move all of the PS_ flags into either p_flag or td_flags.jeff2007-09-171-16/+7
| | | | | | | | | | | | | | - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
* - Include opt_sched.h for SCHED_STATS.jeff2007-06-121-0/+1
|
* Commit 14/14 of sched_lock decomposition.jeff2007-06-051-8/+11
| | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* Do proper "locking" for missing vmmeters part.attilio2007-06-041-1/+1
| | | | | | | | Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)
* - Move rusage from being per-process in struct pstats to per-thread injeff2007-06-011-18/+2
| | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
* Revert VMCNT_* operations introduction.attilio2007-05-311-1/+1
| | | | | | | | Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
* - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulatingjeff2007-05-181-1/+1
| | | | | | | | vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
* Remove 'MPSAFE' annotations from the comments above most system calls: allrwatson2007-03-041-4/+2
| | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
* Threading cleanup.. part 2 of several.julian2006-12-061-11/+1
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* Make KSE a kernel option, turned on by default in all GENERICjb2006-10-261-0/+12
| | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* kern_intr.c:bde2006-10-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | - Count (scheduling of) software interrupts (SWIs) as SWIs, not as hardware interrupts. - Don't count (scheduling of) delayed SWIs as interrupts at all, since in the delayed case it is expected that there are many more scheduling calls than handling calls. Perhaps all interrupts should be counted only when they are handled, but it is only counts of delayed SWIs that shouldn never be combined with the other counts. subr_trap.c: - Count (handling of) Asynchronous System Traps (ASTs) as traps, not as software interrupts. Before these changes, the counter for SWIs only counted ASTs, and SWIs weren't counted separately, but a subcounter for ASTs alone is less needed than for most other exception sources. 4.4BSD-Lite uses the counters for similar things (actually matching their names) on its main arches (hp300, ..., !i386) where more of the exceptions are in hardware.
* Test before modifying p_sflag to avoid unconditionally cache linedavidxu2006-02-101-2/+4
| | | | ping-pong on SMP.
OpenPOWER on IntegriCloud