summaryrefslogtreecommitdiffstats
path: root/sys/cddl/dev
Commit message (Collapse)AuthorAgeFilesLines
* Correct the types of the arguments to return probes of the syscallrstone2011-11-111-2/+7
| | | | | | | | provider. Previously we were erroneously supplying the argument types of the corresponding entry probe. Reviewed by: rpaulo MFC after: 1 week
* On i386, fbt probes are implemented by writing an invalid opcode overrstone2011-11-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | certain instructions in a function prologue or epilogue. DTrace has a hook into the invalid opcode fault handler that checks whether the fault was due to an probe and if so, runs the DTrace magic. Upon returning from an invalid opcode fault caused by a probe, DTrace must emulate the instruction that was replaced with the invalid opcode and then return control to the instruction following the invalid opcode. There were a pair of related bugs in the emulation for the leave instruction. The leave instruction is used to pop off a stack frame prior to returning from a function. The emulation for this instruction must move the trap frame for the invalid opcode fault down the stack to the bottom of the stack frame that is being removed, and then execute an iret. At two points in this process, the emulation code was storing values above the current value of the stack pointer. This opened up a window in which if we were two take an interrupt, the trap frame for the interrupt would overwrite the values stored on the stack, causing the system to panic later. The first bug was that at one point the emulation code saves the new value for $esp above the current stack pointer value. The fix is to save this value instead inside of the original trap frame. At this point we do not need the original trap frame so this is safe. The second bug is that when the emulate code loads $esp from the stack, it points part-way through the new trap frame instead of at its beginning. The emulation code adjusts the stack pointer to the correct value immediately afterwards, but this still leaves a one instruction window in which an interrupt would corrupt this trap frame. Fix this by adjusting the stack frame value before loading it into $esp. This fixes panics in invop_leave on i386 when using fbt return probes. Reviewed by: rpaulo, attilio MFC after: 1 week
* Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.ed2011-11-072-2/+2
| | | | This means that their use is restricted to a single C file.
* Define dtrace_cmpset_long in terms of atomic_cmpset_longmarcel2011-10-161-43/+3
| | | | | and not by virtue of inline assembly. Now this file compiles on all supported architectures.
* With retirement of cpumask_t and usage of cpuset_t for representing aattilio2011-07-042-4/+4
| | | | | | | | | | | | | | | mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient. Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))). This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement. MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast
* MFCattilio2011-05-162-20/+32
|
* MFCattilio2011-05-102-42/+2
|\
| * dtrace: remove unused codeavg2011-05-102-42/+2
| | | | | | | | | | | | Which is also useless, IMO. MFC after: 5 days
* | Commit the support for removing cpumask_t and replacing it directly withattilio2011-05-053-9/+14
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno
* Stripped '32' suffix from linux systrace module name on i386.art2011-04-081-2/+3
| | | | Approved by: avg
* Use atomic load & store for TSC frequency. It may be overkill for amd64 butjkim2011-04-072-2/+2
| | | | | | | | | safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).
* add DTrace systrace support for linux32 and freebsd32 on amd64 syscallsavg2011-03-121-18/+45
| | | | | | | | | | | | | | | | | | | Add systrace_linux32 and systrace_freebsd32 modules which provide support for tracing compat system calls in addition to native system call tracing provided by systrace module. Provided that all the systrace modules are loaded now you can select what syscalls to trace in the following manner: syscall::xxx:yyy - work on all system calls that match the specification syscall:freebsd:xxx:yyy - only native system calls syscall:linux32:xxx:yyy - linux32 compat system calls syscall:freebsd32:xxx:yyy - freebsd32 compat system calls on amd64 PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (earlier version) MFC after: 3 weeks
* Fix typos - remove duplicate "the".brucec2011-02-212-2/+2
| | | | | | PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
* cyclic xcall: use smp_no_rendevous_barrier as setup function parameteravg2010-12-171-2/+2
| | | | | | | | | | | | | In this case we call target function only on a single CPU and do not need any synchronization at the setup stage. It's a bit non-obvious but setup function of NULL means that smp_rendezvous_cpus waits for all CPUs to arrive at the rendezvous point, but without doing any actual setup. While using smp_no_rendevous_barrier means that each CPU proceeds on its own schedule without any synchronization whatsoever. MFC after: 3 weeks
* opensolaris cyclic: fix deadlock and make a little bit closer to upstreamavg2010-12-071-175/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dealock was caused in the following way: - thread T1 on CPU C1 holds a spin mutex, IPIs CPU C2 and waits for the IPI to be handled - C2 executes timer interrupt filter, thus has interrupts disabled, and gets blocked on the spin mutex held by T1 The problem seems to have been introduced by simplifications made to OpenSolaris code during porting. The problem is fixed by reorganizing the code to more closely resemble the upstream version. Interrupt filter (cyclic_fire) now doesn't acquire any locks, all per-CPU data accesses are performed on a target CPU with preemption and interrupts disabled thus precluding concurrent access to the data. cyp_mtx spin mutex is used to disable preemtion and interrupts; it's not used for classical mutual exclusion, because xcall already serializes calls to a CPU. It's an emulation of OpenSolaris cyb_set_level(CY_HIGH_LEVEL) call, the spin mutexes could probably be reduced to just a spinlock_enter()/_exit() pair. Diff with upstream version is now reduced by ~500 lines, however it still remains quite large - many things that are not needed (at the moment) or are irrelevant on FreeBSD were simply ripped out during porting. Examples of such things: - support for CPU onlining/offlining - support for suspend/resume - support for running callouts at soft interrupt levels - support for callout rebinding from CPU to CPU - support for CPU partitions Tested by: Artem Belevich <fbsdlist@src.cx> MFC after: 3 weeks X-MFC with: r216252
* opensolaris cyclic xcall: no need for special handling of curcpuavg2010-12-071-9/+3
| | | | | | | smp_rendezvous_cpus already properly handles current CPU case and non-SMP case. MFC after: 3 weeks
* dtrace_xcall: no need for special handling of curcpuavg2010-12-072-32/+6
| | | | | | | smp_rendezvous_cpus alreadt does the right thing in a very similar fashion, so the code was kind of duplicating that. MFC after: 3 weeks
* dtrace_gethrtime_init: pin to master while examining other CPUsavg2010-12-072-8/+10
| | | | | | | Also use pc_cpumask to be future-friendly. Reviewed by: jhb MFC after: 2 weeks
* Make the /dev/dtrace/helper node have the mode 0660. This allowsrpaulo2010-09-011-1/+1
| | | | | | | | | | | | | | | | | | programs that refuse to run as root (pgsql) to install probes when their user is part of the wheel group. Sponsored by: The FreeBSD Foundation > Description of fields to fill in above: 76 columns --| > PR: If a GNATS PR is affected by the change. > Submitted by: If someone else sent in the change. > Reviewed by: If someone else reviewed your modification. > Approved by: If you needed approval for this commit. > Obtained from: If the change is from a third party. > MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email. > Security: Vulnerability reference (one per line) or description. > Empty fields above will be automatically removed. M dev/dtrace/dtrace_load.c
* Destroy the helper device when unloading.rpaulo2010-08-221-0/+1
| | | | Sponsored by: The FreeBSD Foundation
* Add more compatibility structure members needed by the upcoming fasttraprpaulo2010-08-221-3/+33
| | | | | | DTrace device. Sponsored by: The FreeBSD Foundation
* Kernel DTrace support for:rpaulo2010-08-227-221/+585
| | | | | | | | | o uregs (sson@) o ustack (sson@) o /dev/dtrace/helper device (needed for USDT probes) The work done by me was: Sponsored by: The FreeBSD Foundation
* Add a function compatibility function dtrace_instr_size_isa() that onrpaulo2010-08-222-0/+14
| | | | | | FreeBSD does the same as dtrace_dis_isize(). Sponsored by: The FreeBSD Foundation
* Update several places that iterate over CPUs to use CPU_FOREACH().jhb2010-06-115-22/+7
|
* Reorganize syscall entry and leave handling.kib2010-05-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
* Rename the cyclic global variable lapic_cyclic_clock_func to justrpaulo2010-04-201-3/+3
| | | | | cyclic_clock_func. This will make more sense when we start developing non x86 cyclic version.
* The amd64 version of the cyclic dtrace module is a verbatim copy of therpaulo2010-04-201-133/+0
| | | | | i386 version, so instead having a copy of the same file, use Makefile foo to include the i386 version on amd64.
* dtrace_gethrtime: improve scaling of TSC ticks to nanosecondsavg2009-07-152-4/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently dtrace_gethrtime uses formula similar to the following for converting TSC ticks to nanoseconds: rdtsc() * 10^9 / tsc_freq The dividend overflows 64-bit type and wraps-around every 2^64/10^9 = 18446744073 ticks which is just a few seconds on modern machines. Now we instead use precalculated scaling factor of 10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately for each 32-bit half. This allows to avoid overflow of the dividend described above. The idea is taken from OpenSolaris. This has an added feature of always scaling TSC with invariant value regardless of TSC frequency changes. Thus the timestamps will not be accurate if TSC actually changes, but they are always proportional to TSC ticks and thus monotonic. This should be much better than current formula which produces wildly different non-monotonic results on when tsc_freq changes. Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init() to make it identical to the i386 twin. PR: kern/127441 Tested by: Thomas Backman <serenity@exscape.org> Reviewed by: jhb Discussed with: current@, bde, gnn Silence from: jb Approved by: re (gnn) MFC after: 1 week
* dtrace/amd64: fix virtual address checksavg2009-06-242-9/+6
| | | | | | | | | | | | | | | On amd64 KERNBASE/kernbase does not mean start of kernel memory. This should fix a KASSERT panic in dtrace_copycheck when copyin*() is used in D program. Also make checks for user memory a bit stricter. Reported by: Thomas Backman <serenity@exscape.org> Submitted by: wxs (kaddr part) Tested by: Thomas Backman (prototype), wxs Reviewed by: alc (concept), jhb, current@ Aprroved by: jb (concept) MFC after: 2 weeks PR: kern/134408
* Add the OpenSolaris dtrace lockstat provider. The lockstat providersson2009-05-261-0/+327
| | | | | | | | | | adds probes for mutexes, reader/writer and shared/exclusive locks to gather contention statistics and other locking information for dtrace scripts, the lockstat(1M) command and other potential consumers. Reviewed by: attilio jhb jb Approved by: gnn (mentor)
* Move dtnfsclient.c in the cddl tree to nfs_kdtrace.c in the nfsclientrwatson2009-03-251-545/+0
| | | | | | | | directory, since it's under a BSD license, and this keeps NFS internals- aware tracing parts close to NFS. MFC after: 1 month Suggested by: jhb
* Add DTrace probes to the NFS access and attribute caches. Access cacherwatson2009-03-241-54/+256
| | | | | | | | | | | | | | | | | | | | | | | | | events are: nfsclient:accesscache:flush:done nfsclient:accesscache:get:hit nfsclient:accesscache:get:miss nfsclient:accesscache:load:done They pass the vnode, uid, and requested or loaded access mode (if any); the load event may also report a load error if the RPC fails. The attribute cache events are: nfsclient:attrcache:flush:done nfsclient:attrcache:get:hit nfsclient:attrcache:get:miss nfsclient:attrcache:load:done They pass the vnode, optionally the vattr if one is present (hit or load), and in the case of a load event, also a possible RPC error. MFC after: 1 month Sponsored by: Google, Inc.
* Add dtnfsclient, a first cut at an NFSv2/v3 client reuest DTracerwatson2009-03-221-0/+343
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | provider. The NFS client exposes 'start' and 'done' probes for NFSv2 and NFSv3 RPCs when using the new RPC implementation, passing in the vnode, mbuf chain, credential, and NFSv2 or NFSv3 procedure number. For 'done' probes, the error number is also available. Probes are named in the following way: ... nfsclient:nfs2:write:start nfsclient:nfs2:write:done ... nfsclient:nfs3:access:start nfsclient:nfs3:access:done ... Access to the unmarshalled arguments is not easily available at this point in the stack, but the passed probe arguments are sufficient to to a lot of interesting things in practice. Technically, these probes may cover multiple RPC retransmits, and even transactions if the transaction ID change as a result of authentication failure or a jukebox error from the server, but usefully capture the intent of a single NFS request, such as access, getattr, write, etc. Typical use might involve profiling RPC latency by system call, number of RPCs, how often a getattr leads to a call to access, when failed access control checks occur, etc. More detailed RPC information might best be provided by adding a krpc provider. It would also be useful to add NFS client probes for events such as the access cache or attribute cache satisfying requests without an RPC. Sponsored by: Google, Inc. MFC after: 1 month
* Remove unused variable.ganbold2008-11-252-4/+2
| | | | | | | Found with: Coverity Prevent(tm) CID: 3669,3671 Approved by: jb
* Merge latest DTrace changes from Perforce.rodrigc2008-11-054-26/+88
|
* Remove unit2minor() use from kernel code.ed2008-09-261-1/+1
| | | | | | | | | | | | | | | When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib
* The cyclic timer device. This is a cut down version of the one injb2008-05-234-0/+1988
| | | | | OpenSolaris. We don't have the lock levels that they do, so this is just hooked into clock interrupts.
* Custom DTrace kernel module files plus FreeBSD-specific DTrace providers.jb2008-05-2330-0/+15340
OpenPOWER on IntegriCloud