summaryrefslogtreecommitdiffstats
path: root/sys/dev/hwpmc/hwpmc_mod.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r308480: pmc_process_csw_out: ignore deleted countersavg2016-12-141-2/+2
|
* MFC r298931, r298981, r299375:pfg2016-05-171-7/+7
| | | | | | | Minor spelling fixes in: sys/dev, sys/sys Many of these have user-visible strings.
* MFC r295352:kib2016-03-121-4/+4
| | | | | | | Do not call vn_fullpath(9) (through the pmc_getfilename() wrapper) when its result is immediately ignored, i.e. for kernel processes forked from the user process. Do not test for non-null before freeing string.
* MFC r290811:jtl2016-01-141-44/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix hwpmc "stalled" behavior Currently, there is a single pm_stalled flag that tracks whether a performance monitor was "stalled" due to insufficent ring buffer space for samples. However, because the same performance monitor can run on multiple processes or threads at the same time, a single pm_stalled flag that impacts them all seems insufficient. In particular, you can hit corner cases where the code fails to stop performance monitors during a context switch out, because it thinks the performance monitor is already stopped. However, in reality, it may be that only the monitor running on a different CPU was stalled. This patch attempts to fix that behavior by tracking on a per-CPU basis whether a PM desires to run and whether it is "stalled". This lets the code make better decisions about when to stop PMs and when to try to restart them. Ideally, we should avoid the case where the code fails to stop a PM during a context switch out. MFC r290813: Optimizations to the way hwpmc gathers user callchains Changes to the code to gather user stacks: * Delay setting pmc_cpumask until we actually have the stack. * When recording user stack traces, only walk the portion of the ring that should have samples for us. MFC r290929: Change the driver stats to what they really are: unsigned values. When pmcstat exits after some samples were dropped, give the user an idea of how many were lost. (Granted, these are global numbers, but they may still help quantify the scope of the loss.) MFC r290930: Improve accuracy of PMC sampling frequency The code tracks a counter which is the number of events until the next sample. On context switch in, it loads the saved counter. On context switch out, it tries to calculate a new saved counter. Problems: 1. The saved counter was shared by all threads in a process. However, this means that all threads would be initially loaded with the same saved counter. However, that could result in sampling more often than once every X number of events. 2. The calculation to determine a new saved counter was backwards. It added when it should have subtracted, and subtracted when it should have added. Assume a single-threaded process with a reload count of 1000 events. Assuming the counter on context switch in was 100 and the counter on context switch out was 50 (meaning the thread has "consumed" 50 more events), the code would calculate a new saved counter of 150 (instead of the proper 50). Fix: 1. As soon as the saved counter is used to initialize a monitor for a thread on context switch in, set the saved counter to the reload count. That way, subsequent threads to use the saved counter will get the full reload count, assuring we sample at least once every X number of events (across all threads). 2. Change the calculation of the saved counter. Due to the change to the saved counter in #1, we simply need to add (modulo the reload count) the remaining counter time we retrieve from the CPU when a thread is context switched out. MFC r291016: Support a wider history counter in pmcstat(8) gmon output pmcstat(8) contains an option to output sampling data in a gmon format compatible with gprof(1). Currently, it uses the default histcounter, which is an (unsigned short). With large sets of sampling data, it is possible to overflow the maximum value provided by an (unsigned short). This change adds the -e argument to pmcstat. If -e and -g are both specified, pmcstat will use a histcounter type of uint64_t. MFC r291017: Fix the date on the pmcstat(8) man page from r291016.
* MFC r283924vangyzen2015-10-021-3/+3
| | | | | | | | | | | | | | | Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431
* MFC 283123:jhb2015-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix two bugs that could result in PMC sampling effectively stopping. In both cases, the the effect of the bug was that a very small positive number was written to the counter. This means that a large number of events needed to occur before the next sampling interrupt would trigger. Even with very frequently occurring events like clock cycles wrapping all the way around could take a long time. Both bugs occurred when updating the saved reload count for an outgoing thread on a context switch. First, the counter-independent code compares the current reload count against the count set when the thread switched in and generates a delta to apply to the saved count. If this delta causes the reload counter to go negative, it would add a full reload interval to wrap it around to a positive value. The fix is to add the full reload interval if the resulting counter is zero. Second, occasionally the raw counter value read during a context switch has actually wrapped, but an interrupt has not yet triggered. In this case the existing logic would return a very large reload count (e.g. 2^48 - 2 if the counter had overflowed by a count of 2). This was seen both for fixed-function and programmable counters on an E5-2643. Workaround this case by returning a reload count of zero. PR: 198149 Sponsored by: Norse Corp, Inc.
* MFC 282641,282658:jhb2015-06-011-83/+83
| | | | | | | | - Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of the broader DEBUG option. - Convert hwpmc(4) debug printfs over to KTR. Sponsored by: Norse Corp, Inc.
* MFC of r277177 and r279894 with the fixes for the PMC for Haswell.rrs2015-03-241-0/+5
| | | | Sponsored by: Netflix Inc.
* Clamp too large hwpmc callchaindepth to maximumemaste2015-02-151-2/+3
| | | | | | | | If the depth requested by the user is too large, it's better to provide the maximum than the smaller default. MFC of: r274766 Sponsored by: The FreeBSD Foundation
* MFC r273236:markj2014-11-061-13/+9
| | | | | | Use pmc_destroy_pmc_descriptor() to actually free the pmc, which is consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy PMCs if a process exits or execs without explicitly releasing them.
* Remove local change leftover, this should never have been part ofdavide2013-09-201-2/+0
| | | | | | | r255745. Pointy-hat to: davide Approved by: re (implicit)
* Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. Withdavide2013-09-201-0/+2
| | | | | | | | | | | | | | | current lock classes KPI it was really difficult because there was no way to pass an rmtracker object to the lock/unlock routines. In order to accomplish the task, modify the aforementioned functions so that they can return (or pass as argument) an uinptr_t, which is in the rm case used to hold a pointer to struct rm_priotracker for current thread. As an added bonus, this fixes rm_sleep() in the rm shared case, which right now can communicate priotracker structure between lc_unlock()/lc_lock(). Suggested by: jhb Reviewed by: jhb Approved by: re (delphij)
* Complete r250105. Do not zero fields if M_ZERO flag is specified todavide2013-09-011-6/+0
| | | | | | malloc(9). Reported by: pluknet, glebius
* Rename the kld_unload event handler to kld_unload_try, and add a newmarkj2013-08-241-58/+53
| | | | | | | | | | | | | | kld_unload event handler which gets invoked after a linker file has been successfully unloaded. The kld_unload and kld_load event handlers are now invoked with the shared linker lock held, while kld_unload_try is invoked with the lock exclusively held. Convert hwpmc(4) to use these event handlers instead of having kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are loaded or unloaded. This has no functional effect, but simplifes the linker code somewhat. Reviewed by: jhb
* Relax the vm object locking. Use a read lock.alc2013-06-051-10/+10
| | | | Sponsored by: EMC / Isilon Storage Division
* malloc(9) cannot return NULL if M_WAITOK flag is specified.davide2013-04-301-14/+5
|
* Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-10/+10
| | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* Switch vm_object lock to be a rwlock.attilio2013-02-201-0/+1
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Quiesce a couple of clang warningssbruno2013-01-121-1/+1
| | | | | Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Obtained from: Yahoo! Inc
* Fixup r240246: hwpmc needs to retain the pinning until ASTs are notattilio2012-10-301-1/+1
| | | | | | | | | | | | | executed. This means past the point where userret() is generally executed. Skip the td_pinned check if a callchain tracing is currently happening and add a more robust check to pmc_capture_user_callchain() in order to catch td_pinned leak past ast() in hwpmc case. Reported and tested by: fabient MFC after: 1 week X-MFC: r240246
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-3/+0
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* Add software PMC support.fabient2012-03-281-38/+213
| | | | | | | | | | | | | New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8). Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions. Sponsored by: NETASQ MFC after: 1 month
* Add a flush of the current PMC log buffer before displaying the next top.fabient2011-10-181-2/+20
| | | | | | | | As the underlying block is 4KB if the PMC throughput is low the measurement will be reported on the next tick. pmcstat(8) use the modified flush API to reclaim current buffer before displaying next top. MFC after: 1 month
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-2/+2
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Commit the support for removing cpumask_t and replacing it directly withattilio2011-05-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno
* Fix a typo/error.attilio2011-04-301-1/+1
|
* Remove unnecessary usage of memory barriers when dealing withattilio2011-04-301-3/+3
| | | | | | pmc_cpumask. Discussed with: fabient
* Convert pm_runcount to int to correctly check for negative value.fabient2010-06-051-8/+5
| | | | | | | Remove uncessary check for error. Found with: Coverity Prevent(tm) MFC after: 1 month
* When configuring a system-wide couting PMC, hwpmc was incorrectly logging ↵rstone2010-05-011-9/+9
| | | | | | | | | process mappings for that PMC. Nothing ever reads pmc logs out of a counting PMC, so the log buffers were leaked when the PMC was deconfigured. The process mappings are only useful for sampling PMCs anyway, so only log the mappings if the PMC is a sampling PMC. This bug would cause allocating sample-mode PMCs to fail with ENOMEM after allocating several counting-mode PMCs. Approved by: jkoshy (mentor) MFC after: 2 weeks
* If there is multiple PMCs for the same interrupt ignore new post.fabient2010-03-311-3/+5
| | | | | | | This will indirectly fix a bug where the thread will be pinned forever if the assert is not compiled. MFC after: 3days
* Use VFS_{LOCK,UNLOCK}_GIANT() around the call to vrele().jkoshy2009-12-291-0/+6
| | | | Reviewed by: kib
* Log process mappings for existing processes at PMC start time.jkoshy2009-12-261-3/+161
| | | | | Submitted by: Marc Unangst <mju at panasas dot com> [original patch] Tested by: fabient
* Use switch out (SWO) instead of switch in (SWI) debug log mask in csw_out.emaste2009-11-301-1/+1
|
* Handle the case where there is only one PMC in the system.fabient2009-10-211-4/+4
| | | | | Approved by: jkoshy (mentor) MFC after: 3 days
* Fix KASSERT string to include the real module name.rpaulo2009-10-181-1/+1
|
* Fix a LOR between pmc_sx and proctree/allproc when creating a new threadattilio2009-06-251-7/+12
| | | | | | | | for the pmclog. Reported by: Ryan Stone <rstone at sandvine dot com> Tested by: Ryan Stone <rstone at sandvine dot com> Sponsored by: Sandvine Incorporated
* - Bug fix: prevent a thread from migrating between CPUs between thejkoshy2008-12-131-16/+60
| | | | | | | | | | | | | time it is marked for user space callchain capture in the NMI handler and the time the callchain capture callback runs. - Improve code and control flow clarity by invoking hwpmc(4)'s user space callchain capture callback directly from low-level code. Reviewed by: jhb (kern/subr_trap.c) Testing (various patch revisions): gnn, Fabien Thomas <fabien dot thomas at netasq dot com>, Artem Belevich <artemb at gmail dot com>
* - Add support for PMCs in Intel CPUs of Family 6, model 0xE (Core Solojkoshy2008-11-271-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and Core Duo), models 0xF (Core2), model 0x17 (Core2Extreme) and model 0x1C (Atom). In these CPUs, the actual numbers, kinds and widths of PMCs present need to queried at run time. Support for specific "architectural" events also needs to be queried at run time. Model 0xE CPUs support programmable PMCs, subsequent CPUs additionally support "fixed-function" counters. - Use event names that are close to vendor documentation, taking in account that: - events with identical semantics on two or more CPUs in this family can have differing names in vendor documentation, - identical vendor event names may map to differing events across CPUs, - each type of CPU supports a different subset of measurable events. Fixed-function and programmable counters both use the same vendor names for events. The use of a class name prefix ("iaf-" or "iap-" respectively) permits these to be distinguished. - In libpmc, refactor pmc_name_of_event() into a public interface and an internal helper function, for use by log handling code. - Minor code tweaks: staticize a global, freshen a few comments. Tested by: gnn
* Print PMC widths in the initialization announcement.jkoshy2008-11-161-1/+2
|
* Correct an oversight: call the MD finalize hook at module unloadjkoshy2008-11-151-0/+3
| | | | time.
* - Separate PMC class dependent code from other kinds of machinejkoshy2008-11-091-122/+230
| | | | | | | | | | | | | | | | | dependencies. A 'struct pmc_classdep' structure describes operations on PMCs; 'struct pmc_mdep' contains one or more 'struct pmc_classdep' structures depending on the CPU in question. Inside PMC class dependent code, row indices are relative to the PMCs supported by the PMC class; MI code in "hwpmc_mod.c" translates global row indices before invoking class dependent operations. - Augment the OP_GETCPUINFO request with the number of PMCs present in a PMC class. - Move code common to Intel CPUs to file "hwpmc_intel.c". - Move TSC handling to file "hwpmc_tsc.c".
* Remove unnecessary locking around vn_fullpath(). The vnode lock for thejhb2008-11-041-2/+0
| | | | | | | | | | | | | | | | vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month
* Fix a number of style issues in the MALLOC / FREE commit. I've tried todes2008-10-231-12/+10
| | | | | be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-41/+37
| | | | MFC after: 3 months
* Support sparsely numbered CPUs.jkoshy2008-09-221-38/+43
| | | | Requested by: obrien, alfred (long ago)
* - Provide kernelname as the name for process with P_KTHREAD set asjeff2008-07-251-1/+5
| | | | | | | otherwise their textvp is NULL. Reviewed by: jkoshy Sponsored by: Nokia
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-4/+2
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-1/+1
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Kernel and hwpmc(4) support for callchain capture.jkoshy2007-12-071-63/+284
| | | | Sponsored by: FreeBSD Foundation and Google Inc.
* Commit 14/14 of sched_lock decomposition.jeff2007-06-051-6/+6
| | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
OpenPOWER on IntegriCloud