summaryrefslogtreecommitdiffstats
path: root/sys/dev/hwpmc
Commit message (Collapse)AuthorAgeFilesLines
* MFC r290811:jtl2016-01-141-44/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix hwpmc "stalled" behavior Currently, there is a single pm_stalled flag that tracks whether a performance monitor was "stalled" due to insufficent ring buffer space for samples. However, because the same performance monitor can run on multiple processes or threads at the same time, a single pm_stalled flag that impacts them all seems insufficient. In particular, you can hit corner cases where the code fails to stop performance monitors during a context switch out, because it thinks the performance monitor is already stopped. However, in reality, it may be that only the monitor running on a different CPU was stalled. This patch attempts to fix that behavior by tracking on a per-CPU basis whether a PM desires to run and whether it is "stalled". This lets the code make better decisions about when to stop PMs and when to try to restart them. Ideally, we should avoid the case where the code fails to stop a PM during a context switch out. MFC r290813: Optimizations to the way hwpmc gathers user callchains Changes to the code to gather user stacks: * Delay setting pmc_cpumask until we actually have the stack. * When recording user stack traces, only walk the portion of the ring that should have samples for us. MFC r290929: Change the driver stats to what they really are: unsigned values. When pmcstat exits after some samples were dropped, give the user an idea of how many were lost. (Granted, these are global numbers, but they may still help quantify the scope of the loss.) MFC r290930: Improve accuracy of PMC sampling frequency The code tracks a counter which is the number of events until the next sample. On context switch in, it loads the saved counter. On context switch out, it tries to calculate a new saved counter. Problems: 1. The saved counter was shared by all threads in a process. However, this means that all threads would be initially loaded with the same saved counter. However, that could result in sampling more often than once every X number of events. 2. The calculation to determine a new saved counter was backwards. It added when it should have subtracted, and subtracted when it should have added. Assume a single-threaded process with a reload count of 1000 events. Assuming the counter on context switch in was 100 and the counter on context switch out was 50 (meaning the thread has "consumed" 50 more events), the code would calculate a new saved counter of 150 (instead of the proper 50). Fix: 1. As soon as the saved counter is used to initialize a monitor for a thread on context switch in, set the saved counter to the reload count. That way, subsequent threads to use the saved counter will get the full reload count, assuring we sample at least once every X number of events (across all threads). 2. Change the calculation of the saved counter. Due to the change to the saved counter in #1, we simply need to add (modulo the reload count) the remaining counter time we retrieve from the CPU when a thread is context switched out. MFC r291016: Support a wider history counter in pmcstat(8) gmon output pmcstat(8) contains an option to output sampling data in a gmon format compatible with gprof(1). Currently, it uses the default histcounter, which is an (unsigned short). With large sets of sampling data, it is possible to overflow the maximum value provided by an (unsigned short). This change adds the -e argument to pmcstat. If -e and -g are both specified, pmcstat will use a histcounter type of uint64_t. MFC r291017: Fix the date on the pmcstat(8) man page from r291016.
* MFC r283924vangyzen2015-10-021-3/+3
| | | | | | | | | | | | | | | Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431
* MFC 283121:jhb2015-10-011-1/+1
| | | | Use the proper mask when reloading sampling PMCs for Core CPUs.
* MFC 283123:jhb2015-06-012-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix two bugs that could result in PMC sampling effectively stopping. In both cases, the the effect of the bug was that a very small positive number was written to the counter. This means that a large number of events needed to occur before the next sampling interrupt would trigger. Even with very frequently occurring events like clock cycles wrapping all the way around could take a long time. Both bugs occurred when updating the saved reload count for an outgoing thread on a context switch. First, the counter-independent code compares the current reload count against the count set when the thread switched in and generates a delta to apply to the saved count. If this delta causes the reload counter to go negative, it would add a full reload interval to wrap it around to a positive value. The fix is to add the full reload interval if the resulting counter is zero. Second, occasionally the raw counter value read during a context switch has actually wrapped, but an interrupt has not yet triggered. In this case the existing logic would return a very large reload count (e.g. 2^48 - 2 if the counter had overflowed by a count of 2). This was seen both for fixed-function and programmable counters on an E5-2643. Workaround this case by returning a reload count of zero. PR: 198149 Sponsored by: Norse Corp, Inc.
* MFC 282641,282658:jhb2015-06-0116-254/+256
| | | | | | | | - Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of the broader DEBUG option. - Convert hwpmc(4) debug printfs over to KTR. Sponsored by: Norse Corp, Inc.
* MFC of r277177 and r279894 with the fixes for the PMC for Haswell.rrs2015-03-248-262/+610
| | | | Sponsored by: Netflix Inc.
* Merge r263233 from HEAD to stable/10:rwatson2015-03-191-1/+1
| | | | | | | | | Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. Sponsored by: Google, Inc.
* Clamp too large hwpmc callchaindepth to maximumemaste2015-02-151-2/+3
| | | | | | | | If the depth requested by the user is too large, it's better to provide the maximum than the smaller default. MFC of: r274766 Sponsored by: The FreeBSD Foundation
* MFC r273236:markj2014-11-061-13/+9
| | | | | | Use pmc_destroy_pmc_descriptor() to actually free the pmc, which is consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy PMCs if a process exits or execs without explicitly releasing them.
* MFC r272713:bz2014-10-101-1/+1
| | | | | | | | | | | Since introducing the extra mapping in r250103 (head) for architectural performance events we have actually counted 'Branch Instruction Retired' when people asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask mnemonic. Reviewed by: jimharris Discussed with: gnn, rwatson Sponsored by: DARPA/AFRL
* MFC r267062:kib2014-06-182-2/+17
| | | | Disable existing uncore hwpmc code for Nehalem and Westmere EX.
* MFC r263446hiren2014-05-313-37/+202
| | | | | Update hwpmc to support core events for Atom Silvermont microarchitecture. (Model 0x4D as per Intel document 330061-001 01/2014)
* MFC r266195:markj2014-05-192-4/+0
| | | | Remove some prototypes for undefined functions.
* MFC r263080:kib2014-03-191-2/+1
| | | | Use correct types for sizeof() in the calculations for the malloc(9) sizes.
* MFC r262547jhibbits2014-03-141-4/+24
| | | | Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.
* MFC r261342jhibbits2014-03-145-17/+771
| | | | | | | | | | Add hwpmc(4) support for the PowerPC 970 class processors, direct events. This also fixes asserts on removal of the module for the mpc74xx. The PowerPC 970 processors have two different types of events: direct events and indirect events. Thus far only direct events are supported. I included some documentation in the driver on how indirect events work, but support is for the future.
* MFC r261173jhibbits2014-03-021-0/+1
| | | | MPC74xx should not fall through, to the error case.
* MFC r258779,r258780,r258787,r258822:eadler2014-02-041-1/+1
| | | | | | | | | | | | | Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this shifts into the sign bit. Instead use (1U << 31) which gets the expected result. Similar to the (1 << 31) case it is not defined to do (2 << 30). This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases. A similar change was made in OpenBSD.
* MFC r259394,r259395,r259699jhibbits2014-01-151-7/+18
| | | | | | | | | r259394: Rebase the PMC indices at 1, since PMC_SOFT is at 0. r259395,r259699: Add userland PMC backtracing, and use the PMC trapframe macros for kernel backtraces.
* Remove local change leftover, this should never have been part ofdavide2013-09-201-2/+0
| | | | | | | r255745. Pointy-hat to: davide Approved by: re (implicit)
* Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. Withdavide2013-09-201-0/+2
| | | | | | | | | | | | | | | current lock classes KPI it was really difficult because there was no way to pass an rmtracker object to the lock/unlock routines. In order to accomplish the task, modify the aforementioned functions so that they can return (or pass as argument) an uinptr_t, which is in the rm case used to hold a pointer to struct rm_priotracker for current thread. As an added bonus, this fixes rm_sleep() in the rm shared case, which right now can communicate priotracker structure between lc_unlock()/lc_lock(). Suggested by: jhb Reviewed by: jhb Approved by: re (delphij)
* Fix the build.jhibbits2013-09-051-0/+2
|
* Change the cap_rights_t type from uint64_t to a structure that we can extendpjd2013-09-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
* Fix hwpmc(4) for 32-bit PowerPC.jhibbits2013-09-042-8/+2
|
* Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoringjhibbits2013-09-033-714/+852
| | | | | will likely be done as more drivers are added, since AIM-compatible processors have similar PMC configuration logic.
* Complete r250105. Do not zero fields if M_ZERO flag is specified todavide2013-09-011-6/+0
| | | | | | malloc(9). Reported by: pluknet, glebius
* Remove the duplicate LLC_MISS event and put it in the right order.adrian2013-08-291-3/+2
|
* Update the mis-predicted branch PMC names (for sandy bridge) to not clash.adrian2013-08-251-2/+2
| | | | | | | | | | | | | | The SDM (June 2013) tables on these are rather confusing. Yes, they assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes (C5H/00H and C5H/04H.) The latter however is the PEBS version. So, to make it easier to see the difference - and yes, we can use both without having to actually enable the PEBS specific bits! - just rename the PEBS one to _PS so there's no clashing. Tested: * Sandy bridge
* Fix a >80 character long line, introduced in my previous commit.adrian2013-08-251-1/+2
| | | | Noticed by: hiren
* Update the MEM_UOP_RETIRED PMC operation for sandy bridge and sandyadrian2013-08-252-24/+35
| | | | | | | | | | | | | | | | | | bridge Xeon. Summary: These are PEBS events but they're also available as normal counter/sample events. The source table (Table 19-2) lists the base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must be qualified with other values. This particular commit fleshes out those umask values. Source: * Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21. Tested: * Sandy Bridge (non-Xeon)
* Rename the kld_unload event handler to kld_unload_try, and add a newmarkj2013-08-241-58/+53
| | | | | | | | | | | | | | kld_unload event handler which gets invoked after a linker file has been successfully unloaded. The kld_unload and kld_load event handlers are now invoked with the shared linker lock held, while kld_unload_try is invoked with the lock exclusively held. Convert hwpmc(4) to use these event handlers instead of having kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are loaded or unloaded. This has no functional effect, but simplifes the linker code somewhat. Reviewed by: jhb
* Change the name of this particular event to reflect the name used inadrian2013-08-211-2/+2
| | | | | | | | | | | | | | | | | | Linux and Intel examples. Sourced: * https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv * http://software.intel.com/en-us/comment/1747932#comment-1747932 Note: * It's not currently in the Intel SDM; I need to chase down what's going on. Tested: * Sandy Bridge
* Correct a typo in the event mask mnemonic.bz2013-08-201-1/+1
| | | | | Reviewed by: gnn MFC after: 3 days
* Add in missing events for Sandy Bridge Xeon.adrian2013-08-182-5/+27
| | | | | | | | | | | | | | | | | * Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy bridge Xeon. Right now it only is enabled for Sandy Bridge. * D2/0F is actually a combination rather than a separate counter, so just flip that on for the CPU types that support it. There's an errata for using this on SB Xeon hardware - I've documented it in kern/181346. Tested: * Sandy Bridge * Sandy Bridge Xeon Sponsored by: Netflix, Inc.
* Relax the vm object locking. Use a read lock.alc2013-06-051-10/+10
| | | | Sponsored by: EMC / Isilon Storage Division
* Suppress a GCC warning. This warning is actually bogus and newer GCCdavide2013-05-021-1/+1
| | | | | | | | versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1) do not whine about it, so, at some point this workaround will be reverted. Reported by: ache Discussed with: dim
* malloc(9) cannot return NULL if M_WAITOK flag is specified.davide2013-04-302-17/+5
|
* The Intel PMC architectural events have encodings which are identical todavide2013-04-302-25/+49
| | | | | | | | | | | | | | | | | those of some non-architectural core events. This is not a problem in the general case as long as there's an 1:1 mapping between the two, but there are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P. CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure reference (i.e. bus) but TSC, so there's the need to disambiguate. In order to avoid the namespace collision rename all the architectural events in a way they cannot be ambigous and refactor the architectural events handling function to reflect this change. While here, per Jim Harris request, rename iap_architectural_event_is_unsupported() to iap_event_is_architectural(). Discussed with: jimharris Reviewed by: jimharris, gnn
* Complete r250097:davide2013-04-301-3/+6
| | | | Do not change the initialization order in pmc_intel_initialize().
* When hwpmc(4) module is unloaded it reports a double leakage. This happensdavide2013-04-301-7/+3
| | | | | | | | at least if FreeBSD is ran under VirtualBox. In order to avoid the leakage, properly deallocate structures in case CPU claims that hw performance monitoring counters are not supported. Reported by: hiren
* Fixup Westmere hwpmc(4) support: add missing CPU flag so thatdavide2013-04-301-3/+3
| | | | | | | intrucion-retired, llc-misses and llc-reference events can now be allocated. Reviewed by: jimharris, gnn
* Improve/correct a comment. We now support a lot more cpu types.hiren2013-04-141-1/+1
| | | | | PR: kern/177496 Approved by: sbruno (mentor)
* Cosmetic change: make a comment reference Sandy Bridge *Xeon*rstone2013-04-121-1/+1
| | | | | Reviewed by: sbruno MFC after: 1 week
* Trailing whitespace cleanup along with 80 column enforcemnt.sbruno2013-04-034-1682/+1803
| | | | | | | Submitted by: hiren.panchasara@gmail.com Reviewed by: sbruno@freebsd.org Obtained from: Yahoo! Inc. MFC after: 2 weeks
* Update hwpmc to support Haswell class processors.sbruno2013-03-284-187/+532
| | | | | | | | | | | | 0x3C: /* Per Intel document 325462-045US 01/2013. */ Add manpage to document all the goodness that is available in this processor model. Submitted by: hiren panchasara <hiren.panchasara@gmail.com> Reviewed by: jimharris, sbruno Obtained from: Yahoo! Inc. MFC after: 2 weeks
* MFCattilio2013-03-081-6/+15
|\
| * Add a generic way to call per event allocate / release function.fabient2013-03-051-6/+15
| | | | | | | | | | Reviewed by: mav MFC after: 1 month
| * Add support for good old 8192Hz profiling clock to software PMC.mav2013-02-261-3/+6
| | | | | | | | Reviewed by: fabient
| * Change the way how software PMC updates counters.mav2013-02-261-2/+6
| | | | | | | | | | | | This at least fixes -n option of pmcstat. Reviewed by: fabient
* | MFCattilio2013-02-261-3/+6
| |
OpenPOWER on IntegriCloud