summaryrefslogtreecommitdiffstats
path: root/sys/dev/hwpmc
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix various bugs in Haswell counter definitionsrstone2015-03-101-6/+6
| | | | | | | | | 1) The "WALK_COMPLETED_2M_4M" event incorrectly referenced 4K pages. 2) The umask for RING0 and RING123 events was reversed. Differential Revision: https://reviews.freebsd.org/D1585 MFC after: 1 month Sponsored by: Sandvine Inc
* The cpu_id macro was renamed in r278529, catch up with this new name.andrew2015-02-111-1/+1
|
* Add ARMv7 performance monitoring counters.br2015-01-284-3/+766
| | | | | | Differential Revision: https://reviews.freebsd.org/D1687 Reviewed by: rpaulo Sponsored by: DARPA, AFRL
* style(9) cleanuprstone2015-01-222-15/+26
|
* Update the hwpmc driver to have the new type HASWELL_XEON. Alsorrs2015-01-148-263/+609
| | | | | | | | | | | | | | | go back through HASWELL, IVY_BRIDGE, IVY_BRIDGE_XEON and SANDY_BRIDGE to straighten out all the missing PMCs. We also add a new pmc tool pmcstudy, this allows one to run the various formulas from the documents "Using Intel Vtune Amplifier XE on XXX Generation platforms" for IB/SB and Haswell. The tool also allows one to postulate your own formulas with any of the various PMC's. At some point I will enahance this to work with Brendan Gregg's flame-graphs so we can flamegraph various PMC interactions. Note the manual page also needs some work (lots of work) but gnn has committed to help me with that ;-) Reviewed by: gnn MFC after:1 month Sponsored by: Netflix Inc.
* Fix hwpmc sampling for ppc970 (G5-class) processors.jhibbits2014-11-271-17/+11
| | | | | | | With this, hwpmc sampling now works on these processors. MFC after: 3 weeks Relnotes: yes
* Fix hwpmc sampling for MPC74xxx (G4) processors.jhibbits2014-11-271-13/+11
| | | | | | | With this, hwpmc sampling now works correctly on these processors. MFC after: 3 weeks Relnotes: yes
* Clamp too-large hwpmc callchaindepth to the maximumemaste2014-11-201-2/+3
| | | | | | | If the depth requested by the user is too large, it's better to provide the maximum than the smaller default. Sponsored by: The FreeBSD Foundation
* Fix up module unload for syscall_module_handler consumers.mjg2014-11-011-1/+2
| | | | | | | | After r273707 it was registering syscalls as static. This fixes hwpmc module unload. Reported by: markj
* Use pmc_destroy_pmc_descriptor() to actually free the pmc, which ismarkj2014-10-171-13/+9
| | | | | | | | | | consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy PMCs if a process exits or execs without explicitly releasing them. Reviewed by: bz, gnn MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D958
* Since introducing the extra mapping in r250103 for architectural performancebz2014-10-071-1/+1
| | | | | | | | | | | events we have actually counted 'Branch Instruction Retired' when people asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask mnemonic. Reviewed by: jimharris Discussed with: gnn, rwatson MFC after: 3 days Sponsored by: DARPA/AFRL
* Fix PowerPC backtraces. Since kernel and user have completely separate addressjhibbits2014-09-141-13/+27
| | | | | | | | spaces, rather than a split address, we actually can't check for being within the kernel's address range. Instead, do what other backtraces do, and use trapexit()/asttrapexit() as the stack sentinel. MFC after: 3 weeks
* Remove ia64.marcel2014-07-071-66/+0
| | | | | | | | | | | | | | | | | This includes: o All directories named *ia64* o All files named *ia64* o All ia64-specific code guarded by __ia64__ o All ia64-specific makefile logic o Mention of ia64 in comments and documentation This excludes: o Everything under contrib/ o Everything under crypto/ o sys/xen/interface o sys/sys/elf_common.h Discussed at: BSDcan
* Fix a bug in hwpmc(4) callchain retrieval, for both user and kernel.jhibbits2014-07-031-9/+13
| | | | | | | | | The array index for the callchain is getting double-incremented -- both in the loop and the storing. It should only be incremented in one location. Also, constrain the stack pointer range check. MFC after: 2 weeks
* Pull in r267961 and r267973 again. Fix for issues reported will follow.hselasky2014-06-282-15/+8
|
* Revert r267961, r267973:gjb2014-06-272-8/+15
| | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifhselasky2014-06-272-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
* For Xeon 7500 and 48XX (Nehalem EX and Westmere EX) variants of thekib2014-06-042-2/+17
| | | | | | | | | | | | | | Core i7 and Westmere processors, the uncore PMC subsystem is completely different from the uncore PMC on smaller versions of CPUs. Disable existing uncore hwpmc code for EX, otherwise non-existing MSRs are accessed. The cores PMCs seems to be identical for non-EX and EX, according to the SDM. Reviewed by: davide, fabient Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Add missing Ivy Bridge and Haswell events.gnn2014-06-021-0/+3
| | | | | Submitted by: Anton Rang <rang@mac.com> MFC: 2 weeks
* Remove some prototypes for undefined functions.markj2014-05-152-4/+0
| | | | MFC after: 3 days
* Enable and disable the PMC unit at load/unload time, respectively.jhibbits2014-04-182-0/+8
| | | | MFC after: 3 weeks
* Update hwpmc to support core events for Atom Silvermont microarchitecture.hiren2014-03-203-37/+202
| | | | | | | (Model 0x4D as per Intel document 330061-001 01/2014) Tested by: Olivier Cochard-Labbe <olivier@cochatrd.me> MFC after: 4 weeks
* Update kernel inclusions of capability.h to use capsicum.h instead; somerwatson2014-03-161-1/+1
| | | | | | | | further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
* Fix pointer type in call to malloceadler2014-03-131-1/+1
| | | | Submitted by: Meyer, Conrad conrad.meyer@isilon.com
* Fix pointer type in call to malloceadler2014-03-131-1/+1
| | | | Submitted by: Meyer, Conrad conrad.meyer@isilon.com
* Use correct types for sizeof() in the calculations for the malloc(9) sizes [1].kib2014-03-121-2/+1
| | | | | | | While there, remove unneeded checks for failed allocations with M_WAITOK flag. Submitted by: Conrad Meyer <cemeyer@uw.edu> [1] MFC after: 1 week
* Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.jhibbits2014-02-271-4/+24
| | | | MFC after: 2 weeks
* Add hwpmc(4) support for the PowerPC 970 class processors, direct events.jhibbits2014-02-015-17/+771
| | | | | | | | | | | This also fixes asserts on removal of the module for the mpc74xx. The PowerPC 970 processors have two different types of events: direct events and indirect events. Thus far only direct events are supported. I included some documentation in the driver on how indirect events work, but support is for the future. MFC after: 1 month
* MPC74xx should not fall through, to the error case.jhibbits2014-01-251-0/+1
| | | | MFC after: 1 week
* Move <machine/apicvar.h> to <x86/apicvar.h>.jhb2014-01-235-5/+5
|
* Add another Haswell model (0x45) to the set of supported chips.gnn2013-12-201-0/+1
| | | | | Model 0x45 appears, for example, in late 2013 Mac Book Pro models and is properly emulated by VMware.
* o Remove assertions on ipa_version as sometimes the version detectionattilio2013-12-203-18/+21
| | | | | | | | | | using cpuid can be quirky (this is the case of VMWare without the vPMC support) but fail to probe hwpmc. o Apply the fix for XEON family of processors as established by 315338-020 document (bug AJ85). Sponsored by: EMC / Isilon storage division Reviewed by: fabient
* Add userland PMC backtracing, and use the PMC trapframe macros for kerneljhibbits2013-12-141-7/+18
| | | | | | backtraces. MFC after: 1 week
* Fix undefined behavior: (1 << 31) is not defined as 1 is an int and thiseadler2013-11-301-1/+1
| | | | | | | | | | | | | shifts into the sign bit. Instead use (1U << 31) which gets the expected result. This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases. A similar change was made in OpenBSD. Discussed with: -arch, rdivacky Reviewed by: cperciva
* Remove local change leftover, this should never have been part ofdavide2013-09-201-2/+0
| | | | | | | r255745. Pointy-hat to: davide Approved by: re (implicit)
* Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. Withdavide2013-09-201-0/+2
| | | | | | | | | | | | | | | current lock classes KPI it was really difficult because there was no way to pass an rmtracker object to the lock/unlock routines. In order to accomplish the task, modify the aforementioned functions so that they can return (or pass as argument) an uinptr_t, which is in the rm case used to hold a pointer to struct rm_priotracker for current thread. As an added bonus, this fixes rm_sleep() in the rm shared case, which right now can communicate priotracker structure between lc_unlock()/lc_lock(). Suggested by: jhb Reviewed by: jhb Approved by: re (delphij)
* Fix the build.jhibbits2013-09-051-0/+2
|
* Change the cap_rights_t type from uint64_t to a structure that we can extendpjd2013-09-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
* Fix hwpmc(4) for 32-bit PowerPC.jhibbits2013-09-042-8/+2
|
* Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoringjhibbits2013-09-033-714/+852
| | | | | will likely be done as more drivers are added, since AIM-compatible processors have similar PMC configuration logic.
* Complete r250105. Do not zero fields if M_ZERO flag is specified todavide2013-09-011-6/+0
| | | | | | malloc(9). Reported by: pluknet, glebius
* Remove the duplicate LLC_MISS event and put it in the right order.adrian2013-08-291-3/+2
|
* Update the mis-predicted branch PMC names (for sandy bridge) to not clash.adrian2013-08-251-2/+2
| | | | | | | | | | | | | | The SDM (June 2013) tables on these are rather confusing. Yes, they assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes (C5H/00H and C5H/04H.) The latter however is the PEBS version. So, to make it easier to see the difference - and yes, we can use both without having to actually enable the PEBS specific bits! - just rename the PEBS one to _PS so there's no clashing. Tested: * Sandy bridge
* Fix a >80 character long line, introduced in my previous commit.adrian2013-08-251-1/+2
| | | | Noticed by: hiren
* Update the MEM_UOP_RETIRED PMC operation for sandy bridge and sandyadrian2013-08-252-24/+35
| | | | | | | | | | | | | | | | | | bridge Xeon. Summary: These are PEBS events but they're also available as normal counter/sample events. The source table (Table 19-2) lists the base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must be qualified with other values. This particular commit fleshes out those umask values. Source: * Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21. Tested: * Sandy Bridge (non-Xeon)
* Rename the kld_unload event handler to kld_unload_try, and add a newmarkj2013-08-241-58/+53
| | | | | | | | | | | | | | kld_unload event handler which gets invoked after a linker file has been successfully unloaded. The kld_unload and kld_load event handlers are now invoked with the shared linker lock held, while kld_unload_try is invoked with the lock exclusively held. Convert hwpmc(4) to use these event handlers instead of having kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are loaded or unloaded. This has no functional effect, but simplifes the linker code somewhat. Reviewed by: jhb
* Change the name of this particular event to reflect the name used inadrian2013-08-211-2/+2
| | | | | | | | | | | | | | | | | | Linux and Intel examples. Sourced: * https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv * http://software.intel.com/en-us/comment/1747932#comment-1747932 Note: * It's not currently in the Intel SDM; I need to chase down what's going on. Tested: * Sandy Bridge
* Correct a typo in the event mask mnemonic.bz2013-08-201-1/+1
| | | | | Reviewed by: gnn MFC after: 3 days
* Add in missing events for Sandy Bridge Xeon.adrian2013-08-182-5/+27
| | | | | | | | | | | | | | | | | * Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy bridge Xeon. Right now it only is enabled for Sandy Bridge. * D2/0F is actually a combination rather than a separate counter, so just flip that on for the CPU types that support it. There's an errata for using this on SB Xeon hardware - I've documented it in kern/181346. Tested: * Sandy Bridge * Sandy Bridge Xeon Sponsored by: Netflix, Inc.
* Relax the vm object locking. Use a read lock.alc2013-06-051-10/+10
| | | | Sponsored by: EMC / Isilon Storage Division
OpenPOWER on IntegriCloud