| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
o Remove assertions on ipa_version as sometimes the version detection
using cpuid can be quirky (this is the case of VMWare without the
vPMC support) but fail to probe hwpmc.
o Apply the fix for XEON family of processors as established by
315338-020 document (bug AJ85).
Sponsored by: EMC / Isilon storage division
Reviewed by: fabient
MFC courtesy of panzura.
|
| |
|
|
|
|
|
|
|
| |
Minor spelling fixes in:
sys/dev, sys/sys
Many of these have user-visible strings.
|
|
|
|
|
|
|
| |
Do not call vn_fullpath(9) (through the pmc_getfilename() wrapper)
when its result is immediately ignored, i.e. for kernel processes
forked from the user process. Do not test for non-null before freeing
string.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix hwpmc "stalled" behavior
Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.
In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.
This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.
MFC r290813:
Optimizations to the way hwpmc gathers user callchains
Changes to the code to gather user stacks:
* Delay setting pmc_cpumask until we actually have the stack.
* When recording user stack traces, only walk the portion of the ring
that should have samples for us.
MFC r290929:
Change the driver stats to what they really are: unsigned values.
When pmcstat exits after some samples were dropped, give the user an
idea of how many were lost. (Granted, these are global numbers, but
they may still help quantify the scope of the loss.)
MFC r290930:
Improve accuracy of PMC sampling frequency
The code tracks a counter which is the number of events until the next
sample. On context switch in, it loads the saved counter. On context
switch out, it tries to calculate a new saved counter.
Problems:
1. The saved counter was shared by all threads in a process. However, this
means that all threads would be initially loaded with the same saved
counter. However, that could result in sampling more often than once every
X number of events.
2. The calculation to determine a new saved counter was backwards. It
added when it should have subtracted, and subtracted when it should have
added. Assume a single-threaded process with a reload count of 1000
events. Assuming the counter on context switch in was 100 and the counter
on context switch out was 50 (meaning the thread has "consumed" 50 more
events), the code would calculate a new saved counter of 150 (instead of
the proper 50).
Fix:
1. As soon as the saved counter is used to initialize a monitor for a
thread on context switch in, set the saved counter to the reload count.
That way, subsequent threads to use the saved counter will get the full
reload count, assuring we sample at least once every X number of events
(across all threads).
2. Change the calculation of the saved counter. Due to the change to the
saved counter in #1, we simply need to add (modulo the reload count) the
remaining counter time we retrieve from the CPU when a thread is context
switched out.
MFC r291016:
Support a wider history counter in pmcstat(8) gmon output
pmcstat(8) contains an option to output sampling data in a gmon format
compatible with gprof(1). Currently, it uses the default histcounter,
which is an (unsigned short). With large sets of sampling data, it
is possible to overflow the maximum value provided by an (unsigned
short).
This change adds the -e argument to pmcstat. If -e and -g are both
specified, pmcstat will use a histcounter type of uint64_t.
MFC r291017:
Fix the date on the pmcstat(8) man page from r291016.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide vnode in memory map info for files on tmpfs
When providing memory map information to userland, populate the vnode pointer
for tmpfs files. Set the memory mapping to appear as a vnode type, to match
FreeBSD 9 behavior.
This fixes the use of tmpfs files with the dtrace pid provider,
procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).
Submitted by: Eric Badger <eric@badgerio.us> (initial revision)
Obtained from: Dell Inc.
PR: 198431
|
|
|
|
| |
Use the proper mask when reloading sampling PMCs for Core CPUs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix two bugs that could result in PMC sampling effectively stopping.
In both cases, the the effect of the bug was that a very small positive
number was written to the counter. This means that a large number of
events needed to occur before the next sampling interrupt would trigger.
Even with very frequently occurring events like clock cycles wrapping all
the way around could take a long time. Both bugs occurred when updating
the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count
against the count set when the thread switched in and generates a delta
to apply to the saved count. If this delta causes the reload counter
to go negative, it would add a full reload interval to wrap it around to
a positive value. The fix is to add the full reload interval if the
resulting counter is zero.
Second, occasionally the raw counter value read during a context switch
has actually wrapped, but an interrupt has not yet triggered. In this
case the existing logic would return a very large reload count (e.g.
2^48 - 2 if the counter had overflowed by a count of 2). This was seen
both for fixed-function and programmable counters on an E5-2643.
Workaround this case by returning a reload count of zero.
PR: 198149
Sponsored by: Norse Corp, Inc.
|
|
|
|
|
|
|
|
| |
- Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of
the broader DEBUG option.
- Convert hwpmc(4) debug printfs over to KTR.
Sponsored by: Norse Corp, Inc.
|
|
|
|
| |
Sponsored by: Netflix Inc.
|
|
|
|
|
|
|
|
|
| |
Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
Sponsored by: Google, Inc.
|
|
|
|
|
|
|
|
| |
If the depth requested by the user is too large, it's better to provide
the maximum than the smaller default.
MFC of: r274766
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
| |
Use pmc_destroy_pmc_descriptor() to actually free the pmc, which is
consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy
PMCs if a process exits or execs without explicitly releasing them.
|
|
|
|
|
|
|
|
|
|
|
| |
Since introducing the extra mapping in r250103 (head) for architectural performance
events we have actually counted 'Branch Instruction Retired' when people
asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask
mnemonic.
Reviewed by: jimharris
Discussed with: gnn, rwatson
Sponsored by: DARPA/AFRL
|
|
|
|
| |
Disable existing uncore hwpmc code for Nehalem and Westmere EX.
|
|
|
|
|
| |
Update hwpmc to support core events for Atom Silvermont microarchitecture.
(Model 0x4D as per Intel document 330061-001 01/2014)
|
|
|
|
| |
Remove some prototypes for undefined functions.
|
|
|
|
| |
Use correct types for sizeof() in the calculations for the malloc(9) sizes.
|
|
|
|
| |
Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.
|
|
|
|
|
|
|
|
|
|
| |
Add hwpmc(4) support for the PowerPC 970 class processors, direct events.
This also fixes asserts on removal of the module for the mpc74xx.
The PowerPC 970 processors have two different types of events: direct events
and indirect events. Thus far only direct events are supported. I included
some documentation in the driver on how indirect events work, but support is
for the future.
|
|
|
|
| |
MPC74xx should not fall through, to the error case.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
Similar to the (1 << 31) case it is not defined to do (2 << 30).
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
|
|
|
|
|
|
|
|
|
| |
r259394:
Rebase the PMC indices at 1, since PMC_SOFT is at 0.
r259395,r259699:
Add userland PMC backtracing, and use the PMC trapframe macros for kernel
backtraces.
|
|
|
|
|
|
|
| |
r255745.
Pointy-hat to: davide
Approved by: re (implicit)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().
Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
| |
|
|
|
|
|
| |
will likely be done as more drivers are added, since AIM-compatible processors
have similar PMC configuration logic.
|
|
|
|
|
|
| |
malloc(9).
Reported by: pluknet, glebius
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SDM (June 2013) tables on these are rather confusing. Yes, they
assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes
(C5H/00H and C5H/04H.) The latter however is the PEBS version.
So, to make it easier to see the difference - and yes, we can use
both without having to actually enable the PEBS specific bits! -
just rename the PEBS one to _PS so there's no clashing.
Tested:
* Sandy bridge
|
|
|
|
| |
Noticed by: hiren
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bridge Xeon.
Summary: These are PEBS events but they're also available as normal
counter/sample events. The source table (Table 19-2) lists the
base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must
be qualified with other values. This particular commit fleshes
out those umask values.
Source:
* Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21.
Tested:
* Sandy Bridge (non-Xeon)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kld_unload event handler which gets invoked after a linker file has been
successfully unloaded. The kld_unload and kld_load event handlers are now
invoked with the shared linker lock held, while kld_unload_try is invoked
with the lock exclusively held.
Convert hwpmc(4) to use these event handlers instead of having
kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are
loaded or unloaded. This has no functional effect, but simplifes the linker
code somewhat.
Reviewed by: jhb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux and Intel examples.
Sourced:
* https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv
* http://software.intel.com/en-us/comment/1747932#comment-1747932
Note:
* It's not currently in the Intel SDM; I need to chase down what's
going on.
Tested:
* Sandy Bridge
|
|
|
|
|
| |
Reviewed by: gnn
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy
bridge Xeon. Right now it only is enabled for Sandy Bridge.
* D2/0F is actually a combination rather than a separate counter, so
just flip that on for the CPU types that support it.
There's an errata for using this on SB Xeon hardware - I've documented
it in kern/181346.
Tested:
* Sandy Bridge
* Sandy Bridge Xeon
Sponsored by: Netflix, Inc.
|
|
|
|
| |
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
|
|
|
|
| |
versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1)
do not whine about it, so, at some point this workaround will be reverted.
Reported by: ache
Discussed with: dim
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
those of some non-architectural core events. This is not a problem in the
general case as long as there's an 1:1 mapping between the two, but there
are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents
both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P.
CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure
reference (i.e. bus) but TSC, so there's the need to disambiguate.
In order to avoid the namespace collision rename all the architectural
events in a way they cannot be ambigous and refactor the architectural
events handling function to reflect this change.
While here, per Jim Harris request, rename
iap_architectural_event_is_unsupported() to iap_event_is_architectural().
Discussed with: jimharris
Reviewed by: jimharris, gnn
|
|
|
|
| |
Do not change the initialization order in pmc_intel_initialize().
|
|
|
|
|
|
|
|
| |
at least if FreeBSD is ran under VirtualBox. In order to avoid the leakage,
properly deallocate structures in case CPU claims that hw performance
monitoring counters are not supported.
Reported by: hiren
|
|
|
|
|
|
|
| |
intrucion-retired, llc-misses and llc-reference events can now be
allocated.
Reviewed by: jimharris, gnn
|
|
|
|
|
| |
PR: kern/177496
Approved by: sbruno (mentor)
|
|
|
|
|
| |
Reviewed by: sbruno
MFC after: 1 week
|
|
|
|
|
|
|
| |
Submitted by: hiren.panchasara@gmail.com
Reviewed by: sbruno@freebsd.org
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
|