| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix two bugs that could result in PMC sampling effectively stopping.
In both cases, the the effect of the bug was that a very small positive
number was written to the counter. This means that a large number of
events needed to occur before the next sampling interrupt would trigger.
Even with very frequently occurring events like clock cycles wrapping all
the way around could take a long time. Both bugs occurred when updating
the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count
against the count set when the thread switched in and generates a delta
to apply to the saved count. If this delta causes the reload counter
to go negative, it would add a full reload interval to wrap it around to
a positive value. The fix is to add the full reload interval if the
resulting counter is zero.
Second, occasionally the raw counter value read during a context switch
has actually wrapped, but an interrupt has not yet triggered. In this
case the existing logic would return a very large reload count (e.g.
2^48 - 2 if the counter had overflowed by a count of 2). This was seen
both for fixed-function and programmable counters on an E5-2643.
Workaround this case by returning a reload count of zero.
PR: 198149
Sponsored by: Norse Corp, Inc.
|
|
|
|
|
|
|
|
| |
- Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of
the broader DEBUG option.
- Convert hwpmc(4) debug printfs over to KTR.
Sponsored by: Norse Corp, Inc.
|
|
|
|
| |
Sponsored by: Netflix Inc.
|
|
|
|
|
|
|
|
|
| |
Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
Sponsored by: Google, Inc.
|
|
|
|
|
|
|
|
| |
If the depth requested by the user is too large, it's better to provide
the maximum than the smaller default.
MFC of: r274766
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
| |
Use pmc_destroy_pmc_descriptor() to actually free the pmc, which is
consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy
PMCs if a process exits or execs without explicitly releasing them.
|
|
|
|
|
|
|
|
|
|
|
| |
Since introducing the extra mapping in r250103 (head) for architectural performance
events we have actually counted 'Branch Instruction Retired' when people
asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask
mnemonic.
Reviewed by: jimharris
Discussed with: gnn, rwatson
Sponsored by: DARPA/AFRL
|
|
|
|
| |
Disable existing uncore hwpmc code for Nehalem and Westmere EX.
|
|
|
|
|
| |
Update hwpmc to support core events for Atom Silvermont microarchitecture.
(Model 0x4D as per Intel document 330061-001 01/2014)
|
|
|
|
| |
Remove some prototypes for undefined functions.
|
|
|
|
| |
Use correct types for sizeof() in the calculations for the malloc(9) sizes.
|
|
|
|
| |
Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.
|
|
|
|
|
|
|
|
|
|
| |
Add hwpmc(4) support for the PowerPC 970 class processors, direct events.
This also fixes asserts on removal of the module for the mpc74xx.
The PowerPC 970 processors have two different types of events: direct events
and indirect events. Thus far only direct events are supported. I included
some documentation in the driver on how indirect events work, but support is
for the future.
|
|
|
|
| |
MPC74xx should not fall through, to the error case.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
Similar to the (1 << 31) case it is not defined to do (2 << 30).
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
|
|
|
|
|
|
|
|
|
| |
r259394:
Rebase the PMC indices at 1, since PMC_SOFT is at 0.
r259395,r259699:
Add userland PMC backtracing, and use the PMC trapframe macros for kernel
backtraces.
|
|
|
|
|
|
|
| |
r255745.
Pointy-hat to: davide
Approved by: re (implicit)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().
Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
|
| |
|
|
|
|
|
| |
will likely be done as more drivers are added, since AIM-compatible processors
have similar PMC configuration logic.
|
|
|
|
|
|
| |
malloc(9).
Reported by: pluknet, glebius
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SDM (June 2013) tables on these are rather confusing. Yes, they
assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes
(C5H/00H and C5H/04H.) The latter however is the PEBS version.
So, to make it easier to see the difference - and yes, we can use
both without having to actually enable the PEBS specific bits! -
just rename the PEBS one to _PS so there's no clashing.
Tested:
* Sandy bridge
|
|
|
|
| |
Noticed by: hiren
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bridge Xeon.
Summary: These are PEBS events but they're also available as normal
counter/sample events. The source table (Table 19-2) lists the
base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must
be qualified with other values. This particular commit fleshes
out those umask values.
Source:
* Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21.
Tested:
* Sandy Bridge (non-Xeon)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kld_unload event handler which gets invoked after a linker file has been
successfully unloaded. The kld_unload and kld_load event handlers are now
invoked with the shared linker lock held, while kld_unload_try is invoked
with the lock exclusively held.
Convert hwpmc(4) to use these event handlers instead of having
kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are
loaded or unloaded. This has no functional effect, but simplifes the linker
code somewhat.
Reviewed by: jhb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux and Intel examples.
Sourced:
* https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv
* http://software.intel.com/en-us/comment/1747932#comment-1747932
Note:
* It's not currently in the Intel SDM; I need to chase down what's
going on.
Tested:
* Sandy Bridge
|
|
|
|
|
| |
Reviewed by: gnn
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy
bridge Xeon. Right now it only is enabled for Sandy Bridge.
* D2/0F is actually a combination rather than a separate counter, so
just flip that on for the CPU types that support it.
There's an errata for using this on SB Xeon hardware - I've documented
it in kern/181346.
Tested:
* Sandy Bridge
* Sandy Bridge Xeon
Sponsored by: Netflix, Inc.
|
|
|
|
| |
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
|
|
|
|
| |
versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1)
do not whine about it, so, at some point this workaround will be reverted.
Reported by: ache
Discussed with: dim
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
those of some non-architectural core events. This is not a problem in the
general case as long as there's an 1:1 mapping between the two, but there
are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents
both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P.
CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure
reference (i.e. bus) but TSC, so there's the need to disambiguate.
In order to avoid the namespace collision rename all the architectural
events in a way they cannot be ambigous and refactor the architectural
events handling function to reflect this change.
While here, per Jim Harris request, rename
iap_architectural_event_is_unsupported() to iap_event_is_architectural().
Discussed with: jimharris
Reviewed by: jimharris, gnn
|
|
|
|
| |
Do not change the initialization order in pmc_intel_initialize().
|
|
|
|
|
|
|
|
| |
at least if FreeBSD is ran under VirtualBox. In order to avoid the leakage,
properly deallocate structures in case CPU claims that hw performance
monitoring counters are not supported.
Reported by: hiren
|
|
|
|
|
|
|
| |
intrucion-retired, llc-misses and llc-reference events can now be
allocated.
Reviewed by: jimharris, gnn
|
|
|
|
|
| |
PR: kern/177496
Approved by: sbruno (mentor)
|
|
|
|
|
| |
Reviewed by: sbruno
MFC after: 1 week
|
|
|
|
|
|
|
| |
Submitted by: hiren.panchasara@gmail.com
Reviewed by: sbruno@freebsd.org
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
| |
0x3C: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this
processor model.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: jimharris, sbruno
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
|
|\ |
|
| |
| |
| |
| |
| | |
Reviewed by: mav
MFC after: 1 month
|
| |
| |
| |
| | |
Reviewed by: fabient
|
| |
| |
| |
| |
| |
| | |
This at least fixes -n option of pmcstat.
Reviewed by: fabient
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| | |
their "write" versions.
Sponsored by: EMC / Isilon storage division
|
|/
|
|
|
|
|
|
| |
* VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations
* VM_OBJECT_SLEEP() is introduced as a general purpose primitve to
get a sleep operation using a VM_OBJECT_LOCK() as protection
* The approach must bear with vm_pager.h namespace pollution so many
files require including directly rwlock.h
|