summaryrefslogtreecommitdiffstats
path: root/usr.sbin/pmcstat
Commit message (Collapse)AuthorAgeFilesLines
* MFC r290811:jtl2016-01-144-18/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix hwpmc "stalled" behavior Currently, there is a single pm_stalled flag that tracks whether a performance monitor was "stalled" due to insufficent ring buffer space for samples. However, because the same performance monitor can run on multiple processes or threads at the same time, a single pm_stalled flag that impacts them all seems insufficient. In particular, you can hit corner cases where the code fails to stop performance monitors during a context switch out, because it thinks the performance monitor is already stopped. However, in reality, it may be that only the monitor running on a different CPU was stalled. This patch attempts to fix that behavior by tracking on a per-CPU basis whether a PM desires to run and whether it is "stalled". This lets the code make better decisions about when to stop PMs and when to try to restart them. Ideally, we should avoid the case where the code fails to stop a PM during a context switch out. MFC r290813: Optimizations to the way hwpmc gathers user callchains Changes to the code to gather user stacks: * Delay setting pmc_cpumask until we actually have the stack. * When recording user stack traces, only walk the portion of the ring that should have samples for us. MFC r290929: Change the driver stats to what they really are: unsigned values. When pmcstat exits after some samples were dropped, give the user an idea of how many were lost. (Granted, these are global numbers, but they may still help quantify the scope of the loss.) MFC r290930: Improve accuracy of PMC sampling frequency The code tracks a counter which is the number of events until the next sample. On context switch in, it loads the saved counter. On context switch out, it tries to calculate a new saved counter. Problems: 1. The saved counter was shared by all threads in a process. However, this means that all threads would be initially loaded with the same saved counter. However, that could result in sampling more often than once every X number of events. 2. The calculation to determine a new saved counter was backwards. It added when it should have subtracted, and subtracted when it should have added. Assume a single-threaded process with a reload count of 1000 events. Assuming the counter on context switch in was 100 and the counter on context switch out was 50 (meaning the thread has "consumed" 50 more events), the code would calculate a new saved counter of 150 (instead of the proper 50). Fix: 1. As soon as the saved counter is used to initialize a monitor for a thread on context switch in, set the saved counter to the reload count. That way, subsequent threads to use the saved counter will get the full reload count, assuring we sample at least once every X number of events (across all threads). 2. Change the calculation of the saved counter. Due to the change to the saved counter in #1, we simply need to add (modulo the reload count) the remaining counter time we retrieve from the CPU when a thread is context switched out. MFC r291016: Support a wider history counter in pmcstat(8) gmon output pmcstat(8) contains an option to output sampling data in a gmon format compatible with gprof(1). Currently, it uses the default histcounter, which is an (unsigned short). With large sets of sampling data, it is possible to overflow the maximum value provided by an (unsigned short). This change adds the -e argument to pmcstat. If -e and -g are both specified, pmcstat will use a histcounter type of uint64_t. MFC r291017: Fix the date on the pmcstat(8) man page from r291016.
* MFC r280793vangyzen2015-10-022-2/+8
| | | | | | | | | | pmcstat.8: fix -a flag description; improve -m flag to match The -a flag reads a file saved by -O, not -o. The -m flag requires the -R flag. Copy that paragraph from -a. Sponsored by: Dell Inc.
* MFC 283613,287374:jhb2015-09-292-30/+21
| | | | | | | | | Use the cpuset API more consistently: - Fetch the root set from cpuset_getaffinity() instead of assuming all CPUs from 0 to hw.ncpu are the root set. - Use CPU_SETSIZE and CPU_FFS. - The original notion of halted CPUs the manpage and code refers to is gone. Use the term "available" instead.
* MFC: r282866hiren2015-06-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Fix pmcstat symbol resolution for userland processes. When examining existing processes pmcstat fails to correctly determine the locations of executable sections of the process due to a miscalculated virtual load address. This does not affect the newly launched processes as the same value passed as a "start address" to the pmcstat_image_link() thus nullifying the effect of it. The issue manifests itself in processes not being reported in the pmcstat(8) output and "dubious frames" being reported. Fix it for now by ignoring all the sections except the executable one. This won't fix the issue for objects with multiple executable sections but helps in majority of real world usecases. The real solution would be to modify the MAP-IN event to include the appropriate load address so pmcstat(8) won't have to manually parse object files to try to determine it. PR: 198147, 198148 Submitted by: stas
* MFC 282643:jhb2015-06-012-30/+36
| | | | | | | | Use the kern.bootfile sysctl to set the default kernel path rather than hardcoding /boot/kernel. This allows pmcstat(8) to work without -k when using nextboot -k or 'boot foo' at the loader to boot alternate kernels. Sponsored by: Norse Corp, Inc.
* MFC r273737, 273739bapt2014-11-031-1/+7
| | | | | | | | Clarify the documentation of pmcstat: the -d argument should be passed before -p, -s, -P or -S to be taken in account Differential Revision: https://reviews.freebsd.org/D1011 Reviewed by: adrian, gnn
* MFC r266903: Update default callchain depth to 16 to match kernelemaste2014-06-061-1/+1
|
* MFC: 266209gnn2014-05-303-3/+51
| | | | | | | | Add a command line argument (-l) to end event collection after some number of seconds. The number of seconds may be a fraction. Submitted by: Julien Charbon <jcharbon@versign.com> Relnotes: yes
* MFC r266208: Speed up pmcstat by improving string hashemaste2014-05-231-3/+3
| | | | | | | | | In one case generating callgraph output from a 24MB system-wide sampling data file took 17.4 seconds on average. Profiling showed pmcstat spending a lot of time in strcmp, due to hash collisions. Replacing the XOR-only hash with FNV-1a reduces the run time for my test by 40%.
* Merged r262424-262425,265085scottl2014-05-077-4/+205
| | | | | | | Add the -a option to pmcstat. This produces a full stack track on the sampled points. See the man page for details on how this works. Obtained from: Netflix, Inc.
* More -Wmissing-variable-declarations fixes.ed2012-10-194-16/+15
| | | | | | | | | | | | | | | | In addition to adding `static' where possible: - bin/date: Move `retval' into extern.h to make it visible to date.c. - bin/ed: Move globally used variables into ed.h. - sbin/camcontrol: Move `verbose' into camcontrol.h and fix shadow warnings. - usr.bin/calendar: Remove unneeded variables. - usr.bin/chat: Make `line' local instead of global. - usr.bin/elfdump: Comment out unneeded function. - usr.bin/rlogin: Use _Noreturn instead of __dead2. - usr.bin/tset: Pull `Ospeed' into extern.h. - usr.sbin/mfiutil: Put global variables in mfiutil.h. - usr.sbin/pkg: Remove unused `os_corres'. - usr.sbin/quotaon, usr.sbin/repquota: Remove unused `qfname'.
* Add -m option (for printing sampled PCs to a file) to pmcstat usagejimharris2012-08-221-0/+1
| | | | | | | message. Sponsored by: Intel MFC after: 3 days
* Remove spurious ARM symbols from lookup table.fabient2012-06-061-0/+8
| | | | MFC after: 3 days
* Don't crash trying to load symbols from striped file.glebius2012-06-051-0/+2
| | | | | | PR: bin/167361 Submitted by: Slawa Olhovchenkov <slw zxy.spb.ru> Silence from: jkoshy
* Minor spelling fixes.joel2012-06-031-1/+1
|
* Add software PMC support.fabient2012-03-281-0/+15
| | | | | | | | | | | | | New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8). Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions. Sponsored by: NETASQ MFC after: 1 month
* - Support inlined location in calltree output.fabient2012-03-282-241/+402
| | | | | | | | | | | | In case of multiple level of inlining all the locations are flattened. Require recent binutils/addr2line (head works or binutils from ports with the right $PATH order). - Multiple fixes in the calltree output (recursion case, ...) - Fix the calltree top view that previously hide some shared nodes. Tested with Kcachegrind(kdesdk4)/qcachegrind(head). Sponsored by: NETASQ
* mdoc: correct .Bd/.Bl arguments.joel2012-03-261-1/+1
| | | | Reviewed by: brueffer
* Fix base vaddr detection for ELF binaries. PT_LOAD with offset 0 is notgonzo2012-03-221-2/+2
| | | | | mandatory for ELF binaries so we'll use the segment with offset less then alignment and align it appropriately (which covers pt_offset == 0 case)
* Fix warning when compiling with gcc46:eadler2012-01-201-2/+1
| | | | | | | error: variable 'current_cpu' set but not used Approved by: dim, cperciva (mentor, blanket for pre-mentorship already-approved commits) MFC after: 3 days
* Spelling fixes for usr.sbin/uqs2011-12-304-4/+4
|
* KNFobrien2011-11-151-16/+17
|
* Improve the chances of matching an outputted string with the line of code.obrien2011-11-154-122/+151
|
* - fix duplicate "a a" in some commentseadler2011-11-131-1/+1
| | | | | | Submitted by: eadler Approved by: simon MFC after: 3 days
* Two bugs fixed:fabient2011-11-011-2/+4
| | | | | | | - Do not close stdout or stderr when redirecting to file. - Correctly handle error code to detect when no buffer available. MFC after: 1 month
* Add a flush of the current PMC log buffer before displaying the next top.fabient2011-10-182-9/+11
| | | | | | | | As the underlying block is 4KB if the PMC throughput is low the measurement will be reported on the next tick. pmcstat(8) use the modified flush API to reclaim current buffer before displaying next top. MFC after: 1 month
* Convert pmcstat about using cpuset_t rather than relying on plain 32 bitattilio2011-08-073-74/+81
| | | | | | | | | | | | | | | | ints. That fixes a first bug where pmcstat wasn't using the old cpumask_t interface and now also brings the full support for more than 32 cpus. While here, make the functions pmcstat_clone_event_descriptor() and pmcstat_get_cpumask() private to pmcstat. The problem of assuming cpu dense masks still persists and should be eventually fixed, as reported by avg. Tested by: pluknet Reviewed by: gnn Approved by: re (kib)
* pmcstat, pmccontrol: catch up with removal of machdep.hlt_cpus sysctlavg2011-07-151-10/+2
| | | | | | Reported by: Pan Tsu <inyaoo@gmail.com> Reviewed by: attilio No objections: gnn
* Remove duplicated header fileskevlo2011-06-241-1/+0
|
* When an asm location cannot be resolved to a function the costfabient2010-09-034-0/+6
| | | | | | | | will be spread as small value and then filtered by the threshold. As a first step solution display the number of event that cannot be resolved as a valid function location. MFC after: 1week
* - Do not use the runtime mask when logfile is specified.fabient2010-08-033-4/+12
| | | | | | - Revert the fix on rtld path that is not necessary. MFC after: 1 week
* Allow file as a top source, it works with socket now.fabient2010-08-032-72/+90
| | | | | | | | | | | | | | | | | This will allow top monitoring using socket/ssh tunnelling of system without local symbols. client: pmcstat -R <ip>:<port> -T -r <symbolspath> monitored device: pmcstat -Sinstructions -O <ip>:<port> - Move the file read in the event loop - Initialize and clean log in all cases - Preserve global stats value during top refresh - Fix the rtld/line resolver that ignore '-r' prefix - Support socket for '-R' (server mode) - Display the statistics when exiting top mode
* Fix the calltree top view that incorrectly filter out some nodes.fabient2010-08-021-2/+8
| | | | MFC after: 1 week
* Fix warnings found by Coverity.fabient2010-06-053-4/+8
| | | | | Found with: Coverity Prevent(tm) MFC after: 1 month
* Rework the calltree top view by critical callchain.fabient2010-05-071-86/+104
| | | | | | The percentage show is the sum of the cost for the codepath. MFC after: 1 week
* Exclude undefined symbol from ELF file when doing function resolve.fabient2010-05-061-0/+2
| | | | MFC after: 3 days
* Apply threshold filter to root node in calltree view.fabient2010-04-211-5/+10
| | | | MFC after: 3days
* Move fatal error at the right place.fabient2010-04-143-2/+4
| | | | | | Fix exit from top mode when checking if PMC is available. MFC after: 3 days
* mdoc: order prologue macros consistently by Dd/Dt/Osuqs2010-04-141-1/+1
| | | | | | | | Although groff_mdoc(7) gives another impression, this is the ordering most widely used and also required by mdocml/mandoc. Reviewed by: ru Approved by: philip, ed (mentors)
* Improve "top" header by:fabient2010-04-024-13/+59
| | | | | - Display sample received per PMCs (or merged PMCs). - Display percentage vs all samples
* Wait for pmc name in the log before displaying data.fabient2010-03-281-5/+8
| | | | | | This will solve an abort in case of low throughput PMCs. MFC after: 3days
* Do not overflow the term in the case of multi-line display.fabient2010-03-261-4/+7
| | | | MFC after: 3days
* Change the way shutdown is handled for log file.fabient2010-03-081-3/+1
| | | | | | | | | | | | pmc_flush_logfile is now non-blocking and just ask the kernel to shutdown the file. From that point, no more data is accepted by the log thread and when the last buffer is flushed the file is closed. This will remove a deadlock between pmcstat asking for flush while it cannot flush the pipe itself. MFC after: 3 days
* Bug fixed:fabient2010-03-053-2/+5
| | | | | | | | - no display on serial terminal in top mode. - display alignment for continuation string. - correct invalid value used for display limit. MFC after: 3 days
* Fixed dependencies (make checkdpadd).ru2010-02-251-1/+1
|
* - Reorganize code in 'plugin' to share log processing.fabient2010-02-1115-1220/+3670
| | | | | | | | - Kcachegrind (calltree) support with assembly/source code mapping and call count estimator (-F). - Top mode for calltree and callgraph plugin (-T). MFC after: 1 month
* The last big commit: let usr.sbin/ use WARNS=6 by default.ed2010-01-021-2/+0
|
* (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.antoine2009-12-281-1/+1
| | | | | | | | | Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
* Catch up with the times: "mozilla" -> "firefox".jkoshy2009-06-021-2/+2
|
* Close the read side of the pipe to self when exiting.jkoshy2008-12-231-0/+3
|
OpenPOWER on IntegriCloud