summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_prof.c
Commit message (Collapse)AuthorAgeFilesLines
* A little infrastructure, preceding some upcoming changesjulian2003-02-081-21/+46
| | | | | | | to the profiling and statistics code. Submitted by: DavidXu@ Reviewed by: peter@
* Reversion of commit by Davidxu plus fixes since applied.julian2003-02-011-47/+22
| | | | | | | | I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.
* Move UPCALL related data structure out of kse, introduce a newdavidxu2003-01-261-22/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Fix warnings & errors caused by my last commit.phk2003-01-071-2/+1
|
* This is all "#if defined(__i386__) && __GNUC__ >= 2":phk2003-01-061-0/+47
| | | | | | | | | | | | | Add support for GCC's --test-coverage --profile-arcs options. Add code to call the functions listed in the .ctors section, these are used to string the per .o file counter blocks into a linked list. Add empty __bb_fork_func() to cope with GCC magic gandling of exec*() named functions. To add support for other platforms should be trivial, but involves determining the exact data-types gcc uses on that platform.
* Don't #error if we are lint.phk2002-10-011-0/+2
|
* more caddr_t removal.alfred2002-06-291-2/+2
|
* Remove __P.alfred2002-03-191-1/+1
|
* - Change all callers of addupc_task() to check PS_PROFIL explicitly andjhb2001-12-181-2/+1
| | | | | | | | remove the check from addupc_task(). It would need sched_lock while testing the flag anyways. - Always read sticks while holding sched_lock using a temporary variable where needed. - Always init prticks to 0 in ast() to quiet a warning.
* Modify the critical section API as follows:jhb2001-12-181-6/+4
| | | | | | | | | | | | | | | | | | | - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha
* Add kmupetext(), a function that expands the range of memory coveredgreen2001-10-301-5/+64
| | | | | | | | | by the profiler on a running system. This is not done sparsely, as memory is cheaper than processor speed and each gprof mcount() and mexitcount() operation is already very expensive. Obtained from: NAI Labs CBOSS project Funded by: DARPA
* KSE Milestone 2julian2001-09-121-11/+12
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Pushdown Giant for: profil(), ntp_adjtime(), ogethostname(),dillon2001-09-011-4/+14
| | | | osethostname(), ogethostid(), osethostid()
* - Close races with signals and other AST's being triggered while we are injhb2001-08-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | the process of exiting the kernel. The ast() function now loops as long as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with preemption disabled so that any further AST's that arrive via an interrupt will be delayed until the low-level MD code returns to user mode. - Use u_int's to store the tick counts for profiling purposes so that we do not need sched_lock just to read p_sticks. This also closes a problem where the call to addupc_task() could screw up the arithmetic due to non-atomic reads of p_sticks. - Axe need_proftick(), aston(), astoff(), astpending(), need_resched(), clear_resched(), and resched_wanted() in favor of direct bit operations on p_sflag. - Fix up locking with sched_lock some. In addupc_intr(), use sched_lock to ensure pr_addr and pr_ticks are updated atomically with setting PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing PS_OWEUPC. We also do not grab the lock just to test a flag. - Simplify the handling of Giant in ast() slightly. Reviewed by: bde (mostly)
* We don't need to hold a lock just to test a flag.jhb2001-06-061-5/+1
|
* Remove unneeded includes of sys/ipl.h and machine/ipl.h.jhb2001-05-151-1/+0
|
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-0/+2
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Switch from save/disable/restore_intr() to critical_enter/exit().jhb2001-03-281-4/+3
|
* Since the PC is a pointer to a code address, change the second parameter ofjhb2001-02-221-2/+2
| | | | addupc_task() and addupc_intr() to be a uintptr_t instead of a u_long.
* Change and clean the mutex lock interface.bmilekic2001-02-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* - Catch up to proc flag changes.jhb2001-01-241-2/+6
|
* Convert more malloc+bzero to malloc+M_ZERO.dwmalone2000-12-081-2/+1
| | | | | Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
* Hide intrstate in the #ifdef where it belongs.phk2000-12-071-1/+1
|
* Catch up to moving headers:jhb2000-10-201-1/+1
| | | | | - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-1/+3
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-1/+1
| | | | Pointed out by: bde
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-1/+1
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* Commit major SMP cleanups and move the BGL (big giant lock) in thedillon2000-03-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
* Unremove used includes.bde1999-10-121-0/+2
| | | | | Bugs in test coverage should be fixed before removing any includes. LINT should be configured for full profiling support.
* Trim unused options (or #ifdef for undoc options).peter1999-10-111-2/+0
| | | | Submitted by: phk
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Fixed profiling of elf kernels. Made high resolution profiling compilebde1999-05-061-2/+5
| | | | | | | | | | | | for elf kernels (it is broken for all kernels due to lack of egcs support). Renaming of many assembler labels is avoided by declaring by declaring the labels that need to be visible to gprof as having type "function" and depending on the elf version of gprof being zealous about discarding the others. A few type declarations are still missing, mainly for SMP. PR: 9413 Submitted by: Assar Westerlund <assar@sics.se> (initial parts)
* Fixed bogotification of pseudocode for syscall args by rev.1.53 ofbde1998-09-051-3/+3
| | | | syscalls.master.
* Changed to the C9x draft spelling of the (unsigned) integral typebde1998-07-141-4/+4
| | | | | | | | suitable for holding object pointers (ptrint_t -> uintptr_t). Added corresponding signed type (intptr_t). Changed/added corresponding non-C9x types for function pointers to match. Don't use nonstandard types to implement these types, and don't comment on them in <machine/types.h>.
* Oops, the previous commit should have changed `i386' to `__i386__',bde1998-05-011-3/+3
| | | | not `__i386'.
* Support compiling with `gcc -ansi'.bde1998-04-151-10/+10
|
* Move the "retval" (3rd) parameter from all syscall functions and putphk1997-11-061-3/+2
| | | | | | | | | | | | it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /*ARGSUSED*/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.
* Moved declaration of etext from <machine/md_var.h> to <machine/cpu.h>bde1997-10-271-4/+1
| | | | | | | | and fixed everything that dependended on it being declared in the old place. It is used in "machine-independent" code in subr_prof.c. Moved declaration of btext from subr_prof.c to <machine/cpu.h>. It is machine-dependent.
* Last major round (Unless Bruce thinks of somthing :-) of malloc changes.phk1997-10-121-2/+2
| | | | | | | | Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
* Distribute and statizice a lot of the malloc M_* types.phk1997-10-111-1/+3
| | | | Substantial input from: bde
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Fixed magic and wrong numbers in calibration of nullfunc_loop_profiled()bde1996-12-131-13/+10
| | | | | and removed related debugging code. Now this part of the calibration is almost as machine-independent as gprof generally.
* Improved non-statistical (GUPROF) profiling:bde1996-10-171-37/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - use a more accurate and more efficient method of compensating for overheads. The old method counted too much time against leaf functions. - normally use the Pentium timestamp counter if available. On Pentiums, the times are now accurate to within a couple of cpu clock cycles per function call in the (unlikely) event that there are no cache misses in or caused by the profiling code. - optionally use an arbitrary Pentium event counter if available. - optionally regress to using the i8254 counter. - scaled the i8254 counter by a factor of 128. Now the i8254 counters overflow slightly faster than the TSC counters for a 150MHz Pentium :-) (after about 16 seconds). This is to avoid fractional overheads. files.i386: permon.c temporarily has to be classified as a profiling-routine because a couple of functions in it may be called from profiling code. options.i386: - I586_CTR_GUPROF is currently unused (oops). - I586_PMC_GUPROF should be something like 0x70000 to enable (but not use unless prof_machdep.c is changed) support for Pentium event counters. 7 is a control mode and the counter number 0 is somewhere in the 0000 bits (see perfmon.h for the encoding). profile.h: - added declarations. - cleaned up separation of user mode declarations. prof_machdep.c: Mostly clock-select changes. The default clock can be changed by editing kmem. There should be a sysctl for this. subr_prof.c: - added copyright. - calibrate overheads for the new method. - documented new method. - fixed races and and machine dependencies in start/stop code. mcount.c: Use the new overhead compensation method. gmon.h: - changed GPROF4 counter type from unsigned to int. Oops, this should be machine-dependent and/or int32_t. - reorganized overhead counters. Submitted by: Pentium event counter changes mostly by wollman
* Implemented non-statistical kernel profiling. This is based onbde1995-12-291-5/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | looking at a high resolution clock for each of the following events: function call, function return, interrupt entry, interrupt exit, and interesting branches. The differences between the times of these events are added at appropriate places in a ordinary histogram (as if very fast statistical profiling sampled the pc at those places) so that ordinary gprof can be used to analyze the times. gmon.h: Histogram counters need to be 4 bytes for microsecond resolutions. They will need to be larger for the 586 clock. The comments were vax-centric and wrong even on vaxes. Does anyone disagree? gprof4.c: The standard gprof should support counters of all integral sizes and the size of the counter should be in the gmon header. This hack will do until then. (Use gprof4 -u to examine the results of non-statistical profiling.) config/*: Non-statistical profiling is configured with `config -pp'. `config -p' still gives ordinary profiling. kgmon/*: Non-statistical profiling is enabled with `kgmon -B'. `kgmon -b' still enables ordinary profiling (and distables non-statistical profiling) if non-statistical profiling is configured.
* Unstaticized addupc_task(). It is supposed to be called from trap().bde1995-12-261-2/+2
| | | | | | | | | | See the comments for addupc_intr() and the NetBSD implementation. We use dummy versions of fuswintr() and susiwintr(), so addupc_intr() always pushes the work to trap() (this is inefficient), and trap() calls the special i386 function addupc() instead of addupc_task(). addupc() is more efficient than addupc_intr(), so some of the lost efficiency is recovered. However, addupc() may be broken on plain i386's since it doesn't check for write permission like copyout().
* A Major staticize sweep. Generates a couple of warnings that I'll dealphk1995-12-141-2/+2
| | | | | | with later. A number of unused vars removed. A number of unused procs removed or #ifdefed.
* Removed unnecessary #includes of vm stuff. Most of them were oncebde1995-12-061-3/+2
| | | | | | | prerequisites for <sys/sysctl.h>. subr_prof.c: Also replaced #include of <sys/user.h> by #include of <sys/resourcevar.h>.
* Finished (?) cleaning up sysinit stuff.bde1995-12-021-3/+3
|
OpenPOWER on IntegriCloud