summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_tc.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r288216:kib2015-10-021-7/+14
| | | | Use per-cpu values for base and last in tc_cpu_ticks().
* MFC r286701:ian2015-08-231-5/+13
| | | | | | | | | | | If a specific timecounter has been chosen via sysctl, and a new timecounter with higher quality registers (presumably in a module that has just been loaded), do not undo the user's choice by switching to the new timecounter. Document that behavior, and also the fact that there is no way to unregister a timecounter (and thus no way to unload a module containing one).
* MFC r286423, r286429:ian2015-08-231-2/+9
| | | | | | RFC 2783 requires a status of ETIMEDOUT, not EWOULDBLOCK, on a timeout. Only process the PPS event types currently enabled in pps_params.mode.
* Revert r284178 and r284256.kib2015-07-211-68/+41
| | | | Approved by: re (gjb)
* MFC r284178:kib2015-06-181-41/+68
| | | | | | | Add barriers when updating and reading th_generation. MFC r284256: Tweaks for r284178.
* MFC r279728, r279729, r279756, r279773, r282424, r281367:ian2015-05-241-4/+36
| | | | | | | | | | | | | | | | | | | Add mutex support to the pps_ioctl() API in the kernel. Add PPS support to USB serial drivers. Use correct mode variable for PPS support. Switch polarity of USB serial PPS events. The ftdi "get latency" and "get bitmode" device commands are read operations, not writes. Implement a mechanism for making changes in the kernel<->driver PPS interface without breaking ABI or API compatibility with existing drivers. Bump version number to indicate the new PPS ABI version changes in the pps_state structure.
* MFC 276724:jhb2015-04-021-4/+4
| | | | | | | | | | On some Intel CPUs with a P-state but not C-state invariant TSC the TSC may also halt in C2 and not just C3 (it seems that in some cases the BIOS advertises its C3 state as a C2 state in _CST). Just play it safe and disable both C2 and C3 states if a user forces the use of the TSC as the timecounter on such CPUs. PR: 192316
* - Make callout(9) tickless, relying on eventtimers(4) as backend fordavide2013-03-041-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | precise time event generation. This greatly improves granularity of callouts which are not anymore constrained to wait next tick to be scheduled. - Extend the callout KPI introducing a set of callout_reset_sbt* functions, which take a sbintime_t as timeout argument. The new KPI also offers a way for consumers to specify precision tolerance they allow, so that callout can coalesce events and reduce number of interrupts as well as potentially avoid scheduling a SWI thread. - Introduce support for dispatching callouts directly from hardware interrupt context, specifying an additional flag. This feature should be used carefully, as long as interrupt context has some limitations (e.g. no sleeping locks can be held). - Enhance mechanisms to gather informations about callwheel, introducing a new sysctl to obtain stats. This change breaks the KBI. struct callout fields has been changed, in particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t' (8 bytes) and another 'sbintime_t' field was added for precision. Together with: mav Reviewed by: attilio, bde, luigi, phk Sponsored by: Google Summer of Code 2012, iXsystems inc. Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm), markj (amd64), mav, Fabian Keil
* Add PPS_CANWAIT support for time_pps_fetch(). This adds support for all threeian2013-02-151-8/+49
| | | | | | | | blocking modes described in section 3.4.3 of RFC 2783, allowing the caller to retrieve the most recent values without blocking, to block for a specified time, or to block forever. Reviewed by: discussion on hackers@
* Mark 'ticks', 'time_second', and 'time_uptime' as volatile to prevent thejhb2013-01-281-2/+2
| | | | | | | compiler from caching their values in tight loops. Reviewed by: bde MFC after: 1 week
* Add support for walltimestamp in DTrace.gnn2012-07-161-0/+20
| | | | | Submitted by: Fabian Keil MFC after: 2 weeks
* Stop updating the struct vdso_timehands from even handler executed inkib2012-06-231-41/+21
| | | | | | | | | | | | | | | | | | | | | | the scheduled task from tc_windup(). Do it directly from tc_windup in interrupt context [1]. Establish the permanent mapping of the shared page into the kernel address space, avoiding the potential need to sleep waiting for allocation of sf buffer during vdso_timehands update. As a consequence, shared_page_write_start() and shared_page_write_end() functions are not needed anymore. Guess and memorize the pointers to native host and compat32 sysentvec during initialization, to avoid the need to get shared_page_alloc_sx lock during the update. In tc_fill_vdso_timehands(), do not loop waiting for timehands generation to stabilize, since vdso_timehands is written in the same interrupt context which wrote timehands. Requested by: mav [1] MFC after: 29 days
* Implement mechanism to export some kernel timekeeping data tokib2012-06-221-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month
* o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with ↵jmallett2012-03-031-0/+2
| | | | | | | | | | | | | | | | | | | | | userlands using the o32 ABI. This mostly follows nwhitehorn's lead in implementing COMPAT_FREEBSD32 on powerpc64. o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the 32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit ABIs use a 64-bit time_t. o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS with 32-bit compatibility, then, disable some code which assumes otherwise wrongly when built for MIPS. A more general macro to check in this case would seem like a good idea eventually. If someone adds support for using n32 userland with n64 kernels on MIPS, then they will have to add a variety of flags related to each piece of the ABI that can vary. That's probably the right time to generalize further. o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the freebsd32 compat code. Probably this should be generalized at some point. Reviewed by: gonzo
* Add a missing break. This bug was introduced in r228856.kevlo2012-02-101-0/+1
|
* Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. Thelstewart2011-12-241-4/+140
| | | | | | | | | | | | | | | | | | | | sysclock_getsnapshot() function allows the caller to obtain a snapshot of all the system clock and timecounter state required to create time stamps at a later point. The sysclock_snap2bintime() function converts a previously obtained snapshot into a bintime time stamp according to the specified flags e.g. which system clock, uptime vs absolute time, etc. These KPIs enable useful functionality, including direct comparison of the feedback and feed-forward system clocks and generation of multiple time stamps with different formats from a single timecounter read. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)
* Do away with the somewhat clunky sysclock_ops structure and associated code,lstewart2011-11-291-90/+12
| | | | | | | | | | | | | | reimplementing the [get]{bin,nano,micro}[up]time() wrapper functions in terms of the new "fromclock" API instead. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* Make the fbclock_[get]{bin,nano,micro}[up]time() function prototypes public solstewart2011-11-291-12/+12
| | | | | | | | | | | | | | that new APIs with some performance sensitivity can be built on top of them. These functions should not be called directly except in special circumstances. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* Fix an oversight in r227747 by calling fbclock_bin{up}time() directly from thelstewart2011-11-291-5/+5
| | | | | | | | | | | | | fbclock_{nanouptime|microuptime|bintime|nanotime|microtime}() functions to avoid indirecting through a sysclock_ops wrapper function. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* - Add Pulse-Per-Second timestamping using raw ffcounter and correspondinglstewart2011-11-211-0/+65
| | | | | | | | | | | | | | ffclock time in seconds. - Add IOCTL to retrieve ffclock timestamps from userland. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* - Provide a sysctl interface to change the active system clock at runtime.lstewart2011-11-201-2/+318
| | | | | | | | | | | | | | | | | - Wrap [get]{bin,nano,micro}[up]time() functions of sys/time.h to allow requesting time from either the feedback or the feed-forward clock. If a feedback (e.g. ntpd) and feed-forward (e.g. radclock) daemon are both running on the system, both kernel clocks are updated but only one serves time. - Add similar wrappers for the feed-forward difference clock. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* Core structure and functions to support a feed-forward clock within the kernel.lstewart2011-11-191-0/+442
| | | | | | | | | | | | | | | | | Implement ffcounter, a monotonically increasing cumulative counter on top of the active timecounter. Provide low-level functions to read the ffcounter and convert it to absolute time or a time interval in seconds using the current ffclock estimates, which track the drift of the oscillator. Add a ring of fftimehands to track passing of time on each kernel tick and pick up updates of ffclock estimates. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-1/+1
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* If TSC stops ticking in C3, disable deep sleep when the user forcefullyjkim2011-07-141-0/+6
| | | | | | select TSC as timecounter hardware. Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)
* Introduce signed and unsigned version of CTLTYPE_QUAD, renamingmdf2011-01-191-2/+2
| | | | existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().
* Add parentheses for clarity. The parentheses around the two terms of the &&cperciva2010-11-231-1/+1
| | | | | | | are unnecessary but I'm leaving them in for the sake of avoiding confusion (I confuse easily). Submitted by: bde
* In tc_windup, handle the case where the previous call to tc_windup wascperciva2010-11-221-0/+10
| | | | | | | | | | | | | more than 1s earlier. Prior to this commit, the computation of th_scale * delta (which produces a 64-bit value equal to the time since the last tc_windup call in units of 2^(-64) seconds) would overflow and any complete seconds would be lost. We fix this by repeatedly converting tc_frequency units of timecounter to one seconds; this is not exactly correct, since it loses the NTP adjustment, but if we find ourselves going more than 1s at a time between clock interrupts, losing a few seconds worth of NTP adjustments is the least of our problems...
* Fix some more style(9) issues.brucec2010-11-141-1/+1
|
* Fix style(9) issues from r215281 and r215282.brucec2010-11-141-2/+4
| | | | MFC after: 1 week
* Add some descriptions to sys/kern sysctls.brucec2010-11-141-4/+4
| | | | | | PR: kern/148710 Tested by: Chip Camden <sterling at camdensoftware.com> MFC after: 1 week
* Until hardclock() and respectively tc_windup() called first time, systemmav2010-09-211-0/+1
| | | | | | | | | | | is running on "dummy" time counter. But to function properly in one-shot mode, event timer management code requires working time counter. Slow moving "dummy" time counter delays first hardclock() call by few seconds on my systems, even though timer interrupts were correctly kicking kernel. That causes few seconds delay during boot with one-shot mode enabled. To break this loop, explicitly call tc_windup() first time during initialization process to let it switch to some real time counter.
* Make kern_tc.c provide minimum frequency of tc_ticktock() calls, requiredmav2010-09-141-2/+7
| | | | | | to handle current timecounter wraps. Make kern_clocksource.c to honor that requirement, scheduling sleeps on first CPU for no more then specified period. Allow other CPUs to sleep up to 1/4 second (for any case).
* Refactor timer management code with priority to one-shot operation mode.mav2010-09-131-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
* Revert r210225 - turns out I was wrong; the "/*-" is not license-onlytrasz2010-07-181-1/+1
| | | | | | | thing; it's also used to indicate that the comment should not be automatically rewrapped. Explained by: cperciva@
* The "/*-" comment marker is supposed to denote copyrights. Remove non-copyrighttrasz2010-07-181-1/+1
| | | | occurences from sys/sys/ and sys/kern/.
* Remove interval validation from cpu_tick_calibrate(). As I found, checkmav2010-07-111-36/+14
| | | | | | | | was needed at preliminary version of the patch, where number of CPU ticks was divided strictly on 16 seconds. Final code instead uses real interval duration, so precise interval should not be important. Same time aliasing issues around second boundary causes false positives, periodically logging useless "t_delta ... too long/short" messages when HZ set below 256.
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-7/+7
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Implement flexible BPF timestamping framework.jkim2010-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | - Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command. Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP command. Document all supported options in bpf(4) and their uses. - Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'. The new time stamp has both 64-bit second and fractional parts. bpf_xhdr has this time stamp instead of 'struct timeval' for bh_tstamp. The new structures let us use bh_tstamp of same size on both 32-bit and 64-bit platforms without adding additional shims for 32-bit binaries. On 64-bit platforms, size of BPF header does not change compared to bpf_hdr as its members are already all 64-bit long. On 32-bit platforms, the size may increase by 8 bytes. For backward compatibility, struct bpf_hdr with struct timeval is still the default header unless new time stamp format is explicitly requested. However, the behaviour may change in the future and all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now. - Add experimental support for tagging mbufs with time stamps from a lower layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs. The time stamps must be uptime in 'struct bintime' format as binuptime(9) and getbinuptime(9) do. Reviewed by: net@
* Remove conditionally compiled time counter statistics; tools likerwatson2009-04-111-31/+0
| | | | | | | | DTrace, kernel profiling, etc, can provide this information without the overhead. MFC after: 3 days Suggested by: bde
* By default, don't compile in counters of calls to various timerwatson2009-03-081-13/+18
| | | | | | | | | | | | | | query functions in the kernel, as these effectively serialize parallel calls to the gettimeofday(2) system call, as well as other kernel services that use timestamps. Use the NetBSD version of the fix (kern_tc.c:1.32 by ad@) as they have picked up our timecounter code and also ran into the same problem. Reported by: kris Obtained from: NetBSD MFC after: 3 days
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Fix typo in comment.imp2008-02-171-1/+1
|
* Note what is too {short,long}.obrien2008-01-021-2/+2
|
* Despite several examples in the kernel, the third argument ofdwmalone2007-06-041-3/+3
| | | | | | | | | | | | | sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.
* Commit the results of the typo hunt by Darren Pilgrim.yar2006-08-041-1/+1
| | | | | | | | | | This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
* Add a kern.timecounter.tc sysctl tree that contains the mask,dwmalone2006-06-161-0/+40
| | | | | | | | | frequency, quality and current value of each available time counter. At the moment all of these are read-only, but it might make sense to make some of these read-write in the future. MFC after: 3 months
* Disable the "cputick increased..." message now that the dust has settled.phk2006-03-151-1/+1
|
* Oops, forgot newline.phk2006-03-091-1/+1
|
* silence cpu_tick calibration and notice only (under bootverbose)phk2006-03-091-15/+8
| | | | when the frequency increases.
* Style nit.jhb2006-03-071-2/+1
|
OpenPOWER on IntegriCloud