summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_tc.c
Commit message (Collapse)AuthorAgeFilesLines
* Add parentheses for clarity. The parentheses around the two terms of the &&cperciva2010-11-231-1/+1
| | | | | | | are unnecessary but I'm leaving them in for the sake of avoiding confusion (I confuse easily). Submitted by: bde
* In tc_windup, handle the case where the previous call to tc_windup wascperciva2010-11-221-0/+10
| | | | | | | | | | | | | more than 1s earlier. Prior to this commit, the computation of th_scale * delta (which produces a 64-bit value equal to the time since the last tc_windup call in units of 2^(-64) seconds) would overflow and any complete seconds would be lost. We fix this by repeatedly converting tc_frequency units of timecounter to one seconds; this is not exactly correct, since it loses the NTP adjustment, but if we find ourselves going more than 1s at a time between clock interrupts, losing a few seconds worth of NTP adjustments is the least of our problems...
* Fix some more style(9) issues.brucec2010-11-141-1/+1
|
* Fix style(9) issues from r215281 and r215282.brucec2010-11-141-2/+4
| | | | MFC after: 1 week
* Add some descriptions to sys/kern sysctls.brucec2010-11-141-4/+4
| | | | | | PR: kern/148710 Tested by: Chip Camden <sterling at camdensoftware.com> MFC after: 1 week
* Until hardclock() and respectively tc_windup() called first time, systemmav2010-09-211-0/+1
| | | | | | | | | | | is running on "dummy" time counter. But to function properly in one-shot mode, event timer management code requires working time counter. Slow moving "dummy" time counter delays first hardclock() call by few seconds on my systems, even though timer interrupts were correctly kicking kernel. That causes few seconds delay during boot with one-shot mode enabled. To break this loop, explicitly call tc_windup() first time during initialization process to let it switch to some real time counter.
* Make kern_tc.c provide minimum frequency of tc_ticktock() calls, requiredmav2010-09-141-2/+7
| | | | | | to handle current timecounter wraps. Make kern_clocksource.c to honor that requirement, scheduling sleeps on first CPU for no more then specified period. Allow other CPUs to sleep up to 1/4 second (for any case).
* Refactor timer management code with priority to one-shot operation mode.mav2010-09-131-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
* Revert r210225 - turns out I was wrong; the "/*-" is not license-onlytrasz2010-07-181-1/+1
| | | | | | | thing; it's also used to indicate that the comment should not be automatically rewrapped. Explained by: cperciva@
* The "/*-" comment marker is supposed to denote copyrights. Remove non-copyrighttrasz2010-07-181-1/+1
| | | | occurences from sys/sys/ and sys/kern/.
* Remove interval validation from cpu_tick_calibrate(). As I found, checkmav2010-07-111-36/+14
| | | | | | | | was needed at preliminary version of the patch, where number of CPU ticks was divided strictly on 16 seconds. Final code instead uses real interval duration, so precise interval should not be important. Same time aliasing issues around second boundary causes false positives, periodically logging useless "t_delta ... too long/short" messages when HZ set below 256.
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-7/+7
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Implement flexible BPF timestamping framework.jkim2010-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | - Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command. Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP command. Document all supported options in bpf(4) and their uses. - Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'. The new time stamp has both 64-bit second and fractional parts. bpf_xhdr has this time stamp instead of 'struct timeval' for bh_tstamp. The new structures let us use bh_tstamp of same size on both 32-bit and 64-bit platforms without adding additional shims for 32-bit binaries. On 64-bit platforms, size of BPF header does not change compared to bpf_hdr as its members are already all 64-bit long. On 32-bit platforms, the size may increase by 8 bytes. For backward compatibility, struct bpf_hdr with struct timeval is still the default header unless new time stamp format is explicitly requested. However, the behaviour may change in the future and all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now. - Add experimental support for tagging mbufs with time stamps from a lower layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs. The time stamps must be uptime in 'struct bintime' format as binuptime(9) and getbinuptime(9) do. Reviewed by: net@
* Remove conditionally compiled time counter statistics; tools likerwatson2009-04-111-31/+0
| | | | | | | | DTrace, kernel profiling, etc, can provide this information without the overhead. MFC after: 3 days Suggested by: bde
* By default, don't compile in counters of calls to various timerwatson2009-03-081-13/+18
| | | | | | | | | | | | | | query functions in the kernel, as these effectively serialize parallel calls to the gettimeofday(2) system call, as well as other kernel services that use timestamps. Use the NetBSD version of the fix (kern_tc.c:1.32 by ad@) as they have picked up our timecounter code and also ran into the same problem. Reported by: kris Obtained from: NetBSD MFC after: 3 days
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Fix typo in comment.imp2008-02-171-1/+1
|
* Note what is too {short,long}.obrien2008-01-021-2/+2
|
* Despite several examples in the kernel, the third argument ofdwmalone2007-06-041-3/+3
| | | | | | | | | | | | | sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.
* Commit the results of the typo hunt by Darren Pilgrim.yar2006-08-041-1/+1
| | | | | | | | | | This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
* Add a kern.timecounter.tc sysctl tree that contains the mask,dwmalone2006-06-161-0/+40
| | | | | | | | | frequency, quality and current value of each available time counter. At the moment all of these are read-only, but it might make sense to make some of these read-write in the future. MFC after: 3 months
* Disable the "cputick increased..." message now that the dust has settled.phk2006-03-151-1/+1
|
* Oops, forgot newline.phk2006-03-091-1/+1
|
* silence cpu_tick calibration and notice only (under bootverbose)phk2006-03-091-15/+8
| | | | when the frequency increases.
* Style nit.jhb2006-03-071-2/+1
|
* Add missing cast.phk2006-03-041-1/+1
|
* More detailed logging if timestepwarnings are enabled.phk2006-03-041-5/+8
|
* Suffer a little bit of math every 16 second and tighten calibration ofphk2006-03-021-12/+24
| | | | cpu_ticks to the low side of PPM.
* CPU time accounting speedup (step 2)phk2006-02-111-5/+133
| | | | | | | | | | | | | | | | | | | Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64
* Modify the way we account for CPU time spent (step 1)phk2006-02-071-0/+21
| | | | | | | | | | | | | | | | Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
* Start time_uptime with 1 instead of 0.andre2005-09-191-1/+1
| | | | Discussed with: phk
* Forward declaring static variables as extern is invalid ISO-C. Now thatobrien2005-09-071-1/+1
| | | | GCC can properly handle forward static declarations, do this properly.
* s/ENOTTY/ENOIOCTL/phk2005-03-261-1/+1
|
* Put on my peril sensitive sunglasses and add a flags field to the internalpeter2004-10-111-2/+17
| | | | | | | | | | | | | | | | sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
* Add some KASSERTS.phk2004-08-141-0/+3
|
* Just because the timecounter reads the same value on two samplesphk2004-03-041-4/+0
| | | | after each other doesn't mean that nothing happened.
* Write 100 times for tomorrow:phk2004-01-221-2/+3
| | | | "Always print time_t as %jd, you never know what width it has"
* Add a sysctl (default: off) which enables a log(LOG_INFO...) warningphk2004-01-211-10/+19
| | | | if the clock is stepped.
* Various minor details:phk2003-11-131-8/+17
| | | | | | | | Give the HZ/overflow check a 10% margin. Eliminate bogus newline. If timecounters have equal quality, prefer higher frequency. Some inspiration from: bde
* Use the quality to disable timecounters for which we deem Hz too low.phk2003-09-031-6/+10
|
* bde made a number of suggested improvements to the code. This commitimp2003-08-201-20/+17
| | | | | | | | | | represents the pruely stylistic changes and should have no net impact on the rest of the code. bde's more substantive changes will follow in a separate commit once we've come to closure on them. Submitted by: bde
* Fix an extreme edge case in leap second handling. We need to callimp2003-08-201-4/+6
| | | | | | | | | | | ntp_update_second twice when we have a large step in case that step goes across a scheduled leap second. The only way this could happen would be if we didn't call tc_windup over the end of day on the day of a leap second, which would only happen if timeouts were delayed for seconds. While it is an edge case, it is an important one to get right for my employer. Sponsored by: Timing Solutions Corporation
* Give timecounters a numeric quality field.phk2003-08-161-8/+35
| | | | | | | | | | | | | | | | A timecounter will be selected when registered if its quality is not negative and no less than the current timecounters. Add a sysctl to report all available timecounters and their qualities. Give the dummy timecounter a solid negative quality of minus a million. Give the i8254 zero and the ACPI 1000. The TSC gets 800, unless APM or SMP forces it negative. Other timecounters default to zero quality and thereby retain current selection behaviour.
* Remove extra space.mux2003-08-121-1/+1
|
* typo fix in comment.phk2003-07-021-1/+1
|
* Fix leap second processing by the kernel time keeping routines.imp2003-06-251-6/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before, we would add/subtract the leap second when the system had been up for an even multiple of days, rather than at the end of the day, as a leap second is defined (at least wrt ntp). We do this by calculating the notion of UTC earlier in the loop, and passing that to get it adjusted. Any adjustments that ntp_update_second makes to this time are then transferred to boot time. We can't pass it either the boot time or the uptime because their sum is what determines when a leap second is needed. This code adds an extra assignment and two extra compare in the typical case, which is as cheap as I could made it. I have confirmed with this code the kernel time does the correct thing for both positive and negative leap seconds. Since the ntp interface doesn't allow for +2 or -2, those cases can't be tested (and the folks in the know here say there will never be a +2s or -2s leap event, but rather two +1s or -1s leap events). There will very likely be no leap seconds for a while, given how the earth is speeding up and slowing down, so there will be plenty of time for this fix to propigate. UT1-UTC is currently at "about -0.4s" and decrementing by .1s every 8 months or so. 6 * 8 is 48 months, or 4 years. -stable has different code, but a similar bug that was introduced about the time of the last leap second, which is why nobody has noticed until now. MFC After: 3 weeks Reviewed by: phk "Furthermore, leap seconds must die." -- Cato the Elder
* Use UTC rather than GMT to describe time scale. latter is obsolete.imp2003-06-231-2/+2
|
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* Move timecounters notion of frequency to 64 bits.phk2003-01-291-3/+4
| | | | [WARNING: CPUs in the distant future may be closer than they appear!]
OpenPOWER on IntegriCloud