summaryrefslogtreecommitdiffstats
path: root/share
diff options
context:
space:
mode:
authorru <ru@FreeBSD.org>2004-03-06 21:20:47 +0000
committerru <ru@FreeBSD.org>2004-03-06 21:20:47 +0000
commit2a3a2e007c3f0b2a255cef853c0205c3f7ed700f (patch)
tree623df442554abdaf9a7666ee292efb3d76a835ca /share
parentce8bd08be4a8bb0e70663a4aa4f6e79f43d29867 (diff)
downloadFreeBSD-src-2a3a2e007c3f0b2a255cef853c0205c3f7ed700f.zip
FreeBSD-src-2a3a2e007c3f0b2a255cef853c0205c3f7ed700f.tar.gz
Luigi was polled for additional documentation about polling(4).
Diffstat (limited to 'share')
-rw-r--r--share/man/man4/polling.4172
1 files changed, 134 insertions, 38 deletions
diff --git a/share/man/man4/polling.4 b/share/man/man4/polling.4
index 33cdd06..59b62bb 100644
--- a/share/man/man4/polling.4
+++ b/share/man/man4/polling.4
@@ -1,7 +1,30 @@
+.\" Copyright (c) 2002 Luigi Rizzo
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
-.Dd February 15, 2002
+.Dd March 6, 2004
.Dt POLLING 4
.Os
.Sh NAME
@@ -11,8 +34,9 @@
.Cd "options DEVICE_POLLING"
.Cd "options HZ=1000"
.Sh DESCRIPTION
-.Dq "Device polling"
-(polling for brevity) refers to a technique to
+Device polling
+.Nm (
+for brevity) refers to a technique to
handle devices that does not rely on the latter to generate
interrupts when they need attention, but rather lets the CPU poll
devices to service their needs.
@@ -21,7 +45,7 @@ properly,
.Nm
gives more control to the operating system on
when and how to handle devices, with a number of advantages in terms
-of system responsivity and performance.
+of system responsiveness and performance.
.Pp
In particular,
.Nm
@@ -30,7 +54,7 @@ switches which is incurred when servicing interrupts, and
gives more control on the scheduling of the CPU between various
tasks (user processes, software interrupts, device handling)
which ultimately reduces the chances of livelock in the system.
-.Sh PRINCIPLES OF OPERATION
+.Ss Principles of Operation
In the normal, interrupt-based mode, devices generate an interrupt
whenever they need attention.
This in turn causes a
@@ -41,12 +65,11 @@ unless the device driver has been programmed with real-time
concerns in mind (which is generally not the case for
.Fx
drivers).
-Furthermore, under heavy traffic, the system might be
+Furthermore, under heavy traffic load, the system might be
persistently processing interrupts without being able to
complete other work, either in the kernel or in userland.
.Pp
-.Nm Polling
-disables interrupts by polling devices at appropriate
+Device polling disables interrupts by polling devices at appropriate
times, i.e., on clock interrupts, system calls and within the idle loop.
This way, the context switch overhead is removed.
Furthermore,
@@ -54,39 +77,107 @@ the operating system can control accurately how much work to spend
in handling device events, and thus prevent livelock by reserving
some amount of CPU to other tasks.
.Pp
-.Nm Polling
-is enabled with a
-.Xr sysctl 8
-variable
-.Va kern.polling.enable
-whereas the percentage of CPU cycles reserved to userland processes is
-controlled by the
+Enabling
+.Nm
+also changes the way software network interrupts
+are scheduled, so there is never the risk of livelock because
+packets are not processed to completion.
+.Ss MIB Variables
+The operation of
+.Nm
+is controlled by the following
.Xr sysctl 8
-variable
-.Va kern.polling.user_frac
-whose range is 0 to 100 (50 is the default value).
+MIB variables:
+.Pp
+.Bl -tag -width indent -compact
+.It Va kern.polling.enable
+If set to non-zero,
+.Nm
+is enabled.
+Default is disabled.
.Pp
+.It Va kern.polling.user_frac
When
.Nm
-is enabled, and provided that there is work to do,
-up to
-.Va kern.polling.user_frac
-percent of the CPU cycles is reserved to userland tasks, the
-remaining fraction being available for device processing.
+is enabled, and provided that there is some work to do,
+up to this percent of the CPU cycles is reserved to userland tasks,
+the remaining fraction being available for
+.Nm
+processing.
+Default is 50.
.Pp
-Enabling
+.It Va kern.polling.burst
+Maximum number of packets grabbed from each network interface in
+each timer tick.
+This number is dynamically adjusted by the kernel,
+according to the programmed
+.Va user_frac , burst_max ,
+CPU speed, and system load.
+.Pp
+.It Va kern.polling.each_burst
+The burst above is split into smaller chunks of this number of
+packets, going round-robin among all interfaces registered for
+.Nm .
+This prevents the case that a large burst from a single interface
+can saturate the IP interrupt queue
+.Pq Va net.inet.ip.intr_queue_maxlen .
+Default is 5.
+.Pp
+.It Va kern.polling.burst_max
+Upper bound for
+.Va kern.polling.burst .
+Note that when
.Nm
-also changes the way network software interrupts
-are scheduled, so there is never the risk of livelock because
-packets are not processed to completion.
+is enabled, each interface can receive at most
+.Pq Va HZ No * Va burst_max
+packets per second unless there are spare CPU cycles available for
+.Nm
+in the idle loop.
+This number should be tuned to match the expected load
+(which can be quite high with GigE cards).
+Default is 150 which is adequate for 100Mbit network and HZ=1000.
+.Pp
+.It Va kern.polling.idle_poll
+Controls if
+.Nm
+is enabled in the idle loop.
+There are no reasons (other than power saving or bugs in the scheduler's
+handling of idle priority kernel threads) to disable this.
+Note that -CURRENT apparently has some problems in this respect now,
+so default is disabled.
+.Pp
+.It Va kern.polling.poll_in_trap
+Controls if
+.Nm
+is enabled during hardware traps.
+Enabling this can be useful to improve the network responsiveness
+of boxes with 100% CPU usage.
+Default is disabled.
.Pp
-There are other variables which control or monitor the behaviour
-of devices operating in polling mode, but they are unlikely to
-require modifications, and are documented in the source file
-.Pa sys/kern/kern_poll.c .
+.It Va kern.polling.reg_frac
+Controls how often (every
+.Va reg_frac No / Va HZ
+seconds) the status registers of the device are checked for error
+conditions and the like.
+Increasing this value reduces the load on the bus, but also delays
+the error detection.
+Default is 20.
+.Pp
+.It Va kern.polling.handlers
+How many active devices have registered for
+.Nm .
+.Pp
+.It Va kern.polling.short_ticks
+.It Va kern.polling.lost_polls
+.It Va kern.polling.pending_polls
+.It Va kern.polling.residual_burst
+.It Va kern.polling.phase
+.It Va kern.polling.suspect
+.It Va kern.polling.stalled
+Debugging variables.
+.El
.Sh SUPPORTED DEVICES
-.Nm Polling
-requires explicit modifications to the device drivers.
+Device polling requires explicit modifications to the device drivers.
As of this writing, the
.Xr dc 4 ,
.Xr em 4 ,
@@ -97,21 +188,26 @@ As of this writing, the
.Xr rl 4 ,
and
.Xr sis 4
-devices are supported, with other in the works.
+devices are supported, with others in the works.
The modifications are rather straightforward, consisting in
the extraction of the inner part of the interrupt service routine
and writing a callback function,
.Fn *_poll ,
which is invoked
to probe the device for events and process them.
-See the
+(See the
conditionally compiled sections of the devices mentioned above
-for more details.
+for more details.)
.Pp
-As in the worst case devices are only polled on
+As in the worst case the devices are only polled on
clock interrupts, in order to reduce the latency in processing
packets, it is advisable to increase the frequency of the clock
to at least 1000 HZ.
.Sh HISTORY
-Device polling was introduced in February 2002 by
+Device polling first appeared in
+.Fx 4.6
+and
+.Fx 5.0 .
+.Sh AUTHORS
+Device polling was written by
.An Luigi Rizzo Aq luigi@iet.unipi.it .
OpenPOWER on IntegriCloud