1 files changed, 162 insertions, 0 deletions
diff --git a/sys/geom/sched/README b/sys/geom/sched/README
new file mode 100644
index 0000000..1b52d90
--- /dev/null
+++ b/sys/geom/sched/README
@@ -0,0 +1,162 @@
+
+	--- GEOM BASED DISK SCHEDULERS FOR FREEBSD ---
+
+This code contains a framework for GEOM-based disk schedulers and a
+couple of sample scheduling algorithms that use the framework and
+implement two forms of "anticipatory scheduling" (see below for more
+details).
+
+As a quick example of what this code can give you, try to run "dd",
+"tar", or some other program with highly SEQUENTIAL access patterns,
+together with "cvs", "cvsup", "svn" or other highly RANDOM access patterns
+(this is not a made-up example: it is pretty common for developers
+to have one or more apps doing random accesses, and others that do
+sequential accesses e.g., loading large binaries from disk, checking
+the integrity of tarballs, watching media streams and so on).
+
+These are the results we get on a local machine (AMD BE2400 dual
+core CPU, SATA 250GB disk):
+
+    /mnt is a partition mounted on /dev/ad0s1f
+
+    cvs: 	cvs -d /mnt/home/ncvs-local update -Pd /mnt/ports
+    dd-read:	dd bs=128k of=/dev/null if=/dev/ad0 (or ad0-sched-)
+    dd-writew	dd bs=128k if=/dev/zero of=/mnt/largefile
+
+			NO SCHEDULER		RR SCHEDULER
+                	dd	cvs		dd	cvs
+
+    dd-read only        72 MB/s	----		72 MB/s	---
+    dd-write only	55 MB/s	---		55 MB/s	---
+    dd-read+cvs		 6 MB/s	ok    		30 MB/s	ok
+    dd-write+cvs	55 MB/s slooow		14 MB/s	ok
+
+As you can see, when a cvs is running concurrently with dd, the
+performance drops dramatically, and depending on read or write mode,
+one of the two is severely penalized.  The use of the RR scheduler
+in this example makes the dd-reader go much faster when competing
+with cvs, and lets cvs progress when competing with a writer.
+
+To try it out:
+
+1. USERS OF FREEBSD 7, PLEASE READ CAREFULLY THE FOLLOWING:
+
+    On loading, this module patches one kernel function (g_io_request())
+    so that I/O requests ("bio's") carry a classification tag, useful
+    for scheduling purposes.
+
+    ON FREEBSD 7, the tag is stored in an existing (though rarely used)
+    field of the "struct bio", a solution which makes this module
+    incompatible with other modules using it, such as ZFS and gjournal.
+    Additionally, g_io_request() is patched in-memory to add a call
+    to the function that initializes this field (i386/amd64 only;
+    for other architectures you need to manually patch sys/geom/geom_io.c).
+    See details in the file g_sched.c.
+
+    On FreeBSD 8.0 and above, the above trick is not necessary,
+    as the struct bio contains dedicated fields for the classifier,
+    and hooks for request classifiers.
+
+    If you don't like the above, don't run this code.
+
+2. PLEASE MAKE SURE THAT THE DISK THAT YOU WILL BE USING FOR TESTS
+   DOES NOT CONTAIN PRECIOUS DATA.
+    This is experimental code, so we make no guarantees, though
+    I am routinely using it on my desktop and laptop.
+
+3. EXTRACT AND BUILD THE PROGRAMS
+    A 'make install' in the directory should work (with root privs),
+    or you can even try the binary modules.
+    If you want to build the modules yourself, look at the Makefile.
+
+4. LOAD THE MODULE, CREATE A GEOM NODE, RUN TESTS
+
+    The scheduler's module must be loaded first:
+
+      # kldload gsched_rr
+
+    substitute with gsched_as to test AS.  Then, supposing that you are
+    using /dev/ad0 for testing, a scheduler can be attached to it with:
+
+      # geom sched insert ad0
+
+    The scheduler is inserted transparently in the geom chain, so
+    mounted partitions and filesystems will keep working, but
+    now requests will go through the scheduler.
+
+    To change scheduler on-the-fly, you can reconfigure the geom:
+
+      # geom sched configure -a as ad0.sched.
+
+    assuming that gsched_as was loaded previously.
+
+5. SCHEDULER REMOVAL
+
+    In principle it is possible to remove the scheduler module
+    even on an active chain by doing
+
+	# geom sched destroy ad0.sched.
+
+    However, there is some race in the geom subsystem which makes
+    the removal unsafe if there are active requests on a chain.
+    So, in order to reduce the risk of data losses, make sure
+    you don't remove a scheduler from a chain with ongoing transactions.
+
+--- NOTES ON THE SCHEDULERS ---
+
+The important contribution of this code is the framework to experiment
+with different scheduling algorithms.  'Anticipatory scheduling'
+is a very powerful technique based on the following reasoning:
+
+    The disk throughput is much better if it serves sequential requests.
+    If we have a mix of sequential and random requests, and we see a
+    non-sequential request, do not serve it immediately but instead wait
+    a little bit (2..5ms) to see if there is another one coming that
+    the disk can serve more efficiently.
+
+There are many details that should be added to make sure that the
+mechanism is effective with different workloads and systems, to
+gain a few extra percent in performance, to improve fairness,
+insulation among processes etc.  A discussion of the vast literature
+on the subject is beyond the purpose of this short note.
+
+--------------------------------------------------------------------------
+
+TRANSPARENT INSERT/DELETE
+
+geom_sched is an ordinary geom module, however it is convenient
+to plug it transparently into the geom graph, so that one can
+enable or disable scheduling on a mounted filesystem, and the
+names in /etc/fstab do not depend on the presence of the scheduler.
+
+To understand how this works in practice, remember that in GEOM
+we have "providers" and "geom" objects.
+Say that we want to hook a scheduler on provider "ad0",
+accessible through pointer 'pp'. Originally, pp is attached to
+geom "ad0" (same name, different object) accessible through pointer old_gp
+
+  BEFORE	---> [ pp    --> old_gp ...]
+
+A normal "geom sched create ad0" call would create a new geom node
+on top of provider ad0/pp, and export a newly created provider
+("ad0.sched." accessible through pointer newpp).
+
+  AFTER create  ---> [ newpp --> gp --> cp ] ---> [ pp    --> old_gp ... ]
+
+On top of newpp, a whole tree will be created automatically, and we
+can e.g. mount partitions on /dev/ad0.sched.s1d, and those requests
+will go through the scheduler, whereas any partition mounted on
+the pre-existing device entries will not go through the scheduler.
+
+With the transparent insert mechanism, the original provider "ad0"/pp
+is hooked to the newly created geom, as follows:
+
+  AFTER insert  ---> [ pp    --> gp --> cp ] ---> [ newpp --> old_gp ... ]
+
+so anything that was previously using provider pp will now have
+the requests routed through the scheduler node.
+
+A removal ("geom sched destroy ad0.sched.") will restore the original
+configuration.
+
+# $FreeBSD$