1 files changed, 378 insertions, 3 deletions
diff --git a/documentation/poky-ref-manual/technical-details.xml b/documentation/poky-ref-manual/technical-details.xml
index b341795..1657431 100644
--- a/documentation/poky-ref-manual/technical-details.xml
+++ b/documentation/poky-ref-manual/technical-details.xml
@@ -151,12 +151,386 @@
 
     <para>
         By design, the Yocto Project builds everything from scratch unless it can determine that
-        a given task's inputs have not changed.  
-        While building from scratch ensures that everything is current, it does also
-        mean that a lot of time could be spent rebuiding things that don't necessarily need built.
+        parts don't need to be rebuilt.
+        Fundamentally, building from scratch is an attraction as it means all parts are 
+        built fresh and there is no possibility of stale data causing problems. 
+        When developers hit problems, they typically default back to building from scratch
+        so they know the state of things from the start.
+    </para>
+
+    <para>  
+        Building an image from scratch is both an advantage and a disadvantage to the process. 
+        As mentioned in the previous paragraph, building from scratch ensures that 
+        everything is current and starts from a known state.
+        However, building from scratch also takes much longer as it generally means 
+        rebuiding things that don't necessarily need rebuilt.
+    </para>
+
+    <para>
+        The Yocto Project implements shared state code that supports incremental builds.
+        The implementation of the shared state code answers the following questions that
+        were fundamental roadblocks within the Yocto Project incremental build support system:
+        <itemizedlist>
+            <listitem>What pieces of the system have changed and what pieces have not changed?</listitem>
+            <listitem>How are changed pieces of software removed and replaced?</listitem>
+            <listitem>How are pre-built components that don't need to be rebuilt from scratch
+                used when they are available?</listitem>
+        </itemizedlist>
     </para>
 
     <para>
+        For the first question, the build system detects changes in the "inputs" to a given task by 
+        creating a checksum (or signature) of the task's inputs. 
+        If the checksum changes, the system assumes the inputs have changed and the task needs to be 
+        rerun.
+        For the second question, the shared state (sstate) code tracks which tasks add which output
+        to the build process. 
+        This means the output from a given task can be removed, upgraded or otherwise manipulated.
+        The third question is partly addressed by the solution for the second question
+        assuming the build system can fetch the sstate objects from remote locations and 
+        install them if they are deemed to be valid.
+    </para>
+
+    <para>
+        The rest of this section goes into detail about the overall incremental build
+        architecture, the checksums (signatures), shared state, and some tips and tricks.
+    </para>
+
+    <section id='overall-architecture'>
+        <title>Overall Architecture</title>
+
+        <para>
+            When determining what parts of the system need to be built, the Yocto Project 
+            uses a per-task basis and does not use a per-recipe basis.
+            You might wonder why using a per-task basis is preferred over a per-recipe basis.
+            To help explain, consider having the IPK packaging backend enabled and then switching to DEB. 
+            In this case, <filename>do_install</filename> and <filename>do_package</filename>
+            output are still valid.
+            However, with a per-recipe approach, the build would not include the 
+            <filename>.deb</filename> files.        
+            Consequently, you would have to invalidate the whole build and rerun it. 
+            Rerunning everything is not the best situation.
+            Also in this case, the core must be "taught" much about specific tasks. 
+            This methodology does not scale well and does not allow users to easily add new tasks 
+            in layers or as external recipes without touching the packaged-staging core.
+        </para>
+    </section>
+
+    <section id='checksums'>
+        <title>Checksums (Signatures)</title>
+
+        <para>
+            The Yocto Project uses a checksum, which is a unique signature of a task's 
+            inputs, to determine if a task needs to be run again. 
+            Because it is a change in a task's inputs that trigger a rerun, the process
+            needs to detect all the inputs to a given task. 
+            For shell tasks, this turns out to be fairly easy because
+            the build process generates a "run" shell script for each task and 
+            it is possible to create a checksum that gives you a good idea of when 
+            the task's data changes.
+        </para>
+
+        <para>
+            To complicate the problem, there are things that should not be included in 
+            the checksum. 
+            First, there is the actual specific build path of a given task - 
+            the <filename>WORKDIR</filename>. 
+            It does not matter if the working directory changes because it should not 
+            affect the output for target packages.
+            Also, the build process has the objective of making native/cross packages relocatable. 
+            The checksum therefore needs to exclude <filename>WORKDIR</filename>.
+            The simplistic approach for excluding the worknig directory is to set 
+            <filename>WORKDIR</filename> to some fixed value and create the checksum
+            for the "run" script. 
+        </para>
+
+        <para>
+            Another problem results from the "run" scripts containing functions that 
+            might or might not get called.  
+            The Yocto Project contains code that figures out dependencies between shell 
+            functions.
+            This code is used to prune the "run" scripts down to the minimum set, 
+            thereby alleviating this problem and making the "run" scripts much more 
+            readable as a bonus.
+        </para>
+
+        <para>
+            So far we have solutions for shell scripts.
+            What about python tasks?
+            Handling these tasks are more difficult but the the same approach 
+            applies.
+            The process needs to figure out what variables a python function accesses 
+            and what functions it calls.
+            Again, the Yocto Project contains code that first figures out the variable and function 
+            dependencies, and then creates a checksum for the data used as the input to 
+            the task.
+        </para>
+
+        <para>
+            Like the <filename>WORKDIR</filename> case, situations exist where dependencies 
+            should be ignored.
+            For these cases, you can instruct the build process to ignore a dependency
+            by using a line like the following:
+            <literallayout class='monospaced'>
+     PACKAGE_ARCHS[vardepsexclude] = "MACHINE"
+            </literallayout>
+            This example ensures that the <filename>PACKAGE_ARCHS</filename> variable does not 
+            depend on the value of <filename>MACHINE</filename>, even if it does reference it.
+        </para>
+           
+        <para>
+            Equally, there are cases where we need to add in dependencies
+            BitBake is not able to find.
+            You can accomplish this by using a line like the following:
+            <literallayout class='monospaced'>
+      PACKAGE_ARCHS[vardeps] = "MACHINE"
+            </literallayout>
+            This example explicitly adds the <filename>MACHINE</filename> variable as a 
+            dependency for <filename>PACKAGE_ARCHS</filename>.
+        </para>
+
+        <para> 
+            Consider a case with inline python, for example, where BitBake is not
+            able to figure out dependencies. 
+            When running in debug mode (i.e. using <filename>-DDD</filename>), BitBake 
+            produces output when it discovers something for which it cannot figure out
+            dependencies. 
+            The Yocto Project team has currently not managed to cover those dependencies 
+            in detail and is aware of the need to fix this situation.
+        </para>
+
+        <para>
+            Thus far, this section has limited discussion to the direct inputs into a 
+            task.
+            Information based on direct inputs is referred to as the "basehash" in the code.
+            However, there is still the question of a task's indirect inputs, the things that 
+            were already built and present in the build directory. 
+            The checksum (or signature) for a particular task needs to add the hashes of all the
+            tasks the particular task depends upon. 
+            Choosing which dependencies to add is a policy decision.
+            However, the effect is to generate a master checksum that combines the 
+            basehash and the hashes of the task's dependencies.
+        </para>
+
+        <para>
+            While figuring out the dependencies and creating these checksums is good,
+            what does the Yocto Project build system do with the checksum information? 
+            The build system uses a signature handler that is responsible for 
+            processing the checksum information.
+            By default, there is a dummy "noop" signature handler enabled in BitBake.
+            This means that behaviour is unchanged from previous versions. 
+            OECore uses the "basic" signature handler through this setting in the
+            <filename>bitbake.conf</filename> file:
+            <literallayout class='monospaced'>
+     BB_SIGNATURE_HANDLER ?= "basic"
+            </literallayout>
+            Also within the BitBake configuration file, we can give BitBake
+            some extra information to help it handle this information.
+            The following statements effectively result in a list of global
+            list of variable dependency excludes - variables never included in 
+            any checksum:
+            <literallayout class='monospaced'>
+     BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH"
+     BB_HASHBASE_WHITELIST += "DL_DIR SSTATE_DIR THISDIR FILESEXTRAPATHS"
+     BB_HASHBASE_WHITELIST += "FILE_DIRNAME HOME LOGNAME SHELL TERM USER"
+     BB_HASHBASE_WHITELIST += "FILESPATH USERNAME STAGING_DIR_HOST STAGING_DIR_TARGET"
+     BB_HASHTASK_WHITELIST += "(.*-cross$|.*-native$|.*-cross-initial$| \
+         .*-cross-intermediate$|^virtual:native:.*|^virtual:nativesdk:.*)"
+            </literallayout>
+            This example is actually where <filename>WORKDIR</filename>
+            is excluded since <filename>WORKDIR</filename> is constructed as a
+            path within <filename>TMPDIR</filename>, which is on the whitelist.
+        </para>
+
+        <para>
+            The <filename>BB_HASHTASK_WHITELIST</filename> covers dependent tasks and 
+            excludes certain kinds of tasks from the dependency chains. 
+            The effect of the previous example is to isolate the native, target,
+            and cross components.
+            So, for example, toolchain changes do not force a rebuild of the whole system.
+        </para>
+
+        <para>
+            The end result of the "basic" handler is to make some dependency and
+            hash information available to the build. 
+            This includes:
+            <literallayout class='monospaced'>
+     BB_BASEHASH_task-&lt;taskname&gt; - the base hashes for each task in the recipe
+     BB_BASEHASH_&lt;filename:taskname&gt; - the base hashes for each dependent task
+     BBHASHDEPS_&lt;filename:taskname&gt; - The task dependencies for each task
+     BB_TASKHASH - the hash of the currently running task
+            </literallayout>
+            There is also a "basichash" <filename>BB_SIGNATURE_HANDLER</filename>,
+            which is the same as the basic version but adds the task hash to the stamp files. 
+            This results in any metadata change that changes the task hash,
+            automatically causing the task to be run again. 
+            This removes the need to bump <filename>PR</filename>
+            values and changes to metadata automatically ripple across the build.
+            Currently, this behavior is not the default behavior.
+            However, it is likely that the Yocto Project team will go forward with this 
+            behavior in the future since all the functionality exists. 
+            The reason for the delay is the potential impact to the distribution feed 
+            creation as they need increasing <filename>PR</filename> fields
+            and the Yocto Project currently lacks a mechanism to automate incrementing 
+            this field.
+        </para>
+    </section>
+
+    <section id='shared-state'>
+        <title>Shared State</title>
+
+        <para>
+            Checksums and dependencies as discussed in the previous section solves half the 
+            problem.
+            The other part of the problem is being able to use checksum information during the build
+            and being able to reuse or rebuild specific components.
+        </para>
+
+        <para>
+            The shared state class (<filename>sstate.bbclass</filename>) 
+            is a relatively generic implementation of how to
+            "capture" a snapshot of a given task. 
+            The idea is that the build process does not care about the source of a
+            task's output.
+            Output could be freshly built or it could be downloaded and unpacked from
+            somewhere - the build process doesn't need to worry about its source.
+        </para>
+
+        <para>
+            There are two types of output, one is just about creating a directory
+            in <filename>WORKDIR</filename>.
+            A good example is the output of either <filename>do_install</filename> or 
+            <filename>do_package</filename>. 
+            The other type of output occurs when a set of data is merged into a shared directory 
+            tree such as the sysroot.
+        </para>
+
+        <para>
+            The Yocto Project team has tried to keep the details of the implementation hidden in 
+            <filename>sstate.bbclass</filename>. 
+            From a user's perspective, adding shared state wrapping to a task
+            is as simple as this <filename>do_deploy</filename> example taken from 
+            <filename>do_deploy.bbclass</filename>:
+            <literallayout class='monospaced'>
+     DEPLOYDIR = "${WORKDIR}/deploy-${PN}"
+     SSTATETASKS += "do_deploy"
+     do_deploy[sstate-name] = "deploy"
+     do_deploy[sstate-inputdirs] = "${DEPLOYDIR}"
+     do_deploy[sstate-outputdirs] = "${DEPLOY_DIR_IMAGE}"
+
+     python do_deploy_setscene () {
+         sstate_setscene(d)
+     }
+     addtask do_deploy_setscene
+            </literallayout>
+            In the example, we add some extra flags to the task, a name field ("deploy"), an
+            input directory where the task sends data, and the output
+            directory where the data from the task should eventually be copied. 
+            We also add a <filename>_setscene</filename> variant of the task and add the task
+            name to the <filename>SSTATETASKS</filename> list.
+        </para>
+
+        <para>
+            If you have a directory whose contents you need to preserve, 
+            you can do this with a line like the following:
+            <literallayout class='monospaced'>
+     do_package[sstate-plaindirs] = "${PKGD} ${PKGDEST}"
+            </literallayout>
+            This method, as well as the following example, also works for mutliple directories.
+            <literallayout class='monospaced'>
+     do_package[sstate-inputdirs] = "${PKGDESTWORK} ${SHLIBSWORKDIR}"
+     do_package[sstate-outputdirs] = "${PKGDATA_DIR} ${SHLIBSDIR}"
+     do_package[sstate-lockfile] = "${PACKAGELOCK}"
+            </literallayout>
+            These methods also include the ability to take a lockfile when manipulating
+            shared state directory structures since some cases are sensitive to file
+            additions or removals.
+        </para>
+
+        <para>
+            Behind the scenes, the shared state code works by looking in 
+            <filename>SSTATE_DIR</filename> and  
+            <filename>SSTATE_MIRRORS</filename> for shared state files. 
+            Here is an example:
+            <literallayout class='monospaced'>
+     SSTATE_MIRRORS ?= "\
+     file://.* http://someserver.tld/share/sstate/ \n \
+     file://.* file:///some/local/dir/sstate/"
+            </literallayout>
+        </para>
+
+        <para>
+            The shared state package validity can be detected just by looking at the
+            filename since the filename contains the task checksum (or signature) as
+            described earlier in this section. 
+            If a valid shared state package is found, the build process downloads it 
+            and uses it to accelerate the task.
+        </para>
+
+        <para>
+            The build processes uses the <filename>*_setscene</filename> tasks
+            for the task acceleration phase.
+            BitBake goes through this phase before the main execution code and tries
+            to accelerate any tasks for which it can find shared state packages. 
+            If a shared state package for a task is available, the shared state
+            package is used.
+            This means the task and any tasks on which it is dependent are not 
+            executed.
+        </para>
+
+        <para>
+            As a real world example, the aim is when building an IPK-based image,
+            only the <filename>do_package_write_ipk</filename> tasks would have their 
+            shared state packages fetched and extracted. 
+            Since the sysroot is not used, it would never get extracted. 
+            This is another reason to prefer the task-based approach over a 
+            recipe-based approach, which would have to install the output from every task.
+        </para>
+    </section>
+
+    <section id='tips-and-tricks'>
+        <title>Tips and Tricks</title>
+
+        <para>
+            The code in the Yocto Project that supports incremental builds is not 
+            simple code. 
+            Consequently, when things go wrong, debugging needs to be straightforward. 
+            Because of this, the Yocto Project team included strong debugging
+            tools.
+        </para>
+
+        <para>
+            First, whenever a shared state package is written out, so is a
+            corresponding <filename>.siginfo</filename> file. 
+            This practice results in a pickled python database of all
+            the metadata that went into creating the hash for a given shared state
+            package.
+        </para>
+
+        <para>
+            Second, if BitBake is run with the <filename>--dump-signatures</filename>
+            (or <filename>-S</filename>) option, BitBake dumps out 
+            <filename>.siginfo</filename> files in
+            the stamp directory for every task it would have executed instead of
+            building the target package specified.
+        </para>
+
+        <para>
+            Finally, there is a <filename>bitbake-diffsigs</filename> command that
+            can process these <filename>.siginfo</filename> files. 
+            If one file is specified, it will dump out the dependency
+            information in the file. 
+            If two files are specified, it will compare the
+            two files and dump out the differences between the two.
+            This allows the question of "What changed between X and Y?" to be
+            answered easily.
+        </para>
+    </section>
+</section>
+
+<!--
+
+    <para>
         The Yocto Project build process uses a shared state caching scheme to avoid having to 
         rebuild software when it is not necessary.  
         Because the build time for a Yocto image can be significant, it is helpful to try and 
@@ -222,6 +596,7 @@
         <ulink url='http://git.yoctoproject.org/cgit.cgi/poky/commit/meta/classes/package.bbclass?id=737f8bbb4f27b4837047cb9b4fbfe01dfde36d54'>commit</ulink>.
     </note>
 </section>
+-->
 
 </chapter>
 <!--