diff options
author | dim <dim@FreeBSD.org> | 2012-08-15 19:34:23 +0000 |
---|---|---|
committer | dim <dim@FreeBSD.org> | 2012-08-15 19:34:23 +0000 |
commit | 721c201bd55ffb73cb2ba8d39e0570fa38c44e15 (patch) | |
tree | eacfc83d988e4b9d11114387ae7dc41243f2a363 /docs | |
parent | 2b2816e083a455f7a656ae88b0fd059d1688bb36 (diff) | |
download | FreeBSD-src-721c201bd55ffb73cb2ba8d39e0570fa38c44e15.zip FreeBSD-src-721c201bd55ffb73cb2ba8d39e0570fa38c44e15.tar.gz |
Vendor import of llvm trunk r161861:
http://llvm.org/svn/llvm-project/llvm/trunk@161861
Diffstat (limited to 'docs')
156 files changed, 18869 insertions, 22196 deletions
diff --git a/docs/AliasAnalysis.html b/docs/AliasAnalysis.html deleted file mode 100644 index c59f60d..0000000 --- a/docs/AliasAnalysis.html +++ /dev/null @@ -1,1067 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Alias Analysis Infrastructure</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1> - LLVM Alias Analysis Infrastructure -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - - <li><a href="#overview"><tt>AliasAnalysis</tt> Class Overview</a> - <ul> - <li><a href="#pointers">Representation of Pointers</a></li> - <li><a href="#alias">The <tt>alias</tt> method</a></li> - <li><a href="#ModRefInfo">The <tt>getModRefInfo</tt> methods</a></li> - <li><a href="#OtherItfs">Other useful <tt>AliasAnalysis</tt> methods</a></li> - </ul> - </li> - - <li><a href="#writingnew">Writing a new <tt>AliasAnalysis</tt> Implementation</a> - <ul> - <li><a href="#passsubclasses">Different Pass styles</a></li> - <li><a href="#requiredcalls">Required initialization calls</a></li> - <li><a href="#interfaces">Interfaces which may be specified</a></li> - <li><a href="#chaining"><tt>AliasAnalysis</tt> chaining behavior</a></li> - <li><a href="#updating">Updating analysis results for transformations</a></li> - <li><a href="#implefficiency">Efficiency Issues</a></li> - <li><a href="#limitations">Limitations</a></li> - </ul> - </li> - - <li><a href="#using">Using alias analysis results</a> - <ul> - <li><a href="#memdep">Using the <tt>MemoryDependenceAnalysis</tt> Pass</a></li> - <li><a href="#ast">Using the <tt>AliasSetTracker</tt> class</a></li> - <li><a href="#direct">Using the <tt>AliasAnalysis</tt> interface directly</a></li> - </ul> - </li> - - <li><a href="#exist">Existing alias analysis implementations and clients</a> - <ul> - <li><a href="#impls">Available <tt>AliasAnalysis</tt> implementations</a></li> - <li><a href="#aliasanalysis-xforms">Alias analysis driven transformations</a></li> - <li><a href="#aliasanalysis-debug">Clients for debugging and evaluation of - implementations</a></li> - </ul> - </li> - <li><a href="#memdep">Memory Dependence Analysis</a></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt -to determine whether or not two pointers ever can point to the same object in -memory. There are many different algorithms for alias analysis and many -different ways of classifying them: flow-sensitive vs flow-insensitive, -context-sensitive vs context-insensitive, field-sensitive vs field-insensitive, -unification-based vs subset-based, etc. Traditionally, alias analyses respond -to a query with a <a href="#MustMayNo">Must, May, or No</a> alias response, -indicating that two pointers always point to the same object, might point to the -same object, or are known to never point to the same object.</p> - -<p>The LLVM <a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -class is the primary interface used by clients and implementations of alias -analyses in the LLVM system. This class is the common interface between clients -of alias analysis information and the implementations providing it, and is -designed to support a wide range of implementations and clients (but currently -all clients are assumed to be flow-insensitive). In addition to simple alias -analysis information, this class exposes Mod/Ref information from those -implementations which can provide it, allowing for powerful analyses and -transformations to work well together.</p> - -<p>This document contains information necessary to successfully implement this -interface, use it, and to test both sides. It also explains some of the finer -points about what exactly results mean. If you feel that something is unclear -or should be added, please <a href="mailto:sabre@nondot.org">let me -know</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="overview"><tt>AliasAnalysis</tt> Class Overview</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The <a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -class defines the interface that the various alias analysis implementations -should support. This class exports two important enums: <tt>AliasResult</tt> -and <tt>ModRefResult</tt> which represent the result of an alias query or a -mod/ref query, respectively.</p> - -<p>The <tt>AliasAnalysis</tt> interface exposes information about memory, -represented in several different ways. In particular, memory objects are -represented as a starting address and size, and function calls are represented -as the actual <tt>call</tt> or <tt>invoke</tt> instructions that performs the -call. The <tt>AliasAnalysis</tt> interface also exposes some helper methods -which allow you to get mod/ref information for arbitrary instructions.</p> - -<p>All <tt>AliasAnalysis</tt> interfaces require that in queries involving -multiple values, values which are not -<a href="LangRef.html#constants">constants</a> are all defined within the -same function.</p> - -<!-- ======================================================================= --> -<h3> - <a name="pointers">Representation of Pointers</a> -</h3> - -<div> - -<p>Most importantly, the <tt>AliasAnalysis</tt> class provides several methods -which are used to query whether or not two memory objects alias, whether -function calls can modify or read a memory object, etc. For all of these -queries, memory objects are represented as a pair of their starting address (a -symbolic LLVM <tt>Value*</tt>) and a static size.</p> - -<p>Representing memory objects as a starting address and a size is critically -important for correct Alias Analyses. For example, consider this (silly, but -possible) C code:</p> - -<div class="doc_code"> -<pre> -int i; -char C[2]; -char A[10]; -/* ... */ -for (i = 0; i != 10; ++i) { - C[0] = A[i]; /* One byte store */ - C[1] = A[9-i]; /* One byte store */ -} -</pre> -</div> - -<p>In this case, the <tt>basicaa</tt> pass will disambiguate the stores to -<tt>C[0]</tt> and <tt>C[1]</tt> because they are accesses to two distinct -locations one byte apart, and the accesses are each one byte. In this case, the -LICM pass can use store motion to remove the stores from the loop. In -constrast, the following code:</p> - -<div class="doc_code"> -<pre> -int i; -char C[2]; -char A[10]; -/* ... */ -for (i = 0; i != 10; ++i) { - ((short*)C)[0] = A[i]; /* Two byte store! */ - C[1] = A[9-i]; /* One byte store */ -} -</pre> -</div> - -<p>In this case, the two stores to C do alias each other, because the access to -the <tt>&C[0]</tt> element is a two byte access. If size information wasn't -available in the query, even the first case would have to conservatively assume -that the accesses alias.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="alias">The <tt>alias</tt> method</a> -</h3> - -<div> -<p>The <tt>alias</tt> method is the primary interface used to determine whether -or not two memory objects alias each other. It takes two memory objects as -input and returns MustAlias, PartialAlias, MayAlias, or NoAlias as -appropriate.</p> - -<p>Like all <tt>AliasAnalysis</tt> interfaces, the <tt>alias</tt> method requires -that either the two pointer values be defined within the same function, or at -least one of the values is a <a href="LangRef.html#constants">constant</a>.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="MustMayNo">Must, May, and No Alias Responses</a> -</h4> - -<div> -<p>The NoAlias response may be used when there is never an immediate dependence -between any memory reference <i>based</i> on one pointer and any memory -reference <i>based</i> the other. The most obvious example is when the two -pointers point to non-overlapping memory ranges. Another is when the two -pointers are only ever used for reading memory. Another is when the memory is -freed and reallocated between accesses through one pointer and accesses through -the other -- in this case, there is a dependence, but it's mediated by the free -and reallocation.</p> - -<p>As an exception to this is with the -<a href="LangRef.html#noalias"><tt>noalias</tt></a> keyword; the "irrelevant" -dependencies are ignored.</p> - -<p>The MayAlias response is used whenever the two pointers might refer to the -same object.</p> - -<p>The PartialAlias response is used when the two memory objects are known -to be overlapping in some way, but do not start at the same address.</p> - -<p>The MustAlias response may only be returned if the two memory objects are -guaranteed to always start at exactly the same location. A MustAlias response -implies that the pointers compare equal.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ModRefInfo">The <tt>getModRefInfo</tt> methods</a> -</h3> - -<div> - -<p>The <tt>getModRefInfo</tt> methods return information about whether the -execution of an instruction can read or modify a memory location. Mod/Ref -information is always conservative: if an instruction <b>might</b> read or write -a location, ModRef is returned.</p> - -<p>The <tt>AliasAnalysis</tt> class also provides a <tt>getModRefInfo</tt> -method for testing dependencies between function calls. This method takes two -call sites (CS1 & CS2), returns NoModRef if neither call writes to memory -read or written by the other, Ref if CS1 reads memory written by CS2, Mod if CS1 -writes to memory read or written by CS2, or ModRef if CS1 might read or write -memory written to by CS2. Note that this relation is not commutative.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="OtherItfs">Other useful <tt>AliasAnalysis</tt> methods</a> -</h3> - -<div> - -<p> -Several other tidbits of information are often collected by various alias -analysis implementations and can be put to good use by various clients. -</p> - -<!-- _______________________________________________________________________ --> -<h4> - The <tt>pointsToConstantMemory</tt> method -</h4> - -<div> - -<p>The <tt>pointsToConstantMemory</tt> method returns true if and only if the -analysis can prove that the pointer only points to unchanging memory locations -(functions, constant global variables, and the null pointer). This information -can be used to refine mod/ref information: it is impossible for an unchanging -memory location to be modified.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="simplemodref">The <tt>doesNotAccessMemory</tt> and - <tt>onlyReadsMemory</tt> methods</a> -</h4> - -<div> - -<p>These methods are used to provide very simple mod/ref information for -function calls. The <tt>doesNotAccessMemory</tt> method returns true for a -function if the analysis can prove that the function never reads or writes to -memory, or if the function only reads from constant memory. Functions with this -property are side-effect free and only depend on their input arguments, allowing -them to be eliminated if they form common subexpressions or be hoisted out of -loops. Many common functions behave this way (e.g., <tt>sin</tt> and -<tt>cos</tt>) but many others do not (e.g., <tt>acos</tt>, which modifies the -<tt>errno</tt> variable).</p> - -<p>The <tt>onlyReadsMemory</tt> method returns true for a function if analysis -can prove that (at most) the function only reads from non-volatile memory. -Functions with this property are side-effect free, only depending on their input -arguments and the state of memory when they are called. This property allows -calls to these functions to be eliminated and moved around, as long as there is -no store instruction that changes the contents of memory. Note that all -functions that satisfy the <tt>doesNotAccessMemory</tt> method also satisfies -<tt>onlyReadsMemory</tt>.</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="writingnew">Writing a new <tt>AliasAnalysis</tt> Implementation</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Writing a new alias analysis implementation for LLVM is quite -straight-forward. There are already several implementations that you can use -for examples, and the following information should help fill in any details. -For a examples, take a look at the <a href="#impls">various alias analysis -implementations</a> included with LLVM.</p> - -<!-- ======================================================================= --> -<h3> - <a name="passsubclasses">Different Pass styles</a> -</h3> - -<div> - -<p>The first step to determining what type of <a -href="WritingAnLLVMPass.html">LLVM pass</a> you need to use for your Alias -Analysis. As is the case with most other analyses and transformations, the -answer should be fairly obvious from what type of problem you are trying to -solve:</p> - -<ol> - <li>If you require interprocedural analysis, it should be a - <tt>Pass</tt>.</li> - <li>If you are a function-local analysis, subclass <tt>FunctionPass</tt>.</li> - <li>If you don't need to look at the program at all, subclass - <tt>ImmutablePass</tt>.</li> -</ol> - -<p>In addition to the pass that you subclass, you should also inherit from the -<tt>AliasAnalysis</tt> interface, of course, and use the -<tt>RegisterAnalysisGroup</tt> template to register as an implementation of -<tt>AliasAnalysis</tt>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="requiredcalls">Required initialization calls</a> -</h3> - -<div> - -<p>Your subclass of <tt>AliasAnalysis</tt> is required to invoke two methods on -the <tt>AliasAnalysis</tt> base class: <tt>getAnalysisUsage</tt> and -<tt>InitializeAliasAnalysis</tt>. In particular, your implementation of -<tt>getAnalysisUsage</tt> should explicitly call into the -<tt>AliasAnalysis::getAnalysisUsage</tt> method in addition to doing any -declaring any pass dependencies your pass has. Thus you should have something -like this:</p> - -<div class="doc_code"> -<pre> -void getAnalysisUsage(AnalysisUsage &AU) const { - AliasAnalysis::getAnalysisUsage(AU); - <i>// declare your dependencies here.</i> -} -</pre> -</div> - -<p>Additionally, your must invoke the <tt>InitializeAliasAnalysis</tt> method -from your analysis run method (<tt>run</tt> for a <tt>Pass</tt>, -<tt>runOnFunction</tt> for a <tt>FunctionPass</tt>, or <tt>InitializePass</tt> -for an <tt>ImmutablePass</tt>). For example (as part of a <tt>Pass</tt>):</p> - -<div class="doc_code"> -<pre> -bool run(Module &M) { - InitializeAliasAnalysis(this); - <i>// Perform analysis here...</i> - return false; -} -</pre> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="interfaces">Interfaces which may be specified</a> -</h3> - -<div> - -<p>All of the <a -href="/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -virtual methods default to providing <a href="#chaining">chaining</a> to another -alias analysis implementation, which ends up returning conservatively correct -information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries -respectively). Depending on the capabilities of the analysis you are -implementing, you just override the interfaces you can improve.</p> - -</div> - - - -<!-- ======================================================================= --> -<h3> - <a name="chaining"><tt>AliasAnalysis</tt> chaining behavior</a> -</h3> - -<div> - -<p>With only one special exception (the <a href="#no-aa"><tt>no-aa</tt></a> -pass) every alias analysis pass chains to another alias analysis -implementation (for example, the user can specify "<tt>-basicaa -ds-aa --licm</tt>" to get the maximum benefit from both alias -analyses). The alias analysis class automatically takes care of most of this -for methods that you don't override. For methods that you do override, in code -paths that return a conservative MayAlias or Mod/Ref result, simply return -whatever the superclass computes. For example:</p> - -<div class="doc_code"> -<pre> -AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size, - const Value *V2, unsigned V2Size) { - if (...) - return NoAlias; - ... - - <i>// Couldn't determine a must or no-alias result.</i> - return AliasAnalysis::alias(V1, V1Size, V2, V2Size); -} -</pre> -</div> - -<p>In addition to analysis queries, you must make sure to unconditionally pass -LLVM <a href="#updating">update notification</a> methods to the superclass as -well if you override them, which allows all alias analyses in a change to be -updated.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="updating">Updating analysis results for transformations</a> -</h3> - -<div> -<p> -Alias analysis information is initially computed for a static snapshot of the -program, but clients will use this information to make transformations to the -code. All but the most trivial forms of alias analysis will need to have their -analysis results updated to reflect the changes made by these transformations. -</p> - -<p> -The <tt>AliasAnalysis</tt> interface exposes four methods which are used to -communicate program changes from the clients to the analysis implementations. -Various alias analysis implementations should use these methods to ensure that -their internal data structures are kept up-to-date as the program changes (for -example, when an instruction is deleted), and clients of alias analysis must be -sure to call these interfaces appropriately. -</p> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>deleteValue</tt> method</h4> - -<div> -The <tt>deleteValue</tt> method is called by transformations when they remove an -instruction or any other value from the program (including values that do not -use pointers). Typically alias analyses keep data structures that have entries -for each value in the program. When this method is called, they should remove -any entries for the specified value, if they exist. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>copyValue</tt> method</h4> - -<div> -The <tt>copyValue</tt> method is used when a new value is introduced into the -program. There is no way to introduce a value into the program that did not -exist before (this doesn't make sense for a safe compiler transformation), so -this is the only way to introduce a new value. This method indicates that the -new value has exactly the same properties as the value being copied. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>replaceWithNewValue</tt> method</h4> - -<div> -This method is a simple helper method that is provided to make clients easier to -use. It is implemented by copying the old analysis information to the new -value, then deleting the old value. This method cannot be overridden by alias -analysis implementations. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>addEscapingUse</tt> method</h4> - -<div> -<p>The <tt>addEscapingUse</tt> method is used when the uses of a pointer -value have changed in ways that may invalidate precomputed analysis information. -Implementations may either use this callback to provide conservative responses -for points whose uses have change since analysis time, or may recompute some -or all of their internal state to continue providing accurate responses.</p> - -<p>In general, any new use of a pointer value is considered an escaping use, -and must be reported through this callback, <em>except</em> for the -uses below:</p> - -<ul> - <li>A <tt>bitcast</tt> or <tt>getelementptr</tt> of the pointer</li> - <li>A <tt>store</tt> through the pointer (but not a <tt>store</tt> - <em>of</em> the pointer)</li> - <li>A <tt>load</tt> through the pointer</li> -</ul> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="implefficiency">Efficiency Issues</a> -</h3> - -<div> - -<p>From the LLVM perspective, the only thing you need to do to provide an -efficient alias analysis is to make sure that alias analysis <b>queries</b> are -serviced quickly. The actual calculation of the alias analysis results (the -"run" method) is only performed once, but many (perhaps duplicate) queries may -be performed. Because of this, try to move as much computation to the run -method as possible (within reason).</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="limitations">Limitations</a> -</h3> - -<div> - -<p>The AliasAnalysis infrastructure has several limitations which make -writing a new <tt>AliasAnalysis</tt> implementation difficult.</p> - -<p>There is no way to override the default alias analysis. It would -be very useful to be able to do something like "opt -my-aa -O2" and -have it use -my-aa for all passes which need AliasAnalysis, but there -is currently no support for that, short of changing the source code -and recompiling. Similarly, there is also no way of setting a chain -of analyses as the default.</p> - -<p>There is no way for transform passes to declare that they preserve -<tt>AliasAnalysis</tt> implementations. The <tt>AliasAnalysis</tt> -interface includes <tt>deleteValue</tt> and <tt>copyValue</tt> methods -which are intended to allow a pass to keep an AliasAnalysis consistent, -however there's no way for a pass to declare in its -<tt>getAnalysisUsage</tt> that it does so. Some passes attempt to use -<tt>AU.addPreserved<AliasAnalysis></tt>, however this doesn't -actually have any effect.</p> - -<p><tt>AliasAnalysisCounter</tt> (<tt>-count-aa</tt>) and <tt>AliasDebugger</tt> -(<tt>-debug-aa</tt>) are implemented as <tt>ModulePass</tt> classes, so if your -alias analysis uses <tt>FunctionPass</tt>, it won't be able to use -these utilities. If you try to use them, the pass manager will -silently route alias analysis queries directly to -<tt>BasicAliasAnalysis</tt> instead.</p> - -<p>Similarly, the <tt>opt -p</tt> option introduces <tt>ModulePass</tt> -passes between each pass, which prevents the use of <tt>FunctionPass</tt> -alias analysis passes.</p> - -<p>The <tt>AliasAnalysis</tt> API does have functions for notifying -implementations when values are deleted or copied, however these -aren't sufficient. There are many other ways that LLVM IR can be -modified which could be relevant to <tt>AliasAnalysis</tt> -implementations which can not be expressed.</p> - -<p>The <tt>AliasAnalysisDebugger</tt> utility seems to suggest that -<tt>AliasAnalysis</tt> implementations can expect that they will be -informed of any relevant <tt>Value</tt> before it appears in an -alias query. However, popular clients such as <tt>GVN</tt> don't -support this, and are known to trigger errors when run with the -<tt>AliasAnalysisDebugger</tt>.</p> - -<p>Due to several of the above limitations, the most obvious use for -the <tt>AliasAnalysisCounter</tt> utility, collecting stats on all -alias queries in a compilation, doesn't work, even if the -<tt>AliasAnalysis</tt> implementations don't use <tt>FunctionPass</tt>. -There's no way to set a default, much less a default sequence, -and there's no way to preserve it.</p> - -<p>The <tt>AliasSetTracker</tt> class (which is used by <tt>LICM</tt> -makes a non-deterministic number of alias queries. This can cause stats -collected by <tt>AliasAnalysisCounter</tt> to have fluctuations among -identical runs, for example. Another consequence is that debugging -techniques involving pausing execution after a predetermined number -of queries can be unreliable.</p> - -<p>Many alias queries can be reformulated in terms of other alias -queries. When multiple <tt>AliasAnalysis</tt> queries are chained together, -it would make sense to start those queries from the beginning of the chain, -with care taken to avoid infinite looping, however currently an -implementation which wants to do this can only start such queries -from itself.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="using">Using alias analysis results</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>There are several different ways to use alias analysis results. In order of -preference, these are...</p> - -<!-- ======================================================================= --> -<h3> - <a name="memdep">Using the <tt>MemoryDependenceAnalysis</tt> Pass</a> -</h3> - -<div> - -<p>The <tt>memdep</tt> pass uses alias analysis to provide high-level dependence -information about memory-using instructions. This will tell you which store -feeds into a load, for example. It uses caching and other techniques to be -efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations. -</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ast">Using the <tt>AliasSetTracker</tt> class</a> -</h3> - -<div> - -<p>Many transformations need information about alias <b>sets</b> that are active -in some scope, rather than information about pairwise aliasing. The <tt><a -href="/doxygen/classllvm_1_1AliasSetTracker.html">AliasSetTracker</a></tt> class -is used to efficiently build these Alias Sets from the pairwise alias analysis -information provided by the <tt>AliasAnalysis</tt> interface.</p> - -<p>First you initialize the AliasSetTracker by using the "<tt>add</tt>" methods -to add information about various potentially aliasing instructions in the scope -you are interested in. Once all of the alias sets are completed, your pass -should simply iterate through the constructed alias sets, using the -<tt>AliasSetTracker</tt> <tt>begin()</tt>/<tt>end()</tt> methods.</p> - -<p>The <tt>AliasSet</tt>s formed by the <tt>AliasSetTracker</tt> are guaranteed -to be disjoint, calculate mod/ref information and volatility for the set, and -keep track of whether or not all of the pointers in the set are Must aliases. -The AliasSetTracker also makes sure that sets are properly folded due to call -instructions, and can provide a list of pointers in each set.</p> - -<p>As an example user of this, the <a href="/doxygen/structLICM.html">Loop -Invariant Code Motion</a> pass uses <tt>AliasSetTracker</tt>s to calculate alias -sets for each loop nest. If an <tt>AliasSet</tt> in a loop is not modified, -then all load instructions from that set may be hoisted out of the loop. If any -alias sets are stored to <b>and</b> are must alias sets, then the stores may be -sunk to outside of the loop, promoting the memory location to a register for the -duration of the loop nest. Both of these transformations only apply if the -pointer argument is loop-invariant.</p> - -<!-- _______________________________________________________________________ --> -<h4> - The AliasSetTracker implementation -</h4> - -<div> - -<p>The AliasSetTracker class is implemented to be as efficient as possible. It -uses the union-find algorithm to efficiently merge AliasSets when a pointer is -inserted into the AliasSetTracker that aliases multiple sets. The primary data -structure is a hash table mapping pointers to the AliasSet they are in.</p> - -<p>The AliasSetTracker class must maintain a list of all of the LLVM Value*'s -that are in each AliasSet. Since the hash table already has entries for each -LLVM Value* of interest, the AliasesSets thread the linked list through these -hash-table nodes to avoid having to allocate memory unnecessarily, and to make -merging alias sets extremely efficient (the linked list merge is constant time). -</p> - -<p>You shouldn't need to understand these details if you are just a client of -the AliasSetTracker, but if you look at the code, hopefully this brief -description will help make sense of why things are designed the way they -are.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="direct">Using the <tt>AliasAnalysis</tt> interface directly</a> -</h3> - -<div> - -<p>If neither of these utility class are what your pass needs, you should use -the interfaces exposed by the <tt>AliasAnalysis</tt> class directly. Try to use -the higher-level methods when possible (e.g., use mod/ref information instead of -the <a href="#alias"><tt>alias</tt></a> method directly if possible) to get the -best precision and efficiency.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="exist">Existing alias analysis implementations and clients</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you're going to be working with the LLVM alias analysis infrastructure, -you should know what clients and implementations of alias analysis are -available. In particular, if you are implementing an alias analysis, you should -be aware of the <a href="#aliasanalysis-debug">the clients</a> that are useful -for monitoring and evaluating different implementations.</p> - -<!-- ======================================================================= --> -<h3> - <a name="impls">Available <tt>AliasAnalysis</tt> implementations</a> -</h3> - -<div> - -<p>This section lists the various implementations of the <tt>AliasAnalysis</tt> -interface. With the exception of the <a href="#no-aa"><tt>-no-aa</tt></a> -implementation, all of these <a href="#chaining">chain</a> to other alias -analysis implementations.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="no-aa">The <tt>-no-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-no-aa</tt> pass is just like what it sounds: an alias analysis that -never returns any useful information. This pass can be useful if you think that -alias analysis is doing something wrong and are trying to narrow down a -problem.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="basic-aa">The <tt>-basicaa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-basicaa</tt> pass is an aggressive local analysis that "knows" -many important facts:</p> - -<ul> -<li>Distinct globals, stack allocations, and heap allocations can never - alias.</li> -<li>Globals, stack allocations, and heap allocations never alias the null - pointer.</li> -<li>Different fields of a structure do not alias.</li> -<li>Indexes into arrays with statically differing subscripts cannot alias.</li> -<li>Many common standard C library functions <a - href="#simplemodref">never access memory or only read memory</a>.</li> -<li>Pointers that obviously point to constant globals - "<tt>pointToConstantMemory</tt>".</li> -<li>Function calls can not modify or references stack allocations if they never - escape from the function that allocates them (a common case for automatic - arrays).</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="globalsmodref">The <tt>-globalsmodref-aa</tt> pass</a> -</h4> - -<div> - -<p>This pass implements a simple context-sensitive mod/ref and alias analysis -for internal global variables that don't "have their address taken". If a -global does not have its address taken, the pass knows that no pointers alias -the global. This pass also keeps track of functions that it knows never access -memory or never read memory. This allows certain optimizations (e.g. GVN) to -eliminate call instructions entirely. -</p> - -<p>The real power of this pass is that it provides context-sensitive mod/ref -information for call instructions. This allows the optimizer to know that -calls to a function do not clobber or read the value of the global, allowing -loads and stores to be eliminated.</p> - -<p>Note that this pass is somewhat limited in its scope (only support -non-address taken globals), but is very quick analysis.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="steens-aa">The <tt>-steens-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-steens-aa</tt> pass implements a variation on the well-known -"Steensgaard's algorithm" for interprocedural alias analysis. Steensgaard's -algorithm is a unification-based, flow-insensitive, context-insensitive, and -field-insensitive alias analysis that is also very scalable (effectively linear -time).</p> - -<p>The LLVM <tt>-steens-aa</tt> pass implements a "speculatively -field-<b>sensitive</b>" version of Steensgaard's algorithm using the Data -Structure Analysis framework. This gives it substantially more precision than -the standard algorithm while maintaining excellent analysis scalability.</p> - -<p>Note that <tt>-steens-aa</tt> is available in the optional "poolalloc" -module, it is not part of the LLVM core.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ds-aa">The <tt>-ds-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-ds-aa</tt> pass implements the full Data Structure Analysis -algorithm. Data Structure Analysis is a modular unification-based, -flow-insensitive, context-<b>sensitive</b>, and speculatively -field-<b>sensitive</b> alias analysis that is also quite scalable, usually at -O(n*log(n)).</p> - -<p>This algorithm is capable of responding to a full variety of alias analysis -queries, and can provide context-sensitive mod/ref information as well. The -only major facility not implemented so far is support for must-alias -information.</p> - -<p>Note that <tt>-ds-aa</tt> is available in the optional "poolalloc" -module, it is not part of the LLVM core.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scev-aa">The <tt>-scev-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-scev-aa</tt> pass implements AliasAnalysis queries by -translating them into ScalarEvolution queries. This gives it a -more complete understanding of <tt>getelementptr</tt> instructions -and loop induction variables than other alias analyses have.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="aliasanalysis-xforms">Alias analysis driven transformations</a> -</h3> - -<div> -LLVM includes several alias-analysis driven transformations which can be used -with any of the implementations above. - -<!-- _______________________________________________________________________ --> -<h4> - <a name="adce">The <tt>-adce</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-adce</tt> pass, which implements Aggressive Dead Code Elimination -uses the <tt>AliasAnalysis</tt> interface to delete calls to functions that do -not have side-effects and are not used.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="licm">The <tt>-licm</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-licm</tt> pass implements various Loop Invariant Code Motion related -transformations. It uses the <tt>AliasAnalysis</tt> interface for several -different transformations:</p> - -<ul> -<li>It uses mod/ref information to hoist or sink load instructions out of loops -if there are no instructions in the loop that modifies the memory loaded.</li> - -<li>It uses mod/ref information to hoist function calls out of loops that do not -write to memory and are loop-invariant.</li> - -<li>If uses alias information to promote memory objects that are loaded and -stored to in loops to live in a register instead. It can do this if there are -no may aliases to the loaded/stored memory location.</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="argpromotion">The <tt>-argpromotion</tt> pass</a> -</h4> - -<div> -<p> -The <tt>-argpromotion</tt> pass promotes by-reference arguments to be passed in -by-value instead. In particular, if pointer arguments are only loaded from it -passes in the value loaded instead of the address to the function. This pass -uses alias information to make sure that the value loaded from the argument -pointer is not modified between the entry of the function and any load of the -pointer.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="gvn">The <tt>-gvn</tt>, <tt>-memcpyopt</tt>, and <tt>-dse</tt> - passes</a> -</h4> - -<div> - -<p>These passes use AliasAnalysis information to reason about loads and stores. -</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="aliasanalysis-debug">Clients for debugging and evaluation of - implementations</a> -</h3> - -<div> - -<p>These passes are useful for evaluating the various alias analysis -implementations. You can use them with commands like '<tt>opt -ds-aa --aa-eval foo.bc -disable-output -stats</tt>'.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="print-alias-sets">The <tt>-print-alias-sets</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-print-alias-sets</tt> pass is exposed as part of the -<tt>opt</tt> tool to print out the Alias Sets formed by the <a -href="#ast"><tt>AliasSetTracker</tt></a> class. This is useful if you're using -the <tt>AliasSetTracker</tt> class. To use it, use something like:</p> - -<div class="doc_code"> -<pre> -% opt -ds-aa -print-alias-sets -disable-output -</pre> -</div> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="count-aa">The <tt>-count-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-count-aa</tt> pass is useful to see how many queries a particular -pass is making and what responses are returned by the alias analysis. As an -example,</p> - -<div class="doc_code"> -<pre> -% opt -basicaa -count-aa -ds-aa -count-aa -licm -</pre> -</div> - -<p>will print out how many queries (and what responses are returned) by the -<tt>-licm</tt> pass (of the <tt>-ds-aa</tt> pass) and how many queries are made -of the <tt>-basicaa</tt> pass by the <tt>-ds-aa</tt> pass. This can be useful -when debugging a transformation or an alias analysis implementation.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="aa-eval">The <tt>-aa-eval</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-aa-eval</tt> pass simply iterates through all pairs of pointers in a -function and asks an alias analysis whether or not the pointers alias. This -gives an indication of the precision of the alias analysis. Statistics are -printed indicating the percent of no/may/must aliases found (a more precise -algorithm will have a lower number of may aliases).</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="memdep">Memory Dependence Analysis</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you're just looking to be a client of alias analysis information, consider -using the Memory Dependence Analysis interface instead. MemDep is a lazy, -caching layer on top of alias analysis that is able to answer the question of -what preceding memory operations a given instruction depends on, either at an -intra- or inter-block level. Because of its laziness and caching -policy, using MemDep can be a significant performance win over accessing alias -analysis directly.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-01-31 00:05:41 +0100 (Tue, 31 Jan 2012) $ -</address> - -</body> -</html> diff --git a/docs/AliasAnalysis.rst b/docs/AliasAnalysis.rst new file mode 100644 index 0000000..2d4f291 --- /dev/null +++ b/docs/AliasAnalysis.rst @@ -0,0 +1,702 @@ +.. _alias_analysis: + +================================== +LLVM Alias Analysis Infrastructure +================================== + +.. contents:: + :local: + +Introduction +============ + +Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt to +determine whether or not two pointers ever can point to the same object in +memory. There are many different algorithms for alias analysis and many +different ways of classifying them: flow-sensitive vs. flow-insensitive, +context-sensitive vs. context-insensitive, field-sensitive +vs. field-insensitive, unification-based vs. subset-based, etc. Traditionally, +alias analyses respond to a query with a `Must, May, or No`_ alias response, +indicating that two pointers always point to the same object, might point to the +same object, or are known to never point to the same object. + +The LLVM `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ class is the +primary interface used by clients and implementations of alias analyses in the +LLVM system. This class is the common interface between clients of alias +analysis information and the implementations providing it, and is designed to +support a wide range of implementations and clients (but currently all clients +are assumed to be flow-insensitive). In addition to simple alias analysis +information, this class exposes Mod/Ref information from those implementations +which can provide it, allowing for powerful analyses and transformations to work +well together. + +This document contains information necessary to successfully implement this +interface, use it, and to test both sides. It also explains some of the finer +points about what exactly results mean. If you feel that something is unclear +or should be added, please `let me know <mailto:sabre@nondot.org>`_. + +``AliasAnalysis`` Class Overview +================================ + +The `AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ +class defines the interface that the various alias analysis implementations +should support. This class exports two important enums: ``AliasResult`` and +``ModRefResult`` which represent the result of an alias query or a mod/ref +query, respectively. + +The ``AliasAnalysis`` interface exposes information about memory, represented in +several different ways. In particular, memory objects are represented as a +starting address and size, and function calls are represented as the actual +``call`` or ``invoke`` instructions that performs the call. The +``AliasAnalysis`` interface also exposes some helper methods which allow you to +get mod/ref information for arbitrary instructions. + +All ``AliasAnalysis`` interfaces require that in queries involving multiple +values, values which are not `constants <LangRef.html#constants>`_ are all +defined within the same function. + +Representation of Pointers +-------------------------- + +Most importantly, the ``AliasAnalysis`` class provides several methods which are +used to query whether or not two memory objects alias, whether function calls +can modify or read a memory object, etc. For all of these queries, memory +objects are represented as a pair of their starting address (a symbolic LLVM +``Value*``) and a static size. + +Representing memory objects as a starting address and a size is critically +important for correct Alias Analyses. For example, consider this (silly, but +possible) C code: + +.. code-block:: c++ + + int i; + char C[2]; + char A[10]; + /* ... */ + for (i = 0; i != 10; ++i) { + C[0] = A[i]; /* One byte store */ + C[1] = A[9-i]; /* One byte store */ + } + +In this case, the ``basicaa`` pass will disambiguate the stores to ``C[0]`` and +``C[1]`` because they are accesses to two distinct locations one byte apart, and +the accesses are each one byte. In this case, the Loop Invariant Code Motion +(LICM) pass can use store motion to remove the stores from the loop. In +constrast, the following code: + +.. code-block:: c++ + + int i; + char C[2]; + char A[10]; + /* ... */ + for (i = 0; i != 10; ++i) { + ((short*)C)[0] = A[i]; /* Two byte store! */ + C[1] = A[9-i]; /* One byte store */ + } + +In this case, the two stores to C do alias each other, because the access to the +``&C[0]`` element is a two byte access. If size information wasn't available in +the query, even the first case would have to conservatively assume that the +accesses alias. + +.. _alias: + +The ``alias`` method +-------------------- + +The ``alias`` method is the primary interface used to determine whether or not +two memory objects alias each other. It takes two memory objects as input and +returns MustAlias, PartialAlias, MayAlias, or NoAlias as appropriate. + +Like all ``AliasAnalysis`` interfaces, the ``alias`` method requires that either +the two pointer values be defined within the same function, or at least one of +the values is a `constant <LangRef.html#constants>`_. + +.. _Must, May, or No: + +Must, May, and No Alias Responses +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``NoAlias`` response may be used when there is never an immediate dependence +between any memory reference *based* on one pointer and any memory reference +*based* the other. The most obvious example is when the two pointers point to +non-overlapping memory ranges. Another is when the two pointers are only ever +used for reading memory. Another is when the memory is freed and reallocated +between accesses through one pointer and accesses through the other --- in this +case, there is a dependence, but it's mediated by the free and reallocation. + +As an exception to this is with the `noalias <LangRef.html#noalias>`_ keyword; +the "irrelevant" dependencies are ignored. + +The ``MayAlias`` response is used whenever the two pointers might refer to the +same object. + +The ``PartialAlias`` response is used when the two memory objects are known to +be overlapping in some way, but do not start at the same address. + +The ``MustAlias`` response may only be returned if the two memory objects are +guaranteed to always start at exactly the same location. A ``MustAlias`` +response implies that the pointers compare equal. + +The ``getModRefInfo`` methods +----------------------------- + +The ``getModRefInfo`` methods return information about whether the execution of +an instruction can read or modify a memory location. Mod/Ref information is +always conservative: if an instruction **might** read or write a location, +``ModRef`` is returned. + +The ``AliasAnalysis`` class also provides a ``getModRefInfo`` method for testing +dependencies between function calls. This method takes two call sites (``CS1`` +& ``CS2``), returns ``NoModRef`` if neither call writes to memory read or +written by the other, ``Ref`` if ``CS1`` reads memory written by ``CS2``, +``Mod`` if ``CS1`` writes to memory read or written by ``CS2``, or ``ModRef`` if +``CS1`` might read or write memory written to by ``CS2``. Note that this +relation is not commutative. + +Other useful ``AliasAnalysis`` methods +-------------------------------------- + +Several other tidbits of information are often collected by various alias +analysis implementations and can be put to good use by various clients. + +The ``pointsToConstantMemory`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``pointsToConstantMemory`` method returns true if and only if the analysis +can prove that the pointer only points to unchanging memory locations +(functions, constant global variables, and the null pointer). This information +can be used to refine mod/ref information: it is impossible for an unchanging +memory location to be modified. + +.. _never access memory or only read memory: + +The ``doesNotAccessMemory`` and ``onlyReadsMemory`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These methods are used to provide very simple mod/ref information for function +calls. The ``doesNotAccessMemory`` method returns true for a function if the +analysis can prove that the function never reads or writes to memory, or if the +function only reads from constant memory. Functions with this property are +side-effect free and only depend on their input arguments, allowing them to be +eliminated if they form common subexpressions or be hoisted out of loops. Many +common functions behave this way (e.g., ``sin`` and ``cos``) but many others do +not (e.g., ``acos``, which modifies the ``errno`` variable). + +The ``onlyReadsMemory`` method returns true for a function if analysis can prove +that (at most) the function only reads from non-volatile memory. Functions with +this property are side-effect free, only depending on their input arguments and +the state of memory when they are called. This property allows calls to these +functions to be eliminated and moved around, as long as there is no store +instruction that changes the contents of memory. Note that all functions that +satisfy the ``doesNotAccessMemory`` method also satisfies ``onlyReadsMemory``. + +Writing a new ``AliasAnalysis`` Implementation +============================================== + +Writing a new alias analysis implementation for LLVM is quite straight-forward. +There are already several implementations that you can use for examples, and the +following information should help fill in any details. For a examples, take a +look at the `various alias analysis implementations`_ included with LLVM. + +Different Pass styles +--------------------- + +The first step to determining what type of `LLVM pass <WritingAnLLVMPass.html>`_ +you need to use for your Alias Analysis. As is the case with most other +analyses and transformations, the answer should be fairly obvious from what type +of problem you are trying to solve: + +#. If you require interprocedural analysis, it should be a ``Pass``. +#. If you are a function-local analysis, subclass ``FunctionPass``. +#. If you don't need to look at the program at all, subclass ``ImmutablePass``. + +In addition to the pass that you subclass, you should also inherit from the +``AliasAnalysis`` interface, of course, and use the ``RegisterAnalysisGroup`` +template to register as an implementation of ``AliasAnalysis``. + +Required initialization calls +----------------------------- + +Your subclass of ``AliasAnalysis`` is required to invoke two methods on the +``AliasAnalysis`` base class: ``getAnalysisUsage`` and +``InitializeAliasAnalysis``. In particular, your implementation of +``getAnalysisUsage`` should explicitly call into the +``AliasAnalysis::getAnalysisUsage`` method in addition to doing any declaring +any pass dependencies your pass has. Thus you should have something like this: + +.. code-block:: c++ + + void getAnalysisUsage(AnalysisUsage &AU) const { + AliasAnalysis::getAnalysisUsage(AU); + // declare your dependencies here. + } + +Additionally, your must invoke the ``InitializeAliasAnalysis`` method from your +analysis run method (``run`` for a ``Pass``, ``runOnFunction`` for a +``FunctionPass``, or ``InitializePass`` for an ``ImmutablePass``). For example +(as part of a ``Pass``): + +.. code-block:: c++ + + bool run(Module &M) { + InitializeAliasAnalysis(this); + // Perform analysis here... + return false; + } + +Interfaces which may be specified +--------------------------------- + +All of the `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ virtual methods +default to providing `chaining`_ to another alias analysis implementation, which +ends up returning conservatively correct information (returning "May" Alias and +"Mod/Ref" for alias and mod/ref queries respectively). Depending on the +capabilities of the analysis you are implementing, you just override the +interfaces you can improve. + +.. _chaining: +.. _chain: + +``AliasAnalysis`` chaining behavior +----------------------------------- + +With only one special exception (the `no-aa`_ pass) every alias analysis pass +chains to another alias analysis implementation (for example, the user can +specify "``-basicaa -ds-aa -licm``" to get the maximum benefit from both alias +analyses). The alias analysis class automatically takes care of most of this +for methods that you don't override. For methods that you do override, in code +paths that return a conservative MayAlias or Mod/Ref result, simply return +whatever the superclass computes. For example: + +.. code-block:: c++ + + AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size, + const Value *V2, unsigned V2Size) { + if (...) + return NoAlias; + ... + + // Couldn't determine a must or no-alias result. + return AliasAnalysis::alias(V1, V1Size, V2, V2Size); + } + +In addition to analysis queries, you must make sure to unconditionally pass LLVM +`update notification`_ methods to the superclass as well if you override them, +which allows all alias analyses in a change to be updated. + +.. _update notification: + +Updating analysis results for transformations +--------------------------------------------- + +Alias analysis information is initially computed for a static snapshot of the +program, but clients will use this information to make transformations to the +code. All but the most trivial forms of alias analysis will need to have their +analysis results updated to reflect the changes made by these transformations. + +The ``AliasAnalysis`` interface exposes four methods which are used to +communicate program changes from the clients to the analysis implementations. +Various alias analysis implementations should use these methods to ensure that +their internal data structures are kept up-to-date as the program changes (for +example, when an instruction is deleted), and clients of alias analysis must be +sure to call these interfaces appropriately. + +The ``deleteValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``deleteValue`` method is called by transformations when they remove an +instruction or any other value from the program (including values that do not +use pointers). Typically alias analyses keep data structures that have entries +for each value in the program. When this method is called, they should remove +any entries for the specified value, if they exist. + +The ``copyValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``copyValue`` method is used when a new value is introduced into the +program. There is no way to introduce a value into the program that did not +exist before (this doesn't make sense for a safe compiler transformation), so +this is the only way to introduce a new value. This method indicates that the +new value has exactly the same properties as the value being copied. + +The ``replaceWithNewValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This method is a simple helper method that is provided to make clients easier to +use. It is implemented by copying the old analysis information to the new +value, then deleting the old value. This method cannot be overridden by alias +analysis implementations. + +The ``addEscapingUse`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``addEscapingUse`` method is used when the uses of a pointer value have +changed in ways that may invalidate precomputed analysis information. +Implementations may either use this callback to provide conservative responses +for points whose uses have change since analysis time, or may recompute some or +all of their internal state to continue providing accurate responses. + +In general, any new use of a pointer value is considered an escaping use, and +must be reported through this callback, *except* for the uses below: + +* A ``bitcast`` or ``getelementptr`` of the pointer +* A ``store`` through the pointer (but not a ``store`` *of* the pointer) +* A ``load`` through the pointer + +Efficiency Issues +----------------- + +From the LLVM perspective, the only thing you need to do to provide an efficient +alias analysis is to make sure that alias analysis **queries** are serviced +quickly. The actual calculation of the alias analysis results (the "run" +method) is only performed once, but many (perhaps duplicate) queries may be +performed. Because of this, try to move as much computation to the run method +as possible (within reason). + +Limitations +----------- + +The AliasAnalysis infrastructure has several limitations which make writing a +new ``AliasAnalysis`` implementation difficult. + +There is no way to override the default alias analysis. It would be very useful +to be able to do something like "``opt -my-aa -O2``" and have it use ``-my-aa`` +for all passes which need AliasAnalysis, but there is currently no support for +that, short of changing the source code and recompiling. Similarly, there is +also no way of setting a chain of analyses as the default. + +There is no way for transform passes to declare that they preserve +``AliasAnalysis`` implementations. The ``AliasAnalysis`` interface includes +``deleteValue`` and ``copyValue`` methods which are intended to allow a pass to +keep an AliasAnalysis consistent, however there's no way for a pass to declare +in its ``getAnalysisUsage`` that it does so. Some passes attempt to use +``AU.addPreserved<AliasAnalysis>``, however this doesn't actually have any +effect. + +``AliasAnalysisCounter`` (``-count-aa``) and ``AliasDebugger`` (``-debug-aa``) +are implemented as ``ModulePass`` classes, so if your alias analysis uses +``FunctionPass``, it won't be able to use these utilities. If you try to use +them, the pass manager will silently route alias analysis queries directly to +``BasicAliasAnalysis`` instead. + +Similarly, the ``opt -p`` option introduces ``ModulePass`` passes between each +pass, which prevents the use of ``FunctionPass`` alias analysis passes. + +The ``AliasAnalysis`` API does have functions for notifying implementations when +values are deleted or copied, however these aren't sufficient. There are many +other ways that LLVM IR can be modified which could be relevant to +``AliasAnalysis`` implementations which can not be expressed. + +The ``AliasAnalysisDebugger`` utility seems to suggest that ``AliasAnalysis`` +implementations can expect that they will be informed of any relevant ``Value`` +before it appears in an alias query. However, popular clients such as ``GVN`` +don't support this, and are known to trigger errors when run with the +``AliasAnalysisDebugger``. + +Due to several of the above limitations, the most obvious use for the +``AliasAnalysisCounter`` utility, collecting stats on all alias queries in a +compilation, doesn't work, even if the ``AliasAnalysis`` implementations don't +use ``FunctionPass``. There's no way to set a default, much less a default +sequence, and there's no way to preserve it. + +The ``AliasSetTracker`` class (which is used by ``LICM``) makes a +non-deterministic number of alias queries. This can cause stats collected by +``AliasAnalysisCounter`` to have fluctuations among identical runs, for +example. Another consequence is that debugging techniques involving pausing +execution after a predetermined number of queries can be unreliable. + +Many alias queries can be reformulated in terms of other alias queries. When +multiple ``AliasAnalysis`` queries are chained together, it would make sense to +start those queries from the beginning of the chain, with care taken to avoid +infinite looping, however currently an implementation which wants to do this can +only start such queries from itself. + +Using alias analysis results +============================ + +There are several different ways to use alias analysis results. In order of +preference, these are: + +Using the ``MemoryDependenceAnalysis`` Pass +------------------------------------------- + +The ``memdep`` pass uses alias analysis to provide high-level dependence +information about memory-using instructions. This will tell you which store +feeds into a load, for example. It uses caching and other techniques to be +efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations. + +.. _AliasSetTracker: + +Using the ``AliasSetTracker`` class +----------------------------------- + +Many transformations need information about alias **sets** that are active in +some scope, rather than information about pairwise aliasing. The +`AliasSetTracker <http://llvm.org/doxygen/classllvm_1_1AliasSetTracker.html>`__ +class is used to efficiently build these Alias Sets from the pairwise alias +analysis information provided by the ``AliasAnalysis`` interface. + +First you initialize the AliasSetTracker by using the "``add``" methods to add +information about various potentially aliasing instructions in the scope you are +interested in. Once all of the alias sets are completed, your pass should +simply iterate through the constructed alias sets, using the ``AliasSetTracker`` +``begin()``/``end()`` methods. + +The ``AliasSet``\s formed by the ``AliasSetTracker`` are guaranteed to be +disjoint, calculate mod/ref information and volatility for the set, and keep +track of whether or not all of the pointers in the set are Must aliases. The +AliasSetTracker also makes sure that sets are properly folded due to call +instructions, and can provide a list of pointers in each set. + +As an example user of this, the `Loop Invariant Code Motion +<doxygen/structLICM.html>`_ pass uses ``AliasSetTracker``\s to calculate alias +sets for each loop nest. If an ``AliasSet`` in a loop is not modified, then all +load instructions from that set may be hoisted out of the loop. If any alias +sets are stored to **and** are must alias sets, then the stores may be sunk +to outside of the loop, promoting the memory location to a register for the +duration of the loop nest. Both of these transformations only apply if the +pointer argument is loop-invariant. + +The AliasSetTracker implementation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The AliasSetTracker class is implemented to be as efficient as possible. It +uses the union-find algorithm to efficiently merge AliasSets when a pointer is +inserted into the AliasSetTracker that aliases multiple sets. The primary data +structure is a hash table mapping pointers to the AliasSet they are in. + +The AliasSetTracker class must maintain a list of all of the LLVM ``Value*``\s +that are in each AliasSet. Since the hash table already has entries for each +LLVM ``Value*`` of interest, the AliasesSets thread the linked list through +these hash-table nodes to avoid having to allocate memory unnecessarily, and to +make merging alias sets extremely efficient (the linked list merge is constant +time). + +You shouldn't need to understand these details if you are just a client of the +AliasSetTracker, but if you look at the code, hopefully this brief description +will help make sense of why things are designed the way they are. + +Using the ``AliasAnalysis`` interface directly +---------------------------------------------- + +If neither of these utility class are what your pass needs, you should use the +interfaces exposed by the ``AliasAnalysis`` class directly. Try to use the +higher-level methods when possible (e.g., use mod/ref information instead of the +`alias`_ method directly if possible) to get the best precision and efficiency. + +Existing alias analysis implementations and clients +=================================================== + +If you're going to be working with the LLVM alias analysis infrastructure, you +should know what clients and implementations of alias analysis are available. +In particular, if you are implementing an alias analysis, you should be aware of +the `the clients`_ that are useful for monitoring and evaluating different +implementations. + +.. _various alias analysis implementations: + +Available ``AliasAnalysis`` implementations +------------------------------------------- + +This section lists the various implementations of the ``AliasAnalysis`` +interface. With the exception of the `-no-aa`_ implementation, all of these +`chain`_ to other alias analysis implementations. + +.. _no-aa: +.. _-no-aa: + +The ``-no-aa`` pass +^^^^^^^^^^^^^^^^^^^ + +The ``-no-aa`` pass is just like what it sounds: an alias analysis that never +returns any useful information. This pass can be useful if you think that alias +analysis is doing something wrong and are trying to narrow down a problem. + +The ``-basicaa`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-basicaa`` pass is an aggressive local analysis that *knows* many +important facts: + +* Distinct globals, stack allocations, and heap allocations can never alias. +* Globals, stack allocations, and heap allocations never alias the null pointer. +* Different fields of a structure do not alias. +* Indexes into arrays with statically differing subscripts cannot alias. +* Many common standard C library functions `never access memory or only read + memory`_. +* Pointers that obviously point to constant globals "``pointToConstantMemory``". +* Function calls can not modify or references stack allocations if they never + escape from the function that allocates them (a common case for automatic + arrays). + +The ``-globalsmodref-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This pass implements a simple context-sensitive mod/ref and alias analysis for +internal global variables that don't "have their address taken". If a global +does not have its address taken, the pass knows that no pointers alias the +global. This pass also keeps track of functions that it knows never access +memory or never read memory. This allows certain optimizations (e.g. GVN) to +eliminate call instructions entirely. + +The real power of this pass is that it provides context-sensitive mod/ref +information for call instructions. This allows the optimizer to know that calls +to a function do not clobber or read the value of the global, allowing loads and +stores to be eliminated. + +.. note:: + + This pass is somewhat limited in its scope (only support non-address taken + globals), but is very quick analysis. + +The ``-steens-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-steens-aa`` pass implements a variation on the well-known "Steensgaard's +algorithm" for interprocedural alias analysis. Steensgaard's algorithm is a +unification-based, flow-insensitive, context-insensitive, and field-insensitive +alias analysis that is also very scalable (effectively linear time). + +The LLVM ``-steens-aa`` pass implements a "speculatively field-**sensitive**" +version of Steensgaard's algorithm using the Data Structure Analysis framework. +This gives it substantially more precision than the standard algorithm while +maintaining excellent analysis scalability. + +.. note:: + + ``-steens-aa`` is available in the optional "poolalloc" module. It is not part + of the LLVM core. + +The ``-ds-aa`` pass +^^^^^^^^^^^^^^^^^^^ + +The ``-ds-aa`` pass implements the full Data Structure Analysis algorithm. Data +Structure Analysis is a modular unification-based, flow-insensitive, +context-**sensitive**, and speculatively field-**sensitive** alias +analysis that is also quite scalable, usually at ``O(n * log(n))``. + +This algorithm is capable of responding to a full variety of alias analysis +queries, and can provide context-sensitive mod/ref information as well. The +only major facility not implemented so far is support for must-alias +information. + +.. note:: + + ``-ds-aa`` is available in the optional "poolalloc" module. It is not part of + the LLVM core. + +The ``-scev-aa`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-scev-aa`` pass implements AliasAnalysis queries by translating them into +ScalarEvolution queries. This gives it a more complete understanding of +``getelementptr`` instructions and loop induction variables than other alias +analyses have. + +Alias analysis driven transformations +------------------------------------- + +LLVM includes several alias-analysis driven transformations which can be used +with any of the implementations above. + +The ``-adce`` pass +^^^^^^^^^^^^^^^^^^ + +The ``-adce`` pass, which implements Aggressive Dead Code Elimination uses the +``AliasAnalysis`` interface to delete calls to functions that do not have +side-effects and are not used. + +The ``-licm`` pass +^^^^^^^^^^^^^^^^^^ + +The ``-licm`` pass implements various Loop Invariant Code Motion related +transformations. It uses the ``AliasAnalysis`` interface for several different +transformations: + +* It uses mod/ref information to hoist or sink load instructions out of loops if + there are no instructions in the loop that modifies the memory loaded. + +* It uses mod/ref information to hoist function calls out of loops that do not + write to memory and are loop-invariant. + +* If uses alias information to promote memory objects that are loaded and stored + to in loops to live in a register instead. It can do this if there are no may + aliases to the loaded/stored memory location. + +The ``-argpromotion`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-argpromotion`` pass promotes by-reference arguments to be passed in +by-value instead. In particular, if pointer arguments are only loaded from it +passes in the value loaded instead of the address to the function. This pass +uses alias information to make sure that the value loaded from the argument +pointer is not modified between the entry of the function and any load of the +pointer. + +The ``-gvn``, ``-memcpyopt``, and ``-dse`` passes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These passes use AliasAnalysis information to reason about loads and stores. + +.. _the clients: + +Clients for debugging and evaluation of implementations +------------------------------------------------------- + +These passes are useful for evaluating the various alias analysis +implementations. You can use them with commands like: + +.. code-block:: bash + + % opt -ds-aa -aa-eval foo.bc -disable-output -stats + +The ``-print-alias-sets`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-print-alias-sets`` pass is exposed as part of the ``opt`` tool to print +out the Alias Sets formed by the `AliasSetTracker`_ class. This is useful if +you're using the ``AliasSetTracker`` class. To use it, use something like: + +.. code-block:: bash + + % opt -ds-aa -print-alias-sets -disable-output + +The ``-count-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^ + +The ``-count-aa`` pass is useful to see how many queries a particular pass is +making and what responses are returned by the alias analysis. As an example: + +.. code-block:: bash + + % opt -basicaa -count-aa -ds-aa -count-aa -licm + +will print out how many queries (and what responses are returned) by the +``-licm`` pass (of the ``-ds-aa`` pass) and how many queries are made of the +``-basicaa`` pass by the ``-ds-aa`` pass. This can be useful when debugging a +transformation or an alias analysis implementation. + +The ``-aa-eval`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-aa-eval`` pass simply iterates through all pairs of pointers in a +function and asks an alias analysis whether or not the pointers alias. This +gives an indication of the precision of the alias analysis. Statistics are +printed indicating the percent of no/may/must aliases found (a more precise +algorithm will have a lower number of may aliases). + +Memory Dependence Analysis +========================== + +If you're just looking to be a client of alias analysis information, consider +using the Memory Dependence Analysis interface instead. MemDep is a lazy, +caching layer on top of alias analysis that is able to answer the question of +what preceding memory operations a given instruction depends on, either at an +intra- or inter-block level. Because of its laziness and caching policy, using +MemDep can be a significant performance win over accessing alias analysis +directly. diff --git a/docs/Atomics.html b/docs/Atomics.html deleted file mode 100644 index fc15e27..0000000 --- a/docs/Atomics.html +++ /dev/null @@ -1,569 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <title>LLVM Atomic Instructions and Concurrency Guide</title> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1> - LLVM Atomic Instructions and Concurrency Guide -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - <li><a href="#outsideatomic">Optimization outside atomic</a></li> - <li><a href="#atomicinst">Atomic instructions</a></li> - <li><a href="#ordering">Atomic orderings</a></li> - <li><a href="#iropt">Atomics and IR optimization</a></li> - <li><a href="#codegen">Atomics and Codegen</a></li> -</ol> - -<div class="doc_author"> - <p>Written by Eli Friedman</p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Historically, LLVM has not had very strong support for concurrency; some -minimal intrinsics were provided, and <code>volatile</code> was used in some -cases to achieve rough semantics in the presence of concurrency. However, this -is changing; there are now new instructions which are well-defined in the -presence of threads and asynchronous signals, and the model for existing -instructions has been clarified in the IR.</p> - -<p>The atomic instructions are designed specifically to provide readable IR and - optimized code generation for the following:</p> -<ul> - <li>The new C++0x <code><atomic></code> header. - (<a href="http://www.open-std.org/jtc1/sc22/wg21/">C++0x draft available here</a>.) - (<a href="http://www.open-std.org/jtc1/sc22/wg14/">C1x draft available here</a>)</li> - <li>Proper semantics for Java-style memory, for both <code>volatile</code> and - regular shared variables. - (<a href="http://java.sun.com/docs/books/jls/third_edition/html/memory.html">Java Specification</a>)</li> - <li>gcc-compatible <code>__sync_*</code> builtins. - (<a href="http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html">Description</a>)</li> - <li>Other scenarios with atomic semantics, including <code>static</code> - variables with non-trivial constructors in C++.</li> -</ul> - -<p>Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ - volatile, which ensures that every volatile load and store happens and is - performed in the stated order. A couple examples: if a - SequentiallyConsistent store is immediately followed by another - SequentiallyConsistent store to the same address, the first store can - be erased. This transformation is not allowed for a pair of volatile - stores. On the other hand, a non-volatile non-atomic load can be moved - across a volatile load freely, but not an Acquire load.</p> - -<p>This document is intended to provide a guide to anyone either writing a - frontend for LLVM or working on optimization passes for LLVM with a guide - for how to deal with instructions with special semantics in the presence of - concurrency. This is not intended to be a precise guide to the semantics; - the details can get extremely complicated and unreadable, and are not - usually necessary.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="outsideatomic">Optimization outside atomic</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The basic <code>'load'</code> and <code>'store'</code> allow a variety of - optimizations, but can lead to undefined results in a concurrent environment; - see <a href="#o_nonatomic">NonAtomic</a>. This section specifically goes - into the one optimizer restriction which applies in concurrent environments, - which gets a bit more of an extended description because any optimization - dealing with stores needs to be aware of it.</p> - -<p>From the optimizer's point of view, the rule is that if there - are not any instructions with atomic ordering involved, concurrency does - not matter, with one exception: if a variable might be visible to another - thread or signal handler, a store cannot be inserted along a path where it - might not execute otherwise. Take the following example:</p> - -<pre> -/* C code, for readability; run through clang -O2 -S -emit-llvm to get - equivalent IR */ -int x; -void f(int* a) { - for (int i = 0; i < 100; i++) { - if (a[i]) - x += 1; - } -} -</pre> - -<p>The following is equivalent in non-concurrent situations:</p> - -<pre> -int x; -void f(int* a) { - int xtemp = x; - for (int i = 0; i < 100; i++) { - if (a[i]) - xtemp += 1; - } - x = xtemp; -} -</pre> - -<p>However, LLVM is not allowed to transform the former to the latter: it could - indirectly introduce undefined behavior if another thread can access x at - the same time. (This example is particularly of interest because before the - concurrency model was implemented, LLVM would perform this - transformation.)</p> - -<p>Note that speculative loads are allowed; a load which - is part of a race returns <code>undef</code>, but does not have undefined - behavior.</p> - - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="atomicinst">Atomic instructions</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>For cases where simple loads and stores are not sufficient, LLVM provides - various atomic instructions. The exact guarantees provided depend on the - ordering; see <a href="#ordering">Atomic orderings</a></p> - -<p><code>load atomic</code> and <code>store atomic</code> provide the same - basic functionality as non-atomic loads and stores, but provide additional - guarantees in situations where threads and signals are involved.</p> - -<p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an - atomic load followed by an atomic store (where the store is conditional for - <code>cmpxchg</code>), but no other memory operation can happen on any thread - between the load and store. Note that LLVM's cmpxchg does not provide quite - as many options as the C++0x version.</p> - -<p>A <code>fence</code> provides Acquire and/or Release ordering which is not - part of another operation; it is normally used along with Monotonic memory - operations. A Monotonic load followed by an Acquire fence is roughly - equivalent to an Acquire load.</p> - -<p>Frontends generating atomic instructions generally need to be aware of the - target to some degree; atomic instructions are guaranteed to be lock-free, - and therefore an instruction which is wider than the target natively supports - can be impossible to generate.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="ordering">Atomic orderings</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>In order to achieve a balance between performance and necessary guarantees, - there are six levels of atomicity. They are listed in order of strength; - each level includes all the guarantees of the previous level except for - Acquire/Release. (See also <a href="LangRef.html#ordering">LangRef</a>.)</p> - -<!-- ======================================================================= --> -<h3> - <a name="o_notatomic">NotAtomic</a> -</h3> - -<div> - -<p>NotAtomic is the obvious, a load or store which is not atomic. (This isn't - really a level of atomicity, but is listed here for comparison.) This is - essentially a regular load or store. If there is a race on a given memory - location, loads from that location return undef.</p> - -<dl> - <dt>Relevant standard</dt> - <dd>This is intended to match shared variables in C/C++, and to be used - in any other context where memory access is necessary, and - a race is impossible. (The precise definition is in - <a href="LangRef.html#memmodel">LangRef</a>.) - <dt>Notes for frontends</dt> - <dd>The rule is essentially that all memory accessed with basic loads and - stores by multiple threads should be protected by a lock or other - synchronization; otherwise, you are likely to run into undefined - behavior. If your frontend is for a "safe" language like Java, - use Unordered to load and store any shared variable. Note that NotAtomic - volatile loads and stores are not properly atomic; do not try to use - them as a substitute. (Per the C/C++ standards, volatile does provide - some limited guarantees around asynchronous signals, but atomics are - generally a better solution.) - <dt>Notes for optimizers</dt> - <dd>Introducing loads to shared variables along a codepath where they would - not otherwise exist is allowed; introducing stores to shared variables - is not. See <a href="#outsideatomic">Optimization outside - atomic</a>.</dd> - <dt>Notes for code generation</dt> - <dd>The one interesting restriction here is that it is not allowed to write - to bytes outside of the bytes relevant to a store. This is mostly - relevant to unaligned stores: it is not allowed in general to convert - an unaligned store into two aligned stores of the same width as the - unaligned store. Backends are also expected to generate an i8 store - as an i8 store, and not an instruction which writes to surrounding - bytes. (If you are writing a backend for an architecture which cannot - satisfy these restrictions and cares about concurrency, please send an - email to llvmdev.)</dd> -</dl> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="o_unordered">Unordered</a> -</h3> - -<div> - -<p>Unordered is the lowest level of atomicity. It essentially guarantees that - races produce somewhat sane results instead of having undefined behavior. - It also guarantees the operation to be lock-free, so it do not depend on - the data being part of a special atomic structure or depend on a separate - per-process global lock. Note that code generation will fail for - unsupported atomic operations; if you need such an operation, use explicit - locking.</p> - -<dl> - <dt>Relevant standard</dt> - <dd>This is intended to match the Java memory model for shared - variables.</dd> - <dt>Notes for frontends</dt> - <dd>This cannot be used for synchronization, but is useful for Java and - other "safe" languages which need to guarantee that the generated - code never exhibits undefined behavior. Note that this guarantee - is cheap on common platforms for loads of a native width, but can - be expensive or unavailable for wider loads, like a 64-bit store - on ARM. (A frontend for Java or other "safe" languages would normally - split a 64-bit store on ARM into two 32-bit unordered stores.) - <dt>Notes for optimizers</dt> - <dd>In terms of the optimizer, this prohibits any transformation that - transforms a single load into multiple loads, transforms a store - into multiple stores, narrows a store, or stores a value which - would not be stored otherwise. Some examples of unsafe optimizations - are narrowing an assignment into a bitfield, rematerializing - a load, and turning loads and stores into a memcpy call. Reordering - unordered operations is safe, though, and optimizers should take - advantage of that because unordered operations are common in - languages that need them.</dd> - <dt>Notes for code generation</dt> - <dd>These operations are required to be atomic in the sense that if you - use unordered loads and unordered stores, a load cannot see a value - which was never stored. A normal load or store instruction is usually - sufficient, but note that an unordered load or store cannot - be split into multiple instructions (or an instruction which - does multiple memory operations, like <code>LDRD</code> on ARM).</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="o_monotonic">Monotonic</a> -</h3> - -<div> - -<p>Monotonic is the weakest level of atomicity that can be used in - synchronization primitives, although it does not provide any general - synchronization. It essentially guarantees that if you take all the - operations affecting a specific address, a consistent ordering exists. - -<dl> - <dt>Relevant standard</dt> - <dd>This corresponds to the C++0x/C1x <code>memory_order_relaxed</code>; - see those standards for the exact definition. - <dt>Notes for frontends</dt> - <dd>If you are writing a frontend which uses this directly, use with caution. - The guarantees in terms of synchronization are very weak, so make - sure these are only used in a pattern which you know is correct. - Generally, these would either be used for atomic operations which - do not protect other memory (like an atomic counter), or along with - a <code>fence</code>.</dd> - <dt>Notes for optimizers</dt> - <dd>In terms of the optimizer, this can be treated as a read+write on the - relevant memory location (and alias analysis will take advantage of - that). In addition, it is legal to reorder non-atomic and Unordered - loads around Monotonic loads. CSE/DSE and a few other optimizations - are allowed, but Monotonic operations are unlikely to be used in ways - which would make those optimizations useful.</dd> - <dt>Notes for code generation</dt> - <dd>Code generation is essentially the same as that for unordered for loads - and stores. No fences are required. <code>cmpxchg</code> and - <code>atomicrmw</code> are required to appear as a single operation.</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="o_acquire">Acquire</a> -</h3> - -<div> - -<p>Acquire provides a barrier of the sort necessary to acquire a lock to access - other memory with normal loads and stores. - -<dl> - <dt>Relevant standard</dt> - <dd>This corresponds to the C++0x/C1x <code>memory_order_acquire</code>. It - should also be used for C++0x/C1x <code>memory_order_consume</code>. - <dt>Notes for frontends</dt> - <dd>If you are writing a frontend which uses this directly, use with caution. - Acquire only provides a semantic guarantee when paired with a Release - operation.</dd> - <dt>Notes for optimizers</dt> - <dd>Optimizers not aware of atomics can treat this like a nothrow call. - It is also possible to move stores from before an Acquire load - or read-modify-write operation to after it, and move non-Acquire - loads from before an Acquire operation to after it.</dd> - <dt>Notes for code generation</dt> - <dd>Architectures with weak memory ordering (essentially everything relevant - today except x86 and SPARC) require some sort of fence to maintain - the Acquire semantics. The precise fences required varies widely by - architecture, but for a simple implementation, most architectures provide - a barrier which is strong enough for everything (<code>dmb</code> on ARM, - <code>sync</code> on PowerPC, etc.). Putting such a fence after the - equivalent Monotonic operation is sufficient to maintain Acquire - semantics for a memory operation.</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="o_acquire">Release</a> -</h3> - -<div> - -<p>Release is similar to Acquire, but with a barrier of the sort necessary to - release a lock. - -<dl> - <dt>Relevant standard</dt> - <dd>This corresponds to the C++0x/C1x <code>memory_order_release</code>.</dd> - <dt>Notes for frontends</dt> - <dd>If you are writing a frontend which uses this directly, use with caution. - Release only provides a semantic guarantee when paired with a Acquire - operation.</dd> - <dt>Notes for optimizers</dt> - <dd>Optimizers not aware of atomics can treat this like a nothrow call. - It is also possible to move loads from after a Release store - or read-modify-write operation to before it, and move non-Release - stores from after an Release operation to before it.</dd> - <dt>Notes for code generation</dt> - <dd>See the section on Acquire; a fence before the relevant operation is - usually sufficient for Release. Note that a store-store fence is not - sufficient to implement Release semantics; store-store fences are - generally not exposed to IR because they are extremely difficult to - use correctly.</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="o_acqrel">AcquireRelease</a> -</h3> - -<div> - -<p>AcquireRelease (<code>acq_rel</code> in IR) provides both an Acquire and a - Release barrier (for fences and operations which both read and write memory). - -<dl> - <dt>Relevant standard</dt> - <dd>This corresponds to the C++0x/C1x <code>memory_order_acq_rel</code>. - <dt>Notes for frontends</dt> - <dd>If you are writing a frontend which uses this directly, use with caution. - Acquire only provides a semantic guarantee when paired with a Release - operation, and vice versa.</dd> - <dt>Notes for optimizers</dt> - <dd>In general, optimizers should treat this like a nothrow call; the - the possible optimizations are usually not interesting.</dd> - <dt>Notes for code generation</dt> - <dd>This operation has Acquire and Release semantics; see the sections on - Acquire and Release.</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="o_seqcst">SequentiallyConsistent</a> -</h3> - -<div> - -<p>SequentiallyConsistent (<code>seq_cst</code> in IR) provides - Acquire semantics for loads and Release semantics for - stores. Additionally, it guarantees that a total ordering exists - between all SequentiallyConsistent operations. - -<dl> - <dt>Relevant standard</dt> - <dd>This corresponds to the C++0x/C1x <code>memory_order_seq_cst</code>, - Java volatile, and the gcc-compatible <code>__sync_*</code> builtins - which do not specify otherwise. - <dt>Notes for frontends</dt> - <dd>If a frontend is exposing atomic operations, these are much easier to - reason about for the programmer than other kinds of operations, and using - them is generally a practical performance tradeoff.</dd> - <dt>Notes for optimizers</dt> - <dd>Optimizers not aware of atomics can treat this like a nothrow call. - For SequentiallyConsistent loads and stores, the same reorderings are - allowed as for Acquire loads and Release stores, except that - SequentiallyConsistent operations may not be reordered.</dd> - <dt>Notes for code generation</dt> - <dd>SequentiallyConsistent loads minimally require the same barriers - as Acquire operations and SequentiallyConsistent stores require - Release barriers. Additionally, the code generator must enforce - ordering between SequentiallyConsistent stores followed by - SequentiallyConsistent loads. This is usually done by emitting - either a full fence before the loads or a full fence after the - stores; which is preferred varies by architecture.</dd> -</dl> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="iropt">Atomics and IR optimization</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Predicates for optimizer writers to query: -<ul> - <li>isSimple(): A load or store which is not volatile or atomic. This is - what, for example, memcpyopt would check for operations it might - transform.</li> - <li>isUnordered(): A load or store which is not volatile and at most - Unordered. This would be checked, for example, by LICM before hoisting - an operation.</li> - <li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note - that they return true for any operation which is volatile or at least - Monotonic.</li> - <li>Alias analysis: Note that AA will return ModRef for anything Acquire or - Release, and for the address accessed by any Monotonic operation.</li> -</ul> - -<p>To support optimizing around atomic operations, make sure you are using - the right predicates; everything should work if that is done. If your - pass should optimize some atomic operations (Unordered operations in - particular), make sure it doesn't replace an atomic load or store with - a non-atomic operation.</p> - -<p>Some examples of how optimizations interact with various kinds of atomic - operations: -<ul> - <li>memcpyopt: An atomic operation cannot be optimized into part of a - memcpy/memset, including unordered loads/stores. It can pull operations - across some atomic operations. - <li>LICM: Unordered loads/stores can be moved out of a loop. It just treats - monotonic operations like a read+write to a memory location, and anything - stricter than that like a nothrow call. - <li>DSE: Unordered stores can be DSE'ed like normal stores. Monotonic stores - can be DSE'ed in some cases, but it's tricky to reason about, and not - especially important. - <li>Folding a load: Any atomic load from a constant global can be - constant-folded, because it cannot be observed. Similar reasoning allows - scalarrepl with atomic loads and stores. -</ul> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="codegen">Atomics and Codegen</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Atomic operations are represented in the SelectionDAG with - <code>ATOMIC_*</code> opcodes. On architectures which use barrier - instructions for all atomic ordering (like ARM), appropriate fences are - split out as the DAG is built.</p> - -<p>The MachineMemOperand for all atomic operations is currently marked as - volatile; this is not correct in the IR sense of volatile, but CodeGen - handles anything marked volatile very conservatively. This should get - fixed at some point.</p> - -<p>Common architectures have some way of representing at least a pointer-sized - lock-free <code>cmpxchg</code>; such an operation can be used to implement - all the other atomic operations which can be represented in IR up to that - size. Backends are expected to implement all those operations, but not - operations which cannot be implemented in a lock-free manner. It is - expected that backends will give an error when given an operation which - cannot be implemented. (The LLVM code generator is not very helpful here - at the moment, but hopefully that will change.)</p> - -<p>The implementation of atomics on LL/SC architectures (like ARM) is currently - a bit of a mess; there is a lot of copy-pasted code across targets, and - the representation is relatively unsuited to optimization (it would be nice - to be able to optimize loops involving cmpxchg etc.).</p> - -<p>On x86, all atomic loads generate a <code>MOV</code>. - SequentiallyConsistent stores generate an <code>XCHG</code>, other stores - generate a <code>MOV</code>. SequentiallyConsistent fences generate an - <code>MFENCE</code>, other fences do not cause any code to be generated. - cmpxchg uses the <code>LOCK CMPXCHG</code> instruction. - <code>atomicrmw xchg</code> uses <code>XCHG</code>, - <code>atomicrmw add</code> and <code>atomicrmw sub</code> use - <code>XADD</code>, and all other <code>atomicrmw</code> operations generate - a loop with <code>LOCK CMPXCHG</code>. Depending on the users of the - result, some <code>atomicrmw</code> operations can be translated into - operations like <code>LOCK AND</code>, but that does not work in - general.</p> - -<p>On ARM, MIPS, and many other RISC architectures, Acquire, Release, and - SequentiallyConsistent semantics require barrier instructions - for every such operation. Loads and stores generate normal instructions. - <code>cmpxchg</code> and <code>atomicrmw</code> can be represented using - a loop with LL/SC-style instructions which take some sort of exclusive - lock on a cache line (<code>LDREX</code> and <code>STREX</code> on - ARM, etc.). At the moment, the IR does not provide any way to represent a - weak <code>cmpxchg</code> which would not require a loop.</p> -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-08-09 02:07:00 -0700 (Tue, 09 Aug 2011) $ -</address> - -</body> -</html> diff --git a/docs/Atomics.rst b/docs/Atomics.rst new file mode 100644 index 0000000..1bca53e --- /dev/null +++ b/docs/Atomics.rst @@ -0,0 +1,441 @@ +.. _atomics: + +============================================== +LLVM Atomic Instructions and Concurrency Guide +============================================== + +.. contents:: + :local: + +Introduction +============ + +Historically, LLVM has not had very strong support for concurrency; some minimal +intrinsics were provided, and ``volatile`` was used in some cases to achieve +rough semantics in the presence of concurrency. However, this is changing; +there are now new instructions which are well-defined in the presence of threads +and asynchronous signals, and the model for existing instructions has been +clarified in the IR. + +The atomic instructions are designed specifically to provide readable IR and +optimized code generation for the following: + +* The new C++0x ``<atomic>`` header. (`C++0x draft available here + <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C1x draft available here + <http://www.open-std.org/jtc1/sc22/wg14/>`_.) + +* Proper semantics for Java-style memory, for both ``volatile`` and regular + shared variables. (`Java Specification + <http://java.sun.com/docs/books/jls/third_edition/html/memory.html>`_) + +* gcc-compatible ``__sync_*`` builtins. (`Description + <http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html>`_) + +* Other scenarios with atomic semantics, including ``static`` variables with + non-trivial constructors in C++. + +Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile, +which ensures that every volatile load and store happens and is performed in the +stated order. A couple examples: if a SequentiallyConsistent store is +immediately followed by another SequentiallyConsistent store to the same +address, the first store can be erased. This transformation is not allowed for a +pair of volatile stores. On the other hand, a non-volatile non-atomic load can +be moved across a volatile load freely, but not an Acquire load. + +This document is intended to provide a guide to anyone either writing a frontend +for LLVM or working on optimization passes for LLVM with a guide for how to deal +with instructions with special semantics in the presence of concurrency. This +is not intended to be a precise guide to the semantics; the details can get +extremely complicated and unreadable, and are not usually necessary. + +.. _Optimization outside atomic: + +Optimization outside atomic +=========================== + +The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can +lead to undefined results in a concurrent environment; see `NotAtomic`_. This +section specifically goes into the one optimizer restriction which applies in +concurrent environments, which gets a bit more of an extended description +because any optimization dealing with stores needs to be aware of it. + +From the optimizer's point of view, the rule is that if there are not any +instructions with atomic ordering involved, concurrency does not matter, with +one exception: if a variable might be visible to another thread or signal +handler, a store cannot be inserted along a path where it might not execute +otherwise. Take the following example: + +.. code-block:: c + + /* C code, for readability; run through clang -O2 -S -emit-llvm to get + equivalent IR */ + int x; + void f(int* a) { + for (int i = 0; i < 100; i++) { + if (a[i]) + x += 1; + } + } + +The following is equivalent in non-concurrent situations: + +.. code-block:: c + + int x; + void f(int* a) { + int xtemp = x; + for (int i = 0; i < 100; i++) { + if (a[i]) + xtemp += 1; + } + x = xtemp; + } + +However, LLVM is not allowed to transform the former to the latter: it could +indirectly introduce undefined behavior if another thread can access ``x`` at +the same time. (This example is particularly of interest because before the +concurrency model was implemented, LLVM would perform this transformation.) + +Note that speculative loads are allowed; a load which is part of a race returns +``undef``, but does not have undefined behavior. + +Atomic instructions +=================== + +For cases where simple loads and stores are not sufficient, LLVM provides +various atomic instructions. The exact guarantees provided depend on the +ordering; see `Atomic orderings`_. + +``load atomic`` and ``store atomic`` provide the same basic functionality as +non-atomic loads and stores, but provide additional guarantees in situations +where threads and signals are involved. + +``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an +atomic store (where the store is conditional for ``cmpxchg``), but no other +memory operation can happen on any thread between the load and store. Note that +LLVM's cmpxchg does not provide quite as many options as the C++0x version. + +A ``fence`` provides Acquire and/or Release ordering which is not part of +another operation; it is normally used along with Monotonic memory operations. +A Monotonic load followed by an Acquire fence is roughly equivalent to an +Acquire load. + +Frontends generating atomic instructions generally need to be aware of the +target to some degree; atomic instructions are guaranteed to be lock-free, and +therefore an instruction which is wider than the target natively supports can be +impossible to generate. + +.. _Atomic orderings: + +Atomic orderings +================ + +In order to achieve a balance between performance and necessary guarantees, +there are six levels of atomicity. They are listed in order of strength; each +level includes all the guarantees of the previous level except for +Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.) + +.. _NotAtomic: + +NotAtomic +--------- + +NotAtomic is the obvious, a load or store which is not atomic. (This isn't +really a level of atomicity, but is listed here for comparison.) This is +essentially a regular load or store. If there is a race on a given memory +location, loads from that location return undef. + +Relevant standard + This is intended to match shared variables in C/C++, and to be used in any + other context where memory access is necessary, and a race is impossible. (The + precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.) + +Notes for frontends + The rule is essentially that all memory accessed with basic loads and stores + by multiple threads should be protected by a lock or other synchronization; + otherwise, you are likely to run into undefined behavior. If your frontend is + for a "safe" language like Java, use Unordered to load and store any shared + variable. Note that NotAtomic volatile loads and stores are not properly + atomic; do not try to use them as a substitute. (Per the C/C++ standards, + volatile does provide some limited guarantees around asynchronous signals, but + atomics are generally a better solution.) + +Notes for optimizers + Introducing loads to shared variables along a codepath where they would not + otherwise exist is allowed; introducing stores to shared variables is not. See + `Optimization outside atomic`_. + +Notes for code generation + The one interesting restriction here is that it is not allowed to write to + bytes outside of the bytes relevant to a store. This is mostly relevant to + unaligned stores: it is not allowed in general to convert an unaligned store + into two aligned stores of the same width as the unaligned store. Backends are + also expected to generate an i8 store as an i8 store, and not an instruction + which writes to surrounding bytes. (If you are writing a backend for an + architecture which cannot satisfy these restrictions and cares about + concurrency, please send an email to llvmdev.) + +Unordered +--------- + +Unordered is the lowest level of atomicity. It essentially guarantees that races +produce somewhat sane results instead of having undefined behavior. It also +guarantees the operation to be lock-free, so it do not depend on the data being +part of a special atomic structure or depend on a separate per-process global +lock. Note that code generation will fail for unsupported atomic operations; if +you need such an operation, use explicit locking. + +Relevant standard + This is intended to match the Java memory model for shared variables. + +Notes for frontends + This cannot be used for synchronization, but is useful for Java and other + "safe" languages which need to guarantee that the generated code never + exhibits undefined behavior. Note that this guarantee is cheap on common + platforms for loads of a native width, but can be expensive or unavailable for + wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe" + languages would normally split a 64-bit store on ARM into two 32-bit unordered + stores.) + +Notes for optimizers + In terms of the optimizer, this prohibits any transformation that transforms a + single load into multiple loads, transforms a store into multiple stores, + narrows a store, or stores a value which would not be stored otherwise. Some + examples of unsafe optimizations are narrowing an assignment into a bitfield, + rematerializing a load, and turning loads and stores into a memcpy + call. Reordering unordered operations is safe, though, and optimizers should + take advantage of that because unordered operations are common in languages + that need them. + +Notes for code generation + These operations are required to be atomic in the sense that if you use + unordered loads and unordered stores, a load cannot see a value which was + never stored. A normal load or store instruction is usually sufficient, but + note that an unordered load or store cannot be split into multiple + instructions (or an instruction which does multiple memory operations, like + ``LDRD`` on ARM). + +Monotonic +--------- + +Monotonic is the weakest level of atomicity that can be used in synchronization +primitives, although it does not provide any general synchronization. It +essentially guarantees that if you take all the operations affecting a specific +address, a consistent ordering exists. + +Relevant standard + This corresponds to the C++0x/C1x ``memory_order_relaxed``; see those + standards for the exact definition. + +Notes for frontends + If you are writing a frontend which uses this directly, use with caution. The + guarantees in terms of synchronization are very weak, so make sure these are + only used in a pattern which you know is correct. Generally, these would + either be used for atomic operations which do not protect other memory (like + an atomic counter), or along with a ``fence``. + +Notes for optimizers + In terms of the optimizer, this can be treated as a read+write on the relevant + memory location (and alias analysis will take advantage of that). In addition, + it is legal to reorder non-atomic and Unordered loads around Monotonic + loads. CSE/DSE and a few other optimizations are allowed, but Monotonic + operations are unlikely to be used in ways which would make those + optimizations useful. + +Notes for code generation + Code generation is essentially the same as that for unordered for loads and + stores. No fences are required. ``cmpxchg`` and ``atomicrmw`` are required + to appear as a single operation. + +Acquire +------- + +Acquire provides a barrier of the sort necessary to acquire a lock to access +other memory with normal loads and stores. + +Relevant standard + This corresponds to the C++0x/C1x ``memory_order_acquire``. It should also be + used for C++0x/C1x ``memory_order_consume``. + +Notes for frontends + If you are writing a frontend which uses this directly, use with caution. + Acquire only provides a semantic guarantee when paired with a Release + operation. + +Notes for optimizers + Optimizers not aware of atomics can treat this like a nothrow call. It is + also possible to move stores from before an Acquire load or read-modify-write + operation to after it, and move non-Acquire loads from before an Acquire + operation to after it. + +Notes for code generation + Architectures with weak memory ordering (essentially everything relevant today + except x86 and SPARC) require some sort of fence to maintain the Acquire + semantics. The precise fences required varies widely by architecture, but for + a simple implementation, most architectures provide a barrier which is strong + enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.). Putting + such a fence after the equivalent Monotonic operation is sufficient to + maintain Acquire semantics for a memory operation. + +Release +------- + +Release is similar to Acquire, but with a barrier of the sort necessary to +release a lock. + +Relevant standard + This corresponds to the C++0x/C1x ``memory_order_release``. + +Notes for frontends + If you are writing a frontend which uses this directly, use with caution. + Release only provides a semantic guarantee when paired with a Acquire + operation. + +Notes for optimizers + Optimizers not aware of atomics can treat this like a nothrow call. It is + also possible to move loads from after a Release store or read-modify-write + operation to before it, and move non-Release stores from after an Release + operation to before it. + +Notes for code generation + See the section on Acquire; a fence before the relevant operation is usually + sufficient for Release. Note that a store-store fence is not sufficient to + implement Release semantics; store-store fences are generally not exposed to + IR because they are extremely difficult to use correctly. + +AcquireRelease +-------------- + +AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release +barrier (for fences and operations which both read and write memory). + +Relevant standard + This corresponds to the C++0x/C1x ``memory_order_acq_rel``. + +Notes for frontends + If you are writing a frontend which uses this directly, use with caution. + Acquire only provides a semantic guarantee when paired with a Release + operation, and vice versa. + +Notes for optimizers + In general, optimizers should treat this like a nothrow call; the possible + optimizations are usually not interesting. + +Notes for code generation + This operation has Acquire and Release semantics; see the sections on Acquire + and Release. + +SequentiallyConsistent +---------------------- + +SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads +and Release semantics for stores. Additionally, it guarantees that a total +ordering exists between all SequentiallyConsistent operations. + +Relevant standard + This corresponds to the C++0x/C1x ``memory_order_seq_cst``, Java volatile, and + the gcc-compatible ``__sync_*`` builtins which do not specify otherwise. + +Notes for frontends + If a frontend is exposing atomic operations, these are much easier to reason + about for the programmer than other kinds of operations, and using them is + generally a practical performance tradeoff. + +Notes for optimizers + Optimizers not aware of atomics can treat this like a nothrow call. For + SequentiallyConsistent loads and stores, the same reorderings are allowed as + for Acquire loads and Release stores, except that SequentiallyConsistent + operations may not be reordered. + +Notes for code generation + SequentiallyConsistent loads minimally require the same barriers as Acquire + operations and SequentiallyConsistent stores require Release + barriers. Additionally, the code generator must enforce ordering between + SequentiallyConsistent stores followed by SequentiallyConsistent loads. This + is usually done by emitting either a full fence before the loads or a full + fence after the stores; which is preferred varies by architecture. + +Atomics and IR optimization +=========================== + +Predicates for optimizer writers to query: + +* ``isSimple()``: A load or store which is not volatile or atomic. This is + what, for example, memcpyopt would check for operations it might transform. + +* ``isUnordered()``: A load or store which is not volatile and at most + Unordered. This would be checked, for example, by LICM before hoisting an + operation. + +* ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note + that they return true for any operation which is volatile or at least + Monotonic. + +* Alias analysis: Note that AA will return ModRef for anything Acquire or + Release, and for the address accessed by any Monotonic operation. + +To support optimizing around atomic operations, make sure you are using the +right predicates; everything should work if that is done. If your pass should +optimize some atomic operations (Unordered operations in particular), make sure +it doesn't replace an atomic load or store with a non-atomic operation. + +Some examples of how optimizations interact with various kinds of atomic +operations: + +* ``memcpyopt``: An atomic operation cannot be optimized into part of a + memcpy/memset, including unordered loads/stores. It can pull operations + across some atomic operations. + +* LICM: Unordered loads/stores can be moved out of a loop. It just treats + monotonic operations like a read+write to a memory location, and anything + stricter than that like a nothrow call. + +* DSE: Unordered stores can be DSE'ed like normal stores. Monotonic stores can + be DSE'ed in some cases, but it's tricky to reason about, and not especially + important. + +* Folding a load: Any atomic load from a constant global can be constant-folded, + because it cannot be observed. Similar reasoning allows scalarrepl with + atomic loads and stores. + +Atomics and Codegen +=================== + +Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes. +On architectures which use barrier instructions for all atomic ordering (like +ARM), appropriate fences are split out as the DAG is built. + +The MachineMemOperand for all atomic operations is currently marked as volatile; +this is not correct in the IR sense of volatile, but CodeGen handles anything +marked volatile very conservatively. This should get fixed at some point. + +Common architectures have some way of representing at least a pointer-sized +lock-free ``cmpxchg``; such an operation can be used to implement all the other +atomic operations which can be represented in IR up to that size. Backends are +expected to implement all those operations, but not operations which cannot be +implemented in a lock-free manner. It is expected that backends will give an +error when given an operation which cannot be implemented. (The LLVM code +generator is not very helpful here at the moment, but hopefully that will +change.) + +The implementation of atomics on LL/SC architectures (like ARM) is currently a +bit of a mess; there is a lot of copy-pasted code across targets, and the +representation is relatively unsuited to optimization (it would be nice to be +able to optimize loops involving cmpxchg etc.). + +On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores +generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent +fences generate an ``MFENCE``, other fences do not cause any code to be +generated. cmpxchg uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg`` +uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all +other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending +on the users of the result, some ``atomicrmw`` operations can be translated into +operations like ``LOCK AND``, but that does not work in general. + +On ARM, MIPS, and many other RISC architectures, Acquire, Release, and +SequentiallyConsistent semantics require barrier instructions for every such +operation. Loads and stores generate normal instructions. ``cmpxchg`` and +``atomicrmw`` can be represented using a loop with LL/SC-style instructions +which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX`` +on ARM, etc.). At the moment, the IR does not provide any way to represent a +weak ``cmpxchg`` which would not require a loop. diff --git a/docs/BitCodeFormat.html b/docs/BitCodeFormat.html deleted file mode 100644 index 9a042a0..0000000 --- a/docs/BitCodeFormat.html +++ /dev/null @@ -1,1470 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Bitcode File Format</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> -<h1> LLVM Bitcode File Format</h1> -<ol> - <li><a href="#abstract">Abstract</a></li> - <li><a href="#overview">Overview</a></li> - <li><a href="#bitstream">Bitstream Format</a> - <ol> - <li><a href="#magic">Magic Numbers</a></li> - <li><a href="#primitives">Primitives</a></li> - <li><a href="#abbrevid">Abbreviation IDs</a></li> - <li><a href="#blocks">Blocks</a></li> - <li><a href="#datarecord">Data Records</a></li> - <li><a href="#abbreviations">Abbreviations</a></li> - <li><a href="#stdblocks">Standard Blocks</a></li> - </ol> - </li> - <li><a href="#wrapper">Bitcode Wrapper Format</a> - </li> - <li><a href="#llvmir">LLVM IR Encoding</a> - <ol> - <li><a href="#basics">Basics</a></li> - <li><a href="#MODULE_BLOCK">MODULE_BLOCK Contents</a></li> - <li><a href="#PARAMATTR_BLOCK">PARAMATTR_BLOCK Contents</a></li> - <li><a href="#TYPE_BLOCK">TYPE_BLOCK Contents</a></li> - <li><a href="#CONSTANTS_BLOCK">CONSTANTS_BLOCK Contents</a></li> - <li><a href="#FUNCTION_BLOCK">FUNCTION_BLOCK Contents</a></li> - <li><a href="#TYPE_SYMTAB_BLOCK">TYPE_SYMTAB_BLOCK Contents</a></li> - <li><a href="#VALUE_SYMTAB_BLOCK">VALUE_SYMTAB_BLOCK Contents</a></li> - <li><a href="#METADATA_BLOCK">METADATA_BLOCK Contents</a></li> - <li><a href="#METADATA_ATTACHMENT">METADATA_ATTACHMENT Contents</a></li> - </ol> - </li> -</ol> -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>, - <a href="http://www.reverberate.org">Joshua Haberman</a>, - and <a href="mailto:housel@acm.org">Peter S. Housel</a>. -</p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="abstract">Abstract</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>This document describes the LLVM bitstream file format and the encoding of -the LLVM IR into it.</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="overview">Overview</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -What is commonly known as the LLVM bitcode file format (also, sometimes -anachronistically known as bytecode) is actually two things: a <a -href="#bitstream">bitstream container format</a> -and an <a href="#llvmir">encoding of LLVM IR</a> into the container format.</p> - -<p> -The bitstream format is an abstract encoding of structured data, very -similar to XML in some ways. Like XML, bitstream files contain tags, and nested -structures, and you can parse the file without having to understand the tags. -Unlike XML, the bitstream format is a binary encoding, and unlike XML it -provides a mechanism for the file to self-describe "abbreviations", which are -effectively size optimizations for the content.</p> - -<p>LLVM IR files may be optionally embedded into a <a -href="#wrapper">wrapper</a> structure that makes it easy to embed extra data -along with LLVM IR files.</p> - -<p>This document first describes the LLVM bitstream format, describes the -wrapper format, then describes the record structure used by LLVM IR files. -</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="bitstream">Bitstream Format</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -The bitstream format is literally a stream of bits, with a very simple -structure. This structure consists of the following concepts: -</p> - -<ul> -<li>A "<a href="#magic">magic number</a>" that identifies the contents of - the stream.</li> -<li>Encoding <a href="#primitives">primitives</a> like variable bit-rate - integers.</li> -<li><a href="#blocks">Blocks</a>, which define nested content.</li> -<li><a href="#datarecord">Data Records</a>, which describe entities within the - file.</li> -<li>Abbreviations, which specify compression optimizations for the file.</li> -</ul> - -<p>Note that the <a -href="CommandGuide/html/llvm-bcanalyzer.html">llvm-bcanalyzer</a> tool can be -used to dump and inspect arbitrary bitstreams, which is very useful for -understanding the encoding.</p> - -<!-- ======================================================================= --> -<h3> - <a name="magic">Magic Numbers</a> -</h3> - -<div> - -<p>The first two bytes of a bitcode file are 'BC' (0x42, 0x43). -The second two bytes are an application-specific magic number. Generic -bitcode tools can look at only the first two bytes to verify the file is -bitcode, while application-specific programs will want to look at all four.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="primitives">Primitives</a> -</h3> - -<div> - -<p> -A bitstream literally consists of a stream of bits, which are read in order -starting with the least significant bit of each byte. The stream is made up of a -number of primitive values that encode a stream of unsigned integer values. -These integers are encoded in two ways: either as <a href="#fixedwidth">Fixed -Width Integers</a> or as <a href="#variablewidth">Variable Width -Integers</a>. -</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="fixedwidth">Fixed Width Integers</a> -</h4> - -<div> - -<p>Fixed-width integer values have their low bits emitted directly to the file. - For example, a 3-bit integer value encodes 1 as 001. Fixed width integers - are used when there are a well-known number of options for a field. For - example, boolean values are usually encoded with a 1-bit wide integer. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="variablewidth">Variable Width Integers</a> -</h4> - -<div> - -<p>Variable-width integer (VBR) values encode values of arbitrary size, -optimizing for the case where the values are small. Given a 4-bit VBR field, -any 3-bit value (0 through 7) is encoded directly, with the high bit set to -zero. Values larger than N-1 bits emit their bits in a series of N-1 bit -chunks, where all but the last set the high bit.</p> - -<p>For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a -vbr4 value. The first set of four bits indicates the value 3 (011) with a -continuation piece (indicated by a high bit of 1). The next word indicates a -value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value -27. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="char6">6-bit characters</a></h4> - -<div> - -<p>6-bit characters encode common characters into a fixed 6-bit field. They -represent the following characters with the following 6-bit values:</p> - -<div class="doc_code"> -<pre> -'a' .. 'z' — 0 .. 25 -'A' .. 'Z' — 26 .. 51 -'0' .. '9' — 52 .. 61 - '.' — 62 - '_' — 63 -</pre> -</div> - -<p>This encoding is only suitable for encoding characters and strings that -consist only of the above characters. It is completely incapable of encoding -characters not in the set.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="wordalign">Word Alignment</a></h4> - -<div> - -<p>Occasionally, it is useful to emit zero bits until the bitstream is a -multiple of 32 bits. This ensures that the bit position in the stream can be -represented as a multiple of 32-bit words.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="abbrevid">Abbreviation IDs</a> -</h3> - -<div> - -<p> -A bitstream is a sequential series of <a href="#blocks">Blocks</a> and -<a href="#datarecord">Data Records</a>. Both of these start with an -abbreviation ID encoded as a fixed-bitwidth field. The width is specified by -the current block, as described below. The value of the abbreviation ID -specifies either a builtin ID (which have special meanings, defined below) or -one of the abbreviation IDs defined for the current block by the stream itself. -</p> - -<p> -The set of builtin abbrev IDs is: -</p> - -<ul> -<li><tt>0 - <a href="#END_BLOCK">END_BLOCK</a></tt> — This abbrev ID marks - the end of the current block.</li> -<li><tt>1 - <a href="#ENTER_SUBBLOCK">ENTER_SUBBLOCK</a></tt> — This - abbrev ID marks the beginning of a new block.</li> -<li><tt>2 - <a href="#DEFINE_ABBREV">DEFINE_ABBREV</a></tt> — This defines - a new abbreviation.</li> -<li><tt>3 - <a href="#UNABBREV_RECORD">UNABBREV_RECORD</a></tt> — This ID - specifies the definition of an unabbreviated record.</li> -</ul> - -<p>Abbreviation IDs 4 and above are defined by the stream itself, and specify -an <a href="#abbrev_records">abbreviated record encoding</a>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="blocks">Blocks</a> -</h3> - -<div> - -<p> -Blocks in a bitstream denote nested regions of the stream, and are identified by -a content-specific id number (for example, LLVM IR uses an ID of 12 to represent -function bodies). Block IDs 0-7 are reserved for <a href="#stdblocks">standard blocks</a> -whose meaning is defined by Bitcode; block IDs 8 and greater are -application specific. Nested blocks capture the hierarchical structure of the data -encoded in it, and various properties are associated with blocks as the file is -parsed. Block definitions allow the reader to efficiently skip blocks -in constant time if the reader wants a summary of blocks, or if it wants to -efficiently skip data it does not understand. The LLVM IR reader uses this -mechanism to skip function bodies, lazily reading them on demand. -</p> - -<p> -When reading and encoding the stream, several properties are maintained for the -block. In particular, each block maintains: -</p> - -<ol> -<li>A current abbrev id width. This value starts at 2 at the beginning of - the stream, and is set every time a - block record is entered. The block entry specifies the abbrev id width for - the body of the block.</li> - -<li>A set of abbreviations. Abbreviations may be defined within a block, in - which case they are only defined in that block (neither subblocks nor - enclosing blocks see the abbreviation). Abbreviations can also be defined - inside a <tt><a href="#BLOCKINFO">BLOCKINFO</a></tt> block, in which case - they are defined in all blocks that match the ID that the BLOCKINFO block is - describing. -</li> -</ol> - -<p> -As sub blocks are entered, these properties are saved and the new sub-block has -its own set of abbreviations, and its own abbrev id width. When a sub-block is -popped, the saved values are restored. -</p> - -<!-- _______________________________________________________________________ --> -<h4><a name="ENTER_SUBBLOCK">ENTER_SUBBLOCK Encoding</a></h4> - -<div> - -<p><tt>[ENTER_SUBBLOCK, blockid<sub>vbr8</sub>, newabbrevlen<sub>vbr4</sub>, - <align32bits>, blocklen<sub>32</sub>]</tt></p> - -<p> -The <tt>ENTER_SUBBLOCK</tt> abbreviation ID specifies the start of a new block -record. The <tt>blockid</tt> value is encoded as an 8-bit VBR identifier, and -indicates the type of block being entered, which can be -a <a href="#stdblocks">standard block</a> or an application-specific block. -The <tt>newabbrevlen</tt> value is a 4-bit VBR, which specifies the abbrev id -width for the sub-block. The <tt>blocklen</tt> value is a 32-bit aligned value -that specifies the size of the subblock in 32-bit words. This value allows the -reader to skip over the entire block in one jump. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="END_BLOCK">END_BLOCK Encoding</a></h4> - -<div> - -<p><tt>[END_BLOCK, <align32bits>]</tt></p> - -<p> -The <tt>END_BLOCK</tt> abbreviation ID specifies the end of the current block -record. Its end is aligned to 32-bits to ensure that the size of the block is -an even multiple of 32-bits. -</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="datarecord">Data Records</a> -</h3> - -<div> -<p> -Data records consist of a record code and a number of (up to) 64-bit -integer values. The interpretation of the code and values is -application specific and may vary between different block types. -Records can be encoded either using an unabbrev record, or with an -abbreviation. In the LLVM IR format, for example, there is a record -which encodes the target triple of a module. The code is -<tt>MODULE_CODE_TRIPLE</tt>, and the values of the record are the -ASCII codes for the characters in the string. -</p> - -<!-- _______________________________________________________________________ --> -<h4><a name="UNABBREV_RECORD">UNABBREV_RECORD Encoding</a></h4> - -<div> - -<p><tt>[UNABBREV_RECORD, code<sub>vbr6</sub>, numops<sub>vbr6</sub>, - op0<sub>vbr6</sub>, op1<sub>vbr6</sub>, ...]</tt></p> - -<p> -An <tt>UNABBREV_RECORD</tt> provides a default fallback encoding, which is both -completely general and extremely inefficient. It can describe an arbitrary -record by emitting the code and operands as VBRs. -</p> - -<p> -For example, emitting an LLVM IR target triple as an unabbreviated record -requires emitting the <tt>UNABBREV_RECORD</tt> abbrevid, a vbr6 for the -<tt>MODULE_CODE_TRIPLE</tt> code, a vbr6 for the length of the string, which is -equal to the number of operands, and a vbr6 for each character. Because there -are no letters with values less than 32, each letter would need to be emitted as -at least a two-part VBR, which means that each letter would require at least 12 -bits. This is not an efficient encoding, but it is fully general. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="abbrev_records">Abbreviated Record Encoding</a></h4> - -<div> - -<p><tt>[<abbrevid>, fields...]</tt></p> - -<p> -An abbreviated record is a abbreviation id followed by a set of fields that are -encoded according to the <a href="#abbreviations">abbreviation definition</a>. -This allows records to be encoded significantly more densely than records -encoded with the <tt><a href="#UNABBREV_RECORD">UNABBREV_RECORD</a></tt> type, -and allows the abbreviation types to be specified in the stream itself, which -allows the files to be completely self describing. The actual encoding of -abbreviations is defined below. -</p> - -<p>The record code, which is the first field of an abbreviated record, -may be encoded in the abbreviation definition (as a literal -operand) or supplied in the abbreviated record (as a Fixed or VBR -operand value).</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="abbreviations">Abbreviations</a> -</h3> - -<div> -<p> -Abbreviations are an important form of compression for bitstreams. The idea is -to specify a dense encoding for a class of records once, then use that encoding -to emit many records. It takes space to emit the encoding into the file, but -the space is recouped (hopefully plus some) when the records that use it are -emitted. -</p> - -<p> -Abbreviations can be determined dynamically per client, per file. Because the -abbreviations are stored in the bitstream itself, different streams of the same -format can contain different sets of abbreviations according to the needs -of the specific stream. -As a concrete example, LLVM IR files usually emit an abbreviation -for binary operators. If a specific LLVM module contained no or few binary -operators, the abbreviation does not need to be emitted. -</p> - -<!-- _______________________________________________________________________ --> -<h4><a name="DEFINE_ABBREV">DEFINE_ABBREV Encoding</a></h4> - -<div> - -<p><tt>[DEFINE_ABBREV, numabbrevops<sub>vbr5</sub>, abbrevop0, abbrevop1, - ...]</tt></p> - -<p> -A <tt>DEFINE_ABBREV</tt> record adds an abbreviation to the list of currently -defined abbreviations in the scope of this block. This definition only exists -inside this immediate block — it is not visible in subblocks or enclosing -blocks. Abbreviations are implicitly assigned IDs sequentially starting from 4 -(the first application-defined abbreviation ID). Any abbreviations defined in a -<tt>BLOCKINFO</tt> record for the particular block type -receive IDs first, in order, followed by any -abbreviations defined within the block itself. Abbreviated data records -reference this ID to indicate what abbreviation they are invoking. -</p> - -<p> -An abbreviation definition consists of the <tt>DEFINE_ABBREV</tt> abbrevid -followed by a VBR that specifies the number of abbrev operands, then the abbrev -operands themselves. Abbreviation operands come in three forms. They all start -with a single bit that indicates whether the abbrev operand is a literal operand -(when the bit is 1) or an encoding operand (when the bit is 0). -</p> - -<ol> -<li>Literal operands — <tt>[1<sub>1</sub>, litvalue<sub>vbr8</sub>]</tt> -— Literal operands specify that the value in the result is always a single -specific value. This specific value is emitted as a vbr8 after the bit -indicating that it is a literal operand.</li> -<li>Encoding info without data — <tt>[0<sub>1</sub>, - encoding<sub>3</sub>]</tt> — Operand encodings that do not have extra - data are just emitted as their code. -</li> -<li>Encoding info with data — <tt>[0<sub>1</sub>, encoding<sub>3</sub>, -value<sub>vbr5</sub>]</tt> — Operand encodings that do have extra data are -emitted as their code, followed by the extra data. -</li> -</ol> - -<p>The possible operand encodings are:</p> - -<ul> -<li>Fixed (code 1): The field should be emitted as - a <a href="#fixedwidth">fixed-width value</a>, whose width is specified by - the operand's extra data.</li> -<li>VBR (code 2): The field should be emitted as - a <a href="#variablewidth">variable-width value</a>, whose width is - specified by the operand's extra data.</li> -<li>Array (code 3): This field is an array of values. The array operand - has no extra data, but expects another operand to follow it, indicating - the element type of the array. When reading an array in an abbreviated - record, the first integer is a vbr6 that indicates the array length, - followed by the encoded elements of the array. An array may only occur as - the last operand of an abbreviation (except for the one final operand that - gives the array's type).</li> -<li>Char6 (code 4): This field should be emitted as - a <a href="#char6">char6-encoded value</a>. This operand type takes no - extra data. Char6 encoding is normally used as an array element type. - </li> -<li>Blob (code 5): This field is emitted as a vbr6, followed by padding to a - 32-bit boundary (for alignment) and an array of 8-bit objects. The array of - bytes is further followed by tail padding to ensure that its total length is - a multiple of 4 bytes. This makes it very efficient for the reader to - decode the data without having to make a copy of it: it can use a pointer to - the data in the mapped in file and poke directly at it. A blob may only - occur as the last operand of an abbreviation.</li> -</ul> - -<p> -For example, target triples in LLVM modules are encoded as a record of the -form <tt>[TRIPLE, 'a', 'b', 'c', 'd']</tt>. Consider if the bitstream emitted -the following abbrev entry: -</p> - -<div class="doc_code"> -<pre> -[0, Fixed, 4] -[0, Array] -[0, Char6] -</pre> -</div> - -<p> -When emitting a record with this abbreviation, the above entry would be emitted -as: -</p> - -<div class="doc_code"> -<p> -<tt>[4<sub>abbrevwidth</sub>, 2<sub>4</sub>, 4<sub>vbr6</sub>, 0<sub>6</sub>, -1<sub>6</sub>, 2<sub>6</sub>, 3<sub>6</sub>]</tt> -</p> -</div> - -<p>These values are:</p> - -<ol> -<li>The first value, 4, is the abbreviation ID for this abbreviation.</li> -<li>The second value, 2, is the record code for <tt>TRIPLE</tt> records within LLVM IR file <tt>MODULE_BLOCK</tt> blocks.</li> -<li>The third value, 4, is the length of the array.</li> -<li>The rest of the values are the char6 encoded values - for <tt>"abcd"</tt>.</li> -</ol> - -<p> -With this abbreviation, the triple is emitted with only 37 bits (assuming a -abbrev id width of 3). Without the abbreviation, significantly more space would -be required to emit the target triple. Also, because the <tt>TRIPLE</tt> value -is not emitted as a literal in the abbreviation, the abbreviation can also be -used for any other string value. -</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="stdblocks">Standard Blocks</a> -</h3> - -<div> - -<p> -In addition to the basic block structure and record encodings, the bitstream -also defines specific built-in block types. These block types specify how the -stream is to be decoded or other metadata. In the future, new standard blocks -may be added. Block IDs 0-7 are reserved for standard blocks. -</p> - -<!-- _______________________________________________________________________ --> -<h4><a name="BLOCKINFO">#0 - BLOCKINFO Block</a></h4> - -<div> - -<p> -The <tt>BLOCKINFO</tt> block allows the description of metadata for other -blocks. The currently specified records are: -</p> - -<div class="doc_code"> -<pre> -[SETBID (#1), blockid] -[DEFINE_ABBREV, ...] -[BLOCKNAME, ...name...] -[SETRECORDNAME, RecordID, ...name...] -</pre> -</div> - -<p> -The <tt>SETBID</tt> record (code 1) indicates which block ID is being -described. <tt>SETBID</tt> records can occur multiple times throughout the -block to change which block ID is being described. There must be -a <tt>SETBID</tt> record prior to any other records. -</p> - -<p> -Standard <tt>DEFINE_ABBREV</tt> records can occur inside <tt>BLOCKINFO</tt> -blocks, but unlike their occurrence in normal blocks, the abbreviation is -defined for blocks matching the block ID we are describing, <i>not</i> the -<tt>BLOCKINFO</tt> block itself. The abbreviations defined -in <tt>BLOCKINFO</tt> blocks receive abbreviation IDs as described -in <tt><a href="#DEFINE_ABBREV">DEFINE_ABBREV</a></tt>. -</p> - -<p>The <tt>BLOCKNAME</tt> record (code 2) can optionally occur in this block. The elements of -the record are the bytes of the string name of the block. llvm-bcanalyzer can use -this to dump out bitcode files symbolically.</p> - -<p>The <tt>SETRECORDNAME</tt> record (code 3) can also optionally occur in this block. The -first operand value is a record ID number, and the rest of the elements of the record are -the bytes for the string name of the record. llvm-bcanalyzer can use -this to dump out bitcode files symbolically.</p> - -<p> -Note that although the data in <tt>BLOCKINFO</tt> blocks is described as -"metadata," the abbreviations they contain are essential for parsing records -from the corresponding blocks. It is not safe to skip them. -</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="wrapper">Bitcode Wrapper Format</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper -structure. This structure contains a simple header that indicates the offset -and size of the embedded BC file. This allows additional information to be -stored alongside the BC file. The structure of this file header is: -</p> - -<div class="doc_code"> -<p> -<tt>[Magic<sub>32</sub>, Version<sub>32</sub>, Offset<sub>32</sub>, -Size<sub>32</sub>, CPUType<sub>32</sub>]</tt> -</p> -</div> - -<p> -Each of the fields are 32-bit fields stored in little endian form (as with -the rest of the bitcode file fields). The Magic number is always -<tt>0x0B17C0DE</tt> and the version is currently always <tt>0</tt>. The Offset -field is the offset in bytes to the start of the bitcode stream in the file, and -the Size field is the size in bytes of the stream. CPUType is a target-specific -value that can be used to encode the CPU of the target. -</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="llvmir">LLVM IR Encoding</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -LLVM IR is encoded into a bitstream by defining blocks and records. It uses -blocks for things like constant pools, functions, symbol tables, etc. It uses -records for things like instructions, global variable descriptors, type -descriptions, etc. This document does not describe the set of abbreviations -that the writer uses, as these are fully self-described in the file, and the -reader is not allowed to build in any knowledge of this. -</p> - -<!-- ======================================================================= --> -<h3> - <a name="basics">Basics</a> -</h3> - -<div> - -<!-- _______________________________________________________________________ --> -<h4><a name="ir_magic">LLVM IR Magic Number</a></h4> - -<div> - -<p> -The magic number for LLVM IR files is: -</p> - -<div class="doc_code"> -<p> -<tt>[0x0<sub>4</sub>, 0xC<sub>4</sub>, 0xE<sub>4</sub>, 0xD<sub>4</sub>]</tt> -</p> -</div> - -<p> -When combined with the bitcode magic number and viewed as bytes, this is -<tt>"BC 0xC0DE"</tt>. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="ir_signed_vbr">Signed VBRs</a></h4> - -<div> - -<p> -<a href="#variablewidth">Variable Width Integer</a> encoding is an efficient way to -encode arbitrary sized unsigned values, but is an extremely inefficient for -encoding signed values, as signed values are otherwise treated as maximally large -unsigned values. -</p> - -<p> -As such, signed VBR values of a specific width are emitted as follows: -</p> - -<ul> -<li>Positive values are emitted as VBRs of the specified width, but with their - value shifted left by one.</li> -<li>Negative values are emitted as VBRs of the specified width, but the negated - value is shifted left by one, and the low bit is set.</li> -</ul> - -<p> -With this encoding, small positive and small negative values can both -be emitted efficiently. Signed VBR encoding is used in -<tt>CST_CODE_INTEGER</tt> and <tt>CST_CODE_WIDE_INTEGER</tt> records -within <tt>CONSTANTS_BLOCK</tt> blocks. -</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4><a name="ir_blocks">LLVM IR Blocks</a></h4> - -<div> - -<p> -LLVM IR is defined with the following blocks: -</p> - -<ul> -<li>8 — <a href="#MODULE_BLOCK"><tt>MODULE_BLOCK</tt></a> — This is the top-level block that - contains the entire module, and describes a variety of per-module - information.</li> -<li>9 — <a href="#PARAMATTR_BLOCK"><tt>PARAMATTR_BLOCK</tt></a> — This enumerates the parameter - attributes.</li> -<li>10 — <a href="#TYPE_BLOCK"><tt>TYPE_BLOCK</tt></a> — This describes all of the types in - the module.</li> -<li>11 — <a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a> — This describes constants for a - module or function.</li> -<li>12 — <a href="#FUNCTION_BLOCK"><tt>FUNCTION_BLOCK</tt></a> — This describes a function - body.</li> -<li>13 — <a href="#TYPE_SYMTAB_BLOCK"><tt>TYPE_SYMTAB_BLOCK</tt></a> — This describes the type symbol - table.</li> -<li>14 — <a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a> — This describes a value symbol - table.</li> -<li>15 — <a href="#METADATA_BLOCK"><tt>METADATA_BLOCK</tt></a> — This describes metadata items.</li> -<li>16 — <a href="#METADATA_ATTACHMENT"><tt>METADATA_ATTACHMENT</tt></a> — This contains records associating metadata with function instruction values.</li> -</ul> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="MODULE_BLOCK">MODULE_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>MODULE_BLOCK</tt> block (id 8) is the top-level block for LLVM -bitcode files, and each bitcode file must contain exactly one. In -addition to records (described below) containing information -about the module, a <tt>MODULE_BLOCK</tt> block may contain the -following sub-blocks: -</p> - -<ul> -<li><a href="#BLOCKINFO"><tt>BLOCKINFO</tt></a></li> -<li><a href="#PARAMATTR_BLOCK"><tt>PARAMATTR_BLOCK</tt></a></li> -<li><a href="#TYPE_BLOCK"><tt>TYPE_BLOCK</tt></a></li> -<li><a href="#TYPE_SYMTAB_BLOCK"><tt>TYPE_SYMTAB_BLOCK</tt></a></li> -<li><a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a></li> -<li><a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a></li> -<li><a href="#FUNCTION_BLOCK"><tt>FUNCTION_BLOCK</tt></a></li> -<li><a href="#METADATA_BLOCK"><tt>METADATA_BLOCK</tt></a></li> -</ul> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_VERSION">MODULE_CODE_VERSION Record</a></h4> - -<div> - -<p><tt>[VERSION, version#]</tt></p> - -<p>The <tt>VERSION</tt> record (code 1) contains a single value -indicating the format version. Only version 0 is supported at this -time.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_TRIPLE">MODULE_CODE_TRIPLE Record</a></h4> - -<div> -<p><tt>[TRIPLE, ...string...]</tt></p> - -<p>The <tt>TRIPLE</tt> record (code 2) contains a variable number of -values representing the bytes of the <tt>target triple</tt> -specification string.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_DATALAYOUT">MODULE_CODE_DATALAYOUT Record</a></h4> - -<div> -<p><tt>[DATALAYOUT, ...string...]</tt></p> - -<p>The <tt>DATALAYOUT</tt> record (code 3) contains a variable number of -values representing the bytes of the <tt>target datalayout</tt> -specification string.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_ASM">MODULE_CODE_ASM Record</a></h4> - -<div> -<p><tt>[ASM, ...string...]</tt></p> - -<p>The <tt>ASM</tt> record (code 4) contains a variable number of -values representing the bytes of <tt>module asm</tt> strings, with -individual assembly blocks separated by newline (ASCII 10) characters.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME Record</a></h4> - -<div> -<p><tt>[SECTIONNAME, ...string...]</tt></p> - -<p>The <tt>SECTIONNAME</tt> record (code 5) contains a variable number -of values representing the bytes of a single section name -string. There should be one <tt>SECTIONNAME</tt> record for each -section name referenced (e.g., in global variable or function -<tt>section</tt> attributes) within the module. These records can be -referenced by the 1-based index in the <i>section</i> fields of -<tt>GLOBALVAR</tt> or <tt>FUNCTION</tt> records.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_DEPLIB">MODULE_CODE_DEPLIB Record</a></h4> - -<div> -<p><tt>[DEPLIB, ...string...]</tt></p> - -<p>The <tt>DEPLIB</tt> record (code 6) contains a variable number of -values representing the bytes of a single dependent library name -string, one of the libraries mentioned in a <tt>deplibs</tt> -declaration. There should be one <tt>DEPLIB</tt> record for each -library name referenced.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_GLOBALVAR">MODULE_CODE_GLOBALVAR Record</a></h4> - -<div> -<p><tt>[GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal]</tt></p> - -<p>The <tt>GLOBALVAR</tt> record (code 7) marks the declaration or -definition of a global variable. The operand fields are:</p> - -<ul> -<li><i>pointer type</i>: The type index of the pointer type used to point to -this global variable</li> - -<li><i>isconst</i>: Non-zero if the variable is treated as constant within -the module, or zero if it is not</li> - -<li><i>initid</i>: If non-zero, the value index of the initializer for this -variable, plus 1.</li> - -<li><a name="linkage"><i>linkage</i></a>: An encoding of the linkage -type for this variable: - <ul> - <li><tt>external</tt>: code 0</li> - <li><tt>weak</tt>: code 1</li> - <li><tt>appending</tt>: code 2</li> - <li><tt>internal</tt>: code 3</li> - <li><tt>linkonce</tt>: code 4</li> - <li><tt>dllimport</tt>: code 5</li> - <li><tt>dllexport</tt>: code 6</li> - <li><tt>extern_weak</tt>: code 7</li> - <li><tt>common</tt>: code 8</li> - <li><tt>private</tt>: code 9</li> - <li><tt>weak_odr</tt>: code 10</li> - <li><tt>linkonce_odr</tt>: code 11</li> - <li><tt>available_externally</tt>: code 12</li> - <li><tt>linker_private</tt>: code 13</li> - </ul> -</li> - -<li><i>alignment</i>: The logarithm base 2 of the variable's requested -alignment, plus 1</li> - -<li><i>section</i>: If non-zero, the 1-based section index in the -table of <a href="#MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME</a> -entries.</li> - -<li><a name="visibility"><i>visibility</i></a>: If present, an -encoding of the visibility of this variable: - <ul> - <li><tt>default</tt>: code 0</li> - <li><tt>hidden</tt>: code 1</li> - <li><tt>protected</tt>: code 2</li> - </ul> -</li> - -<li><i>threadlocal</i>: If present and non-zero, indicates that the variable -is <tt>thread_local</tt></li> - -<li><i>unnamed_addr</i>: If present and non-zero, indicates that the variable -has <tt>unnamed_addr</tt></li> - -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_FUNCTION">MODULE_CODE_FUNCTION Record</a></h4> - -<div> - -<p><tt>[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]</tt></p> - -<p>The <tt>FUNCTION</tt> record (code 8) marks the declaration or -definition of a function. The operand fields are:</p> - -<ul> -<li><i>type</i>: The type index of the function type describing this function</li> - -<li><i>callingconv</i>: The calling convention number: - <ul> - <li><tt>ccc</tt>: code 0</li> - <li><tt>fastcc</tt>: code 8</li> - <li><tt>coldcc</tt>: code 9</li> - <li><tt>x86_stdcallcc</tt>: code 64</li> - <li><tt>x86_fastcallcc</tt>: code 65</li> - <li><tt>arm_apcscc</tt>: code 66</li> - <li><tt>arm_aapcscc</tt>: code 67</li> - <li><tt>arm_aapcs_vfpcc</tt>: code 68</li> - </ul> -</li> - -<li><i>isproto</i>: Non-zero if this entry represents a declaration -rather than a definition</li> - -<li><i>linkage</i>: An encoding of the <a href="#linkage">linkage type</a> -for this function</li> - -<li><i>paramattr</i>: If nonzero, the 1-based parameter attribute index -into the table of <a href="#PARAMATTR_CODE_ENTRY">PARAMATTR_CODE_ENTRY</a> -entries.</li> - -<li><i>alignment</i>: The logarithm base 2 of the function's requested -alignment, plus 1</li> - -<li><i>section</i>: If non-zero, the 1-based section index in the -table of <a href="#MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME</a> -entries.</li> - -<li><i>visibility</i>: An encoding of the <a href="#visibility">visibility</a> - of this function</li> - -<li><i>gc</i>: If present and nonzero, the 1-based garbage collector -index in the table of -<a href="#MODULE_CODE_GCNAME">MODULE_CODE_GCNAME</a> entries.</li> - -<li><i>unnamed_addr</i>: If present and non-zero, indicates that the function -has <tt>unnamed_addr</tt></li> - -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_ALIAS">MODULE_CODE_ALIAS Record</a></h4> - -<div> - -<p><tt>[ALIAS, alias type, aliasee val#, linkage, visibility]</tt></p> - -<p>The <tt>ALIAS</tt> record (code 9) marks the definition of an -alias. The operand fields are</p> - -<ul> -<li><i>alias type</i>: The type index of the alias</li> - -<li><i>aliasee val#</i>: The value index of the aliased value</li> - -<li><i>linkage</i>: An encoding of the <a href="#linkage">linkage type</a> -for this alias</li> - -<li><i>visibility</i>: If present, an encoding of the -<a href="#visibility">visibility</a> of the alias</li> - -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_PURGEVALS">MODULE_CODE_PURGEVALS Record</a></h4> - -<div> -<p><tt>[PURGEVALS, numvals]</tt></p> - -<p>The <tt>PURGEVALS</tt> record (code 10) resets the module-level -value list to the size given by the single operand value. Module-level -value list items are added by <tt>GLOBALVAR</tt>, <tt>FUNCTION</tt>, -and <tt>ALIAS</tt> records. After a <tt>PURGEVALS</tt> record is seen, -new value indices will start from the given <i>numvals</i> value.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="MODULE_CODE_GCNAME">MODULE_CODE_GCNAME Record</a></h4> - -<div> -<p><tt>[GCNAME, ...string...]</tt></p> - -<p>The <tt>GCNAME</tt> record (code 11) contains a variable number of -values representing the bytes of a single garbage collector name -string. There should be one <tt>GCNAME</tt> record for each garbage -collector name referenced in function <tt>gc</tt> attributes within -the module. These records can be referenced by 1-based index in the <i>gc</i> -fields of <tt>FUNCTION</tt> records.</p> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="PARAMATTR_BLOCK">PARAMATTR_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>PARAMATTR_BLOCK</tt> block (id 9) contains a table of -entries describing the attributes of function parameters. These -entries are referenced by 1-based index in the <i>paramattr</i> field -of module block <a name="MODULE_CODE_FUNCTION"><tt>FUNCTION</tt></a> -records, or within the <i>attr</i> field of function block <a -href="#FUNC_CODE_INST_INVOKE"><tt>INST_INVOKE</tt></a> and <a -href="#FUNC_CODE_INST_CALL"><tt>INST_CALL</tt></a> records.</p> - -<p>Entries within <tt>PARAMATTR_BLOCK</tt> are constructed to ensure -that each is unique (i.e., no two indicies represent equivalent -attribute lists). </p> - -<!-- _______________________________________________________________________ --> -<h4><a name="PARAMATTR_CODE_ENTRY">PARAMATTR_CODE_ENTRY Record</a></h4> - -<div> - -<p><tt>[ENTRY, paramidx0, attr0, paramidx1, attr1...]</tt></p> - -<p>The <tt>ENTRY</tt> record (code 1) contains an even number of -values describing a unique set of function parameter attributes. Each -<i>paramidx</i> value indicates which set of attributes is -represented, with 0 representing the return value attributes, -0xFFFFFFFF representing function attributes, and other values -representing 1-based function parameters. Each <i>attr</i> value is a -bitmap with the following interpretation: -</p> - -<ul> -<li>bit 0: <tt>zeroext</tt></li> -<li>bit 1: <tt>signext</tt></li> -<li>bit 2: <tt>noreturn</tt></li> -<li>bit 3: <tt>inreg</tt></li> -<li>bit 4: <tt>sret</tt></li> -<li>bit 5: <tt>nounwind</tt></li> -<li>bit 6: <tt>noalias</tt></li> -<li>bit 7: <tt>byval</tt></li> -<li>bit 8: <tt>nest</tt></li> -<li>bit 9: <tt>readnone</tt></li> -<li>bit 10: <tt>readonly</tt></li> -<li>bit 11: <tt>noinline</tt></li> -<li>bit 12: <tt>alwaysinline</tt></li> -<li>bit 13: <tt>optsize</tt></li> -<li>bit 14: <tt>ssp</tt></li> -<li>bit 15: <tt>sspreq</tt></li> -<li>bits 16–31: <tt>align <var>n</var></tt></li> -<li>bit 32: <tt>nocapture</tt></li> -<li>bit 33: <tt>noredzone</tt></li> -<li>bit 34: <tt>noimplicitfloat</tt></li> -<li>bit 35: <tt>naked</tt></li> -<li>bit 36: <tt>inlinehint</tt></li> -<li>bits 37–39: <tt>alignstack <var>n</var></tt>, represented as -the logarithm base 2 of the requested alignment, plus 1</li> -</ul> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="TYPE_BLOCK">TYPE_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>TYPE_BLOCK</tt> block (id 10) contains records which -constitute a table of type operator entries used to represent types -referenced within an LLVM module. Each record (with the exception of -<a href="#TYPE_CODE_NUMENTRY"><tt>NUMENTRY</tt></a>) generates a -single type table entry, which may be referenced by 0-based index from -instructions, constants, metadata, type symbol table entries, or other -type operator records. -</p> - -<p>Entries within <tt>TYPE_BLOCK</tt> are constructed to ensure that -each entry is unique (i.e., no two indicies represent structurally -equivalent types). </p> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_NUMENTRY">TYPE_CODE_NUMENTRY Record</a></h4> - -<div> - -<p><tt>[NUMENTRY, numentries]</tt></p> - -<p>The <tt>NUMENTRY</tt> record (code 1) contains a single value which -indicates the total number of type code entries in the type table of -the module. If present, <tt>NUMENTRY</tt> should be the first record -in the block. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_VOID">TYPE_CODE_VOID Record</a></h4> - -<div> - -<p><tt>[VOID]</tt></p> - -<p>The <tt>VOID</tt> record (code 2) adds a <tt>void</tt> type to the -type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_FLOAT">TYPE_CODE_FLOAT Record</a></h4> - -<div> - -<p><tt>[FLOAT]</tt></p> - -<p>The <tt>FLOAT</tt> record (code 3) adds a <tt>float</tt> (32-bit -floating point) type to the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_DOUBLE">TYPE_CODE_DOUBLE Record</a></h4> - -<div> - -<p><tt>[DOUBLE]</tt></p> - -<p>The <tt>DOUBLE</tt> record (code 4) adds a <tt>double</tt> (64-bit -floating point) type to the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_LABEL">TYPE_CODE_LABEL Record</a></h4> - -<div> - -<p><tt>[LABEL]</tt></p> - -<p>The <tt>LABEL</tt> record (code 5) adds a <tt>label</tt> type to -the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_OPAQUE">TYPE_CODE_OPAQUE Record</a></h4> - -<div> - -<p><tt>[OPAQUE]</tt></p> - -<p>The <tt>OPAQUE</tt> record (code 6) adds an <tt>opaque</tt> type to -the type table. Note that distinct <tt>opaque</tt> types are not -unified. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_INTEGER">TYPE_CODE_INTEGER Record</a></h4> - -<div> - -<p><tt>[INTEGER, width]</tt></p> - -<p>The <tt>INTEGER</tt> record (code 7) adds an integer type to the -type table. The single <i>width</i> field indicates the width of the -integer type. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_POINTER">TYPE_CODE_POINTER Record</a></h4> - -<div> - -<p><tt>[POINTER, pointee type, address space]</tt></p> - -<p>The <tt>POINTER</tt> record (code 8) adds a pointer type to the -type table. The operand fields are</p> - -<ul> -<li><i>pointee type</i>: The type index of the pointed-to type</li> - -<li><i>address space</i>: If supplied, the target-specific numbered -address space where the pointed-to object resides. Otherwise, the -default address space is zero. -</li> -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_FUNCTION">TYPE_CODE_FUNCTION Record</a></h4> - -<div> - -<p><tt>[FUNCTION, vararg, ignored, retty, ...paramty... ]</tt></p> - -<p>The <tt>FUNCTION</tt> record (code 9) adds a function type to the -type table. The operand fields are</p> - -<ul> -<li><i>vararg</i>: Non-zero if the type represents a varargs function</li> - -<li><i>ignored</i>: This value field is present for backward -compatibility only, and is ignored</li> - -<li><i>retty</i>: The type index of the function's return type</li> - -<li><i>paramty</i>: Zero or more type indices representing the -parameter types of the function</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_STRUCT">TYPE_CODE_STRUCT Record</a></h4> - -<div> - -<p><tt>[STRUCT, ispacked, ...eltty...]</tt></p> - -<p>The <tt>STRUCT </tt> record (code 10) adds a struct type to the -type table. The operand fields are</p> - -<ul> -<li><i>ispacked</i>: Non-zero if the type represents a packed structure</li> - -<li><i>eltty</i>: Zero or more type indices representing the element -types of the structure</li> -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_ARRAY">TYPE_CODE_ARRAY Record</a></h4> - -<div> - -<p><tt>[ARRAY, numelts, eltty]</tt></p> - -<p>The <tt>ARRAY</tt> record (code 11) adds an array type to the type -table. The operand fields are</p> - -<ul> -<li><i>numelts</i>: The number of elements in arrays of this type</li> - -<li><i>eltty</i>: The type index of the array element type</li> -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_VECTOR">TYPE_CODE_VECTOR Record</a></h4> - -<div> - -<p><tt>[VECTOR, numelts, eltty]</tt></p> - -<p>The <tt>VECTOR</tt> record (code 12) adds a vector type to the type -table. The operand fields are</p> - -<ul> -<li><i>numelts</i>: The number of elements in vectors of this type</li> - -<li><i>eltty</i>: The type index of the vector element type</li> -</ul> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_X86_FP80">TYPE_CODE_X86_FP80 Record</a></h4> - -<div> - -<p><tt>[X86_FP80]</tt></p> - -<p>The <tt>X86_FP80</tt> record (code 13) adds an <tt>x86_fp80</tt> (80-bit -floating point) type to the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_FP128">TYPE_CODE_FP128 Record</a></h4> - -<div> - -<p><tt>[FP128]</tt></p> - -<p>The <tt>FP128</tt> record (code 14) adds an <tt>fp128</tt> (128-bit -floating point) type to the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_PPC_FP128">TYPE_CODE_PPC_FP128 Record</a></h4> - -<div> - -<p><tt>[PPC_FP128]</tt></p> - -<p>The <tt>PPC_FP128</tt> record (code 15) adds a <tt>ppc_fp128</tt> -(128-bit floating point) type to the type table. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4><a name="TYPE_CODE_METADATA">TYPE_CODE_METADATA Record</a></h4> - -<div> - -<p><tt>[METADATA]</tt></p> - -<p>The <tt>METADATA</tt> record (code 16) adds a <tt>metadata</tt> -type to the type table. -</p> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="CONSTANTS_BLOCK">CONSTANTS_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>CONSTANTS_BLOCK</tt> block (id 11) ... -</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="FUNCTION_BLOCK">FUNCTION_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>FUNCTION_BLOCK</tt> block (id 12) ... -</p> - -<p>In addition to the record types described below, a -<tt>FUNCTION_BLOCK</tt> block may contain the following sub-blocks: -</p> - -<ul> -<li><a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a></li> -<li><a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a></li> -<li><a href="#METADATA_ATTACHMENT"><tt>METADATA_ATTACHMENT</tt></a></li> -</ul> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="TYPE_SYMTAB_BLOCK">TYPE_SYMTAB_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>TYPE_SYMTAB_BLOCK</tt> block (id 13) contains entries which -map between module-level named types and their corresponding type -indices. -</p> - -<!-- _______________________________________________________________________ --> -<h4><a name="TST_CODE_ENTRY">TST_CODE_ENTRY Record</a></h4> - -<div> - -<p><tt>[ENTRY, typeid, ...string...]</tt></p> - -<p>The <tt>ENTRY</tt> record (code 1) contains a variable number of -values, with the first giving the type index of the designated type, -and the remaining values giving the character codes of the type -name. Each entry corresponds to a single named type. -</p> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="VALUE_SYMTAB_BLOCK">VALUE_SYMTAB_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>VALUE_SYMTAB_BLOCK</tt> block (id 14) ... -</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="METADATA_BLOCK">METADATA_BLOCK Contents</a> -</h3> - -<div> - -<p>The <tt>METADATA_BLOCK</tt> block (id 15) ... -</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="METADATA_ATTACHMENT">METADATA_ATTACHMENT Contents</a> -</h3> - -<div> - -<p>The <tt>METADATA_ATTACHMENT</tt> block (id 16) ... -</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> -<a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> -<a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> -Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ -</address> -</body> -</html> diff --git a/docs/BitCodeFormat.rst b/docs/BitCodeFormat.rst new file mode 100644 index 0000000..d3995e7 --- /dev/null +++ b/docs/BitCodeFormat.rst @@ -0,0 +1,1045 @@ +.. _bitcode_format: + +.. role:: raw-html(raw) + :format: html + +======================== +LLVM Bitcode File Format +======================== + +.. contents:: + :local: + +Abstract +======== + +This document describes the LLVM bitstream file format and the encoding of the +LLVM IR into it. + +Overview +======== + +What is commonly known as the LLVM bitcode file format (also, sometimes +anachronistically known as bytecode) is actually two things: a `bitstream +container format`_ and an `encoding of LLVM IR`_ into the container format. + +The bitstream format is an abstract encoding of structured data, very similar to +XML in some ways. Like XML, bitstream files contain tags, and nested +structures, and you can parse the file without having to understand the tags. +Unlike XML, the bitstream format is a binary encoding, and unlike XML it +provides a mechanism for the file to self-describe "abbreviations", which are +effectively size optimizations for the content. + +LLVM IR files may be optionally embedded into a `wrapper`_ structure that makes +it easy to embed extra data along with LLVM IR files. + +This document first describes the LLVM bitstream format, describes the wrapper +format, then describes the record structure used by LLVM IR files. + +.. _bitstream container format: + +Bitstream Format +================ + +The bitstream format is literally a stream of bits, with a very simple +structure. This structure consists of the following concepts: + +* A "`magic number`_" that identifies the contents of the stream. + +* Encoding `primitives`_ like variable bit-rate integers. + +* `Blocks`_, which define nested content. + +* `Data Records`_, which describe entities within the file. + +* Abbreviations, which specify compression optimizations for the file. + +Note that the `llvm-bcanalyzer <CommandGuide/html/llvm-bcanalyzer.html>`_ tool +can be used to dump and inspect arbitrary bitstreams, which is very useful for +understanding the encoding. + +.. _magic number: + +Magic Numbers +------------- + +The first two bytes of a bitcode file are 'BC' (``0x42``, ``0x43``). The second +two bytes are an application-specific magic number. Generic bitcode tools can +look at only the first two bytes to verify the file is bitcode, while +application-specific programs will want to look at all four. + +.. _primitives: + +Primitives +---------- + +A bitstream literally consists of a stream of bits, which are read in order +starting with the least significant bit of each byte. The stream is made up of +a number of primitive values that encode a stream of unsigned integer values. +These integers are encoded in two ways: either as `Fixed Width Integers`_ or as +`Variable Width Integers`_. + +.. _Fixed Width Integers: +.. _fixed-width value: + +Fixed Width Integers +^^^^^^^^^^^^^^^^^^^^ + +Fixed-width integer values have their low bits emitted directly to the file. +For example, a 3-bit integer value encodes 1 as 001. Fixed width integers are +used when there are a well-known number of options for a field. For example, +boolean values are usually encoded with a 1-bit wide integer. + +.. _Variable Width Integers: +.. _Variable Width Integer: +.. _variable-width value: + +Variable Width Integers +^^^^^^^^^^^^^^^^^^^^^^^ + +Variable-width integer (VBR) values encode values of arbitrary size, optimizing +for the case where the values are small. Given a 4-bit VBR field, any 3-bit +value (0 through 7) is encoded directly, with the high bit set to zero. Values +larger than N-1 bits emit their bits in a series of N-1 bit chunks, where all +but the last set the high bit. + +For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a vbr4 +value. The first set of four bits indicates the value 3 (011) with a +continuation piece (indicated by a high bit of 1). The next word indicates a +value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value +27. + +.. _char6-encoded value: + +6-bit characters +^^^^^^^^^^^^^^^^ + +6-bit characters encode common characters into a fixed 6-bit field. They +represent the following characters with the following 6-bit values: + +:: + + 'a' .. 'z' --- 0 .. 25 + 'A' .. 'Z' --- 26 .. 51 + '0' .. '9' --- 52 .. 61 + '.' --- 62 + '_' --- 63 + +This encoding is only suitable for encoding characters and strings that consist +only of the above characters. It is completely incapable of encoding characters +not in the set. + +Word Alignment +^^^^^^^^^^^^^^ + +Occasionally, it is useful to emit zero bits until the bitstream is a multiple +of 32 bits. This ensures that the bit position in the stream can be represented +as a multiple of 32-bit words. + +Abbreviation IDs +---------------- + +A bitstream is a sequential series of `Blocks`_ and `Data Records`_. Both of +these start with an abbreviation ID encoded as a fixed-bitwidth field. The +width is specified by the current block, as described below. The value of the +abbreviation ID specifies either a builtin ID (which have special meanings, +defined below) or one of the abbreviation IDs defined for the current block by +the stream itself. + +The set of builtin abbrev IDs is: + +* 0 - `END_BLOCK`_ --- This abbrev ID marks the end of the current block. + +* 1 - `ENTER_SUBBLOCK`_ --- This abbrev ID marks the beginning of a new + block. + +* 2 - `DEFINE_ABBREV`_ --- This defines a new abbreviation. + +* 3 - `UNABBREV_RECORD`_ --- This ID specifies the definition of an + unabbreviated record. + +Abbreviation IDs 4 and above are defined by the stream itself, and specify an +`abbreviated record encoding`_. + +.. _Blocks: + +Blocks +------ + +Blocks in a bitstream denote nested regions of the stream, and are identified by +a content-specific id number (for example, LLVM IR uses an ID of 12 to represent +function bodies). Block IDs 0-7 are reserved for `standard blocks`_ whose +meaning is defined by Bitcode; block IDs 8 and greater are application +specific. Nested blocks capture the hierarchical structure of the data encoded +in it, and various properties are associated with blocks as the file is parsed. +Block definitions allow the reader to efficiently skip blocks in constant time +if the reader wants a summary of blocks, or if it wants to efficiently skip data +it does not understand. The LLVM IR reader uses this mechanism to skip function +bodies, lazily reading them on demand. + +When reading and encoding the stream, several properties are maintained for the +block. In particular, each block maintains: + +#. A current abbrev id width. This value starts at 2 at the beginning of the + stream, and is set every time a block record is entered. The block entry + specifies the abbrev id width for the body of the block. + +#. A set of abbreviations. Abbreviations may be defined within a block, in + which case they are only defined in that block (neither subblocks nor + enclosing blocks see the abbreviation). Abbreviations can also be defined + inside a `BLOCKINFO`_ block, in which case they are defined in all blocks + that match the ID that the ``BLOCKINFO`` block is describing. + +As sub blocks are entered, these properties are saved and the new sub-block has +its own set of abbreviations, and its own abbrev id width. When a sub-block is +popped, the saved values are restored. + +.. _ENTER_SUBBLOCK: + +ENTER_SUBBLOCK Encoding +^^^^^^^^^^^^^^^^^^^^^^^ + +:raw-html:`<tt>` +[ENTER_SUBBLOCK, blockid\ :sub:`vbr8`, newabbrevlen\ :sub:`vbr4`, <align32bits>, blocklen_32] +:raw-html:`</tt>` + +The ``ENTER_SUBBLOCK`` abbreviation ID specifies the start of a new block +record. The ``blockid`` value is encoded as an 8-bit VBR identifier, and +indicates the type of block being entered, which can be a `standard block`_ or +an application-specific block. The ``newabbrevlen`` value is a 4-bit VBR, which +specifies the abbrev id width for the sub-block. The ``blocklen`` value is a +32-bit aligned value that specifies the size of the subblock in 32-bit +words. This value allows the reader to skip over the entire block in one jump. + +.. _END_BLOCK: + +END_BLOCK Encoding +^^^^^^^^^^^^^^^^^^ + +``[END_BLOCK, <align32bits>]`` + +The ``END_BLOCK`` abbreviation ID specifies the end of the current block record. +Its end is aligned to 32-bits to ensure that the size of the block is an even +multiple of 32-bits. + +.. _Data Records: + +Data Records +------------ + +Data records consist of a record code and a number of (up to) 64-bit integer +values. The interpretation of the code and values is application specific and +may vary between different block types. Records can be encoded either using an +unabbrev record, or with an abbreviation. In the LLVM IR format, for example, +there is a record which encodes the target triple of a module. The code is +``MODULE_CODE_TRIPLE``, and the values of the record are the ASCII codes for the +characters in the string. + +.. _UNABBREV_RECORD: + +UNABBREV_RECORD Encoding +^^^^^^^^^^^^^^^^^^^^^^^^ + +:raw-html:`<tt>` +[UNABBREV_RECORD, code\ :sub:`vbr6`, numops\ :sub:`vbr6`, op0\ :sub:`vbr6`, op1\ :sub:`vbr6`, ...] +:raw-html:`</tt>` + +An ``UNABBREV_RECORD`` provides a default fallback encoding, which is both +completely general and extremely inefficient. It can describe an arbitrary +record by emitting the code and operands as VBRs. + +For example, emitting an LLVM IR target triple as an unabbreviated record +requires emitting the ``UNABBREV_RECORD`` abbrevid, a vbr6 for the +``MODULE_CODE_TRIPLE`` code, a vbr6 for the length of the string, which is equal +to the number of operands, and a vbr6 for each character. Because there are no +letters with values less than 32, each letter would need to be emitted as at +least a two-part VBR, which means that each letter would require at least 12 +bits. This is not an efficient encoding, but it is fully general. + +.. _abbreviated record encoding: + +Abbreviated Record Encoding +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[<abbrevid>, fields...]`` + +An abbreviated record is a abbreviation id followed by a set of fields that are +encoded according to the `abbreviation definition`_. This allows records to be +encoded significantly more densely than records encoded with the +`UNABBREV_RECORD`_ type, and allows the abbreviation types to be specified in +the stream itself, which allows the files to be completely self describing. The +actual encoding of abbreviations is defined below. + +The record code, which is the first field of an abbreviated record, may be +encoded in the abbreviation definition (as a literal operand) or supplied in the +abbreviated record (as a Fixed or VBR operand value). + +.. _abbreviation definition: + +Abbreviations +------------- + +Abbreviations are an important form of compression for bitstreams. The idea is +to specify a dense encoding for a class of records once, then use that encoding +to emit many records. It takes space to emit the encoding into the file, but +the space is recouped (hopefully plus some) when the records that use it are +emitted. + +Abbreviations can be determined dynamically per client, per file. Because the +abbreviations are stored in the bitstream itself, different streams of the same +format can contain different sets of abbreviations according to the needs of the +specific stream. As a concrete example, LLVM IR files usually emit an +abbreviation for binary operators. If a specific LLVM module contained no or +few binary operators, the abbreviation does not need to be emitted. + +.. _DEFINE_ABBREV: + +DEFINE_ABBREV Encoding +^^^^^^^^^^^^^^^^^^^^^^ + +:raw-html:`<tt>` +[DEFINE_ABBREV, numabbrevops\ :sub:`vbr5`, abbrevop0, abbrevop1, ...] +:raw-html:`</tt>` + +A ``DEFINE_ABBREV`` record adds an abbreviation to the list of currently defined +abbreviations in the scope of this block. This definition only exists inside +this immediate block --- it is not visible in subblocks or enclosing blocks. +Abbreviations are implicitly assigned IDs sequentially starting from 4 (the +first application-defined abbreviation ID). Any abbreviations defined in a +``BLOCKINFO`` record for the particular block type receive IDs first, in order, +followed by any abbreviations defined within the block itself. Abbreviated data +records reference this ID to indicate what abbreviation they are invoking. + +An abbreviation definition consists of the ``DEFINE_ABBREV`` abbrevid followed +by a VBR that specifies the number of abbrev operands, then the abbrev operands +themselves. Abbreviation operands come in three forms. They all start with a +single bit that indicates whether the abbrev operand is a literal operand (when +the bit is 1) or an encoding operand (when the bit is 0). + +#. Literal operands --- :raw-html:`<tt>` [1\ :sub:`1`, litvalue\ + :sub:`vbr8`] :raw-html:`</tt>` --- Literal operands specify that the value in + the result is always a single specific value. This specific value is emitted + as a vbr8 after the bit indicating that it is a literal operand. + +#. Encoding info without data --- :raw-html:`<tt>` [0\ :sub:`1`, encoding\ + :sub:`3`] :raw-html:`</tt>` --- Operand encodings that do not have extra data + are just emitted as their code. + +#. Encoding info with data --- :raw-html:`<tt>` [0\ :sub:`1`, encoding\ + :sub:`3`, value\ :sub:`vbr5`] :raw-html:`</tt>` --- Operand encodings that do + have extra data are emitted as their code, followed by the extra data. + +The possible operand encodings are: + +* Fixed (code 1): The field should be emitted as a `fixed-width value`_, whose + width is specified by the operand's extra data. + +* VBR (code 2): The field should be emitted as a `variable-width value`_, whose + width is specified by the operand's extra data. + +* Array (code 3): This field is an array of values. The array operand has no + extra data, but expects another operand to follow it, indicating the element + type of the array. When reading an array in an abbreviated record, the first + integer is a vbr6 that indicates the array length, followed by the encoded + elements of the array. An array may only occur as the last operand of an + abbreviation (except for the one final operand that gives the array's + type). + +* Char6 (code 4): This field should be emitted as a `char6-encoded value`_. + This operand type takes no extra data. Char6 encoding is normally used as an + array element type. + +* Blob (code 5): This field is emitted as a vbr6, followed by padding to a + 32-bit boundary (for alignment) and an array of 8-bit objects. The array of + bytes is further followed by tail padding to ensure that its total length is a + multiple of 4 bytes. This makes it very efficient for the reader to decode + the data without having to make a copy of it: it can use a pointer to the data + in the mapped in file and poke directly at it. A blob may only occur as the + last operand of an abbreviation. + +For example, target triples in LLVM modules are encoded as a record of the form +``[TRIPLE, 'a', 'b', 'c', 'd']``. Consider if the bitstream emitted the +following abbrev entry: + +:: + + [0, Fixed, 4] + [0, Array] + [0, Char6] + +When emitting a record with this abbreviation, the above entry would be emitted +as: + +:raw-html:`<tt><blockquote>` +[4\ :sub:`abbrevwidth`, 2\ :sub:`4`, 4\ :sub:`vbr6`, 0\ :sub:`6`, 1\ :sub:`6`, 2\ :sub:`6`, 3\ :sub:`6`] +:raw-html:`</blockquote></tt>` + +These values are: + +#. The first value, 4, is the abbreviation ID for this abbreviation. + +#. The second value, 2, is the record code for ``TRIPLE`` records within LLVM IR + file ``MODULE_BLOCK`` blocks. + +#. The third value, 4, is the length of the array. + +#. The rest of the values are the char6 encoded values for ``"abcd"``. + +With this abbreviation, the triple is emitted with only 37 bits (assuming a +abbrev id width of 3). Without the abbreviation, significantly more space would +be required to emit the target triple. Also, because the ``TRIPLE`` value is +not emitted as a literal in the abbreviation, the abbreviation can also be used +for any other string value. + +.. _standard blocks: +.. _standard block: + +Standard Blocks +--------------- + +In addition to the basic block structure and record encodings, the bitstream +also defines specific built-in block types. These block types specify how the +stream is to be decoded or other metadata. In the future, new standard blocks +may be added. Block IDs 0-7 are reserved for standard blocks. + +.. _BLOCKINFO: + +#0 - BLOCKINFO Block +^^^^^^^^^^^^^^^^^^^^ + +The ``BLOCKINFO`` block allows the description of metadata for other blocks. +The currently specified records are: + +:: + + [SETBID (#1), blockid] + [DEFINE_ABBREV, ...] + [BLOCKNAME, ...name...] + [SETRECORDNAME, RecordID, ...name...] + +The ``SETBID`` record (code 1) indicates which block ID is being described. +``SETBID`` records can occur multiple times throughout the block to change which +block ID is being described. There must be a ``SETBID`` record prior to any +other records. + +Standard ``DEFINE_ABBREV`` records can occur inside ``BLOCKINFO`` blocks, but +unlike their occurrence in normal blocks, the abbreviation is defined for blocks +matching the block ID we are describing, *not* the ``BLOCKINFO`` block +itself. The abbreviations defined in ``BLOCKINFO`` blocks receive abbreviation +IDs as described in `DEFINE_ABBREV`_. + +The ``BLOCKNAME`` record (code 2) can optionally occur in this block. The +elements of the record are the bytes of the string name of the block. +llvm-bcanalyzer can use this to dump out bitcode files symbolically. + +The ``SETRECORDNAME`` record (code 3) can also optionally occur in this block. +The first operand value is a record ID number, and the rest of the elements of +the record are the bytes for the string name of the record. llvm-bcanalyzer can +use this to dump out bitcode files symbolically. + +Note that although the data in ``BLOCKINFO`` blocks is described as "metadata," +the abbreviations they contain are essential for parsing records from the +corresponding blocks. It is not safe to skip them. + +.. _wrapper: + +Bitcode Wrapper Format +====================== + +Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper +structure. This structure contains a simple header that indicates the offset +and size of the embedded BC file. This allows additional information to be +stored alongside the BC file. The structure of this file header is: + +:raw-html:`<tt><blockquote>` +[Magic\ :sub:`32`, Version\ :sub:`32`, Offset\ :sub:`32`, Size\ :sub:`32`, CPUType\ :sub:`32`] +:raw-html:`</blockquote></tt>` + +Each of the fields are 32-bit fields stored in little endian form (as with the +rest of the bitcode file fields). The Magic number is always ``0x0B17C0DE`` and +the version is currently always ``0``. The Offset field is the offset in bytes +to the start of the bitcode stream in the file, and the Size field is the size +in bytes of the stream. CPUType is a target-specific value that can be used to +encode the CPU of the target. + +.. _encoding of LLVM IR: + +LLVM IR Encoding +================ + +LLVM IR is encoded into a bitstream by defining blocks and records. It uses +blocks for things like constant pools, functions, symbol tables, etc. It uses +records for things like instructions, global variable descriptors, type +descriptions, etc. This document does not describe the set of abbreviations +that the writer uses, as these are fully self-described in the file, and the +reader is not allowed to build in any knowledge of this. + +Basics +------ + +LLVM IR Magic Number +^^^^^^^^^^^^^^^^^^^^ + +The magic number for LLVM IR files is: + +:raw-html:`<tt><blockquote>` +[0x0\ :sub:`4`, 0xC\ :sub:`4`, 0xE\ :sub:`4`, 0xD\ :sub:`4`] +:raw-html:`</blockquote></tt>` + +When combined with the bitcode magic number and viewed as bytes, this is +``"BC 0xC0DE"``. + +Signed VBRs +^^^^^^^^^^^ + +`Variable Width Integer`_ encoding is an efficient way to encode arbitrary sized +unsigned values, but is an extremely inefficient for encoding signed values, as +signed values are otherwise treated as maximally large unsigned values. + +As such, signed VBR values of a specific width are emitted as follows: + +* Positive values are emitted as VBRs of the specified width, but with their + value shifted left by one. + +* Negative values are emitted as VBRs of the specified width, but the negated + value is shifted left by one, and the low bit is set. + +With this encoding, small positive and small negative values can both be emitted +efficiently. Signed VBR encoding is used in ``CST_CODE_INTEGER`` and +``CST_CODE_WIDE_INTEGER`` records within ``CONSTANTS_BLOCK`` blocks. + +LLVM IR Blocks +^^^^^^^^^^^^^^ + +LLVM IR is defined with the following blocks: + +* 8 --- `MODULE_BLOCK`_ --- This is the top-level block that contains the entire + module, and describes a variety of per-module information. + +* 9 --- `PARAMATTR_BLOCK`_ --- This enumerates the parameter attributes. + +* 10 --- `TYPE_BLOCK`_ --- This describes all of the types in the module. + +* 11 --- `CONSTANTS_BLOCK`_ --- This describes constants for a module or + function. + +* 12 --- `FUNCTION_BLOCK`_ --- This describes a function body. + +* 13 --- `TYPE_SYMTAB_BLOCK`_ --- This describes the type symbol table. + +* 14 --- `VALUE_SYMTAB_BLOCK`_ --- This describes a value symbol table. + +* 15 --- `METADATA_BLOCK`_ --- This describes metadata items. + +* 16 --- `METADATA_ATTACHMENT`_ --- This contains records associating metadata + with function instruction values. + +.. _MODULE_BLOCK: + +MODULE_BLOCK Contents +--------------------- + +The ``MODULE_BLOCK`` block (id 8) is the top-level block for LLVM bitcode files, +and each bitcode file must contain exactly one. In addition to records +(described below) containing information about the module, a ``MODULE_BLOCK`` +block may contain the following sub-blocks: + +* `BLOCKINFO`_ +* `PARAMATTR_BLOCK`_ +* `TYPE_BLOCK`_ +* `TYPE_SYMTAB_BLOCK`_ +* `VALUE_SYMTAB_BLOCK`_ +* `CONSTANTS_BLOCK`_ +* `FUNCTION_BLOCK`_ +* `METADATA_BLOCK`_ + +MODULE_CODE_VERSION Record +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[VERSION, version#]`` + +The ``VERSION`` record (code 1) contains a single value indicating the format +version. Only version 0 is supported at this time. + +MODULE_CODE_TRIPLE Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[TRIPLE, ...string...]`` + +The ``TRIPLE`` record (code 2) contains a variable number of values representing +the bytes of the ``target triple`` specification string. + +MODULE_CODE_DATALAYOUT Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[DATALAYOUT, ...string...]`` + +The ``DATALAYOUT`` record (code 3) contains a variable number of values +representing the bytes of the ``target datalayout`` specification string. + +MODULE_CODE_ASM Record +^^^^^^^^^^^^^^^^^^^^^^ + +``[ASM, ...string...]`` + +The ``ASM`` record (code 4) contains a variable number of values representing +the bytes of ``module asm`` strings, with individual assembly blocks separated +by newline (ASCII 10) characters. + +.. _MODULE_CODE_SECTIONNAME: + +MODULE_CODE_SECTIONNAME Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[SECTIONNAME, ...string...]`` + +The ``SECTIONNAME`` record (code 5) contains a variable number of values +representing the bytes of a single section name string. There should be one +``SECTIONNAME`` record for each section name referenced (e.g., in global +variable or function ``section`` attributes) within the module. These records +can be referenced by the 1-based index in the *section* fields of ``GLOBALVAR`` +or ``FUNCTION`` records. + +MODULE_CODE_DEPLIB Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[DEPLIB, ...string...]`` + +The ``DEPLIB`` record (code 6) contains a variable number of values representing +the bytes of a single dependent library name string, one of the libraries +mentioned in a ``deplibs`` declaration. There should be one ``DEPLIB`` record +for each library name referenced. + +MODULE_CODE_GLOBALVAR Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal, unnamed_addr]`` + +The ``GLOBALVAR`` record (code 7) marks the declaration or definition of a +global variable. The operand fields are: + +* *pointer type*: The type index of the pointer type used to point to this + global variable + +* *isconst*: Non-zero if the variable is treated as constant within the module, + or zero if it is not + +* *initid*: If non-zero, the value index of the initializer for this variable, + plus 1. + +.. _linkage type: + +* *linkage*: An encoding of the linkage type for this variable: + * ``external``: code 0 + * ``weak``: code 1 + * ``appending``: code 2 + * ``internal``: code 3 + * ``linkonce``: code 4 + * ``dllimport``: code 5 + * ``dllexport``: code 6 + * ``extern_weak``: code 7 + * ``common``: code 8 + * ``private``: code 9 + * ``weak_odr``: code 10 + * ``linkonce_odr``: code 11 + * ``available_externally``: code 12 + * ``linker_private``: code 13 + +* alignment*: The logarithm base 2 of the variable's requested alignment, plus 1 + +* *section*: If non-zero, the 1-based section index in the table of + `MODULE_CODE_SECTIONNAME`_ entries. + +.. _visibility: + +* *visibility*: If present, an encoding of the visibility of this variable: + * ``default``: code 0 + * ``hidden``: code 1 + * ``protected``: code 2 + +* *threadlocal*: If present, an encoding of the thread local storage mode of the + variable: + * ``not thread local``: code 0 + * ``thread local; default TLS model``: code 1 + * ``localdynamic``: code 2 + * ``initialexec``: code 3 + * ``localexec``: code 4 + +* *unnamed_addr*: If present and non-zero, indicates that the variable has + ``unnamed_addr`` + +.. _FUNCTION: + +MODULE_CODE_FUNCTION Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]`` + +The ``FUNCTION`` record (code 8) marks the declaration or definition of a +function. The operand fields are: + +* *type*: The type index of the function type describing this function + +* *callingconv*: The calling convention number: + * ``ccc``: code 0 + * ``fastcc``: code 8 + * ``coldcc``: code 9 + * ``x86_stdcallcc``: code 64 + * ``x86_fastcallcc``: code 65 + * ``arm_apcscc``: code 66 + * ``arm_aapcscc``: code 67 + * ``arm_aapcs_vfpcc``: code 68 + +* isproto*: Non-zero if this entry represents a declaration rather than a + definition + +* *linkage*: An encoding of the `linkage type`_ for this function + +* *paramattr*: If nonzero, the 1-based parameter attribute index into the table + of `PARAMATTR_CODE_ENTRY`_ entries. + +* *alignment*: The logarithm base 2 of the function's requested alignment, plus + 1 + +* *section*: If non-zero, the 1-based section index in the table of + `MODULE_CODE_SECTIONNAME`_ entries. + +* *visibility*: An encoding of the `visibility`_ of this function + +* *gc*: If present and nonzero, the 1-based garbage collector index in the table + of `MODULE_CODE_GCNAME`_ entries. + +* *unnamed_addr*: If present and non-zero, indicates that the function has + ``unnamed_addr`` + +MODULE_CODE_ALIAS Record +^^^^^^^^^^^^^^^^^^^^^^^^ + +``[ALIAS, alias type, aliasee val#, linkage, visibility]`` + +The ``ALIAS`` record (code 9) marks the definition of an alias. The operand +fields are + +* *alias type*: The type index of the alias + +* *aliasee val#*: The value index of the aliased value + +* *linkage*: An encoding of the `linkage type`_ for this alias + +* *visibility*: If present, an encoding of the `visibility`_ of the alias + +MODULE_CODE_PURGEVALS Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[PURGEVALS, numvals]`` + +The ``PURGEVALS`` record (code 10) resets the module-level value list to the +size given by the single operand value. Module-level value list items are added +by ``GLOBALVAR``, ``FUNCTION``, and ``ALIAS`` records. After a ``PURGEVALS`` +record is seen, new value indices will start from the given *numvals* value. + +.. _MODULE_CODE_GCNAME: + +MODULE_CODE_GCNAME Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[GCNAME, ...string...]`` + +The ``GCNAME`` record (code 11) contains a variable number of values +representing the bytes of a single garbage collector name string. There should +be one ``GCNAME`` record for each garbage collector name referenced in function +``gc`` attributes within the module. These records can be referenced by 1-based +index in the *gc* fields of ``FUNCTION`` records. + +.. _PARAMATTR_BLOCK: + +PARAMATTR_BLOCK Contents +------------------------ + +The ``PARAMATTR_BLOCK`` block (id 9) contains a table of entries describing the +attributes of function parameters. These entries are referenced by 1-based index +in the *paramattr* field of module block `FUNCTION`_ records, or within the +*attr* field of function block ``INST_INVOKE`` and ``INST_CALL`` records. + +Entries within ``PARAMATTR_BLOCK`` are constructed to ensure that each is unique +(i.e., no two indicies represent equivalent attribute lists). + +.. _PARAMATTR_CODE_ENTRY: + +PARAMATTR_CODE_ENTRY Record +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[ENTRY, paramidx0, attr0, paramidx1, attr1...]`` + +The ``ENTRY`` record (code 1) contains an even number of values describing a +unique set of function parameter attributes. Each *paramidx* value indicates +which set of attributes is represented, with 0 representing the return value +attributes, 0xFFFFFFFF representing function attributes, and other values +representing 1-based function parameters. Each *attr* value is a bitmap with the +following interpretation: + +* bit 0: ``zeroext`` +* bit 1: ``signext`` +* bit 2: ``noreturn`` +* bit 3: ``inreg`` +* bit 4: ``sret`` +* bit 5: ``nounwind`` +* bit 6: ``noalias`` +* bit 7: ``byval`` +* bit 8: ``nest`` +* bit 9: ``readnone`` +* bit 10: ``readonly`` +* bit 11: ``noinline`` +* bit 12: ``alwaysinline`` +* bit 13: ``optsize`` +* bit 14: ``ssp`` +* bit 15: ``sspreq`` +* bits 16-31: ``align n`` +* bit 32: ``nocapture`` +* bit 33: ``noredzone`` +* bit 34: ``noimplicitfloat`` +* bit 35: ``naked`` +* bit 36: ``inlinehint`` +* bits 37-39: ``alignstack n``, represented as the logarithm + base 2 of the requested alignment, plus 1 + +.. _TYPE_BLOCK: + +TYPE_BLOCK Contents +------------------- + +The ``TYPE_BLOCK`` block (id 10) contains records which constitute a table of +type operator entries used to represent types referenced within an LLVM +module. Each record (with the exception of `NUMENTRY`_) generates a single type +table entry, which may be referenced by 0-based index from instructions, +constants, metadata, type symbol table entries, or other type operator records. + +Entries within ``TYPE_BLOCK`` are constructed to ensure that each entry is +unique (i.e., no two indicies represent structurally equivalent types). + +.. _TYPE_CODE_NUMENTRY: +.. _NUMENTRY: + +TYPE_CODE_NUMENTRY Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[NUMENTRY, numentries]`` + +The ``NUMENTRY`` record (code 1) contains a single value which indicates the +total number of type code entries in the type table of the module. If present, +``NUMENTRY`` should be the first record in the block. + +TYPE_CODE_VOID Record +^^^^^^^^^^^^^^^^^^^^^ + +``[VOID]`` + +The ``VOID`` record (code 2) adds a ``void`` type to the type table. + +TYPE_CODE_HALF Record +^^^^^^^^^^^^^^^^^^^^^ + +``[HALF]`` + +The ``HALF`` record (code 10) adds a ``half`` (16-bit floating point) type to +the type table. + +TYPE_CODE_FLOAT Record +^^^^^^^^^^^^^^^^^^^^^^ + +``[FLOAT]`` + +The ``FLOAT`` record (code 3) adds a ``float`` (32-bit floating point) type to +the type table. + +TYPE_CODE_DOUBLE Record +^^^^^^^^^^^^^^^^^^^^^^^ + +``[DOUBLE]`` + +The ``DOUBLE`` record (code 4) adds a ``double`` (64-bit floating point) type to +the type table. + +TYPE_CODE_LABEL Record +^^^^^^^^^^^^^^^^^^^^^^ + +``[LABEL]`` + +The ``LABEL`` record (code 5) adds a ``label`` type to the type table. + +TYPE_CODE_OPAQUE Record +^^^^^^^^^^^^^^^^^^^^^^^ + +``[OPAQUE]`` + +The ``OPAQUE`` record (code 6) adds an ``opaque`` type to the type table. Note +that distinct ``opaque`` types are not unified. + +TYPE_CODE_INTEGER Record +^^^^^^^^^^^^^^^^^^^^^^^^ + +``[INTEGER, width]`` + +The ``INTEGER`` record (code 7) adds an integer type to the type table. The +single *width* field indicates the width of the integer type. + +TYPE_CODE_POINTER Record +^^^^^^^^^^^^^^^^^^^^^^^^ + +``[POINTER, pointee type, address space]`` + +The ``POINTER`` record (code 8) adds a pointer type to the type table. The +operand fields are + +* *pointee type*: The type index of the pointed-to type + +* *address space*: If supplied, the target-specific numbered address space where + the pointed-to object resides. Otherwise, the default address space is zero. + +TYPE_CODE_FUNCTION Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[FUNCTION, vararg, ignored, retty, ...paramty... ]`` + +The ``FUNCTION`` record (code 9) adds a function type to the type table. The +operand fields are + +* *vararg*: Non-zero if the type represents a varargs function + +* *ignored*: This value field is present for backward compatibility only, and is + ignored + +* *retty*: The type index of the function's return type + +* *paramty*: Zero or more type indices representing the parameter types of the + function + +TYPE_CODE_STRUCT Record +^^^^^^^^^^^^^^^^^^^^^^^ + +``[STRUCT, ispacked, ...eltty...]`` + +The ``STRUCT`` record (code 10) adds a struct type to the type table. The +operand fields are + +* *ispacked*: Non-zero if the type represents a packed structure + +* *eltty*: Zero or more type indices representing the element types of the + structure + +TYPE_CODE_ARRAY Record +^^^^^^^^^^^^^^^^^^^^^^ + +``[ARRAY, numelts, eltty]`` + +The ``ARRAY`` record (code 11) adds an array type to the type table. The +operand fields are + +* *numelts*: The number of elements in arrays of this type + +* *eltty*: The type index of the array element type + +TYPE_CODE_VECTOR Record +^^^^^^^^^^^^^^^^^^^^^^^ + +``[VECTOR, numelts, eltty]`` + +The ``VECTOR`` record (code 12) adds a vector type to the type table. The +operand fields are + +* *numelts*: The number of elements in vectors of this type + +* *eltty*: The type index of the vector element type + +TYPE_CODE_X86_FP80 Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[X86_FP80]`` + +The ``X86_FP80`` record (code 13) adds an ``x86_fp80`` (80-bit floating point) +type to the type table. + +TYPE_CODE_FP128 Record +^^^^^^^^^^^^^^^^^^^^^^ + +``[FP128]`` + +The ``FP128`` record (code 14) adds an ``fp128`` (128-bit floating point) type +to the type table. + +TYPE_CODE_PPC_FP128 Record +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[PPC_FP128]`` + +The ``PPC_FP128`` record (code 15) adds a ``ppc_fp128`` (128-bit floating point) +type to the type table. + +TYPE_CODE_METADATA Record +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``[METADATA]`` + +The ``METADATA`` record (code 16) adds a ``metadata`` type to the type table. + +.. _CONSTANTS_BLOCK: + +CONSTANTS_BLOCK Contents +------------------------ + +The ``CONSTANTS_BLOCK`` block (id 11) ... + +.. _FUNCTION_BLOCK: + +FUNCTION_BLOCK Contents +----------------------- + +The ``FUNCTION_BLOCK`` block (id 12) ... + +In addition to the record types described below, a ``FUNCTION_BLOCK`` block may +contain the following sub-blocks: + +* `CONSTANTS_BLOCK`_ +* `VALUE_SYMTAB_BLOCK`_ +* `METADATA_ATTACHMENT`_ + +.. _TYPE_SYMTAB_BLOCK: + +TYPE_SYMTAB_BLOCK Contents +-------------------------- + +The ``TYPE_SYMTAB_BLOCK`` block (id 13) contains entries which map between +module-level named types and their corresponding type indices. + +.. _TST_CODE_ENTRY: + +TST_CODE_ENTRY Record +^^^^^^^^^^^^^^^^^^^^^ + +``[ENTRY, typeid, ...string...]`` + +The ``ENTRY`` record (code 1) contains a variable number of values, with the +first giving the type index of the designated type, and the remaining values +giving the character codes of the type name. Each entry corresponds to a single +named type. + +.. _VALUE_SYMTAB_BLOCK: + +VALUE_SYMTAB_BLOCK Contents +--------------------------- + +The ``VALUE_SYMTAB_BLOCK`` block (id 14) ... + +.. _METADATA_BLOCK: + +METADATA_BLOCK Contents +----------------------- + +The ``METADATA_BLOCK`` block (id 15) ... + +.. _METADATA_ATTACHMENT: + +METADATA_ATTACHMENT Contents +---------------------------- + +The ``METADATA_ATTACHMENT`` block (id 16) ... diff --git a/docs/BranchWeightMetadata.html b/docs/BranchWeightMetadata.html deleted file mode 100644 index 38b87ba..0000000 --- a/docs/BranchWeightMetadata.html +++ /dev/null @@ -1,164 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Branch Weight Metadata</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1> - LLVM Branch Weight Metadata -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - <li><a href="#supported_instructions">Supported Instructions</a></li> - <li><a href="#builtin_expect">Built-in "expect" Instruction </a></li> - <li><a href="#cfg_modifications">CFG Modifications</a></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:jstaszak@apple.com">Jakub Staszak</a></p> -</div> - -<h2> - <a name="introduction">Introduction</a> -</h2> -<div> -<p>Branch Weight Metadata represents branch weights as its likeliness to -be taken. Metadata is assigned to the <tt>TerminatorInst</tt> as a -<tt>MDNode</tt> of the <tt>MD_prof</tt> kind. The first operator is always a -<tt>MDString</tt> node with the string "branch_weights". Number of operators -depends on the terminator type.</p> - -<p>Branch weights might be fetch from the profiling file, or generated based on -<a href="#builtin_expect"><tt>__builtin_expect</tt></a> instruction. -</p> - -<p>All weights are represented as an unsigned 32-bit values, where higher value -indicates greater chance to be taken.</p> -</div> - -<h2> - <a name="supported_instructions">Supported Instructions</a> -</h2> - -<div> - <h4>BranchInst</h4> - <div> - <p>Metadata is only assign to the conditional branches. There are two extra - operarands, for the true and the false branch.</p> - </div> - <div class="doc_code"> - <pre> -!0 = metadata !{ - metadata !"branch_weights", - i32 <TRUE_BRANCH_WEIGHT>, - i32 <FALSE_BRANCH_WEIGHT> -} - </pre> - </div> - - <h4>SwitchInst</h4> - <div> - <p>Branch weights are assign to every case (including <tt>default</tt> case - which is always case #0).</p> - </div> - <div class="doc_code"> - <pre> -!0 = metadata !{ - metadata !"branch_weights", - i32 <DEFAULT_BRANCH_WEIGHT> - [ , i32 <CASE_BRANCH_WEIGHT> ... ] -} - </pre> - </div> - - <h4>IndirectBrInst</h4> - <div> - <p>Branch weights are assign to every destination.</p> - </div> - <div class="doc_code"> - <pre> -!0 = metadata !{ - metadata !"branch_weights", - i32 <LABEL_BRANCH_WEIGHT> - [ , i32 <LABEL_BRANCH_WEIGHT> ... ] -} - </pre> - </div> - - <h4>Other</h4> - <div> - <p>Other terminator instructions are not allowed to contain Branch Weight - Metadata.</p> - </div> -</div> - -<h2> - <a name="builtin_expect">Built-in "expect" Instructions</a> -</h2> -<div> - <p><tt>__builtin_expect(long exp, long c)</tt> instruction provides branch - prediction information. The return value is the value of <tt>exp</tt>.</p> - - <p>It is especially useful in conditional statements. Currently Clang supports - two conditional statements: - </p> - <h4><tt>if</tt> statement</h4> - <div> - <p>The <tt>exp</tt> parameter is the condition. The <tt>c</tt> parameter is - the expected comparision value. If it is equal to 1 (true), the condition is - likely to be true, in other case condition is likely to be false. For example: - </p> - </div> - <div class="doc_code"> - <pre> - if (__builtin_expect(x > 0, 1)) { - // This block is likely to be taken. - } - </pre> - </div> - - <h4><tt>switch</tt> statement</h4> - <div> - <p>The <tt>exp</tt> parameter is the value. The <tt>c</tt> parameter is the - expected value. If the expected value doesn't show on the cases list, the - <tt>default</tt> case is assumed to be likely taken.</p> - </div> - <div class="doc_code"> - <pre> - switch (__builtin_expect(x, 5)) { - default: break; - case 0: // ... - case 3: // ... - case 5: // This case is likely to be taken. - } - </pre> - </div> -</div> - -<h2> - <a name="cfg_modifications">CFG Modifications</a> -</h2> -<div> -<p>Branch Weight Metatada is not proof against CFG changes. If terminator -operands' are changed some action should be taken. In other case some -misoptimizations may occur due to incorrent branch prediction information.</p> -</div> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:jstaszak@apple.com">Jakub Staszak</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> -</address> - -</body> -</html> diff --git a/docs/BranchWeightMetadata.rst b/docs/BranchWeightMetadata.rst new file mode 100644 index 0000000..f0df971 --- /dev/null +++ b/docs/BranchWeightMetadata.rst @@ -0,0 +1,118 @@ +.. _branch_weight: + +=========================== +LLVM Branch Weight Metadata +=========================== + +.. contents:: + :local: + +Introduction +============ + +Branch Weight Metadata represents branch weights as its likeliness to be +taken. Metadata is assigned to the ``TerminatorInst`` as a ``MDNode`` of the +``MD_prof`` kind. The first operator is always a ``MDString`` node with the +string "branch_weights". Number of operators depends on the terminator type. + +Branch weights might be fetch from the profiling file, or generated based on +`__builtin_expect`_ instruction. + +All weights are represented as an unsigned 32-bit values, where higher value +indicates greater chance to be taken. + +Supported Instructions +====================== + +``BranchInst`` +^^^^^^^^^^^^^^ + +Metadata is only assign to the conditional branches. There are two extra +operarands, for the true and the false branch. + +.. code-block:: llvm + + !0 = metadata !{ + metadata !"branch_weights", + i32 <TRUE_BRANCH_WEIGHT>, + i32 <FALSE_BRANCH_WEIGHT> + } + +``SwitchInst`` +^^^^^^^^^^^^^^ + +Branch weights are assign to every case (including ``default`` case which is +always case #0). + +.. code-block:: llvm + + !0 = metadata !{ + metadata !"branch_weights", + i32 <DEFAULT_BRANCH_WEIGHT> + [ , i32 <CASE_BRANCH_WEIGHT> ... ] + } + +``IndirectBrInst`` +^^^^^^^^^^^^^^^^^^ + +Branch weights are assign to every destination. + +.. code-block:: llvm + + !0 = metadata !{ + metadata !"branch_weights", + i32 <LABEL_BRANCH_WEIGHT> + [ , i32 <LABEL_BRANCH_WEIGHT> ... ] + } + +Other +^^^^^ + +Other terminator instructions are not allowed to contain Branch Weight Metadata. + +.. _\__builtin_expect: + +Built-in ``expect`` Instructions +================================ + +``__builtin_expect(long exp, long c)`` instruction provides branch prediction +information. The return value is the value of ``exp``. + +It is especially useful in conditional statements. Currently Clang supports two +conditional statements: + +``if`` statement +^^^^^^^^^^^^^^^^ + +The ``exp`` parameter is the condition. The ``c`` parameter is the expected +comparison value. If it is equal to 1 (true), the condition is likely to be +true, in other case condition is likely to be false. For example: + +.. code-block:: c++ + + if (__builtin_expect(x > 0, 1)) { + // This block is likely to be taken. + } + +``switch`` statement +^^^^^^^^^^^^^^^^^^^^ + +The ``exp`` parameter is the value. The ``c`` parameter is the expected +value. If the expected value doesn't show on the cases list, the ``default`` +case is assumed to be likely taken. + +.. code-block:: c++ + + switch (__builtin_expect(x, 5)) { + default: break; + case 0: // ... + case 3: // ... + case 5: // This case is likely to be taken. + } + +CFG Modifications +================= + +Branch Weight Metatada is not proof against CFG changes. If terminator operands' +are changed some action should be taken. In other case some misoptimizations may +occur due to incorrent branch prediction information. diff --git a/docs/Bugpoint.html b/docs/Bugpoint.html deleted file mode 100644 index d9cce0b..0000000 --- a/docs/Bugpoint.html +++ /dev/null @@ -1,239 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM bugpoint tool: design and usage</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> - -<h1> - LLVM bugpoint tool: design and usage -</h1> - -<ul> - <li><a href="#desc">Description</a></li> - <li><a href="#design">Design Philosophy</a> - <ul> - <li><a href="#autoselect">Automatic Debugger Selection</a></li> - <li><a href="#crashdebug">Crash debugger</a></li> - <li><a href="#codegendebug">Code generator debugger</a></li> - <li><a href="#miscompilationdebug">Miscompilation debugger</a></li> - </ul></li> - <li><a href="#advice">Advice for using <tt>bugpoint</tt></a></li> -</ul> - -<div class="doc_author"> -<p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="desc">Description</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p><tt>bugpoint</tt> narrows down the source of problems in LLVM tools and -passes. It can be used to debug three types of failures: optimizer crashes, -miscompilations by optimizers, or bad native code generation (including problems -in the static and JIT compilers). It aims to reduce large test cases to small, -useful ones. For example, if <tt>opt</tt> crashes while optimizing a -file, it will identify the optimization (or combination of optimizations) that -causes the crash, and reduce the file down to a small example which triggers the -crash.</p> - -<p>For detailed case scenarios, such as debugging <tt>opt</tt>, -<tt>llvm-ld</tt>, or one of the LLVM code generators, see <a -href="HowToSubmitABug.html">How To Submit a Bug Report document</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="design">Design Philosophy</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p><tt>bugpoint</tt> is designed to be a useful tool without requiring any -hooks into the LLVM infrastructure at all. It works with any and all LLVM -passes and code generators, and does not need to "know" how they work. Because -of this, it may appear to do stupid things or miss obvious -simplifications. <tt>bugpoint</tt> is also designed to trade off programmer -time for computer time in the compiler-debugging process; consequently, it may -take a long period of (unattended) time to reduce a test case, but we feel it -is still worth it. Note that <tt>bugpoint</tt> is generally very quick unless -debugging a miscompilation where each test of the program (which requires -executing it) takes a long time.</p> - -<!-- ======================================================================= --> -<h3> - <a name="autoselect">Automatic Debugger Selection</a> -</h3> - -<div> - -<p><tt>bugpoint</tt> reads each <tt>.bc</tt> or <tt>.ll</tt> file specified on -the command line and links them together into a single module, called the test -program. If any LLVM passes are specified on the command line, it runs these -passes on the test program. If any of the passes crash, or if they produce -malformed output (which causes the verifier to abort), <tt>bugpoint</tt> starts -the <a href="#crashdebug">crash debugger</a>.</p> - -<p>Otherwise, if the <tt>-output</tt> option was not specified, -<tt>bugpoint</tt> runs the test program with the C backend (which is assumed to -generate good code) to generate a reference output. Once <tt>bugpoint</tt> has -a reference output for the test program, it tries executing it with the -selected code generator. If the selected code generator crashes, -<tt>bugpoint</tt> starts the <a href="#crashdebug">crash debugger</a> on the -code generator. Otherwise, if the resulting output differs from the reference -output, it assumes the difference resulted from a code generator failure, and -starts the <a href="#codegendebug">code generator debugger</a>.</p> - -<p>Finally, if the output of the selected code generator matches the reference -output, <tt>bugpoint</tt> runs the test program after all of the LLVM passes -have been applied to it. If its output differs from the reference output, it -assumes the difference resulted from a failure in one of the LLVM passes, and -enters the <a href="#miscompilationdebug">miscompilation debugger</a>. -Otherwise, there is no problem <tt>bugpoint</tt> can debug.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="crashdebug">Crash debugger</a> -</h3> - -<div> - -<p>If an optimizer or code generator crashes, <tt>bugpoint</tt> will try as hard -as it can to reduce the list of passes (for optimizer crashes) and the size of -the test program. First, <tt>bugpoint</tt> figures out which combination of -optimizer passes triggers the bug. This is useful when debugging a problem -exposed by <tt>opt</tt>, for example, because it runs over 38 passes.</p> - -<p>Next, <tt>bugpoint</tt> tries removing functions from the test program, to -reduce its size. Usually it is able to reduce a test program to a single -function, when debugging intraprocedural optimizations. Once the number of -functions has been reduced, it attempts to delete various edges in the control -flow graph, to reduce the size of the function as much as possible. Finally, -<tt>bugpoint</tt> deletes any individual LLVM instructions whose absence does -not eliminate the failure. At the end, <tt>bugpoint</tt> should tell you what -passes crash, give you a bitcode file, and give you instructions on how to -reproduce the failure with <tt>opt</tt> or <tt>llc</tt>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="codegendebug">Code generator debugger</a> -</h3> - -<div> - -<p>The code generator debugger attempts to narrow down the amount of code that -is being miscompiled by the selected code generator. To do this, it takes the -test program and partitions it into two pieces: one piece which it compiles -with the C backend (into a shared object), and one piece which it runs with -either the JIT or the static LLC compiler. It uses several techniques to -reduce the amount of code pushed through the LLVM code generator, to reduce the -potential scope of the problem. After it is finished, it emits two bitcode -files (called "test" [to be compiled with the code generator] and "safe" [to be -compiled with the C backend], respectively), and instructions for reproducing -the problem. The code generator debugger assumes that the C backend produces -good code.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="miscompilationdebug">Miscompilation debugger</a> -</h3> - -<div> - -<p>The miscompilation debugger works similarly to the code generator debugger. -It works by splitting the test program into two pieces, running the -optimizations specified on one piece, linking the two pieces back together, and -then executing the result. It attempts to narrow down the list of passes to -the one (or few) which are causing the miscompilation, then reduce the portion -of the test program which is being miscompiled. The miscompilation debugger -assumes that the selected code generator is working properly.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="advice">Advice for using bugpoint</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<tt>bugpoint</tt> can be a remarkably useful tool, but it sometimes works in -non-obvious ways. Here are some hints and tips:<p> - -<ol> -<li>In the code generator and miscompilation debuggers, <tt>bugpoint</tt> only - works with programs that have deterministic output. Thus, if the program - outputs <tt>argv[0]</tt>, the date, time, or any other "random" data, - <tt>bugpoint</tt> may misinterpret differences in these data, when output, - as the result of a miscompilation. Programs should be temporarily modified - to disable outputs that are likely to vary from run to run. - -<li>In the code generator and miscompilation debuggers, debugging will go - faster if you manually modify the program or its inputs to reduce the - runtime, but still exhibit the problem. - -<li><tt>bugpoint</tt> is extremely useful when working on a new optimization: - it helps track down regressions quickly. To avoid having to relink - <tt>bugpoint</tt> every time you change your optimization however, have - <tt>bugpoint</tt> dynamically load your optimization with the - <tt>-load</tt> option. - -<li><p><tt>bugpoint</tt> can generate a lot of output and run for a long period - of time. It is often useful to capture the output of the program to file. - For example, in the C shell, you can run:</p> - -<div class="doc_code"> -<p><tt>bugpoint ... |& tee bugpoint.log</tt></p> -</div> - - <p>to get a copy of <tt>bugpoint</tt>'s output in the file - <tt>bugpoint.log</tt>, as well as on your terminal.</p> - -<li><tt>bugpoint</tt> cannot debug problems with the LLVM linker. If - <tt>bugpoint</tt> crashes before you see its "All input ok" message, - you might try <tt>llvm-link -v</tt> on the same set of input files. If - that also crashes, you may be experiencing a linker bug. - -<li><tt>bugpoint</tt> is useful for proactively finding bugs in LLVM. - Invoking <tt>bugpoint</tt> with the <tt>-find-bugs</tt> option will cause - the list of specified optimizations to be randomized and applied to the - program. This process will repeat until a bug is found or the user - kills <tt>bugpoint</tt>. -</ol> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ -</address> - -</body> -</html> diff --git a/docs/Bugpoint.rst b/docs/Bugpoint.rst new file mode 100644 index 0000000..9ccf0cc --- /dev/null +++ b/docs/Bugpoint.rst @@ -0,0 +1,218 @@ +.. _bugpoint: + +==================================== +LLVM bugpoint tool: design and usage +==================================== + +.. contents:: + :local: + +Description +=========== + +``bugpoint`` narrows down the source of problems in LLVM tools and passes. It +can be used to debug three types of failures: optimizer crashes, miscompilations +by optimizers, or bad native code generation (including problems in the static +and JIT compilers). It aims to reduce large test cases to small, useful ones. +For example, if ``opt`` crashes while optimizing a file, it will identify the +optimization (or combination of optimizations) that causes the crash, and reduce +the file down to a small example which triggers the crash. + +For detailed case scenarios, such as debugging ``opt``, or one of the LLVM code +generators, see `How To Submit a Bug Report document <HowToSubmitABug.html>`_. + +Design Philosophy +================= + +``bugpoint`` is designed to be a useful tool without requiring any hooks into +the LLVM infrastructure at all. It works with any and all LLVM passes and code +generators, and does not need to "know" how they work. Because of this, it may +appear to do stupid things or miss obvious simplifications. ``bugpoint`` is +also designed to trade off programmer time for computer time in the +compiler-debugging process; consequently, it may take a long period of +(unattended) time to reduce a test case, but we feel it is still worth it. Note +that ``bugpoint`` is generally very quick unless debugging a miscompilation +where each test of the program (which requires executing it) takes a long time. + +Automatic Debugger Selection +---------------------------- + +``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line +and links them together into a single module, called the test program. If any +LLVM passes are specified on the command line, it runs these passes on the test +program. If any of the passes crash, or if they produce malformed output (which +causes the verifier to abort), ``bugpoint`` starts the `crash debugger`_. + +Otherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the +test program with the "safe" backend (which is assumed to generate good code) to +generate a reference output. Once ``bugpoint`` has a reference output for the +test program, it tries executing it with the selected code generator. If the +selected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on +the code generator. Otherwise, if the resulting output differs from the +reference output, it assumes the difference resulted from a code generator +failure, and starts the `code generator debugger`_. + +Finally, if the output of the selected code generator matches the reference +output, ``bugpoint`` runs the test program after all of the LLVM passes have +been applied to it. If its output differs from the reference output, it assumes +the difference resulted from a failure in one of the LLVM passes, and enters the +`miscompilation debugger`_. Otherwise, there is no problem ``bugpoint`` can +debug. + +.. _crash debugger: + +Crash debugger +-------------- + +If an optimizer or code generator crashes, ``bugpoint`` will try as hard as it +can to reduce the list of passes (for optimizer crashes) and the size of the +test program. First, ``bugpoint`` figures out which combination of optimizer +passes triggers the bug. This is useful when debugging a problem exposed by +``opt``, for example, because it runs over 38 passes. + +Next, ``bugpoint`` tries removing functions from the test program, to reduce its +size. Usually it is able to reduce a test program to a single function, when +debugging intraprocedural optimizations. Once the number of functions has been +reduced, it attempts to delete various edges in the control flow graph, to +reduce the size of the function as much as possible. Finally, ``bugpoint`` +deletes any individual LLVM instructions whose absence does not eliminate the +failure. At the end, ``bugpoint`` should tell you what passes crash, give you a +bitcode file, and give you instructions on how to reproduce the failure with +``opt`` or ``llc``. + +.. _code generator debugger: + +Code generator debugger +----------------------- + +The code generator debugger attempts to narrow down the amount of code that is +being miscompiled by the selected code generator. To do this, it takes the test +program and partitions it into two pieces: one piece which it compiles with the +"safe" backend (into a shared object), and one piece which it runs with either +the JIT or the static LLC compiler. It uses several techniques to reduce the +amount of code pushed through the LLVM code generator, to reduce the potential +scope of the problem. After it is finished, it emits two bitcode files (called +"test" [to be compiled with the code generator] and "safe" [to be compiled with +the "safe" backend], respectively), and instructions for reproducing the +problem. The code generator debugger assumes that the "safe" backend produces +good code. + +.. _miscompilation debugger: + +Miscompilation debugger +----------------------- + +The miscompilation debugger works similarly to the code generator debugger. It +works by splitting the test program into two pieces, running the optimizations +specified on one piece, linking the two pieces back together, and then executing +the result. It attempts to narrow down the list of passes to the one (or few) +which are causing the miscompilation, then reduce the portion of the test +program which is being miscompiled. The miscompilation debugger assumes that +the selected code generator is working properly. + +Advice for using bugpoint +========================= + +``bugpoint`` can be a remarkably useful tool, but it sometimes works in +non-obvious ways. Here are some hints and tips: + +* In the code generator and miscompilation debuggers, ``bugpoint`` only works + with programs that have deterministic output. Thus, if the program outputs + ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may + misinterpret differences in these data, when output, as the result of a + miscompilation. Programs should be temporarily modified to disable outputs + that are likely to vary from run to run. + +* In the code generator and miscompilation debuggers, debugging will go faster + if you manually modify the program or its inputs to reduce the runtime, but + still exhibit the problem. + +* ``bugpoint`` is extremely useful when working on a new optimization: it helps + track down regressions quickly. To avoid having to relink ``bugpoint`` every + time you change your optimization however, have ``bugpoint`` dynamically load + your optimization with the ``-load`` option. + +* ``bugpoint`` can generate a lot of output and run for a long period of time. + It is often useful to capture the output of the program to file. For example, + in the C shell, you can run: + + .. code-block:: bash + + bugpoint ... |& tee bugpoint.log + + to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well + as on your terminal. + +* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint`` + crashes before you see its "All input ok" message, you might try ``llvm-link + -v`` on the same set of input files. If that also crashes, you may be + experiencing a linker bug. + +* ``bugpoint`` is useful for proactively finding bugs in LLVM. Invoking + ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified + optimizations to be randomized and applied to the program. This process will + repeat until a bug is found or the user kills ``bugpoint``. + +What to do when bugpoint isn't enough +===================================== + +Sometimes, ``bugpoint`` is not enough. In particular, InstCombine and +TargetLowering both have visitor structured code with lots of potential +transformations. If the process of using bugpoint has left you with still too +much code to figure out and the problem seems to be in instcombine, the +following steps may help. These same techniques are useful with TargetLowering +as well. + +Turn on ``-debug-only=instcombine`` and see which transformations within +instcombine are firing by selecting out lines with "``IC``" in them. + +At this point, you have a decision to make. Is the number of transformations +small enough to step through them using a debugger? If so, then try that. + +If there are too many transformations, then a source modification approach may +be helpful. In this approach, you can modify the source code of instcombine to +disable just those transformations that are being performed on your test input +and perform a binary search over the set of transformations. One set of places +to modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.* +``visitICmpInst``) by adding a "``return false``" as the first line of the +method. + +If that still doesn't remove enough, then change the caller of +``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the +number of iterations. + +You may also find it useful to use "``-stats``" now to see what parts of +instcombine are firing. This can guide where to put additional reporting code. + +At this point, if the amount of transformations is still too large, then +inserting code to limit whether or not to execute the body of the code in the +visit function can be helpful. Add a static counter which is incremented on +every invocation of the function. Then add code which simply returns false on +desired ranges. For example: + +.. code-block:: c++ + + + static int calledCount = 0; + calledCount++; + DEBUG(if (calledCount < 212) return false); + DEBUG(if (calledCount > 217) return false); + DEBUG(if (calledCount == 213) return false); + DEBUG(if (calledCount == 214) return false); + DEBUG(if (calledCount == 215) return false); + DEBUG(if (calledCount == 216) return false); + DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n"); + DEBUG(dbgs() << "I: "; I->dump()); + +could be added to ``visitXOR`` to limit ``visitXor`` to being applied only to +calls 212 and 217. This is from an actual test case and raises an important +point---a simple binary search may not be sufficient, as transformations that +interact may require isolating more than one call. In TargetLowering, use +``return SDNode();`` instead of ``return false;``. + +Now that that the number of transformations is down to a manageable number, try +examining the output to see if you can figure out which transformations are +being done. If that can be figured out, then do the usual debugging. If which +code corresponds to the transformation being performed isn't obvious, set a +breakpoint after the call count based disabling and step through the code. +Alternatively, you can use "``printf``" style debugging to report waypoints. diff --git a/docs/CMake.html b/docs/CMake.html deleted file mode 100644 index acc7fe9..0000000 --- a/docs/CMake.html +++ /dev/null @@ -1,584 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Building LLVM with CMake</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> - -<h1> - Building LLVM with CMake -</h1> - -<ul> - <li><a href="#intro">Introduction</a></li> - <li><a href="#quickstart">Quick start</a></li> - <li><a href="#usage">Basic CMake usage</a> - <li><a href="#options">Options and variables</a> - <ul> - <li><a href="#freccmake">Frequently-used CMake variables</a></li> - <li><a href="#llvmvars">LLVM-specific variables</a></li> - </ul></li> - <li><a href="#testing">Executing the test suite</a> - <li><a href="#cross">Cross compiling</a> - <li><a href="#embedding">Embedding LLVM in your project</a> - <ul> - <li><a href="#passdev">Developing LLVM pass out of source</a></li> - </ul></li> - <li><a href="#specifics">Compiler/Platform specific topics</a> - <ul> - <li><a href="#msvc">Microsoft Visual C++</a></li> - </ul></li> -</ul> - -<div class="doc_author"> -<p>Written by <a href="mailto:ofv@wanadoo.es">Oscar Fuentes</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="intro">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p><a href="http://www.cmake.org/">CMake</a> is a cross-platform - build-generator tool. CMake does not build the project, it generates - the files needed by your build tool (GNU make, Visual Studio, etc) for - building LLVM.</p> - - <p>If you are really anxious about getting a functional LLVM build, - go to the <a href="#quickstart">Quick start</a> section. If you - are a CMake novice, start on <a href="#usage">Basic CMake - usage</a> and then go back to the <a href="#quickstart">Quick - start</a> once you know what you are - doing. The <a href="#options">Options and variables</a> section - is a reference for customizing your build. If you already have - experience with CMake, this is the recommended starting point. -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="quickstart">Quick start</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p> We use here the command-line, non-interactive CMake interface </p> - -<ol> - - <li><p><a href="http://www.cmake.org/cmake/resources/software.html">Download</a> - and install CMake. Version 2.8 is the minimum required.</p> - - <li><p>Open a shell. Your development tools must be reachable from this - shell through the PATH environment variable.</p> - - <li><p>Create a directory for containing the build. It is not - supported to build LLVM on the source directory. cd to this - directory:</p> - <div class="doc_code"> - <p><tt>mkdir mybuilddir</tt></p> - <p><tt>cd mybuilddir</tt></p> - </div> - - <li><p>Execute this command on the shell - replacing <i>path/to/llvm/source/root</i> with the path to the - root of your LLVM source tree:</p> - <div class="doc_code"> - <p><tt>cmake path/to/llvm/source/root</tt></p> - </div> - - <p>CMake will detect your development environment, perform a - series of test and generate the files required for building - LLVM. CMake will use default values for all build - parameters. See the <a href="#options">Options and variables</a> - section for fine-tuning your build</p> - - <p>This can fail if CMake can't detect your toolset, or if it - thinks that the environment is not sane enough. On this case - make sure that the toolset that you intend to use is the only - one reachable from the shell and that the shell itself is the - correct one for you development environment. CMake will refuse - to build MinGW makefiles if you have a POSIX shell reachable - through the PATH environment variable, for instance. You can - force CMake to use a given build tool, see - the <a href="#usage">Usage</a> section.</p> - -</ol> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="usage">Basic CMake usage</a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p>This section explains basic aspects of CMake, mostly for - explaining those options which you may need on your day-to-day - usage.</p> - - <p>CMake comes with extensive documentation in the form of html - files and on the cmake executable itself. Execute <i>cmake - --help</i> for further help options.</p> - - <p>CMake requires to know for which build tool it shall generate - files (GNU make, Visual Studio, Xcode, etc). If not specified on - the command line, it tries to guess it based on you - environment. Once identified the build tool, CMake uses the - corresponding <i>Generator</i> for creating files for your build - tool. You can explicitly specify the generator with the command - line option <i>-G "Name of the generator"</i>. For knowing the - available generators on your platform, execute</p> - - <div class="doc_code"> - <p><tt>cmake --help</tt></p> - </div> - - <p>This will list the generator's names at the end of the help - text. Generator's names are case-sensitive. Example:</p> - - <div class="doc_code"> - <p><tt>cmake -G "Visual Studio 9 2008" path/to/llvm/source/root</tt></p> - </div> - - <p>For a given development platform there can be more than one - adequate generator. If you use Visual Studio "NMake Makefiles" - is a generator you can use for building with NMake. By default, - CMake chooses the more specific generator supported by your - development environment. If you want an alternative generator, - you must tell this to CMake with the <i>-G</i> option.</p> - - <p>TODO: explain variables and cache. Move explanation here from - #options section.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="options">Options and variables</a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p>Variables customize how the build will be generated. Options are - boolean variables, with possible values ON/OFF. Options and - variables are defined on the CMake command line like this:</p> - - <div class="doc_code"> - <p><tt>cmake -DVARIABLE=value path/to/llvm/source</tt></p> - </div> - - <p>You can set a variable after the initial CMake invocation for - changing its value. You can also undefine a variable:</p> - - <div class="doc_code"> - <p><tt>cmake -UVARIABLE path/to/llvm/source</tt></p> - </div> - - <p>Variables are stored on the CMake cache. This is a file - named <tt>CMakeCache.txt</tt> on the root of the build - directory. Do not hand-edit it.</p> - - <p>Variables are listed here appending its type after a colon. It is - correct to write the variable and the type on the CMake command - line:</p> - - <div class="doc_code"> - <p><tt>cmake -DVARIABLE:TYPE=value path/to/llvm/source</tt></p> - </div> - -<!-- ======================================================================= --> -<h3> - <a name="freccmake">Frequently-used CMake variables</a> -</h3> - -<div> - -<p>Here are listed some of the CMake variables that are used often, - along with a brief explanation and LLVM-specific notes. For full - documentation, check the CMake docs or execute <i>cmake - --help-variable VARIABLE_NAME</i>.</p> - -<dl> - <dt><b>CMAKE_BUILD_TYPE</b>:STRING</dt> - - <dd>Sets the build type for <i>make</i> based generators. Possible - values are Release, Debug, RelWithDebInfo and MinSizeRel. On - systems like Visual Studio the user sets the build type with the IDE - settings.</dd> - - <dt><b>CMAKE_INSTALL_PREFIX</b>:PATH</dt> - <dd>Path where LLVM will be installed if "make install" is invoked - or the "INSTALL" target is built.</dd> - - <dt><b>LLVM_LIBDIR_SUFFIX</b>:STRING</dt> - <dd>Extra suffix to append to the directory where libraries are to - be installed. On a 64-bit architecture, one could use - -DLLVM_LIBDIR_SUFFIX=64 to install libraries to /usr/lib64.</dd> - - <dt><b>CMAKE_C_FLAGS</b>:STRING</dt> - <dd>Extra flags to use when compiling C source files.</dd> - - <dt><b>CMAKE_CXX_FLAGS</b>:STRING</dt> - <dd>Extra flags to use when compiling C++ source files.</dd> - - <dt><b>BUILD_SHARED_LIBS</b>:BOOL</dt> - <dd>Flag indicating is shared libraries will be built. Its default - value is OFF. Shared libraries are not supported on Windows and - not recommended in the other OSes.</dd> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="llvmvars">LLVM-specific variables</a> -</h3> - -<div> - -<dl> - <dt><b>LLVM_TARGETS_TO_BUILD</b>:STRING</dt> - <dd>Semicolon-separated list of targets to build, or <i>all</i> for - building all targets. Case-sensitive. For Visual C++ defaults - to <i>X86</i>. On the other cases defaults to <i>all</i>. Example: - <i>-DLLVM_TARGETS_TO_BUILD="X86;PowerPC"</i>.</dd> - - <dt><b>LLVM_BUILD_TOOLS</b>:BOOL</dt> - <dd>Build LLVM tools. Defaults to ON. Targets for building each tool - are generated in any case. You can build an tool separately by - invoking its target. For example, you can build <i>llvm-as</i> - with a makefile-based system executing <i>make llvm-as</i> on the - root of your build directory.</dd> - - <dt><b>LLVM_INCLUDE_TOOLS</b>:BOOL</dt> - <dd>Generate build targets for the LLVM tools. Defaults to - ON. You can use that option for disabling the generation of build - targets for the LLVM tools.</dd> - - <dt><b>LLVM_BUILD_EXAMPLES</b>:BOOL</dt> - <dd>Build LLVM examples. Defaults to OFF. Targets for building each - example are generated in any case. See documentation - for <i>LLVM_BUILD_TOOLS</i> above for more details.</dd> - - <dt><b>LLVM_INCLUDE_EXAMPLES</b>:BOOL</dt> - <dd>Generate build targets for the LLVM examples. Defaults to - ON. You can use that option for disabling the generation of build - targets for the LLVM examples.</dd> - - <dt><b>LLVM_BUILD_TESTS</b>:BOOL</dt> - <dd>Build LLVM unit tests. Defaults to OFF. Targets for building - each unit test are generated in any case. You can build a specific - unit test with the target <i>UnitTestNameTests</i> (where at this - time <i>UnitTestName</i> can be ADT, Analysis, ExecutionEngine, - JIT, Support, Transform, VMCore; see the subdirectories - of <i>unittests</i> for an updated list.) It is possible to build - all unit tests with the target <i>UnitTests</i>.</dd> - - <dt><b>LLVM_INCLUDE_TESTS</b>:BOOL</dt> - <dd>Generate build targets for the LLVM unit tests. Defaults to - ON. You can use that option for disabling the generation of build - targets for the LLVM unit tests.</dd> - - <dt><b>LLVM_APPEND_VC_REV</b>:BOOL</dt> - <dd>Append version control revision info (svn revision number or git - revision id) to LLVM version string (stored in the PACKAGE_VERSION - macro). For this to work cmake must be invoked before the - build. Defaults to OFF.</dd> - - <dt><b>LLVM_ENABLE_THREADS</b>:BOOL</dt> - <dd>Build with threads support, if available. Defaults to ON.</dd> - - <dt><b>LLVM_ENABLE_ASSERTIONS</b>:BOOL</dt> - <dd>Enables code assertions. Defaults to OFF if and only if - CMAKE_BUILD_TYPE is <i>Release</i>.</dd> - - <dt><b>LLVM_ENABLE_PIC</b>:BOOL</dt> - <dd>Add the <i>-fPIC</i> flag for the compiler command-line, if the - compiler supports this flag. Some systems, like Windows, do not - need this flag. Defaults to ON.</dd> - - <dt><b>LLVM_ENABLE_WARNINGS</b>:BOOL</dt> - <dd>Enable all compiler warnings. Defaults to ON.</dd> - - <dt><b>LLVM_ENABLE_PEDANTIC</b>:BOOL</dt> - <dd>Enable pedantic mode. This disable compiler specific extensions, is - possible. Defaults to ON.</dd> - - <dt><b>LLVM_ENABLE_WERROR</b>:BOOL</dt> - <dd>Stop and fail build, if a compiler warning is - triggered. Defaults to OFF.</dd> - - <dt><b>LLVM_BUILD_32_BITS</b>:BOOL</dt> - <dd>Build 32-bits executables and libraries on 64-bits systems. This - option is available only on some 64-bits unix systems. Defaults to - OFF.</dd> - - <dt><b>LLVM_TARGET_ARCH</b>:STRING</dt> - <dd>LLVM target to use for native code generation. This is required - for JIT generation. It defaults to "host", meaning that it shall - pick the architecture of the machine where LLVM is being built. If - you are cross-compiling, set it to the target architecture - name.</dd> - - <dt><b>LLVM_TABLEGEN</b>:STRING</dt> - <dd>Full path to a native TableGen executable (usually - named <i>tblgen</i>). This is intented for cross-compiling: if the - user sets this variable, no native TableGen will be created.</dd> - - <dt><b>LLVM_LIT_ARGS</b>:STRING</dt> - <dd>Arguments given to lit. - <tt>make check</tt> and <tt>make clang-test</tt> are affected. - By default, <tt>"-sv --no-progress-bar"</tt> - on Visual C++ and Xcode, - <tt>"-sv"</tt> on others.</dd> - - <dt><b>LLVM_LIT_TOOLS_DIR</b>:PATH</dt> - <dd>The path to GnuWin32 tools for tests. Valid on Windows host. - Defaults to "", then Lit seeks tools according to %PATH%. - Lit can find tools(eg. grep, sort, &c) on LLVM_LIT_TOOLS_DIR at first, - without specifying GnuWin32 to %PATH%.</dd> - - <dt><b>LLVM_ENABLE_FFI</b>:BOOL</dt> - <dd>Indicates whether LLVM Interpreter will be linked with Foreign - Function Interface library. If the library or its headers are - installed on a custom location, you can set the variables - FFI_INCLUDE_DIR and FFI_LIBRARY_DIR. Defaults to OFF.</dd> - - <dt><b>LLVM_CLANG_SOURCE_DIR</b>:PATH</dt> - <dd>Path to Clang's source directory. Defaults to tools/clang. - Clang will not be built when it is empty or it does not point valid - path.</dd> - - <dt><b>LLVM_USE_OPROFILE</b>:BOOL</dt> - <dd> Enable building OProfile JIT support. Defaults to OFF</dd> - - <dt><b>LLVM_USE_INTEL_JITEVENTS</b>:BOOL</dt> - <dd> Enable building support for Intel JIT Events API. Defaults to OFF</dd> - - <dt><b>LLVM_INTEL_JITEVENTS_DIR</b>:PATH</dt> - <dd> Path to installation of Intel(R) VTune(TM) Amplifier XE 2011, - used to locate the <tt>jitprofiling</tt> library. Default = - <tt>%VTUNE_AMPLIFIER_XE_2011_DIR%</tt> (Windows) - | <tt>/opt/intel/vtune_amplifier_xe_2011</tt> (Linux) </dd> - -</dl> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="testing">Executing the test suite</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Testing is performed when the <i>check</i> target is built. For - instance, if you are using makefiles, execute this command while on - the top level of your build directory:</p> - -<div class="doc_code"> - <p><tt>make check</tt></p> -</div> - -<p>On Visual Studio, you may run tests to build the project "check".</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="cross">Cross compiling</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>See <a href="http://www.vtk.org/Wiki/CMake_Cross_Compiling">this - wiki page</a> for generic instructions on how to cross-compile - with CMake. It goes into detailed explanations and may seem - daunting, but it is not. On the wiki page there are several - examples including toolchain files. Go directly to - <a href="http://www.vtk.org/Wiki/CMake_Cross_Compiling#Information_how_to_set_up_various_cross_compiling_toolchains">this - section</a> for a quick solution.</p> - -<p>Also see the <a href="#llvmvars">LLVM-specific variables</a> - section for variables used when cross-compiling.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="embedding">Embedding LLVM in your project</a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p>The most difficult part of adding LLVM to the build of a project - is to determine the set of LLVM libraries corresponding to the set - of required LLVM features. What follows is an example of how to - obtain this information:</p> - - <div class="doc_code"> - <pre> - <b># A convenience variable:</b> - set(LLVM_ROOT "" CACHE PATH "Root of LLVM install.") - <b># A bit of a sanity check:</b> - if( NOT EXISTS ${LLVM_ROOT}/include/llvm ) - message(FATAL_ERROR "LLVM_ROOT (${LLVM_ROOT}) is not a valid LLVM install") - endif() - <b># We incorporate the CMake features provided by LLVM:</b> - set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${LLVM_ROOT}/share/llvm/cmake") - include(LLVMConfig) - <b># Now set the header and library paths:</b> - include_directories( ${LLVM_INCLUDE_DIRS} ) - link_directories( ${LLVM_LIBRARY_DIRS} ) - add_definitions( ${LLVM_DEFINITIONS} ) - <b># Let's suppose we want to build a JIT compiler with support for - # binary code (no interpreter):</b> - llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native) - <b># Finally, we link the LLVM libraries to our executable:</b> - target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES}) - </pre> - </div> - - <p>This assumes that LLVM_ROOT points to an install of LLVM. The - procedure works too for uninstalled builds although we need to take - care to add an <i>include_directories</i> for the location of the - headers on the LLVM source directory (if we are building - out-of-source.)</p> - - <p>Alternativaly, you can utilize CMake's <i>find_package</i> - functionality. Here is an equivalent variant of snippet shown above:</p> - - <div class="doc_code"> - <pre> - find_package(LLVM) - - if( NOT LLVM_FOUND ) - message(FATAL_ERROR "LLVM package can't be found. Set CMAKE_PREFIX_PATH variable to LLVM's installation prefix.") - endif() - - include_directories( ${LLVM_INCLUDE_DIRS} ) - link_directories( ${LLVM_LIBRARY_DIRS} ) - - llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native) - - target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES}) - </pre> - </div> - -<!-- ======================================================================= --> -<h3> - <a name="passdev">Developing LLVM pass out of source</a> -</h3> - -<div> - - <p>It is possible to develop LLVM passes against installed LLVM. - An example of project layout provided below:</p> - - <div class="doc_code"> - <pre> - <project dir>/ - | - CMakeLists.txt - <pass name>/ - | - CMakeLists.txt - Pass.cpp - ... - </pre> - </div> - - <p>Contents of <project dir>/CMakeLists.txt:</p> - - <div class="doc_code"> - <pre> - find_package(LLVM) - - <b># Define add_llvm_* macro's.</b> - include(AddLLVM) - - add_definitions(${LLVM_DEFINITIONS}) - include_directories(${LLVM_INCLUDE_DIRS}) - link_directories(${LLVM_LIBRARY_DIRS}) - - add_subdirectory(<pass name>) - </pre> - </div> - - <p>Contents of <project dir>/<pass name>/CMakeLists.txt:</p> - - <div class="doc_code"> - <pre> - add_llvm_loadable_module(LLVMPassname - Pass.cpp - ) - </pre> - </div> - - <p>When you are done developing your pass, you may wish to integrate it - into LLVM source tree. You can achieve it in two easy steps:<br> - 1. Copying <pass name> folder into <LLVM root>/lib/Transform directory.<br> - 2. Adding "add_subdirectory(<pass name>)" line into <LLVM root>/lib/Transform/CMakeLists.txt</p> -</div> -<!-- *********************************************************************** --> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="specifics">Compiler/Platform specific topics</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Notes for specific compilers and/or platforms.</p> - -<h3> - <a name="msvc">Microsoft Visual C++</a> -</h3> - -<div> - -<dl> - <dt><b>LLVM_COMPILER_JOBS</b>:STRING</dt> - <dd>Specifies the maximum number of parallell compiler jobs to use - per project when building with msbuild or Visual Studio. Only supported for - Visual Studio 2008 and Visual Studio 2010 CMake generators. 0 means use all - processors. Default is 0.</dd> -</dl> - -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:ofv@wanadoo.es">Oscar Fuentes</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2010-08-09 03:59:36 +0100 (Mon, 9 Aug 2010) $ -</address> - -</body> -</html> diff --git a/docs/CMake.rst b/docs/CMake.rst new file mode 100644 index 0000000..e1761c5 --- /dev/null +++ b/docs/CMake.rst @@ -0,0 +1,423 @@ +.. _building-with-cmake: + +======================== +Building LLVM with CMake +======================== + +.. contents:: + :local: + +Introduction +============ + +`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake +does not build the project, it generates the files needed by your build tool +(GNU make, Visual Studio, etc) for building LLVM. + +If you are really anxious about getting a functional LLVM build, go to the +`Quick start`_ section. If you are a CMake novice, start on `Basic CMake usage`_ +and then go back to the `Quick start`_ once you know what you are doing. The +`Options and variables`_ section is a reference for customizing your build. If +you already have experience with CMake, this is the recommended starting point. + +.. _Quick start: + +Quick start +=========== + +We use here the command-line, non-interactive CMake interface. + +#. `Download <http://www.cmake.org/cmake/resources/software.html>`_ and install + CMake. Version 2.8 is the minimum required. + +#. Open a shell. Your development tools must be reachable from this shell + through the PATH environment variable. + +#. Create a directory for containing the build. It is not supported to build + LLVM on the source directory. cd to this directory: + + .. code-block:: bash + + $ mkdir mybuilddir + $ cd mybuilddir + +#. Execute this command on the shell replacing `path/to/llvm/source/root` with + the path to the root of your LLVM source tree: + + .. code-block:: bash + + $ cmake path/to/llvm/source/root + + CMake will detect your development environment, perform a series of test and + generate the files required for building LLVM. CMake will use default values + for all build parameters. See the `Options and variables`_ section for + fine-tuning your build + + This can fail if CMake can't detect your toolset, or if it thinks that the + environment is not sane enough. On this case make sure that the toolset that + you intend to use is the only one reachable from the shell and that the shell + itself is the correct one for you development environment. CMake will refuse + to build MinGW makefiles if you have a POSIX shell reachable through the PATH + environment variable, for instance. You can force CMake to use a given build + tool, see the `Usage`_ section. + +.. _Basic CMake usage: +.. _Usage: + +Basic CMake usage +================= + +This section explains basic aspects of CMake, mostly for explaining those +options which you may need on your day-to-day usage. + +CMake comes with extensive documentation in the form of html files and on the +cmake executable itself. Execute ``cmake --help`` for further help options. + +CMake requires to know for which build tool it shall generate files (GNU make, +Visual Studio, Xcode, etc). If not specified on the command line, it tries to +guess it based on you environment. Once identified the build tool, CMake uses +the corresponding *Generator* for creating files for your build tool. You can +explicitly specify the generator with the command line option ``-G "Name of the +generator"``. For knowing the available generators on your platform, execute + +.. code-block:: bash + + $ cmake --help + +This will list the generator's names at the end of the help text. Generator's +names are case-sensitive. Example: + +.. code-block:: bash + + $ cmake -G "Visual Studio 9 2008" path/to/llvm/source/root + +For a given development platform there can be more than one adequate +generator. If you use Visual Studio "NMake Makefiles" is a generator you can use +for building with NMake. By default, CMake chooses the more specific generator +supported by your development environment. If you want an alternative generator, +you must tell this to CMake with the ``-G`` option. + +.. todo:: + + Explain variables and cache. Move explanation here from #options section. + +.. _Options and variables: + +Options and variables +===================== + +Variables customize how the build will be generated. Options are boolean +variables, with possible values ON/OFF. Options and variables are defined on the +CMake command line like this: + +.. code-block:: bash + + $ cmake -DVARIABLE=value path/to/llvm/source + +You can set a variable after the initial CMake invocation for changing its +value. You can also undefine a variable: + +.. code-block:: bash + + $ cmake -UVARIABLE path/to/llvm/source + +Variables are stored on the CMake cache. This is a file named ``CMakeCache.txt`` +on the root of the build directory. Do not hand-edit it. + +Variables are listed here appending its type after a colon. It is correct to +write the variable and the type on the CMake command line: + +.. code-block:: bash + + $ cmake -DVARIABLE:TYPE=value path/to/llvm/source + +Frequently-used CMake variables +------------------------------- + +Here are listed some of the CMake variables that are used often, along with a +brief explanation and LLVM-specific notes. For full documentation, check the +CMake docs or execute ``cmake --help-variable VARIABLE_NAME``. + +**CMAKE_BUILD_TYPE**:STRING + Sets the build type for ``make`` based generators. Possible values are + Release, Debug, RelWithDebInfo and MinSizeRel. On systems like Visual Studio + the user sets the build type with the IDE settings. + +**CMAKE_INSTALL_PREFIX**:PATH + Path where LLVM will be installed if "make install" is invoked or the + "INSTALL" target is built. + +**LLVM_LIBDIR_SUFFIX**:STRING + Extra suffix to append to the directory where libraries are to be + installed. On a 64-bit architecture, one could use ``-DLLVM_LIBDIR_SUFFIX=64`` + to install libraries to ``/usr/lib64``. + +**CMAKE_C_FLAGS**:STRING + Extra flags to use when compiling C source files. + +**CMAKE_CXX_FLAGS**:STRING + Extra flags to use when compiling C++ source files. + +**BUILD_SHARED_LIBS**:BOOL + Flag indicating is shared libraries will be built. Its default value is + OFF. Shared libraries are not supported on Windows and not recommended in the + other OSes. + +.. _LLVM-specific variables: + +LLVM-specific variables +----------------------- + +**LLVM_TARGETS_TO_BUILD**:STRING + Semicolon-separated list of targets to build, or *all* for building all + targets. Case-sensitive. For Visual C++ defaults to *X86*. On the other cases + defaults to *all*. Example: ``-DLLVM_TARGETS_TO_BUILD="X86;PowerPC"``. + +**LLVM_BUILD_TOOLS**:BOOL + Build LLVM tools. Defaults to ON. Targets for building each tool are generated + in any case. You can build an tool separately by invoking its target. For + example, you can build *llvm-as* with a makefile-based system executing *make + llvm-as* on the root of your build directory. + +**LLVM_INCLUDE_TOOLS**:BOOL + Generate build targets for the LLVM tools. Defaults to ON. You can use that + option for disabling the generation of build targets for the LLVM tools. + +**LLVM_BUILD_EXAMPLES**:BOOL + Build LLVM examples. Defaults to OFF. Targets for building each example are + generated in any case. See documentation for *LLVM_BUILD_TOOLS* above for more + details. + +**LLVM_INCLUDE_EXAMPLES**:BOOL + Generate build targets for the LLVM examples. Defaults to ON. You can use that + option for disabling the generation of build targets for the LLVM examples. + +**LLVM_BUILD_TESTS**:BOOL + Build LLVM unit tests. Defaults to OFF. Targets for building each unit test + are generated in any case. You can build a specific unit test with the target + *UnitTestNameTests* (where at this time *UnitTestName* can be ADT, Analysis, + ExecutionEngine, JIT, Support, Transform, VMCore; see the subdirectories of + *unittests* for an updated list.) It is possible to build all unit tests with + the target *UnitTests*. + +**LLVM_INCLUDE_TESTS**:BOOL + Generate build targets for the LLVM unit tests. Defaults to ON. You can use + that option for disabling the generation of build targets for the LLVM unit + tests. + +**LLVM_APPEND_VC_REV**:BOOL + Append version control revision info (svn revision number or git revision id) + to LLVM version string (stored in the PACKAGE_VERSION macro). For this to work + cmake must be invoked before the build. Defaults to OFF. + +**LLVM_ENABLE_THREADS**:BOOL + Build with threads support, if available. Defaults to ON. + +**LLVM_ENABLE_ASSERTIONS**:BOOL + Enables code assertions. Defaults to OFF if and only if ``CMAKE_BUILD_TYPE`` + is *Release*. + +**LLVM_ENABLE_PIC**:BOOL + Add the ``-fPIC`` flag for the compiler command-line, if the compiler supports + this flag. Some systems, like Windows, do not need this flag. Defaults to ON. + +**LLVM_ENABLE_WARNINGS**:BOOL + Enable all compiler warnings. Defaults to ON. + +**LLVM_ENABLE_PEDANTIC**:BOOL + Enable pedantic mode. This disable compiler specific extensions, is + possible. Defaults to ON. + +**LLVM_ENABLE_WERROR**:BOOL + Stop and fail build, if a compiler warning is triggered. Defaults to OFF. + +**LLVM_BUILD_32_BITS**:BOOL + Build 32-bits executables and libraries on 64-bits systems. This option is + available only on some 64-bits unix systems. Defaults to OFF. + +**LLVM_TARGET_ARCH**:STRING + LLVM target to use for native code generation. This is required for JIT + generation. It defaults to "host", meaning that it shall pick the architecture + of the machine where LLVM is being built. If you are cross-compiling, set it + to the target architecture name. + +**LLVM_TABLEGEN**:STRING + Full path to a native TableGen executable (usually named ``tblgen``). This is + intended for cross-compiling: if the user sets this variable, no native + TableGen will be created. + +**LLVM_LIT_ARGS**:STRING + Arguments given to lit. ``make check`` and ``make clang-test`` are affected. + By default, ``'-sv --no-progress-bar'`` on Visual C++ and Xcode, ``'-sv'`` on + others. + +**LLVM_LIT_TOOLS_DIR**:PATH + The path to GnuWin32 tools for tests. Valid on Windows host. Defaults to "", + then Lit seeks tools according to %PATH%. Lit can find tools(eg. grep, sort, + &c) on LLVM_LIT_TOOLS_DIR at first, without specifying GnuWin32 to %PATH%. + +**LLVM_ENABLE_FFI**:BOOL + Indicates whether LLVM Interpreter will be linked with Foreign Function + Interface library. If the library or its headers are installed on a custom + location, you can set the variables FFI_INCLUDE_DIR and + FFI_LIBRARY_DIR. Defaults to OFF. + +**LLVM_EXTERNAL_{CLANG,LLD,POLLY}_SOURCE_DIR**:PATH + Path to ``{Clang,lld,Polly}``\'s source directory. Defaults to + ``tools/{clang,lld,polly}``. ``{Clang,lld,Polly}`` will not be built when it + is empty or it does not point valid path. + +**LLVM_USE_OPROFILE**:BOOL + Enable building OProfile JIT support. Defaults to OFF + +**LLVM_USE_INTEL_JITEVENTS**:BOOL + Enable building support for Intel JIT Events API. Defaults to OFF + +**LLVM_INTEL_JITEVENTS_DIR**:PATH + Path to installation of Intel(R) VTune(TM) Amplifier XE 2011, used to locate + the ``jitprofiling`` library. Default = ``%VTUNE_AMPLIFIER_XE_2011_DIR%`` + (Windows) | ``/opt/intel/vtune_amplifier_xe_2011`` (Linux) + +Executing the test suite +======================== + +Testing is performed when the *check* target is built. For instance, if you are +using makefiles, execute this command while on the top level of your build +directory: + +.. code-block:: bash + + $ make check + +On Visual Studio, you may run tests to build the project "check". + +Cross compiling +=============== + +See `this wiki page <http://www.vtk.org/Wiki/CMake_Cross_Compiling>`_ for +generic instructions on how to cross-compile with CMake. It goes into detailed +explanations and may seem daunting, but it is not. On the wiki page there are +several examples including toolchain files. Go directly to `this section +<http://www.vtk.org/Wiki/CMake_Cross_Compiling#Information_how_to_set_up_various_cross_compiling_toolchains>`_ +for a quick solution. + +Also see the `LLVM-specific variables`_ section for variables used when +cross-compiling. + +Embedding LLVM in your project +============================== + +The most difficult part of adding LLVM to the build of a project is to determine +the set of LLVM libraries corresponding to the set of required LLVM +features. What follows is an example of how to obtain this information: + +.. code-block:: cmake + + # A convenience variable: + set(LLVM_ROOT "" CACHE PATH "Root of LLVM install.") + + # A bit of a sanity check: + if( NOT EXISTS ${LLVM_ROOT}/include/llvm ) + message(FATAL_ERROR "LLVM_ROOT (${LLVM_ROOT}) is not a valid LLVM install") + endif() + + # We incorporate the CMake features provided by LLVM: + set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${LLVM_ROOT}/share/llvm/cmake") + include(LLVMConfig) + + # Now set the header and library paths: + include_directories( ${LLVM_INCLUDE_DIRS} ) + link_directories( ${LLVM_LIBRARY_DIRS} ) + add_definitions( ${LLVM_DEFINITIONS} ) + + # Let's suppose we want to build a JIT compiler with support for + # binary code (no interpreter): + llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native) + + # Finally, we link the LLVM libraries to our executable: + target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES}) + +This assumes that LLVM_ROOT points to an install of LLVM. The procedure works +too for uninstalled builds although we need to take care to add an +`include_directories` for the location of the headers on the LLVM source +directory (if we are building out-of-source.) + +Alternativaly, you can utilize CMake's ``find_package`` functionality. Here is +an equivalent variant of snippet shown above: + +.. code-block:: cmake + + find_package(LLVM) + + if( NOT LLVM_FOUND ) + message(FATAL_ERROR "LLVM package can't be found. Set CMAKE_PREFIX_PATH variable to LLVM's installation prefix.") + endif() + + include_directories( ${LLVM_INCLUDE_DIRS} ) + link_directories( ${LLVM_LIBRARY_DIRS} ) + + llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native) + + target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES}) + +Developing LLVM pass out of source +---------------------------------- + +It is possible to develop LLVM passes against installed LLVM. An example of +project layout provided below: + +.. code-block:: bash + + <project dir>/ + | + CMakeLists.txt + <pass name>/ + | + CMakeLists.txt + Pass.cpp + ... + +Contents of ``<project dir>/CMakeLists.txt``: + +.. code-block:: cmake + + find_package(LLVM) + + # Define add_llvm_* macro's. + include(AddLLVM) + + add_definitions(${LLVM_DEFINITIONS}) + include_directories(${LLVM_INCLUDE_DIRS}) + link_directories(${LLVM_LIBRARY_DIRS}) + + add_subdirectory(<pass name>) + +Contents of ``<project dir>/<pass name>/CMakeLists.txt``: + +.. code-block:: cmake + + add_llvm_loadable_module(LLVMPassname + Pass.cpp + ) + +When you are done developing your pass, you may wish to integrate it +into LLVM source tree. You can achieve it in two easy steps: + +#. Copying ``<pass name>`` folder into ``<LLVM root>/lib/Transform`` directory. + +#. Adding ``add_subdirectory(<pass name>)`` line into + ``<LLVM root>/lib/Transform/CMakeLists.txt``. + +Compiler/Platform specific topics +================================= + +Notes for specific compilers and/or platforms. + +Microsoft Visual C++ +-------------------- + +**LLVM_COMPILER_JOBS**:STRING + Specifies the maximum number of parallell compiler jobs to use per project + when building with msbuild or Visual Studio. Only supported for Visual Studio + 2008 and Visual Studio 2010 CMake generators. 0 means use all + processors. Default is 0. diff --git a/docs/CodeGenerator.html b/docs/CodeGenerator.html deleted file mode 100644 index 2dc22c1..0000000 --- a/docs/CodeGenerator.html +++ /dev/null @@ -1,3189 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="content-type" content="text/html; charset=utf-8"> - <title>The LLVM Target-Independent Code Generator</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> - - <style type="text/css"> - .unknown { background-color: #C0C0C0; text-align: center; } - .unknown:before { content: "?" } - .no { background-color: #C11B17 } - .no:before { content: "N" } - .partial { background-color: #F88017 } - .yes { background-color: #0F0; } - .yes:before { content: "Y" } - </style> - -</head> -<body> - -<h1> - The LLVM Target-Independent Code Generator -</h1> - -<ol> - <li><a href="#introduction">Introduction</a> - <ul> - <li><a href="#required">Required components in the code generator</a></li> - <li><a href="#high-level-design">The high-level design of the code - generator</a></li> - <li><a href="#tablegen">Using TableGen for target description</a></li> - </ul> - </li> - <li><a href="#targetdesc">Target description classes</a> - <ul> - <li><a href="#targetmachine">The <tt>TargetMachine</tt> class</a></li> - <li><a href="#targetdata">The <tt>TargetData</tt> class</a></li> - <li><a href="#targetlowering">The <tt>TargetLowering</tt> class</a></li> - <li><a href="#targetregisterinfo">The <tt>TargetRegisterInfo</tt> class</a></li> - <li><a href="#targetinstrinfo">The <tt>TargetInstrInfo</tt> class</a></li> - <li><a href="#targetframeinfo">The <tt>TargetFrameInfo</tt> class</a></li> - <li><a href="#targetsubtarget">The <tt>TargetSubtarget</tt> class</a></li> - <li><a href="#targetjitinfo">The <tt>TargetJITInfo</tt> class</a></li> - </ul> - </li> - <li><a href="#codegendesc">The "Machine" Code Generator classes</a> - <ul> - <li><a href="#machineinstr">The <tt>MachineInstr</tt> class</a></li> - <li><a href="#machinebasicblock">The <tt>MachineBasicBlock</tt> - class</a></li> - <li><a href="#machinefunction">The <tt>MachineFunction</tt> class</a></li> - <li><a href="#machineinstrbundle"><tt>MachineInstr Bundles</tt></a></li> - </ul> - </li> - <li><a href="#mc">The "MC" Layer</a> - <ul> - <li><a href="#mcstreamer">The <tt>MCStreamer</tt> API</a></li> - <li><a href="#mccontext">The <tt>MCContext</tt> class</a> - <li><a href="#mcsymbol">The <tt>MCSymbol</tt> class</a></li> - <li><a href="#mcsection">The <tt>MCSection</tt> class</a></li> - <li><a href="#mcinst">The <tt>MCInst</tt> class</a></li> - </ul> - </li> - <li><a href="#codegenalgs">Target-independent code generation algorithms</a> - <ul> - <li><a href="#instselect">Instruction Selection</a> - <ul> - <li><a href="#selectiondag_intro">Introduction to SelectionDAGs</a></li> - <li><a href="#selectiondag_process">SelectionDAG Code Generation - Process</a></li> - <li><a href="#selectiondag_build">Initial SelectionDAG - Construction</a></li> - <li><a href="#selectiondag_legalize_types">SelectionDAG LegalizeTypes Phase</a></li> - <li><a href="#selectiondag_legalize">SelectionDAG Legalize Phase</a></li> - <li><a href="#selectiondag_optimize">SelectionDAG Optimization - Phase: the DAG Combiner</a></li> - <li><a href="#selectiondag_select">SelectionDAG Select Phase</a></li> - <li><a href="#selectiondag_sched">SelectionDAG Scheduling and Formation - Phase</a></li> - <li><a href="#selectiondag_future">Future directions for the - SelectionDAG</a></li> - </ul></li> - <li><a href="#liveintervals">Live Intervals</a> - <ul> - <li><a href="#livevariable_analysis">Live Variable Analysis</a></li> - <li><a href="#liveintervals_analysis">Live Intervals Analysis</a></li> - </ul></li> - <li><a href="#regalloc">Register Allocation</a> - <ul> - <li><a href="#regAlloc_represent">How registers are represented in - LLVM</a></li> - <li><a href="#regAlloc_howTo">Mapping virtual registers to physical - registers</a></li> - <li><a href="#regAlloc_twoAddr">Handling two address instructions</a></li> - <li><a href="#regAlloc_ssaDecon">The SSA deconstruction phase</a></li> - <li><a href="#regAlloc_fold">Instruction folding</a></li> - <li><a href="#regAlloc_builtIn">Built in register allocators</a></li> - </ul></li> - <li><a href="#codeemit">Code Emission</a></li> - <li><a href="#vliw_packetizer">VLIW Packetizer</a> - <ul> - <li><a href="#vliw_mapping">Mapping from instructions to functional - units</a></li> - <li><a href="#vliw_repr">How the packetization tables are - generated and used</a></li> - </ul> - </li> - </ul> - </li> - <li><a href="#nativeassembler">Implementing a Native Assembler</a></li> - - <li><a href="#targetimpls">Target-specific Implementation Notes</a> - <ul> - <li><a href="#targetfeatures">Target Feature Matrix</a></li> - <li><a href="#tailcallopt">Tail call optimization</a></li> - <li><a href="#sibcallopt">Sibling call optimization</a></li> - <li><a href="#x86">The X86 backend</a></li> - <li><a href="#ppc">The PowerPC backend</a> - <ul> - <li><a href="#ppc_abi">LLVM PowerPC ABI</a></li> - <li><a href="#ppc_frame">Frame Layout</a></li> - <li><a href="#ppc_prolog">Prolog/Epilog</a></li> - <li><a href="#ppc_dynamic">Dynamic Allocation</a></li> - </ul></li> - <li><a href="#ptx">The PTX backend</a></li> - </ul></li> - -</ol> - -<div class="doc_author"> - <p>Written by the LLVM Team.</p> -</div> - -<div class="doc_warning"> - <p>Warning: This is a work in progress.</p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The LLVM target-independent code generator is a framework that provides a - suite of reusable components for translating the LLVM internal representation - to the machine code for a specified target—either in assembly form - (suitable for a static compiler) or in binary machine code format (usable for - a JIT compiler). The LLVM target-independent code generator consists of six - main components:</p> - -<ol> - <li><a href="#targetdesc">Abstract target description</a> interfaces which - capture important properties about various aspects of the machine, - independently of how they will be used. These interfaces are defined in - <tt>include/llvm/Target/</tt>.</li> - - <li>Classes used to represent the <a href="#codegendesc">code being - generated</a> for a target. These classes are intended to be abstract - enough to represent the machine code for <i>any</i> target machine. These - classes are defined in <tt>include/llvm/CodeGen/</tt>. At this level, - concepts like "constant pool entries" and "jump tables" are explicitly - exposed.</li> - - <li>Classes and algorithms used to represent code as the object file level, - the <a href="#mc">MC Layer</a>. These classes represent assembly level - constructs like labels, sections, and instructions. At this level, - concepts like "constant pool entries" and "jump tables" don't exist.</li> - - <li><a href="#codegenalgs">Target-independent algorithms</a> used to implement - various phases of native code generation (register allocation, scheduling, - stack frame representation, etc). This code lives - in <tt>lib/CodeGen/</tt>.</li> - - <li><a href="#targetimpls">Implementations of the abstract target description - interfaces</a> for particular targets. These machine descriptions make - use of the components provided by LLVM, and can optionally provide custom - target-specific passes, to build complete code generators for a specific - target. Target descriptions live in <tt>lib/Target/</tt>.</li> - - <li><a href="#jit">The target-independent JIT components</a>. The LLVM JIT is - completely target independent (it uses the <tt>TargetJITInfo</tt> - structure to interface for target-specific issues. The code for the - target-independent JIT lives in <tt>lib/ExecutionEngine/JIT</tt>.</li> -</ol> - -<p>Depending on which part of the code generator you are interested in working - on, different pieces of this will be useful to you. In any case, you should - be familiar with the <a href="#targetdesc">target description</a> - and <a href="#codegendesc">machine code representation</a> classes. If you - want to add a backend for a new target, you will need - to <a href="#targetimpls">implement the target description</a> classes for - your new target and understand the <a href="LangRef.html">LLVM code - representation</a>. If you are interested in implementing a - new <a href="#codegenalgs">code generation algorithm</a>, it should only - depend on the target-description and machine code representation classes, - ensuring that it is portable.</p> - -<!-- ======================================================================= --> -<h3> - <a name="required">Required components in the code generator</a> -</h3> - -<div> - -<p>The two pieces of the LLVM code generator are the high-level interface to the - code generator and the set of reusable components that can be used to build - target-specific backends. The two most important interfaces - (<a href="#targetmachine"><tt>TargetMachine</tt></a> - and <a href="#targetdata"><tt>TargetData</tt></a>) are the only ones that are - required to be defined for a backend to fit into the LLVM system, but the - others must be defined if the reusable code generator components are going to - be used.</p> - -<p>This design has two important implications. The first is that LLVM can - support completely non-traditional code generation targets. For example, the - C backend does not require register allocation, instruction selection, or any - of the other standard components provided by the system. As such, it only - implements these two interfaces, and does its own thing. Another example of - a code generator like this is a (purely hypothetical) backend that converts - LLVM to the GCC RTL form and uses GCC to emit machine code for a target.</p> - -<p>This design also implies that it is possible to design and implement - radically different code generators in the LLVM system that do not make use - of any of the built-in components. Doing so is not recommended at all, but - could be required for radically different targets that do not fit into the - LLVM machine description model: FPGAs for example.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="high-level-design">The high-level design of the code generator</a> -</h3> - -<div> - -<p>The LLVM target-independent code generator is designed to support efficient - and quality code generation for standard register-based microprocessors. - Code generation in this model is divided into the following stages:</p> - -<ol> - <li><b><a href="#instselect">Instruction Selection</a></b> — This phase - determines an efficient way to express the input LLVM code in the target - instruction set. This stage produces the initial code for the program in - the target instruction set, then makes use of virtual registers in SSA - form and physical registers that represent any required register - assignments due to target constraints or calling conventions. This step - turns the LLVM code into a DAG of target instructions.</li> - - <li><b><a href="#selectiondag_sched">Scheduling and Formation</a></b> — - This phase takes the DAG of target instructions produced by the - instruction selection phase, determines an ordering of the instructions, - then emits the instructions - as <tt><a href="#machineinstr">MachineInstr</a></tt>s with that ordering. - Note that we describe this in the <a href="#instselect">instruction - selection section</a> because it operates on - a <a href="#selectiondag_intro">SelectionDAG</a>.</li> - - <li><b><a href="#ssamco">SSA-based Machine Code Optimizations</a></b> — - This optional stage consists of a series of machine-code optimizations - that operate on the SSA-form produced by the instruction selector. - Optimizations like modulo-scheduling or peephole optimization work - here.</li> - - <li><b><a href="#regalloc">Register Allocation</a></b> — The target code - is transformed from an infinite virtual register file in SSA form to the - concrete register file used by the target. This phase introduces spill - code and eliminates all virtual register references from the program.</li> - - <li><b><a href="#proepicode">Prolog/Epilog Code Insertion</a></b> — Once - the machine code has been generated for the function and the amount of - stack space required is known (used for LLVM alloca's and spill slots), - the prolog and epilog code for the function can be inserted and "abstract - stack location references" can be eliminated. This stage is responsible - for implementing optimizations like frame-pointer elimination and stack - packing.</li> - - <li><b><a href="#latemco">Late Machine Code Optimizations</a></b> — - Optimizations that operate on "final" machine code can go here, such as - spill code scheduling and peephole optimizations.</li> - - <li><b><a href="#codeemit">Code Emission</a></b> — The final stage - actually puts out the code for the current function, either in the target - assembler format or in machine code.</li> -</ol> - -<p>The code generator is based on the assumption that the instruction selector - will use an optimal pattern matching selector to create high-quality - sequences of native instructions. Alternative code generator designs based - on pattern expansion and aggressive iterative peephole optimization are much - slower. This design permits efficient compilation (important for JIT - environments) and aggressive optimization (used when generating code offline) - by allowing components of varying levels of sophistication to be used for any - step of compilation.</p> - -<p>In addition to these stages, target implementations can insert arbitrary - target-specific passes into the flow. For example, the X86 target uses a - special pass to handle the 80x87 floating point stack architecture. Other - targets with unusual requirements can be supported with custom passes as - needed.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="tablegen">Using TableGen for target description</a> -</h3> - -<div> - -<p>The target description classes require a detailed description of the target - architecture. These target descriptions often have a large amount of common - information (e.g., an <tt>add</tt> instruction is almost identical to a - <tt>sub</tt> instruction). In order to allow the maximum amount of - commonality to be factored out, the LLVM code generator uses - the <a href="TableGenFundamentals.html">TableGen</a> tool to describe big - chunks of the target machine, which allows the use of domain-specific and - target-specific abstractions to reduce the amount of repetition.</p> - -<p>As LLVM continues to be developed and refined, we plan to move more and more - of the target description to the <tt>.td</tt> form. Doing so gives us a - number of advantages. The most important is that it makes it easier to port - LLVM because it reduces the amount of C++ code that has to be written, and - the surface area of the code generator that needs to be understood before - someone can get something working. Second, it makes it easier to change - things. In particular, if tables and other things are all emitted - by <tt>tblgen</tt>, we only need a change in one place (<tt>tblgen</tt>) to - update all of the targets to a new interface.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="targetdesc">Target description classes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The LLVM target description classes (located in the - <tt>include/llvm/Target</tt> directory) provide an abstract description of - the target machine independent of any particular client. These classes are - designed to capture the <i>abstract</i> properties of the target (such as the - instructions and registers it has), and do not incorporate any particular - pieces of code generation algorithms.</p> - -<p>All of the target description classes (except the - <tt><a href="#targetdata">TargetData</a></tt> class) are designed to be - subclassed by the concrete target implementation, and have virtual methods - implemented. To get to these implementations, the - <tt><a href="#targetmachine">TargetMachine</a></tt> class provides accessors - that should be implemented by the target.</p> - -<!-- ======================================================================= --> -<h3> - <a name="targetmachine">The <tt>TargetMachine</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetMachine</tt> class provides virtual methods that are used to - access the target-specific implementations of the various target description - classes via the <tt>get*Info</tt> methods (<tt>getInstrInfo</tt>, - <tt>getRegisterInfo</tt>, <tt>getFrameInfo</tt>, etc.). This class is - designed to be specialized by a concrete target implementation - (e.g., <tt>X86TargetMachine</tt>) which implements the various virtual - methods. The only required target description class is - the <a href="#targetdata"><tt>TargetData</tt></a> class, but if the code - generator components are to be used, the other interfaces should be - implemented as well.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetdata">The <tt>TargetData</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetData</tt> class is the only required target description class, - and it is the only class that is not extensible (you cannot derived a new - class from it). <tt>TargetData</tt> specifies information about how the - target lays out memory for structures, the alignment requirements for various - data types, the size of pointers in the target, and whether the target is - little-endian or big-endian.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetlowering">The <tt>TargetLowering</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetLowering</tt> class is used by SelectionDAG based instruction - selectors primarily to describe how LLVM code should be lowered to - SelectionDAG operations. Among other things, this class indicates:</p> - -<ul> - <li>an initial register class to use for various <tt>ValueType</tt>s,</li> - - <li>which operations are natively supported by the target machine,</li> - - <li>the return type of <tt>setcc</tt> operations,</li> - - <li>the type to use for shift amounts, and</li> - - <li>various high-level characteristics, like whether it is profitable to turn - division by a constant into a multiplication sequence</li> -</ul> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetregisterinfo">The <tt>TargetRegisterInfo</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetRegisterInfo</tt> class is used to describe the register file - of the target and any interactions between the registers.</p> - -<p>Registers in the code generator are represented in the code generator by - unsigned integers. Physical registers (those that actually exist in the - target description) are unique small numbers, and virtual registers are - generally large. Note that register #0 is reserved as a flag value.</p> - -<p>Each register in the processor description has an associated - <tt>TargetRegisterDesc</tt> entry, which provides a textual name for the - register (used for assembly output and debugging dumps) and a set of aliases - (used to indicate whether one register overlaps with another).</p> - -<p>In addition to the per-register description, the <tt>TargetRegisterInfo</tt> - class exposes a set of processor specific register classes (instances of the - <tt>TargetRegisterClass</tt> class). Each register class contains sets of - registers that have the same properties (for example, they are all 32-bit - integer registers). Each SSA virtual register created by the instruction - selector has an associated register class. When the register allocator runs, - it replaces virtual registers with a physical register in the set.</p> - -<p>The target-specific implementations of these classes is auto-generated from - a <a href="TableGenFundamentals.html">TableGen</a> description of the - register file.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetinstrinfo">The <tt>TargetInstrInfo</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetInstrInfo</tt> class is used to describe the machine - instructions supported by the target. It is essentially an array of - <tt>TargetInstrDescriptor</tt> objects, each of which describes one - instruction the target supports. Descriptors define things like the mnemonic - for the opcode, the number of operands, the list of implicit register uses - and defs, whether the instruction has certain target-independent properties - (accesses memory, is commutable, etc), and holds any target-specific - flags.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetframeinfo">The <tt>TargetFrameInfo</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetFrameInfo</tt> class is used to provide information about the - stack frame layout of the target. It holds the direction of stack growth, the - known stack alignment on entry to each function, and the offset to the local - area. The offset to the local area is the offset from the stack pointer on - function entry to the first location where function data (local variables, - spill locations) can be stored.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="targetsubtarget">The <tt>TargetSubtarget</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetSubtarget</tt> class is used to provide information about the - specific chip set being targeted. A sub-target informs code generation of - which instructions are supported, instruction latencies and instruction - execution itinerary; i.e., which processing units are used, in what order, - and for how long.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="targetjitinfo">The <tt>TargetJITInfo</tt> class</a> -</h3> - -<div> - -<p>The <tt>TargetJITInfo</tt> class exposes an abstract interface used by the - Just-In-Time code generator to perform target-specific activities, such as - emitting stubs. If a <tt>TargetMachine</tt> supports JIT code generation, it - should provide one of these objects through the <tt>getJITInfo</tt> - method.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="codegendesc">Machine code description classes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>At the high-level, LLVM code is translated to a machine specific - representation formed out of - <a href="#machinefunction"><tt>MachineFunction</tt></a>, - <a href="#machinebasicblock"><tt>MachineBasicBlock</tt></a>, - and <a href="#machineinstr"><tt>MachineInstr</tt></a> instances (defined - in <tt>include/llvm/CodeGen</tt>). This representation is completely target - agnostic, representing instructions in their most abstract form: an opcode - and a series of operands. This representation is designed to support both an - SSA representation for machine code, as well as a register allocated, non-SSA - form.</p> - -<!-- ======================================================================= --> -<h3> - <a name="machineinstr">The <tt>MachineInstr</tt> class</a> -</h3> - -<div> - -<p>Target machine instructions are represented as instances of the - <tt>MachineInstr</tt> class. This class is an extremely abstract way of - representing machine instructions. In particular, it only keeps track of an - opcode number and a set of operands.</p> - -<p>The opcode number is a simple unsigned integer that only has meaning to a - specific backend. All of the instructions for a target should be defined in - the <tt>*InstrInfo.td</tt> file for the target. The opcode enum values are - auto-generated from this description. The <tt>MachineInstr</tt> class does - not have any information about how to interpret the instruction (i.e., what - the semantics of the instruction are); for that you must refer to the - <tt><a href="#targetinstrinfo">TargetInstrInfo</a></tt> class.</p> - -<p>The operands of a machine instruction can be of several different types: a - register reference, a constant integer, a basic block reference, etc. In - addition, a machine operand should be marked as a def or a use of the value - (though only registers are allowed to be defs).</p> - -<p>By convention, the LLVM code generator orders instruction operands so that - all register definitions come before the register uses, even on architectures - that are normally printed in other orders. For example, the SPARC add - instruction: "<tt>add %i1, %i2, %i3</tt>" adds the "%i1", and "%i2" registers - and stores the result into the "%i3" register. In the LLVM code generator, - the operands should be stored as "<tt>%i3, %i1, %i2</tt>": with the - destination first.</p> - -<p>Keeping destination (definition) operands at the beginning of the operand - list has several advantages. In particular, the debugging printer will print - the instruction like this:</p> - -<div class="doc_code"> -<pre> -%r3 = add %i1, %i2 -</pre> -</div> - -<p>Also if the first operand is a def, it is easier to <a href="#buildmi">create - instructions</a> whose only def is the first operand.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="buildmi">Using the <tt>MachineInstrBuilder.h</tt> functions</a> -</h4> - -<div> - -<p>Machine instructions are created by using the <tt>BuildMI</tt> functions, - located in the <tt>include/llvm/CodeGen/MachineInstrBuilder.h</tt> file. The - <tt>BuildMI</tt> functions make it easy to build arbitrary machine - instructions. Usage of the <tt>BuildMI</tt> functions look like this:</p> - -<div class="doc_code"> -<pre> -// Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42') -// instruction. The '1' specifies how many operands will be added. -MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42); - -// Create the same instr, but insert it at the end of a basic block. -MachineBasicBlock &MBB = ... -BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42); - -// Create the same instr, but insert it before a specified iterator point. -MachineBasicBlock::iterator MBBI = ... -BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42); - -// Create a 'cmp Reg, 0' instruction, no destination reg. -MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0); -// Create an 'sahf' instruction which takes no operands and stores nothing. -MI = BuildMI(X86::SAHF, 0); - -// Create a self looping branch instruction. -BuildMI(MBB, X86::JNE, 1).addMBB(&MBB); -</pre> -</div> - -<p>The key thing to remember with the <tt>BuildMI</tt> functions is that you - have to specify the number of operands that the machine instruction will - take. This allows for efficient memory allocation. You also need to specify - if operands default to be uses of values, not definitions. If you need to - add a definition operand (other than the optional destination register), you - must explicitly mark it as such:</p> - -<div class="doc_code"> -<pre> -MI.addReg(Reg, RegState::Define); -</pre> -</div> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="fixedregs">Fixed (preassigned) registers</a> -</h4> - -<div> - -<p>One important issue that the code generator needs to be aware of is the - presence of fixed registers. In particular, there are often places in the - instruction stream where the register allocator <em>must</em> arrange for a - particular value to be in a particular register. This can occur due to - limitations of the instruction set (e.g., the X86 can only do a 32-bit divide - with the <tt>EAX</tt>/<tt>EDX</tt> registers), or external factors like - calling conventions. In any case, the instruction selector should emit code - that copies a virtual register into or out of a physical register when - needed.</p> - -<p>For example, consider this simple LLVM example:</p> - -<div class="doc_code"> -<pre> -define i32 @test(i32 %X, i32 %Y) { - %Z = udiv i32 %X, %Y - ret i32 %Z -} -</pre> -</div> - -<p>The X86 instruction selector produces this machine code for the <tt>div</tt> - and <tt>ret</tt> (use "<tt>llc X.bc -march=x86 -print-machineinstrs</tt>" to - get this):</p> - -<div class="doc_code"> -<pre> -;; Start of div -%EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX -%reg1027 = sar %reg1024, 31 -%EDX = mov %reg1027 ;; Sign extend X into EDX -idiv %reg1025 ;; Divide by Y (in reg1025) -%reg1026 = mov %EAX ;; Read the result (Z) out of EAX - -;; Start of ret -%EAX = mov %reg1026 ;; 32-bit return value goes in EAX -ret -</pre> -</div> - -<p>By the end of code generation, the register allocator has coalesced the - registers and deleted the resultant identity moves producing the following - code:</p> - -<div class="doc_code"> -<pre> -;; X is in EAX, Y is in ECX -mov %EAX, %EDX -sar %EDX, 31 -idiv %ECX -ret -</pre> -</div> - -<p>This approach is extremely general (if it can handle the X86 architecture, it - can handle anything!) and allows all of the target specific knowledge about - the instruction stream to be isolated in the instruction selector. Note that - physical registers should have a short lifetime for good code generation, and - all physical registers are assumed dead on entry to and exit from basic - blocks (before register allocation). Thus, if you need a value to be live - across basic block boundaries, it <em>must</em> live in a virtual - register.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="callclobber">Call-clobbered registers</a> -</h4> - -<div> - -<p>Some machine instructions, like calls, clobber a large number of physical - registers. Rather than adding <code><def,dead></code> operands for - all of them, it is possible to use an <code>MO_RegisterMask</code> operand - instead. The register mask operand holds a bit mask of preserved registers, - and everything else is considered to be clobbered by the instruction. </p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ssa">Machine code in SSA form</a> -</h4> - -<div> - -<p><tt>MachineInstr</tt>'s are initially selected in SSA-form, and are - maintained in SSA-form until register allocation happens. For the most part, - this is trivially simple since LLVM is already in SSA form; LLVM PHI nodes - become machine code PHI nodes, and virtual registers are only allowed to have - a single definition.</p> - -<p>After register allocation, machine code is no longer in SSA-form because - there are no virtual registers left in the code.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="machinebasicblock">The <tt>MachineBasicBlock</tt> class</a> -</h3> - -<div> - -<p>The <tt>MachineBasicBlock</tt> class contains a list of machine instructions - (<tt><a href="#machineinstr">MachineInstr</a></tt> instances). It roughly - corresponds to the LLVM code input to the instruction selector, but there can - be a one-to-many mapping (i.e. one LLVM basic block can map to multiple - machine basic blocks). The <tt>MachineBasicBlock</tt> class has a - "<tt>getBasicBlock</tt>" method, which returns the LLVM basic block that it - comes from.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="machinefunction">The <tt>MachineFunction</tt> class</a> -</h3> - -<div> - -<p>The <tt>MachineFunction</tt> class contains a list of machine basic blocks - (<tt><a href="#machinebasicblock">MachineBasicBlock</a></tt> instances). It - corresponds one-to-one with the LLVM function input to the instruction - selector. In addition to a list of basic blocks, - the <tt>MachineFunction</tt> contains a a <tt>MachineConstantPool</tt>, - a <tt>MachineFrameInfo</tt>, a <tt>MachineFunctionInfo</tt>, and a - <tt>MachineRegisterInfo</tt>. See - <tt>include/llvm/CodeGen/MachineFunction.h</tt> for more information.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="machineinstrbundle"><tt>MachineInstr Bundles</tt></a> -</h3> - -<div> - -<p>LLVM code generator can model sequences of instructions as MachineInstr - bundles. A MI bundle can model a VLIW group / pack which contains an - arbitrary number of parallel instructions. It can also be used to model - a sequential list of instructions (potentially with data dependencies) that - cannot be legally separated (e.g. ARM Thumb2 IT blocks).</p> - -<p>Conceptually a MI bundle is a MI with a number of other MIs nested within: -</p> - -<div class="doc_code"> -<pre> --------------- -| Bundle | --------- --------------- \ - | ---------------- - | | MI | - | ---------------- - | | - | ---------------- - | | MI | - | ---------------- - | | - | ---------------- - | | MI | - | ---------------- - | --------------- -| Bundle | -------- --------------- \ - | ---------------- - | | MI | - | ---------------- - | | - | ---------------- - | | MI | - | ---------------- - | | - | ... - | --------------- -| Bundle | -------- --------------- \ - | - ... -</pre> -</div> - -<p> MI bundle support does not change the physical representations of - MachineBasicBlock and MachineInstr. All the MIs (including top level and - nested ones) are stored as sequential list of MIs. The "bundled" MIs are - marked with the 'InsideBundle' flag. A top level MI with the special BUNDLE - opcode is used to represent the start of a bundle. It's legal to mix BUNDLE - MIs with indiviual MIs that are not inside bundles nor represent bundles. -</p> - -<p> MachineInstr passes should operate on a MI bundle as a single unit. Member - methods have been taught to correctly handle bundles and MIs inside bundles. - The MachineBasicBlock iterator has been modified to skip over bundled MIs to - enforce the bundle-as-a-single-unit concept. An alternative iterator - instr_iterator has been added to MachineBasicBlock to allow passes to - iterate over all of the MIs in a MachineBasicBlock, including those which - are nested inside bundles. The top level BUNDLE instruction must have the - correct set of register MachineOperand's that represent the cumulative - inputs and outputs of the bundled MIs.</p> - -<p> Packing / bundling of MachineInstr's should be done as part of the register - allocation super-pass. More specifically, the pass which determines what - MIs should be bundled together must be done after code generator exits SSA - form (i.e. after two-address pass, PHI elimination, and copy coalescing). - Bundles should only be finalized (i.e. adding BUNDLE MIs and input and - output register MachineOperands) after virtual registers have been - rewritten into physical registers. This requirement eliminates the need to - add virtual register operands to BUNDLE instructions which would effectively - double the virtual register def and use lists.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="mc">The "MC" Layer</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p> -The MC Layer is used to represent and process code at the raw machine code -level, devoid of "high level" information like "constant pools", "jump tables", -"global variables" or anything like that. At this level, LLVM handles things -like label names, machine instructions, and sections in the object file. The -code in this layer is used for a number of important purposes: the tail end of -the code generator uses it to write a .s or .o file, and it is also used by the -llvm-mc tool to implement standalone machine code assemblers and disassemblers. -</p> - -<p> -This section describes some of the important classes. There are also a number -of important subsystems that interact at this layer, they are described later -in this manual. -</p> - -<!-- ======================================================================= --> -<h3> - <a name="mcstreamer">The <tt>MCStreamer</tt> API</a> -</h3> - -<div> - -<p> -MCStreamer is best thought of as an assembler API. It is an abstract API which -is <em>implemented</em> in different ways (e.g. to output a .s file, output an -ELF .o file, etc) but whose API correspond directly to what you see in a .s -file. MCStreamer has one method per directive, such as EmitLabel, -EmitSymbolAttribute, SwitchSection, EmitValue (for .byte, .word), etc, which -directly correspond to assembly level directives. It also has an -EmitInstruction method, which is used to output an MCInst to the streamer. -</p> - -<p> -This API is most important for two clients: the llvm-mc stand-alone assembler is -effectively a parser that parses a line, then invokes a method on MCStreamer. In -the code generator, the <a href="#codeemit">Code Emission</a> phase of the code -generator lowers higher level LLVM IR and Machine* constructs down to the MC -layer, emitting directives through MCStreamer.</p> - -<p> -On the implementation side of MCStreamer, there are two major implementations: -one for writing out a .s file (MCAsmStreamer), and one for writing out a .o -file (MCObjectStreamer). MCAsmStreamer is a straight-forward implementation -that prints out a directive for each method (e.g. EmitValue -> .byte), but -MCObjectStreamer implements a full assembler. -</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="mccontext">The <tt>MCContext</tt> class</a> -</h3> - -<div> - -<p> -The MCContext class is the owner of a variety of uniqued data structures at the -MC layer, including symbols, sections, etc. As such, this is the class that you -interact with to create symbols and sections. This class can not be subclassed. -</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="mcsymbol">The <tt>MCSymbol</tt> class</a> -</h3> - -<div> - -<p> -The MCSymbol class represents a symbol (aka label) in the assembly file. There -are two interesting kinds of symbols: assembler temporary symbols, and normal -symbols. Assembler temporary symbols are used and processed by the assembler -but are discarded when the object file is produced. The distinction is usually -represented by adding a prefix to the label, for example "L" labels are -assembler temporary labels in MachO. -</p> - -<p>MCSymbols are created by MCContext and uniqued there. This means that -MCSymbols can be compared for pointer equivalence to find out if they are the -same symbol. Note that pointer inequality does not guarantee the labels will -end up at different addresses though. It's perfectly legal to output something -like this to the .s file:<p> - -<pre> - foo: - bar: - .byte 4 -</pre> - -<p>In this case, both the foo and bar symbols will have the same address.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="mcsection">The <tt>MCSection</tt> class</a> -</h3> - -<div> - -<p> -The MCSection class represents an object-file specific section. It is subclassed -by object file specific implementations (e.g. <tt>MCSectionMachO</tt>, -<tt>MCSectionCOFF</tt>, <tt>MCSectionELF</tt>) and these are created and uniqued -by MCContext. The MCStreamer has a notion of the current section, which can be -changed with the SwitchToSection method (which corresponds to a ".section" -directive in a .s file). -</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="mcinst">The <tt>MCInst</tt> class</a> -</h3> - -<div> - -<p> -The MCInst class is a target-independent representation of an instruction. It -is a simple class (much more so than <a href="#machineinstr">MachineInstr</a>) -that holds a target-specific opcode and a vector of MCOperands. MCOperand, in -turn, is a simple discriminated union of three cases: 1) a simple immediate, -2) a target register ID, 3) a symbolic expression (e.g. "Lfoo-Lbar+42") as an -MCExpr. -</p> - -<p>MCInst is the common currency used to represent machine instructions at the -MC layer. It is the type used by the instruction encoder, the instruction -printer, and the type generated by the assembly parser and disassembler. -</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="codegenalgs">Target-independent code generation algorithms</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>This section documents the phases described in the - <a href="#high-level-design">high-level design of the code generator</a>. - It explains how they work and some of the rationale behind their design.</p> - -<!-- ======================================================================= --> -<h3> - <a name="instselect">Instruction Selection</a> -</h3> - -<div> - -<p>Instruction Selection is the process of translating LLVM code presented to - the code generator into target-specific machine instructions. There are - several well-known ways to do this in the literature. LLVM uses a - SelectionDAG based instruction selector.</p> - -<p>Portions of the DAG instruction selector are generated from the target - description (<tt>*.td</tt>) files. Our goal is for the entire instruction - selector to be generated from these <tt>.td</tt> files, though currently - there are still things that require custom C++ code.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_intro">Introduction to SelectionDAGs</a> -</h4> - -<div> - -<p>The SelectionDAG provides an abstraction for code representation in a way - that is amenable to instruction selection using automatic techniques - (e.g. dynamic-programming based optimal pattern matching selectors). It is - also well-suited to other phases of code generation; in particular, - instruction scheduling (SelectionDAG's are very close to scheduling DAGs - post-selection). Additionally, the SelectionDAG provides a host - representation where a large variety of very-low-level (but - target-independent) <a href="#selectiondag_optimize">optimizations</a> may be - performed; ones which require extensive information about the instructions - efficiently supported by the target.</p> - -<p>The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the - <tt>SDNode</tt> class. The primary payload of the <tt>SDNode</tt> is its - operation code (Opcode) that indicates what operation the node performs and - the operands to the operation. The various operation node types are - described at the top of the <tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt> - file.</p> - -<p>Although most operations define a single value, each node in the graph may - define multiple values. For example, a combined div/rem operation will - define both the dividend and the remainder. Many other situations require - multiple values as well. Each node also has some number of operands, which - are edges to the node defining the used value. Because nodes may define - multiple values, edges are represented by instances of the <tt>SDValue</tt> - class, which is a <tt><SDNode, unsigned></tt> pair, indicating the node - and result value being used, respectively. Each value produced by - an <tt>SDNode</tt> has an associated <tt>MVT</tt> (Machine Value Type) - indicating what the type of the value is.</p> - -<p>SelectionDAGs contain two different kinds of values: those that represent - data flow and those that represent control flow dependencies. Data values - are simple edges with an integer or floating point value type. Control edges - are represented as "chain" edges which are of type <tt>MVT::Other</tt>. - These edges provide an ordering between nodes that have side effects (such as - loads, stores, calls, returns, etc). All nodes that have side effects should - take a token chain as input and produce a new one as output. By convention, - token chain inputs are always operand #0, and chain results are always the - last value produced by an operation.</p> - -<p>A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is - always a marker node with an Opcode of <tt>ISD::EntryToken</tt>. The Root - node is the final side-effecting node in the token chain. For example, in a - single basic block function it would be the return node.</p> - -<p>One important concept for SelectionDAGs is the notion of a "legal" vs. - "illegal" DAG. A legal DAG for a target is one that only uses supported - operations and supported types. On a 32-bit PowerPC, for example, a DAG with - a value of type i1, i8, i16, or i64 would be illegal, as would a DAG that - uses a SREM or UREM operation. The - <a href="#selectinodag_legalize_types">legalize types</a> and - <a href="#selectiondag_legalize">legalize operations</a> phases are - responsible for turning an illegal DAG into a legal DAG.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_process">SelectionDAG Instruction Selection Process</a> -</h4> - -<div> - -<p>SelectionDAG-based instruction selection consists of the following steps:</p> - -<ol> - <li><a href="#selectiondag_build">Build initial DAG</a> — This stage - performs a simple translation from the input LLVM code to an illegal - SelectionDAG.</li> - - <li><a href="#selectiondag_optimize">Optimize SelectionDAG</a> — This - stage performs simple optimizations on the SelectionDAG to simplify it, - and recognize meta instructions (like rotates - and <tt>div</tt>/<tt>rem</tt> pairs) for targets that support these meta - operations. This makes the resultant code more efficient and - the <a href="#selectiondag_select">select instructions from DAG</a> phase - (below) simpler.</li> - - <li><a href="#selectiondag_legalize_types">Legalize SelectionDAG Types</a> - — This stage transforms SelectionDAG nodes to eliminate any types - that are unsupported on the target.</li> - - <li><a href="#selectiondag_optimize">Optimize SelectionDAG</a> — The - SelectionDAG optimizer is run to clean up redundancies exposed by type - legalization.</li> - - <li><a href="#selectiondag_legalize">Legalize SelectionDAG Ops</a> — - This stage transforms SelectionDAG nodes to eliminate any operations - that are unsupported on the target.</li> - - <li><a href="#selectiondag_optimize">Optimize SelectionDAG</a> — The - SelectionDAG optimizer is run to eliminate inefficiencies introduced by - operation legalization.</li> - - <li><a href="#selectiondag_select">Select instructions from DAG</a> — - Finally, the target instruction selector matches the DAG operations to - target instructions. This process translates the target-independent input - DAG into another DAG of target instructions.</li> - - <li><a href="#selectiondag_sched">SelectionDAG Scheduling and Formation</a> - — The last phase assigns a linear order to the instructions in the - target-instruction DAG and emits them into the MachineFunction being - compiled. This step uses traditional prepass scheduling techniques.</li> -</ol> - -<p>After all of these steps are complete, the SelectionDAG is destroyed and the - rest of the code generation passes are run.</p> - -<p>One great way to visualize what is going on here is to take advantage of a - few LLC command line options. The following options pop up a window - displaying the SelectionDAG at specific times (if you only get errors printed - to the console while using this, you probably - <a href="ProgrammersManual.html#ViewGraph">need to configure your system</a> - to add support for it).</p> - -<ul> - <li><tt>-view-dag-combine1-dags</tt> displays the DAG after being built, - before the first optimization pass.</li> - - <li><tt>-view-legalize-dags</tt> displays the DAG before Legalization.</li> - - <li><tt>-view-dag-combine2-dags</tt> displays the DAG before the second - optimization pass.</li> - - <li><tt>-view-isel-dags</tt> displays the DAG before the Select phase.</li> - - <li><tt>-view-sched-dags</tt> displays the DAG before Scheduling.</li> -</ul> - -<p>The <tt>-view-sunit-dags</tt> displays the Scheduler's dependency graph. - This graph is based on the final SelectionDAG, with nodes that must be - scheduled together bundled into a single scheduling-unit node, and with - immediate operands and other nodes that aren't relevant for scheduling - omitted.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_build">Initial SelectionDAG Construction</a> -</h4> - -<div> - -<p>The initial SelectionDAG is naïvely peephole expanded from the LLVM - input by the <tt>SelectionDAGLowering</tt> class in the - <tt>lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp</tt> file. The intent of - this pass is to expose as much low-level, target-specific details to the - SelectionDAG as possible. This pass is mostly hard-coded (e.g. an - LLVM <tt>add</tt> turns into an <tt>SDNode add</tt> while a - <tt>getelementptr</tt> is expanded into the obvious arithmetic). This pass - requires target-specific hooks to lower calls, returns, varargs, etc. For - these features, the <tt><a href="#targetlowering">TargetLowering</a></tt> - interface is used.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_legalize_types">SelectionDAG LegalizeTypes Phase</a> -</h4> - -<div> - -<p>The Legalize phase is in charge of converting a DAG to only use the types - that are natively supported by the target.</p> - -<p>There are two main ways of converting values of unsupported scalar types to - values of supported types: converting small types to larger types - ("promoting"), and breaking up large integer types into smaller ones - ("expanding"). For example, a target might require that all f32 values are - promoted to f64 and that all i1/i8/i16 values are promoted to i32. The same - target might require that all i64 values be expanded into pairs of i32 - values. These changes can insert sign and zero extensions as needed to make - sure that the final code has the same behavior as the input.</p> - -<p>There are two main ways of converting values of unsupported vector types to - value of supported types: splitting vector types, multiple times if - necessary, until a legal type is found, and extending vector types by adding - elements to the end to round them out to legal types ("widening"). If a - vector gets split all the way down to single-element parts with no supported - vector type being found, the elements are converted to scalars - ("scalarizing").</p> - -<p>A target implementation tells the legalizer which types are supported (and - which register class to use for them) by calling the - <tt>addRegisterClass</tt> method in its TargetLowering constructor.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_legalize">SelectionDAG Legalize Phase</a> -</h4> - -<div> - -<p>The Legalize phase is in charge of converting a DAG to only use the - operations that are natively supported by the target.</p> - -<p>Targets often have weird constraints, such as not supporting every operation - on every supported datatype (e.g. X86 does not support byte conditional moves - and PowerPC does not support sign-extending loads from a 16-bit memory - location). Legalize takes care of this by open-coding another sequence of - operations to emulate the operation ("expansion"), by promoting one type to a - larger type that supports the operation ("promotion"), or by using a - target-specific hook to implement the legalization ("custom").</p> - -<p>A target implementation tells the legalizer which operations are not - supported (and which of the above three actions to take) by calling the - <tt>setOperationAction</tt> method in its <tt>TargetLowering</tt> - constructor.</p> - -<p>Prior to the existence of the Legalize passes, we required that every target - <a href="#selectiondag_optimize">selector</a> supported and handled every - operator and type even if they are not natively supported. The introduction - of the Legalize phases allows all of the canonicalization patterns to be - shared across targets, and makes it very easy to optimize the canonicalized - code because it is still in the form of a DAG.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_optimize"> - SelectionDAG Optimization Phase: the DAG Combiner - </a> -</h4> - -<div> - -<p>The SelectionDAG optimization phase is run multiple times for code - generation, immediately after the DAG is built and once after each - legalization. The first run of the pass allows the initial code to be - cleaned up (e.g. performing optimizations that depend on knowing that the - operators have restricted type inputs). Subsequent runs of the pass clean up - the messy code generated by the Legalize passes, which allows Legalize to be - very simple (it can focus on making code legal instead of focusing on - generating <em>good</em> and legal code).</p> - -<p>One important class of optimizations performed is optimizing inserted sign - and zero extension instructions. We currently use ad-hoc techniques, but - could move to more rigorous techniques in the future. Here are some good - papers on the subject:</p> - -<p>"<a href="http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html">Widening - integer arithmetic</a>"<br> - Kevin Redwine and Norman Ramsey<br> - International Conference on Compiler Construction (CC) 2004</p> - -<p>"<a href="http://portal.acm.org/citation.cfm?doid=512529.512552">Effective - sign extension elimination</a>"<br> - Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani<br> - Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design - and Implementation.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_select">SelectionDAG Select Phase</a> -</h4> - -<div> - -<p>The Select phase is the bulk of the target-specific code for instruction - selection. This phase takes a legal SelectionDAG as input, pattern matches - the instructions supported by the target to this DAG, and produces a new DAG - of target code. For example, consider the following LLVM fragment:</p> - -<div class="doc_code"> -<pre> -%t1 = fadd float %W, %X -%t2 = fmul float %t1, %Y -%t3 = fadd float %t2, %Z -</pre> -</div> - -<p>This LLVM code corresponds to a SelectionDAG that looks basically like - this:</p> - -<div class="doc_code"> -<pre> -(fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z) -</pre> -</div> - -<p>If a target supports floating point multiply-and-add (FMA) operations, one of - the adds can be merged with the multiply. On the PowerPC, for example, the - output of the instruction selector might look like this DAG:</p> - -<div class="doc_code"> -<pre> -(FMADDS (FADDS W, X), Y, Z) -</pre> -</div> - -<p>The <tt>FMADDS</tt> instruction is a ternary instruction that multiplies its -first two operands and adds the third (as single-precision floating-point -numbers). The <tt>FADDS</tt> instruction is a simple binary single-precision -add instruction. To perform this pattern match, the PowerPC backend includes -the following instruction definitions:</p> - -<div class="doc_code"> -<pre> -def FMADDS : AForm_1<59, 29, - (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB), - "fmadds $FRT, $FRA, $FRC, $FRB", - [<b>(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC), - F4RC:$FRB))</b>]>; -def FADDS : AForm_2<59, 21, - (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB), - "fadds $FRT, $FRA, $FRB", - [<b>(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))</b>]>; -</pre> -</div> - -<p>The portion of the instruction definition in bold indicates the pattern used - to match the instruction. The DAG operators - (like <tt>fmul</tt>/<tt>fadd</tt>) are defined in - the <tt>include/llvm/Target/TargetSelectionDAG.td</tt> file. " - <tt>F4RC</tt>" is the register class of the input and result values.</p> - -<p>The TableGen DAG instruction selector generator reads the instruction - patterns in the <tt>.td</tt> file and automatically builds parts of the - pattern matching code for your target. It has the following strengths:</p> - -<ul> - <li>At compiler-compiler time, it analyzes your instruction patterns and tells - you if your patterns make sense or not.</li> - - <li>It can handle arbitrary constraints on operands for the pattern match. In - particular, it is straight-forward to say things like "match any immediate - that is a 13-bit sign-extended value". For examples, see the - <tt>immSExt16</tt> and related <tt>tblgen</tt> classes in the PowerPC - backend.</li> - - <li>It knows several important identities for the patterns defined. For - example, it knows that addition is commutative, so it allows the - <tt>FMADDS</tt> pattern above to match "<tt>(fadd X, (fmul Y, Z))</tt>" as - well as "<tt>(fadd (fmul X, Y), Z)</tt>", without the target author having - to specially handle this case.</li> - - <li>It has a full-featured type-inferencing system. In particular, you should - rarely have to explicitly tell the system what type parts of your patterns - are. In the <tt>FMADDS</tt> case above, we didn't have to tell - <tt>tblgen</tt> that all of the nodes in the pattern are of type 'f32'. - It was able to infer and propagate this knowledge from the fact that - <tt>F4RC</tt> has type 'f32'.</li> - - <li>Targets can define their own (and rely on built-in) "pattern fragments". - Pattern fragments are chunks of reusable patterns that get inlined into - your patterns during compiler-compiler time. For example, the integer - "<tt>(not x)</tt>" operation is actually defined as a pattern fragment - that expands as "<tt>(xor x, -1)</tt>", since the SelectionDAG does not - have a native '<tt>not</tt>' operation. Targets can define their own - short-hand fragments as they see fit. See the definition of - '<tt>not</tt>' and '<tt>ineg</tt>' for examples.</li> - - <li>In addition to instructions, targets can specify arbitrary patterns that - map to one or more instructions using the 'Pat' class. For example, the - PowerPC has no way to load an arbitrary integer immediate into a register - in one instruction. To tell tblgen how to do this, it defines: - <br> - <br> -<div class="doc_code"> -<pre> -// Arbitrary immediate support. Implement in terms of LIS/ORI. -def : Pat<(i32 imm:$imm), - (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>; -</pre> -</div> - <br> - If none of the single-instruction patterns for loading an immediate into a - register match, this will be used. This rule says "match an arbitrary i32 - immediate, turning it into an <tt>ORI</tt> ('or a 16-bit immediate') and - an <tt>LIS</tt> ('load 16-bit immediate, where the immediate is shifted to - the left 16 bits') instruction". To make this work, the - <tt>LO16</tt>/<tt>HI16</tt> node transformations are used to manipulate - the input immediate (in this case, take the high or low 16-bits of the - immediate).</li> - - <li>While the system does automate a lot, it still allows you to write custom - C++ code to match special cases if there is something that is hard to - express.</li> -</ul> - -<p>While it has many strengths, the system currently has some limitations, - primarily because it is a work in progress and is not yet finished:</p> - -<ul> - <li>Overall, there is no way to define or match SelectionDAG nodes that define - multiple values (e.g. <tt>SMUL_LOHI</tt>, <tt>LOAD</tt>, <tt>CALL</tt>, - etc). This is the biggest reason that you currently still <em>have - to</em> write custom C++ code for your instruction selector.</li> - - <li>There is no great way to support matching complex addressing modes yet. - In the future, we will extend pattern fragments to allow them to define - multiple values (e.g. the four operands of the <a href="#x86_memory">X86 - addressing mode</a>, which are currently matched with custom C++ code). - In addition, we'll extend fragments so that a fragment can match multiple - different patterns.</li> - - <li>We don't automatically infer flags like isStore/isLoad yet.</li> - - <li>We don't automatically generate the set of supported registers and - operations for the <a href="#selectiondag_legalize">Legalizer</a> - yet.</li> - - <li>We don't have a way of tying in custom legalized nodes yet.</li> -</ul> - -<p>Despite these limitations, the instruction selector generator is still quite - useful for most of the binary and logical operations in typical instruction - sets. If you run into any problems or can't figure out how to do something, - please let Chris know!</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_sched">SelectionDAG Scheduling and Formation Phase</a> -</h4> - -<div> - -<p>The scheduling phase takes the DAG of target instructions from the selection - phase and assigns an order. The scheduler can pick an order depending on - various constraints of the machines (i.e. order for minimal register pressure - or try to cover instruction latencies). Once an order is established, the - DAG is converted to a list - of <tt><a href="#machineinstr">MachineInstr</a></tt>s and the SelectionDAG is - destroyed.</p> - -<p>Note that this phase is logically separate from the instruction selection - phase, but is tied to it closely in the code because it operates on - SelectionDAGs.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="selectiondag_future">Future directions for the SelectionDAG</a> -</h4> - -<div> - -<ol> - <li>Optional function-at-a-time selection.</li> - - <li>Auto-generate entire selector from <tt>.td</tt> file.</li> -</ol> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ssamco">SSA-based Machine Code Optimizations</a> -</h3> -<div><p>To Be Written</p></div> - -<!-- ======================================================================= --> -<h3> - <a name="liveintervals">Live Intervals</a> -</h3> - -<div> - -<p>Live Intervals are the ranges (intervals) where a variable is <i>live</i>. - They are used by some <a href="#regalloc">register allocator</a> passes to - determine if two or more virtual registers which require the same physical - register are live at the same point in the program (i.e., they conflict). - When this situation occurs, one virtual register must be <i>spilled</i>.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="livevariable_analysis">Live Variable Analysis</a> -</h4> - -<div> - -<p>The first step in determining the live intervals of variables is to calculate - the set of registers that are immediately dead after the instruction (i.e., - the instruction calculates the value, but it is never used) and the set of - registers that are used by the instruction, but are never used after the - instruction (i.e., they are killed). Live variable information is computed - for each <i>virtual</i> register and <i>register allocatable</i> physical - register in the function. This is done in a very efficient manner because it - uses SSA to sparsely compute lifetime information for virtual registers - (which are in SSA form) and only has to track physical registers within a - block. Before register allocation, LLVM can assume that physical registers - are only live within a single basic block. This allows it to do a single, - local analysis to resolve physical register lifetimes within each basic - block. If a physical register is not register allocatable (e.g., a stack - pointer or condition codes), it is not tracked.</p> - -<p>Physical registers may be live in to or out of a function. Live in values are - typically arguments in registers. Live out values are typically return values - in registers. Live in values are marked as such, and are given a dummy - "defining" instruction during live intervals analysis. If the last basic - block of a function is a <tt>return</tt>, then it's marked as using all live - out values in the function.</p> - -<p><tt>PHI</tt> nodes need to be handled specially, because the calculation of - the live variable information from a depth first traversal of the CFG of the - function won't guarantee that a virtual register used by the <tt>PHI</tt> - node is defined before it's used. When a <tt>PHI</tt> node is encountered, - only the definition is handled, because the uses will be handled in other - basic blocks.</p> - -<p>For each <tt>PHI</tt> node of the current basic block, we simulate an - assignment at the end of the current basic block and traverse the successor - basic blocks. If a successor basic block has a <tt>PHI</tt> node and one of - the <tt>PHI</tt> node's operands is coming from the current basic block, then - the variable is marked as <i>alive</i> within the current basic block and all - of its predecessor basic blocks, until the basic block with the defining - instruction is encountered.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="liveintervals_analysis">Live Intervals Analysis</a> -</h4> - -<div> - -<p>We now have the information available to perform the live intervals analysis - and build the live intervals themselves. We start off by numbering the basic - blocks and machine instructions. We then handle the "live-in" values. These - are in physical registers, so the physical register is assumed to be killed - by the end of the basic block. Live intervals for virtual registers are - computed for some ordering of the machine instructions <tt>[1, N]</tt>. A - live interval is an interval <tt>[i, j)</tt>, where <tt>1 <= i <= j - < N</tt>, for which a variable is live.</p> - -<p><i><b>More to come...</b></i></p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="regalloc">Register Allocation</a> -</h3> - -<div> - -<p>The <i>Register Allocation problem</i> consists in mapping a program - <i>P<sub>v</sub></i>, that can use an unbounded number of virtual registers, - to a program <i>P<sub>p</sub></i> that contains a finite (possibly small) - number of physical registers. Each target architecture has a different number - of physical registers. If the number of physical registers is not enough to - accommodate all the virtual registers, some of them will have to be mapped - into memory. These virtuals are called <i>spilled virtuals</i>.</p> - -<!-- _______________________________________________________________________ --> - -<h4> - <a name="regAlloc_represent">How registers are represented in LLVM</a> -</h4> - -<div> - -<p>In LLVM, physical registers are denoted by integer numbers that normally - range from 1 to 1023. To see how this numbering is defined for a particular - architecture, you can read the <tt>GenRegisterNames.inc</tt> file for that - architecture. For instance, by - inspecting <tt>lib/Target/X86/X86GenRegisterInfo.inc</tt> we see that the - 32-bit register <tt>EAX</tt> is denoted by 43, and the MMX register - <tt>MM0</tt> is mapped to 65.</p> - -<p>Some architectures contain registers that share the same physical location. A - notable example is the X86 platform. For instance, in the X86 architecture, - the registers <tt>EAX</tt>, <tt>AX</tt> and <tt>AL</tt> share the first eight - bits. These physical registers are marked as <i>aliased</i> in LLVM. Given a - particular architecture, you can check which registers are aliased by - inspecting its <tt>RegisterInfo.td</tt> file. Moreover, the method - <tt>MCRegisterInfo::getAliasSet(p_reg)</tt> returns an array containing - all the physical registers aliased to the register <tt>p_reg</tt>.</p> - -<p>Physical registers, in LLVM, are grouped in <i>Register Classes</i>. - Elements in the same register class are functionally equivalent, and can be - interchangeably used. Each virtual register can only be mapped to physical - registers of a particular class. For instance, in the X86 architecture, some - virtuals can only be allocated to 8 bit registers. A register class is - described by <tt>TargetRegisterClass</tt> objects. To discover if a virtual - register is compatible with a given physical, this code can be used:</p> - -<div class="doc_code"> -<pre> -bool RegMapping_Fer::compatible_class(MachineFunction &mf, - unsigned v_reg, - unsigned p_reg) { - assert(TargetRegisterInfo::isPhysicalRegister(p_reg) && - "Target register must be physical"); - const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg); - return trc->contains(p_reg); -} -</pre> -</div> - -<p>Sometimes, mostly for debugging purposes, it is useful to change the number - of physical registers available in the target architecture. This must be done - statically, inside the <tt>TargetRegsterInfo.td</tt> file. Just <tt>grep</tt> - for <tt>RegisterClass</tt>, the last parameter of which is a list of - registers. Just commenting some out is one simple way to avoid them being - used. A more polite way is to explicitly exclude some registers from - the <i>allocation order</i>. See the definition of the <tt>GR8</tt> register - class in <tt>lib/Target/X86/X86RegisterInfo.td</tt> for an example of this. - </p> - -<p>Virtual registers are also denoted by integer numbers. Contrary to physical - registers, different virtual registers never share the same number. Whereas - physical registers are statically defined in a <tt>TargetRegisterInfo.td</tt> - file and cannot be created by the application developer, that is not the case - with virtual registers. In order to create new virtual registers, use the - method <tt>MachineRegisterInfo::createVirtualRegister()</tt>. This method - will return a new virtual register. Use an <tt>IndexedMap<Foo, - VirtReg2IndexFunctor></tt> to hold information per virtual register. If you - need to enumerate all virtual registers, use the function - <tt>TargetRegisterInfo::index2VirtReg()</tt> to find the virtual register - numbers:</p> - -<div class="doc_code"> -<pre> - for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) { - unsigned VirtReg = TargetRegisterInfo::index2VirtReg(i); - stuff(VirtReg); - } -</pre> -</div> - -<p>Before register allocation, the operands of an instruction are mostly virtual - registers, although physical registers may also be used. In order to check if - a given machine operand is a register, use the boolean - function <tt>MachineOperand::isRegister()</tt>. To obtain the integer code of - a register, use <tt>MachineOperand::getReg()</tt>. An instruction may define - or use a register. For instance, <tt>ADD reg:1026 := reg:1025 reg:1024</tt> - defines the registers 1024, and uses registers 1025 and 1026. Given a - register operand, the method <tt>MachineOperand::isUse()</tt> informs if that - register is being used by the instruction. The - method <tt>MachineOperand::isDef()</tt> informs if that registers is being - defined.</p> - -<p>We will call physical registers present in the LLVM bitcode before register - allocation <i>pre-colored registers</i>. Pre-colored registers are used in - many different situations, for instance, to pass parameters of functions - calls, and to store results of particular instructions. There are two types - of pre-colored registers: the ones <i>implicitly</i> defined, and - those <i>explicitly</i> defined. Explicitly defined registers are normal - operands, and can be accessed - with <tt>MachineInstr::getOperand(int)::getReg()</tt>. In order to check - which registers are implicitly defined by an instruction, use - the <tt>TargetInstrInfo::get(opcode)::ImplicitDefs</tt>, - where <tt>opcode</tt> is the opcode of the target instruction. One important - difference between explicit and implicit physical registers is that the - latter are defined statically for each instruction, whereas the former may - vary depending on the program being compiled. For example, an instruction - that represents a function call will always implicitly define or use the same - set of physical registers. To read the registers implicitly used by an - instruction, - use <tt>TargetInstrInfo::get(opcode)::ImplicitUses</tt>. Pre-colored - registers impose constraints on any register allocation algorithm. The - register allocator must make sure that none of them are overwritten by - the values of virtual registers while still alive.</p> - -</div> - -<!-- _______________________________________________________________________ --> - -<h4> - <a name="regAlloc_howTo">Mapping virtual registers to physical registers</a> -</h4> - -<div> - -<p>There are two ways to map virtual registers to physical registers (or to - memory slots). The first way, that we will call <i>direct mapping</i>, is - based on the use of methods of the classes <tt>TargetRegisterInfo</tt>, - and <tt>MachineOperand</tt>. The second way, that we will call <i>indirect - mapping</i>, relies on the <tt>VirtRegMap</tt> class in order to insert loads - and stores sending and getting values to and from memory.</p> - -<p>The direct mapping provides more flexibility to the developer of the register - allocator; however, it is more error prone, and demands more implementation - work. Basically, the programmer will have to specify where load and store - instructions should be inserted in the target function being compiled in - order to get and store values in memory. To assign a physical register to a - virtual register present in a given operand, - use <tt>MachineOperand::setReg(p_reg)</tt>. To insert a store instruction, - use <tt>TargetInstrInfo::storeRegToStackSlot(...)</tt>, and to insert a - load instruction, use <tt>TargetInstrInfo::loadRegFromStackSlot</tt>.</p> - -<p>The indirect mapping shields the application developer from the complexities - of inserting load and store instructions. In order to map a virtual register - to a physical one, use <tt>VirtRegMap::assignVirt2Phys(vreg, preg)</tt>. In - order to map a certain virtual register to memory, - use <tt>VirtRegMap::assignVirt2StackSlot(vreg)</tt>. This method will return - the stack slot where <tt>vreg</tt>'s value will be located. If it is - necessary to map another virtual register to the same stack slot, - use <tt>VirtRegMap::assignVirt2StackSlot(vreg, stack_location)</tt>. One - important point to consider when using the indirect mapping, is that even if - a virtual register is mapped to memory, it still needs to be mapped to a - physical register. This physical register is the location where the virtual - register is supposed to be found before being stored or after being - reloaded.</p> - -<p>If the indirect strategy is used, after all the virtual registers have been - mapped to physical registers or stack slots, it is necessary to use a spiller - object to place load and store instructions in the code. Every virtual that - has been mapped to a stack slot will be stored to memory after been defined - and will be loaded before being used. The implementation of the spiller tries - to recycle load/store instructions, avoiding unnecessary instructions. For an - example of how to invoke the spiller, - see <tt>RegAllocLinearScan::runOnMachineFunction</tt> - in <tt>lib/CodeGen/RegAllocLinearScan.cpp</tt>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="regAlloc_twoAddr">Handling two address instructions</a> -</h4> - -<div> - -<p>With very rare exceptions (e.g., function calls), the LLVM machine code - instructions are three address instructions. That is, each instruction is - expected to define at most one register, and to use at most two registers. - However, some architectures use two address instructions. In this case, the - defined register is also one of the used register. For instance, an - instruction such as <tt>ADD %EAX, %EBX</tt>, in X86 is actually equivalent - to <tt>%EAX = %EAX + %EBX</tt>.</p> - -<p>In order to produce correct code, LLVM must convert three address - instructions that represent two address instructions into true two address - instructions. LLVM provides the pass <tt>TwoAddressInstructionPass</tt> for - this specific purpose. It must be run before register allocation takes - place. After its execution, the resulting code may no longer be in SSA - form. This happens, for instance, in situations where an instruction such - as <tt>%a = ADD %b %c</tt> is converted to two instructions such as:</p> - -<div class="doc_code"> -<pre> -%a = MOVE %b -%a = ADD %a %c -</pre> -</div> - -<p>Notice that, internally, the second instruction is represented as - <tt>ADD %a[def/use] %c</tt>. I.e., the register operand <tt>%a</tt> is both - used and defined by the instruction.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="regAlloc_ssaDecon">The SSA deconstruction phase</a> -</h4> - -<div> - -<p>An important transformation that happens during register allocation is called - the <i>SSA Deconstruction Phase</i>. The SSA form simplifies many analyses - that are performed on the control flow graph of programs. However, - traditional instruction sets do not implement PHI instructions. Thus, in - order to generate executable code, compilers must replace PHI instructions - with other instructions that preserve their semantics.</p> - -<p>There are many ways in which PHI instructions can safely be removed from the - target code. The most traditional PHI deconstruction algorithm replaces PHI - instructions with copy instructions. That is the strategy adopted by - LLVM. The SSA deconstruction algorithm is implemented - in <tt>lib/CodeGen/PHIElimination.cpp</tt>. In order to invoke this pass, the - identifier <tt>PHIEliminationID</tt> must be marked as required in the code - of the register allocator.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="regAlloc_fold">Instruction folding</a> -</h4> - -<div> - -<p><i>Instruction folding</i> is an optimization performed during register - allocation that removes unnecessary copy instructions. For instance, a - sequence of instructions such as:</p> - -<div class="doc_code"> -<pre> -%EBX = LOAD %mem_address -%EAX = COPY %EBX -</pre> -</div> - -<p>can be safely substituted by the single instruction:</p> - -<div class="doc_code"> -<pre> -%EAX = LOAD %mem_address -</pre> -</div> - -<p>Instructions can be folded with - the <tt>TargetRegisterInfo::foldMemoryOperand(...)</tt> method. Care must be - taken when folding instructions; a folded instruction can be quite different - from the original - instruction. See <tt>LiveIntervals::addIntervalsForSpills</tt> - in <tt>lib/CodeGen/LiveIntervalAnalysis.cpp</tt> for an example of its - use.</p> - -</div> - -<!-- _______________________________________________________________________ --> - -<h4> - <a name="regAlloc_builtIn">Built in register allocators</a> -</h4> - -<div> - -<p>The LLVM infrastructure provides the application developer with three - different register allocators:</p> - -<ul> - <li><i>Fast</i> — This register allocator is the default for debug - builds. It allocates registers on a basic block level, attempting to keep - values in registers and reusing registers as appropriate.</li> - - <li><i>Basic</i> — This is an incremental approach to register - allocation. Live ranges are assigned to registers one at a time in - an order that is driven by heuristics. Since code can be rewritten - on-the-fly during allocation, this framework allows interesting - allocators to be developed as extensions. It is not itself a - production register allocator but is a potentially useful - stand-alone mode for triaging bugs and as a performance baseline. - - <li><i>Greedy</i> — <i>The default allocator</i>. This is a - highly tuned implementation of the <i>Basic</i> allocator that - incorporates global live range splitting. This allocator works hard - to minimize the cost of spill code. - - <li><i>PBQP</i> — A Partitioned Boolean Quadratic Programming (PBQP) - based register allocator. This allocator works by constructing a PBQP - problem representing the register allocation problem under consideration, - solving this using a PBQP solver, and mapping the solution back to a - register assignment.</li> -</ul> - -<p>The type of register allocator used in <tt>llc</tt> can be chosen with the - command line option <tt>-regalloc=...</tt>:</p> - -<div class="doc_code"> -<pre> -$ llc -regalloc=linearscan file.bc -o ln.s; -$ llc -regalloc=fast file.bc -o fa.s; -$ llc -regalloc=pbqp file.bc -o pbqp.s; -</pre> -</div> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="proepicode">Prolog/Epilog Code Insertion</a> -</h3> - -<div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="compact_unwind">Compact Unwind</a> -</h4> - -<div> - -<p>Throwing an exception requires <em>unwinding</em> out of a function. The - information on how to unwind a given function is traditionally expressed in - DWARF unwind (a.k.a. frame) info. But that format was originally developed - for debuggers to backtrace, and each Frame Description Entry (FDE) requires - ~20-30 bytes per function. There is also the cost of mapping from an address - in a function to the corresponding FDE at runtime. An alternative unwind - encoding is called <em>compact unwind</em> and requires just 4-bytes per - function.</p> - -<p>The compact unwind encoding is a 32-bit value, which is encoded in an - architecture-specific way. It specifies which registers to restore and from - where, and how to unwind out of the function. When the linker creates a final - linked image, it will create a <code>__TEXT,__unwind_info</code> - section. This section is a small and fast way for the runtime to access - unwind info for any given function. If we emit compact unwind info for the - function, that compact unwind info will be encoded in - the <code>__TEXT,__unwind_info</code> section. If we emit DWARF unwind info, - the <code>__TEXT,__unwind_info</code> section will contain the offset of the - FDE in the <code>__TEXT,__eh_frame</code> section in the final linked - image.</p> - -<p>For X86, there are three modes for the compact unwind encoding:</p> - -<dl> - <dt><i>Function with a Frame Pointer (<code>EBP</code> or <code>RBP</code>)</i></dt> - <dd><p><code>EBP/RBP</code>-based frame, where <code>EBP/RBP</code> is pushed - onto the stack immediately after the return address, - then <code>ESP/RSP</code> is moved to <code>EBP/RBP</code>. Thus to - unwind, <code>ESP/RSP</code> is restored with the - current <code>EBP/RBP</code> value, then <code>EBP/RBP</code> is restored - by popping the stack, and the return is done by popping the stack once - more into the PC. All non-volatile registers that need to be restored must - have been saved in a small range on the stack that - starts <code>EBP-4</code> to <code>EBP-1020</code> (<code>RBP-8</code> - to <code>RBP-1020</code>). The offset (divided by 4 in 32-bit mode and 8 - in 64-bit mode) is encoded in bits 16-23 (mask: <code>0x00FF0000</code>). - The registers saved are encoded in bits 0-14 - (mask: <code>0x00007FFF</code>) as five 3-bit entries from the following - table:</p> -<table border="1" cellspacing="0"> - <tr> - <th>Compact Number</th> - <th>i386 Register</th> - <th>x86-64 Regiser</th> - </tr> - <tr> - <td>1</td> - <td><code>EBX</code></td> - <td><code>RBX</code></td> - </tr> - <tr> - <td>2</td> - <td><code>ECX</code></td> - <td><code>R12</code></td> - </tr> - <tr> - <td>3</td> - <td><code>EDX</code></td> - <td><code>R13</code></td> - </tr> - <tr> - <td>4</td> - <td><code>EDI</code></td> - <td><code>R14</code></td> - </tr> - <tr> - <td>5</td> - <td><code>ESI</code></td> - <td><code>R15</code></td> - </tr> - <tr> - <td>6</td> - <td><code>EBP</code></td> - <td><code>RBP</code></td> - </tr> -</table> - -</dd> - - <dt><i>Frameless with a Small Constant Stack Size (<code>EBP</code> - or <code>RBP</code> is not used as a frame pointer)</i></dt> - <dd><p>To return, a constant (encoded in the compact unwind encoding) is added - to the <code>ESP/RSP</code>. Then the return is done by popping the stack - into the PC. All non-volatile registers that need to be restored must have - been saved on the stack immediately after the return address. The stack - size (divided by 4 in 32-bit mode and 8 in 64-bit mode) is encoded in bits - 16-23 (mask: <code>0x00FF0000</code>). There is a maximum stack size of - 1024 bytes in 32-bit mode and 2048 in 64-bit mode. The number of registers - saved is encoded in bits 9-12 (mask: <code>0x00001C00</code>). Bits 0-9 - (mask: <code>0x000003FF</code>) contain which registers were saved and - their order. (See - the <code>encodeCompactUnwindRegistersWithoutFrame()</code> function - in <code>lib/Target/X86FrameLowering.cpp</code> for the encoding - algorithm.)</p></dd> - - <dt><i>Frameless with a Large Constant Stack Size (<code>EBP</code> - or <code>RBP</code> is not used as a frame pointer)</i></dt> - <dd><p>This case is like the "Frameless with a Small Constant Stack Size" - case, but the stack size is too large to encode in the compact unwind - encoding. Instead it requires that the function contains "<code>subl - $nnnnnn, %esp</code>" in its prolog. The compact encoding contains the - offset to the <code>$nnnnnn</code> value in the function in bits 9-12 - (mask: <code>0x00001C00</code>).</p></dd> -</dl> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="latemco">Late Machine Code Optimizations</a> -</h3> -<div><p>To Be Written</p></div> - -<!-- ======================================================================= --> -<h3> - <a name="codeemit">Code Emission</a> -</h3> - -<div> - -<p>The code emission step of code generation is responsible for lowering from -the code generator abstractions (like <a -href="#machinefunction">MachineFunction</a>, <a -href="#machineinstr">MachineInstr</a>, etc) down -to the abstractions used by the MC layer (<a href="#mcinst">MCInst</a>, -<a href="#mcstreamer">MCStreamer</a>, etc). This is -done with a combination of several different classes: the (misnamed) -target-independent AsmPrinter class, target-specific subclasses of AsmPrinter -(such as SparcAsmPrinter), and the TargetLoweringObjectFile class.</p> - -<p>Since the MC layer works at the level of abstraction of object files, it -doesn't have a notion of functions, global variables etc. Instead, it thinks -about labels, directives, and instructions. A key class used at this time is -the MCStreamer class. This is an abstract API that is implemented in different -ways (e.g. to output a .s file, output an ELF .o file, etc) that is effectively -an "assembler API". MCStreamer has one method per directive, such as EmitLabel, -EmitSymbolAttribute, SwitchSection, etc, which directly correspond to assembly -level directives. -</p> - -<p>If you are interested in implementing a code generator for a target, there -are three important things that you have to implement for your target:</p> - -<ol> -<li>First, you need a subclass of AsmPrinter for your target. This class -implements the general lowering process converting MachineFunction's into MC -label constructs. The AsmPrinter base class provides a number of useful methods -and routines, and also allows you to override the lowering process in some -important ways. You should get much of the lowering for free if you are -implementing an ELF, COFF, or MachO target, because the TargetLoweringObjectFile -class implements much of the common logic.</li> - -<li>Second, you need to implement an instruction printer for your target. The -instruction printer takes an <a href="#mcinst">MCInst</a> and renders it to a -raw_ostream as text. Most of this is automatically generated from the .td file -(when you specify something like "<tt>add $dst, $src1, $src2</tt>" in the -instructions), but you need to implement routines to print operands.</li> - -<li>Third, you need to implement code that lowers a <a -href="#machineinstr">MachineInstr</a> to an MCInst, usually implemented in -"<target>MCInstLower.cpp". This lowering process is often target -specific, and is responsible for turning jump table entries, constant pool -indices, global variable addresses, etc into MCLabels as appropriate. This -translation layer is also responsible for expanding pseudo ops used by the code -generator into the actual machine instructions they correspond to. The MCInsts -that are generated by this are fed into the instruction printer or the encoder. -</li> - -</ol> - -<p>Finally, at your choosing, you can also implement an subclass of -MCCodeEmitter which lowers MCInst's into machine code bytes and relocations. -This is important if you want to support direct .o file emission, or would like -to implement an assembler for your target.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="vliw_packetizer">VLIW Packetizer</a> -</h3> - -<div> - -<p>In a Very Long Instruction Word (VLIW) architecture, the compiler is - responsible for mapping instructions to functional-units available on - the architecture. To that end, the compiler creates groups of instructions - called <i>packets</i> or <i>bundles</i>. The VLIW packetizer in LLVM is - a target-independent mechanism to enable the packetization of machine - instructions.</p> - -<!-- _______________________________________________________________________ --> - -<h4> - <a name="vliw_mapping">Mapping from instructions to functional units</a> -</h4> - -<div> - -<p>Instructions in a VLIW target can typically be mapped to multiple functional -units. During the process of packetizing, the compiler must be able to reason -about whether an instruction can be added to a packet. This decision can be -complex since the compiler has to examine all possible mappings of instructions -to functional units. Therefore to alleviate compilation-time complexity, the -VLIW packetizer parses the instruction classes of a target and generates tables -at compiler build time. These tables can then be queried by the provided -machine-independent API to determine if an instruction can be accommodated in a -packet.</p> -</div> - -<!-- ======================================================================= --> -<h4> - <a name="vliw_repr"> - How the packetization tables are generated and used - </a> -</h4> - -<div> - -<p>The packetizer reads instruction classes from a target's itineraries and -creates a deterministic finite automaton (DFA) to represent the state of a -packet. A DFA consists of three major elements: inputs, states, and -transitions. The set of inputs for the generated DFA represents the instruction -being added to a packet. The states represent the possible consumption -of functional units by instructions in a packet. In the DFA, transitions from -one state to another occur on the addition of an instruction to an existing -packet. If there is a legal mapping of functional units to instructions, then -the DFA contains a corresponding transition. The absence of a transition -indicates that a legal mapping does not exist and that the instruction cannot -be added to the packet.</p> - -<p>To generate tables for a VLIW target, add <i>Target</i>GenDFAPacketizer.inc -as a target to the Makefile in the target directory. The exported API provides -three functions: <tt>DFAPacketizer::clearResources()</tt>, -<tt>DFAPacketizer::reserveResources(MachineInstr *MI)</tt>, and -<tt>DFAPacketizer::canReserveResources(MachineInstr *MI)</tt>. These functions -allow a target packetizer to add an instruction to an existing packet and to -check whether an instruction can be added to a packet. See -<tt>llvm/CodeGen/DFAPacketizer.h</tt> for more information.</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="nativeassembler">Implementing a Native Assembler</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Though you're probably reading this because you want to write or maintain a -compiler backend, LLVM also fully supports building a native assemblers too. -We've tried hard to automate the generation of the assembler from the .td files -(in particular the instruction syntax and encodings), which means that a large -part of the manual and repetitive data entry can be factored and shared with the -compiler.</p> - -<!-- ======================================================================= --> -<h3 id="na_instparsing">Instruction Parsing</h3> - -<div><p>To Be Written</p></div> - - -<!-- ======================================================================= --> -<h3 id="na_instaliases"> - Instruction Alias Processing -</h3> - -<div> -<p>Once the instruction is parsed, it enters the MatchInstructionImpl function. -The MatchInstructionImpl function performs alias processing and then does -actual matching.</p> - -<p>Alias processing is the phase that canonicalizes different lexical forms of -the same instructions down to one representation. There are several different -kinds of alias that are possible to implement and they are listed below in the -order that they are processed (which is in order from simplest/weakest to most -complex/powerful). Generally you want to use the first alias mechanism that -meets the needs of your instruction, because it will allow a more concise -description.</p> - -<!-- _______________________________________________________________________ --> -<h4>Mnemonic Aliases</h4> - -<div> - -<p>The first phase of alias processing is simple instruction mnemonic -remapping for classes of instructions which are allowed with two different -mnemonics. This phase is a simple and unconditionally remapping from one input -mnemonic to one output mnemonic. It isn't possible for this form of alias to -look at the operands at all, so the remapping must apply for all forms of a -given mnemonic. Mnemonic aliases are defined simply, for example X86 has: -</p> - -<div class="doc_code"> -<pre> -def : MnemonicAlias<"cbw", "cbtw">; -def : MnemonicAlias<"smovq", "movsq">; -def : MnemonicAlias<"fldcww", "fldcw">; -def : MnemonicAlias<"fucompi", "fucomip">; -def : MnemonicAlias<"ud2a", "ud2">; -</pre> -</div> - -<p>... and many others. With a MnemonicAlias definition, the mnemonic is -remapped simply and directly. Though MnemonicAlias's can't look at any aspect -of the instruction (such as the operands) they can depend on global modes (the -same ones supported by the matcher), through a Requires clause:</p> - -<div class="doc_code"> -<pre> -def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>; -def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>; -</pre> -</div> - -<p>In this example, the mnemonic gets mapped into different a new one depending -on the current instruction set.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4>Instruction Aliases</h4> - -<div> - -<p>The most general phase of alias processing occurs while matching is -happening: it provides new forms for the matcher to match along with a specific -instruction to generate. An instruction alias has two parts: the string to -match and the instruction to generate. For example: -</p> - -<div class="doc_code"> -<pre> -def : InstAlias<"movsx $src, $dst", (MOVSX16rr8W GR16:$dst, GR8 :$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX16rm8W GR16:$dst, i8mem:$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX32rr8 GR32:$dst, GR8 :$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX32rr16 GR32:$dst, GR16 :$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX64rr8 GR64:$dst, GR8 :$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX64rr16 GR64:$dst, GR16 :$src)>; -def : InstAlias<"movsx $src, $dst", (MOVSX64rr32 GR64:$dst, GR32 :$src)>; -</pre> -</div> - -<p>This shows a powerful example of the instruction aliases, matching the -same mnemonic in multiple different ways depending on what operands are present -in the assembly. The result of instruction aliases can include operands in a -different order than the destination instruction, and can use an input -multiple times, for example:</p> - -<div class="doc_code"> -<pre> -def : InstAlias<"clrb $reg", (XOR8rr GR8 :$reg, GR8 :$reg)>; -def : InstAlias<"clrw $reg", (XOR16rr GR16:$reg, GR16:$reg)>; -def : InstAlias<"clrl $reg", (XOR32rr GR32:$reg, GR32:$reg)>; -def : InstAlias<"clrq $reg", (XOR64rr GR64:$reg, GR64:$reg)>; -</pre> -</div> - -<p>This example also shows that tied operands are only listed once. In the X86 -backend, XOR8rr has two input GR8's and one output GR8 (where an input is tied -to the output). InstAliases take a flattened operand list without duplicates -for tied operands. The result of an instruction alias can also use immediates -and fixed physical registers which are added as simple immediate operands in the -result, for example:</p> - -<div class="doc_code"> -<pre> -// Fixed Immediate operand. -def : InstAlias<"aad", (AAD8i8 10)>; - -// Fixed register operand. -def : InstAlias<"fcomi", (COM_FIr ST1)>; - -// Simple alias. -def : InstAlias<"fcomi $reg", (COM_FIr RST:$reg)>; -</pre> -</div> - - -<p>Instruction aliases can also have a Requires clause to make them -subtarget specific.</p> - -<p>If the back-end supports it, the instruction printer can automatically emit - the alias rather than what's being aliased. It typically leads to better, - more readable code. If it's better to print out what's being aliased, then - pass a '0' as the third parameter to the InstAlias definition.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3 id="na_matching">Instruction Matching</h3> - -<div><p>To Be Written</p></div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="targetimpls">Target-specific Implementation Notes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>This section of the document explains features or design decisions that are - specific to the code generator for a particular target. First we start - with a table that summarizes what features are supported by each target.</p> - -<!-- ======================================================================= --> -<h3> - <a name="targetfeatures">Target Feature Matrix</a> -</h3> - -<div> - -<p>Note that this table does not include the C backend or Cpp backends, since -they do not use the target independent code generator infrastructure. It also -doesn't list features that are not supported fully by any target yet. It -considers a feature to be supported if at least one subtarget supports it. A -feature being supported means that it is useful and works for most cases, it -does not indicate that there are zero known bugs in the implementation. Here -is the key:</p> - - -<table border="1" cellspacing="0"> - <tr> - <th>Unknown</th> - <th>No support</th> - <th>Partial Support</th> - <th>Complete Support</th> - </tr> - <tr> - <td class="unknown"></td> - <td class="no"></td> - <td class="partial"></td> - <td class="yes"></td> - </tr> -</table> - -<p>Here is the table:</p> - -<table width="689" border="1" cellspacing="0"> -<tr><td></td> -<td colspan="13" align="center" style="background-color:#ffc">Target</td> -</tr> - <tr> - <th>Feature</th> - <th>ARM</th> - <th>CellSPU</th> - <th>Hexagon</th> - <th>MBlaze</th> - <th>MSP430</th> - <th>Mips</th> - <th>PTX</th> - <th>PowerPC</th> - <th>Sparc</th> - <th>X86</th> - <th>XCore</th> - </tr> - -<tr> - <td><a href="#feat_reliable">is generally reliable</a></td> - <td class="yes"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="yes"></td> <!-- Hexagon --> - <td class="no"></td> <!-- MBlaze --> - <td class="unknown"></td> <!-- MSP430 --> - <td class="yes"></td> <!-- Mips --> - <td class="no"></td> <!-- PTX --> - <td class="yes"></td> <!-- PowerPC --> - <td class="yes"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="unknown"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_asmparser">assembly parser</a></td> - <td class="no"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="no"></td> <!-- Hexagon --> - <td class="yes"></td> <!-- MBlaze --> - <td class="no"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="no"></td> <!-- PTX --> - <td class="no"></td> <!-- PowerPC --> - <td class="no"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="no"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_disassembler">disassembler</a></td> - <td class="yes"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="no"></td> <!-- Hexagon --> - <td class="yes"></td> <!-- MBlaze --> - <td class="no"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="no"></td> <!-- PTX --> - <td class="no"></td> <!-- PowerPC --> - <td class="no"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="no"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_inlineasm">inline asm</a></td> - <td class="yes"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="yes"></td> <!-- Hexagon --> - <td class="yes"></td> <!-- MBlaze --> - <td class="unknown"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="unknown"></td> <!-- PTX --> - <td class="yes"></td> <!-- PowerPC --> - <td class="unknown"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="unknown"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_jit">jit</a></td> - <td class="partial"><a href="#feat_jit_arm">*</a></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="no"></td> <!-- Hexagon --> - <td class="no"></td> <!-- MBlaze --> - <td class="unknown"></td> <!-- MSP430 --> - <td class="yes"></td> <!-- Mips --> - <td class="unknown"></td> <!-- PTX --> - <td class="yes"></td> <!-- PowerPC --> - <td class="unknown"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="unknown"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_objectwrite">.o file writing</a></td> - <td class="no"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="no"></td> <!-- Hexagon --> - <td class="yes"></td> <!-- MBlaze --> - <td class="no"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="no"></td> <!-- PTX --> - <td class="no"></td> <!-- PowerPC --> - <td class="no"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="no"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_tailcall">tail calls</a></td> - <td class="yes"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="yes"></td> <!-- Hexagon --> - <td class="no"></td> <!-- MBlaze --> - <td class="unknown"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="unknown"></td> <!-- PTX --> - <td class="yes"></td> <!-- PowerPC --> - <td class="unknown"></td> <!-- Sparc --> - <td class="yes"></td> <!-- X86 --> - <td class="unknown"></td> <!-- XCore --> -</tr> - -<tr> - <td><a href="#feat_segstacks">segmented stacks</a></td> - <td class="no"></td> <!-- ARM --> - <td class="no"></td> <!-- CellSPU --> - <td class="no"></td> <!-- Hexagon --> - <td class="no"></td> <!-- MBlaze --> - <td class="no"></td> <!-- MSP430 --> - <td class="no"></td> <!-- Mips --> - <td class="no"></td> <!-- PTX --> - <td class="no"></td> <!-- PowerPC --> - <td class="no"></td> <!-- Sparc --> - <td class="partial"><a href="#feat_segstacks_x86">*</a></td> <!-- X86 --> - <td class="no"></td> <!-- XCore --> -</tr> - - -</table> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_reliable">Is Generally Reliable</h4> - -<div> -<p>This box indicates whether the target is considered to be production quality. -This indicates that the target has been used as a static compiler to -compile large amounts of code by a variety of different people and is in -continuous use.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_asmparser">Assembly Parser</h4> - -<div> -<p>This box indicates whether the target supports parsing target specific .s -files by implementing the MCAsmParser interface. This is required for llvm-mc -to be able to act as a native assembler and is required for inline assembly -support in the native .o file writer.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4 id="feat_disassembler">Disassembler</h4> - -<div> -<p>This box indicates whether the target supports the MCDisassembler API for -disassembling machine opcode bytes into MCInst's.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_inlineasm">Inline Asm</h4> - -<div> -<p>This box indicates whether the target supports most popular inline assembly -constraints and modifiers.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_jit">JIT Support</h4> - -<div> -<p>This box indicates whether the target supports the JIT compiler through -the ExecutionEngine interface.</p> - -<p id="feat_jit_arm">The ARM backend has basic support for integer code -in ARM codegen mode, but lacks NEON and full Thumb support.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_objectwrite">.o File Writing</h4> - -<div> - -<p>This box indicates whether the target supports writing .o files (e.g. MachO, -ELF, and/or COFF) files directly from the target. Note that the target also -must include an assembly parser and general inline assembly support for full -inline assembly support in the .o writer.</p> - -<p>Targets that don't support this feature can obviously still write out .o -files, they just rely on having an external assembler to translate from a .s -file to a .o file (as is the case for many C compilers).</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_tailcall">Tail Calls</h4> - -<div> - -<p>This box indicates whether the target supports guaranteed tail calls. These -are calls marked "<a href="LangRef.html#i_call">tail</a>" and use the fastcc -calling convention. Please see the <a href="#tailcallopt">tail call section -more more details</a>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4 id="feat_segstacks">Segmented Stacks</h4> - -<div> - -<p>This box indicates whether the target supports segmented stacks. This -replaces the traditional large C stack with many linked segments. It -is compatible with the <a href="http://gcc.gnu.org/wiki/SplitStacks">gcc -implementation</a> used by the Go front end.</p> - -<p id="feat_segstacks_x86">Basic support exists on the X86 backend. Currently -vararg doesn't work and the object files are not marked the way the gold -linker expects, but simple Go programs can be built by dragonegg.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="tailcallopt">Tail call optimization</a> -</h3> - -<div> - -<p>Tail call optimization, callee reusing the stack of the caller, is currently - supported on x86/x86-64 and PowerPC. It is performed if:</p> - -<ul> - <li>Caller and callee have the calling convention <tt>fastcc</tt> or - <tt>cc 10</tt> (GHC call convention).</li> - - <li>The call is a tail call - in tail position (ret immediately follows call - and ret uses value of call or is void).</li> - - <li>Option <tt>-tailcallopt</tt> is enabled.</li> - - <li>Platform specific constraints are met.</li> -</ul> - -<p>x86/x86-64 constraints:</p> - -<ul> - <li>No variable argument lists are used.</li> - - <li>On x86-64 when generating GOT/PIC code only module-local calls (visibility - = hidden or protected) are supported.</li> -</ul> - -<p>PowerPC constraints:</p> - -<ul> - <li>No variable argument lists are used.</li> - - <li>No byval parameters are used.</li> - - <li>On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected) are supported.</li> -</ul> - -<p>Example:</p> - -<p>Call as <tt>llc -tailcallopt test.ll</tt>.</p> - -<div class="doc_code"> -<pre> -declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4) - -define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { - %l1 = add i32 %in1, %in2 - %tmp = tail call fastcc i32 @tailcallee(i32 %in1 inreg, i32 %in2 inreg, i32 %in1, i32 %l1) - ret i32 %tmp -} -</pre> -</div> - -<p>Implications of <tt>-tailcallopt</tt>:</p> - -<p>To support tail call optimization in situations where the callee has more - arguments than the caller a 'callee pops arguments' convention is used. This - currently causes each <tt>fastcc</tt> call that is not tail call optimized - (because one or more of above constraints are not met) to be followed by a - readjustment of the stack. So performance might be worse in such cases.</p> - -</div> -<!-- ======================================================================= --> -<h3> - <a name="sibcallopt">Sibling call optimization</a> -</h3> - -<div> - -<p>Sibling call optimization is a restricted form of tail call optimization. - Unlike tail call optimization described in the previous section, it can be - performed automatically on any tail calls when <tt>-tailcallopt</tt> option - is not specified.</p> - -<p>Sibling call optimization is currently performed on x86/x86-64 when the - following constraints are met:</p> - -<ul> - <li>Caller and callee have the same calling convention. It can be either - <tt>c</tt> or <tt>fastcc</tt>. - - <li>The call is a tail call - in tail position (ret immediately follows call - and ret uses value of call or is void).</li> - - <li>Caller and callee have matching return type or the callee result is not - used. - - <li>If any of the callee arguments are being passed in stack, they must be - available in caller's own incoming argument stack and the frame offsets - must be the same. -</ul> - -<p>Example:</p> -<div class="doc_code"> -<pre> -declare i32 @bar(i32, i32) - -define i32 @foo(i32 %a, i32 %b, i32 %c) { -entry: - %0 = tail call i32 @bar(i32 %a, i32 %b) - ret i32 %0 -} -</pre> -</div> - -</div> -<!-- ======================================================================= --> -<h3> - <a name="x86">The X86 backend</a> -</h3> - -<div> - -<p>The X86 code generator lives in the <tt>lib/Target/X86</tt> directory. This - code generator is capable of targeting a variety of x86-32 and x86-64 - processors, and includes support for ISA extensions such as MMX and SSE.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="x86_tt">X86 Target Triples supported</a> -</h4> - -<div> - -<p>The following are the known target triples that are supported by the X86 - backend. This is not an exhaustive list, and it would be useful to add those - that people test.</p> - -<ul> - <li><b>i686-pc-linux-gnu</b> — Linux</li> - - <li><b>i386-unknown-freebsd5.3</b> — FreeBSD 5.3</li> - - <li><b>i686-pc-cygwin</b> — Cygwin on Win32</li> - - <li><b>i686-pc-mingw32</b> — MingW on Win32</li> - - <li><b>i386-pc-mingw32msvc</b> — MingW crosscompiler on Linux</li> - - <li><b>i686-apple-darwin*</b> — Apple Darwin on X86</li> - - <li><b>x86_64-unknown-linux-gnu</b> — Linux</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="x86_cc">X86 Calling Conventions supported</a> -</h4> - - -<div> - -<p>The following target-specific calling conventions are known to backend:</p> - -<ul> -<li><b>x86_StdCall</b> — stdcall calling convention seen on Microsoft - Windows platform (CC ID = 64).</li> -<li><b>x86_FastCall</b> — fastcall calling convention seen on Microsoft - Windows platform (CC ID = 65).</li> -<li><b>x86_ThisCall</b> — Similar to X86_StdCall. Passes first argument - in ECX, others via stack. Callee is responsible for stack cleaning. This - convention is used by MSVC by default for methods in its ABI - (CC ID = 70).</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="x86_memory">Representing X86 addressing modes in MachineInstrs</a> -</h4> - -<div> - -<p>The x86 has a very flexible way of accessing memory. It is capable of - forming memory addresses of the following expression directly in integer - instructions (which use ModR/M addressing):</p> - -<div class="doc_code"> -<pre> -SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32 -</pre> -</div> - -<p>In order to represent this, LLVM tracks no less than 5 operands for each - memory operand of this form. This means that the "load" form of - '<tt>mov</tt>' has the following <tt>MachineOperand</tt>s in this order:</p> - -<div class="doc_code"> -<pre> -Index: 0 | 1 2 3 4 5 -Meaning: DestReg, | BaseReg, Scale, IndexReg, Displacement Segment -OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm PhysReg -</pre> -</div> - -<p>Stores, and all other instructions, treat the four memory operands in the - same way and in the same order. If the segment register is unspecified - (regno = 0), then no segment override is generated. "Lea" operations do not - have a segment register specified, so they only have 4 operands for their - memory reference.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="x86_memory">X86 address spaces supported</a> -</h4> - -<div> - -<p>x86 has a feature which provides - the ability to perform loads and stores to different address spaces - via the x86 segment registers. A segment override prefix byte on an - instruction causes the instruction's memory access to go to the specified - segment. LLVM address space 0 is the default address space, which includes - the stack, and any unqualified memory accesses in a program. Address spaces - 1-255 are currently reserved for user-defined code. The GS-segment is - represented by address space 256, while the FS-segment is represented by - address space 257. Other x86 segments have yet to be allocated address space - numbers.</p> - -<p>While these address spaces may seem similar to TLS via the - <tt>thread_local</tt> keyword, and often use the same underlying hardware, - there are some fundamental differences.</p> - -<p>The <tt>thread_local</tt> keyword applies to global variables and - specifies that they are to be allocated in thread-local memory. There are - no type qualifiers involved, and these variables can be pointed to with - normal pointers and accessed with normal loads and stores. - The <tt>thread_local</tt> keyword is target-independent at the LLVM IR - level (though LLVM doesn't yet have implementations of it for some - configurations).<p> - -<p>Special address spaces, in contrast, apply to static types. Every - load and store has a particular address space in its address operand type, - and this is what determines which address space is accessed. - LLVM ignores these special address space qualifiers on global variables, - and does not provide a way to directly allocate storage in them. - At the LLVM IR level, the behavior of these special address spaces depends - in part on the underlying OS or runtime environment, and they are specific - to x86 (and LLVM doesn't yet handle them correctly in some cases).</p> - -<p>Some operating systems and runtime environments use (or may in the future - use) the FS/GS-segment registers for various low-level purposes, so care - should be taken when considering them.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="x86_names">Instruction naming</a> -</h4> - -<div> - -<p>An instruction name consists of the base name, a default operand size, and a - a character per operand with an optional special size. For example:</p> - -<div class="doc_code"> -<pre> -ADD8rr -> add, 8-bit register, 8-bit register -IMUL16rmi -> imul, 16-bit register, 16-bit memory, 16-bit immediate -IMUL16rmi8 -> imul, 16-bit register, 16-bit memory, 8-bit immediate -MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory -</pre> -</div> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ppc">The PowerPC backend</a> -</h3> - -<div> - -<p>The PowerPC code generator lives in the lib/Target/PowerPC directory. The - code generation is retargetable to several variations or <i>subtargets</i> of - the PowerPC ISA; including ppc32, ppc64 and altivec.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ppc_abi">LLVM PowerPC ABI</a> -</h4> - -<div> - -<p>LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC - relative (PIC) or static addressing for accessing global values, so no TOC - (r2) is used. Second, r31 is used as a frame pointer to allow dynamic growth - of a stack frame. LLVM takes advantage of having no TOC to provide space to - save the frame pointer in the PowerPC linkage area of the caller frame. - Other details of PowerPC ABI can be found at <a href= - "http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/32bitPowerPC.html" - >PowerPC ABI.</a> Note: This link describes the 32 bit ABI. The 64 bit ABI - is similar except space for GPRs are 8 bytes wide (not 4) and r13 is reserved - for system use.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ppc_frame">Frame Layout</a> -</h4> - -<div> - -<p>The size of a PowerPC frame is usually fixed for the duration of a - function's invocation. Since the frame is fixed size, all references - into the frame can be accessed via fixed offsets from the stack pointer. The - exception to this is when dynamic alloca or variable sized arrays are - present, then a base pointer (r31) is used as a proxy for the stack pointer - and stack pointer is free to grow or shrink. A base pointer is also used if - llvm-gcc is not passed the -fomit-frame-pointer flag. The stack pointer is - always aligned to 16 bytes, so that space allocated for altivec vectors will - be properly aligned.</p> - -<p>An invocation frame is laid out as follows (low memory at top);</p> - -<table class="layout"> - <tr> - <td>Linkage<br><br></td> - </tr> - <tr> - <td>Parameter area<br><br></td> - </tr> - <tr> - <td>Dynamic area<br><br></td> - </tr> - <tr> - <td>Locals area<br><br></td> - </tr> - <tr> - <td>Saved registers area<br><br></td> - </tr> - <tr style="border-style: none hidden none hidden;"> - <td><br></td> - </tr> - <tr> - <td>Previous Frame<br><br></td> - </tr> -</table> - -<p>The <i>linkage</i> area is used by a callee to save special registers prior - to allocating its own frame. Only three entries are relevant to LLVM. The - first entry is the previous stack pointer (sp), aka link. This allows - probing tools like gdb or exception handlers to quickly scan the frames in - the stack. A function epilog can also use the link to pop the frame from the - stack. The third entry in the linkage area is used to save the return - address from the lr register. Finally, as mentioned above, the last entry is - used to save the previous frame pointer (r31.) The entries in the linkage - area are the size of a GPR, thus the linkage area is 24 bytes long in 32 bit - mode and 48 bytes in 64 bit mode.</p> - -<p>32 bit linkage area</p> - -<table class="layout"> - <tr> - <td>0</td> - <td>Saved SP (r1)</td> - </tr> - <tr> - <td>4</td> - <td>Saved CR</td> - </tr> - <tr> - <td>8</td> - <td>Saved LR</td> - </tr> - <tr> - <td>12</td> - <td>Reserved</td> - </tr> - <tr> - <td>16</td> - <td>Reserved</td> - </tr> - <tr> - <td>20</td> - <td>Saved FP (r31)</td> - </tr> -</table> - -<p>64 bit linkage area</p> - -<table class="layout"> - <tr> - <td>0</td> - <td>Saved SP (r1)</td> - </tr> - <tr> - <td>8</td> - <td>Saved CR</td> - </tr> - <tr> - <td>16</td> - <td>Saved LR</td> - </tr> - <tr> - <td>24</td> - <td>Reserved</td> - </tr> - <tr> - <td>32</td> - <td>Reserved</td> - </tr> - <tr> - <td>40</td> - <td>Saved FP (r31)</td> - </tr> -</table> - -<p>The <i>parameter area</i> is used to store arguments being passed to a callee - function. Following the PowerPC ABI, the first few arguments are actually - passed in registers, with the space in the parameter area unused. However, - if there are not enough registers or the callee is a thunk or vararg - function, these register arguments can be spilled into the parameter area. - Thus, the parameter area must be large enough to store all the parameters for - the largest call sequence made by the caller. The size must also be - minimally large enough to spill registers r3-r10. This allows callees blind - to the call signature, such as thunks and vararg functions, enough space to - cache the argument registers. Therefore, the parameter area is minimally 32 - bytes (64 bytes in 64 bit mode.) Also note that since the parameter area is - a fixed offset from the top of the frame, that a callee can access its spilt - arguments using fixed offsets from the stack pointer (or base pointer.)</p> - -<p>Combining the information about the linkage, parameter areas and alignment. A - stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit - mode.</p> - -<p>The <i>dynamic area</i> starts out as size zero. If a function uses dynamic - alloca then space is added to the stack, the linkage and parameter areas are - shifted to top of stack, and the new space is available immediately below the - linkage and parameter areas. The cost of shifting the linkage and parameter - areas is minor since only the link value needs to be copied. The link value - can be easily fetched by adding the original frame size to the base pointer. - Note that allocations in the dynamic space need to observe 16 byte - alignment.</p> - -<p>The <i>locals area</i> is where the llvm compiler reserves space for local - variables.</p> - -<p>The <i>saved registers area</i> is where the llvm compiler spills callee - saved registers on entry to the callee.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ppc_prolog">Prolog/Epilog</a> -</h4> - -<div> - -<p>The llvm prolog and epilog are the same as described in the PowerPC ABI, with - the following exceptions. Callee saved registers are spilled after the frame - is created. This allows the llvm epilog/prolog support to be common with - other targets. The base pointer callee saved register r31 is saved in the - TOC slot of linkage area. This simplifies allocation of space for the base - pointer and makes it convenient to locate programatically and during - debugging.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ppc_dynamic">Dynamic Allocation</a> -</h4> - -<div> - -<p><i>TODO - More to come.</i></p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ptx">The PTX backend</a> -</h3> - -<div> - -<p>The PTX code generator lives in the lib/Target/PTX directory. It is - currently a work-in-progress, but already supports most of the code - generation functionality needed to generate correct PTX kernels for - CUDA devices.</p> - -<p>The code generator can target PTX 2.0+, and shader model 1.0+. The - PTX ISA Reference Manual is used as the primary source of ISA - information, though an effort is made to make the output of the code - generator match the output of the NVidia nvcc compiler, whenever - possible.</p> - -<p>Code Generator Options:</p> -<table border="1" cellspacing="0"> - <tr> - <th>Option</th> - <th>Description</th> - </tr> - <tr> - <td><code>double</code></td> - <td align="left">If enabled, the map_f64_to_f32 directive is - disabled in the PTX output, allowing native double-precision - arithmetic</td> - </tr> - <tr> - <td><code>no-fma</code></td> - <td align="left">Disable generation of Fused-Multiply Add - instructions, which may be beneficial for some devices</td> - </tr> - <tr> - <td><code>smxy / computexy</code></td> - <td align="left">Set shader model/compute capability to x.y, - e.g. sm20 or compute13</td> - </tr> -</table> - -<p>Working:</p> -<ul> - <li>Arithmetic instruction selection (including combo FMA)</li> - <li>Bitwise instruction selection</li> - <li>Control-flow instruction selection</li> - <li>Function calls (only on SM 2.0+ and no return arguments)</li> - <li>Addresses spaces (0 = global, 1 = constant, 2 = local, 4 = - shared)</li> - <li>Thread synchronization (bar.sync)</li> - <li>Special register reads ([N]TID, [N]CTAID, PMx, CLOCK, etc.)</li> -</ul> - -<p>In Progress:</p> -<ul> - <li>Robust call instruction selection</li> - <li>Stack frame allocation</li> - <li>Device-specific instruction scheduling optimizations</li> -</ul> - - -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-15 22:22:36 +0200 (Sun, 15 Apr 2012) $ -</address> - -</body> -</html> diff --git a/docs/CodeGenerator.rst b/docs/CodeGenerator.rst new file mode 100644 index 0000000..d1d0231 --- /dev/null +++ b/docs/CodeGenerator.rst @@ -0,0 +1,2428 @@ +.. _code_generator: + +========================================== +The LLVM Target-Independent Code Generator +========================================== + +.. role:: raw-html(raw) + :format: html + +.. raw:: html + + <style> + .unknown { background-color: #C0C0C0; text-align: center; } + .unknown:before { content: "?" } + .no { background-color: #C11B17 } + .no:before { content: "N" } + .partial { background-color: #F88017 } + .yes { background-color: #0F0; } + .yes:before { content: "Y" } + </style> + +.. contents:: + :local: + +.. warning:: + This is a work in progress. + +Introduction +============ + +The LLVM target-independent code generator is a framework that provides a suite +of reusable components for translating the LLVM internal representation to the +machine code for a specified target---either in assembly form (suitable for a +static compiler) or in binary machine code format (usable for a JIT +compiler). The LLVM target-independent code generator consists of six main +components: + +1. `Abstract target description`_ interfaces which capture important properties + about various aspects of the machine, independently of how they will be used. + These interfaces are defined in ``include/llvm/Target/``. + +2. Classes used to represent the `code being generated`_ for a target. These + classes are intended to be abstract enough to represent the machine code for + *any* target machine. These classes are defined in + ``include/llvm/CodeGen/``. At this level, concepts like "constant pool + entries" and "jump tables" are explicitly exposed. + +3. Classes and algorithms used to represent code as the object file level, the + `MC Layer`_. These classes represent assembly level constructs like labels, + sections, and instructions. At this level, concepts like "constant pool + entries" and "jump tables" don't exist. + +4. `Target-independent algorithms`_ used to implement various phases of native + code generation (register allocation, scheduling, stack frame representation, + etc). This code lives in ``lib/CodeGen/``. + +5. `Implementations of the abstract target description interfaces`_ for + particular targets. These machine descriptions make use of the components + provided by LLVM, and can optionally provide custom target-specific passes, + to build complete code generators for a specific target. Target descriptions + live in ``lib/Target/``. + +6. The target-independent JIT components. The LLVM JIT is completely target + independent (it uses the ``TargetJITInfo`` structure to interface for + target-specific issues. The code for the target-independent JIT lives in + ``lib/ExecutionEngine/JIT``. + +Depending on which part of the code generator you are interested in working on, +different pieces of this will be useful to you. In any case, you should be +familiar with the `target description`_ and `machine code representation`_ +classes. If you want to add a backend for a new target, you will need to +`implement the target description`_ classes for your new target and understand +the `LLVM code representation <LangRef.html>`_. If you are interested in +implementing a new `code generation algorithm`_, it should only depend on the +target-description and machine code representation classes, ensuring that it is +portable. + +Required components in the code generator +----------------------------------------- + +The two pieces of the LLVM code generator are the high-level interface to the +code generator and the set of reusable components that can be used to build +target-specific backends. The two most important interfaces (:raw-html:`<tt>` +`TargetMachine`_ :raw-html:`</tt>` and :raw-html:`<tt>` `TargetData`_ +:raw-html:`</tt>`) are the only ones that are required to be defined for a +backend to fit into the LLVM system, but the others must be defined if the +reusable code generator components are going to be used. + +This design has two important implications. The first is that LLVM can support +completely non-traditional code generation targets. For example, the C backend +does not require register allocation, instruction selection, or any of the other +standard components provided by the system. As such, it only implements these +two interfaces, and does its own thing. Note that C backend was removed from the +trunk since LLVM 3.1 release. Another example of a code generator like this is a +(purely hypothetical) backend that converts LLVM to the GCC RTL form and uses +GCC to emit machine code for a target. + +This design also implies that it is possible to design and implement radically +different code generators in the LLVM system that do not make use of any of the +built-in components. Doing so is not recommended at all, but could be required +for radically different targets that do not fit into the LLVM machine +description model: FPGAs for example. + +.. _high-level design of the code generator: + +The high-level design of the code generator +------------------------------------------- + +The LLVM target-independent code generator is designed to support efficient and +quality code generation for standard register-based microprocessors. Code +generation in this model is divided into the following stages: + +1. `Instruction Selection`_ --- This phase determines an efficient way to + express the input LLVM code in the target instruction set. This stage + produces the initial code for the program in the target instruction set, then + makes use of virtual registers in SSA form and physical registers that + represent any required register assignments due to target constraints or + calling conventions. This step turns the LLVM code into a DAG of target + instructions. + +2. `Scheduling and Formation`_ --- This phase takes the DAG of target + instructions produced by the instruction selection phase, determines an + ordering of the instructions, then emits the instructions as :raw-html:`<tt>` + `MachineInstr`_\s :raw-html:`</tt>` with that ordering. Note that we + describe this in the `instruction selection section`_ because it operates on + a `SelectionDAG`_. + +3. `SSA-based Machine Code Optimizations`_ --- This optional stage consists of a + series of machine-code optimizations that operate on the SSA-form produced by + the instruction selector. Optimizations like modulo-scheduling or peephole + optimization work here. + +4. `Register Allocation`_ --- The target code is transformed from an infinite + virtual register file in SSA form to the concrete register file used by the + target. This phase introduces spill code and eliminates all virtual register + references from the program. + +5. `Prolog/Epilog Code Insertion`_ --- Once the machine code has been generated + for the function and the amount of stack space required is known (used for + LLVM alloca's and spill slots), the prolog and epilog code for the function + can be inserted and "abstract stack location references" can be eliminated. + This stage is responsible for implementing optimizations like frame-pointer + elimination and stack packing. + +6. `Late Machine Code Optimizations`_ --- Optimizations that operate on "final" + machine code can go here, such as spill code scheduling and peephole + optimizations. + +7. `Code Emission`_ --- The final stage actually puts out the code for the + current function, either in the target assembler format or in machine + code. + +The code generator is based on the assumption that the instruction selector will +use an optimal pattern matching selector to create high-quality sequences of +native instructions. Alternative code generator designs based on pattern +expansion and aggressive iterative peephole optimization are much slower. This +design permits efficient compilation (important for JIT environments) and +aggressive optimization (used when generating code offline) by allowing +components of varying levels of sophistication to be used for any step of +compilation. + +In addition to these stages, target implementations can insert arbitrary +target-specific passes into the flow. For example, the X86 target uses a +special pass to handle the 80x87 floating point stack architecture. Other +targets with unusual requirements can be supported with custom passes as needed. + +Using TableGen for target description +------------------------------------- + +The target description classes require a detailed description of the target +architecture. These target descriptions often have a large amount of common +information (e.g., an ``add`` instruction is almost identical to a ``sub`` +instruction). In order to allow the maximum amount of commonality to be +factored out, the LLVM code generator uses the +`TableGen <TableGenFundamentals.html>`_ tool to describe big chunks of the +target machine, which allows the use of domain-specific and target-specific +abstractions to reduce the amount of repetition. + +As LLVM continues to be developed and refined, we plan to move more and more of +the target description to the ``.td`` form. Doing so gives us a number of +advantages. The most important is that it makes it easier to port LLVM because +it reduces the amount of C++ code that has to be written, and the surface area +of the code generator that needs to be understood before someone can get +something working. Second, it makes it easier to change things. In particular, +if tables and other things are all emitted by ``tblgen``, we only need a change +in one place (``tblgen``) to update all of the targets to a new interface. + +.. _Abstract target description: +.. _target description: + +Target description classes +========================== + +The LLVM target description classes (located in the ``include/llvm/Target`` +directory) provide an abstract description of the target machine independent of +any particular client. These classes are designed to capture the *abstract* +properties of the target (such as the instructions and registers it has), and do +not incorporate any particular pieces of code generation algorithms. + +All of the target description classes (except the :raw-html:`<tt>` `TargetData`_ +:raw-html:`</tt>` class) are designed to be subclassed by the concrete target +implementation, and have virtual methods implemented. To get to these +implementations, the :raw-html:`<tt>` `TargetMachine`_ :raw-html:`</tt>` class +provides accessors that should be implemented by the target. + +.. _TargetMachine: + +The ``TargetMachine`` class +--------------------------- + +The ``TargetMachine`` class provides virtual methods that are used to access the +target-specific implementations of the various target description classes via +the ``get*Info`` methods (``getInstrInfo``, ``getRegisterInfo``, +``getFrameInfo``, etc.). This class is designed to be specialized by a concrete +target implementation (e.g., ``X86TargetMachine``) which implements the various +virtual methods. The only required target description class is the +:raw-html:`<tt>` `TargetData`_ :raw-html:`</tt>` class, but if the code +generator components are to be used, the other interfaces should be implemented +as well. + +.. _TargetData: + +The ``TargetData`` class +------------------------ + +The ``TargetData`` class is the only required target description class, and it +is the only class that is not extensible (you cannot derived a new class from +it). ``TargetData`` specifies information about how the target lays out memory +for structures, the alignment requirements for various data types, the size of +pointers in the target, and whether the target is little-endian or +big-endian. + +.. _targetlowering: + +The ``TargetLowering`` class +---------------------------- + +The ``TargetLowering`` class is used by SelectionDAG based instruction selectors +primarily to describe how LLVM code should be lowered to SelectionDAG +operations. Among other things, this class indicates: + +* an initial register class to use for various ``ValueType``\s, + +* which operations are natively supported by the target machine, + +* the return type of ``setcc`` operations, + +* the type to use for shift amounts, and + +* various high-level characteristics, like whether it is profitable to turn + division by a constant into a multiplication sequence + +The ``TargetRegisterInfo`` class +-------------------------------- + +The ``TargetRegisterInfo`` class is used to describe the register file of the +target and any interactions between the registers. + +Registers in the code generator are represented in the code generator by +unsigned integers. Physical registers (those that actually exist in the target +description) are unique small numbers, and virtual registers are generally +large. Note that register ``#0`` is reserved as a flag value. + +Each register in the processor description has an associated +``TargetRegisterDesc`` entry, which provides a textual name for the register +(used for assembly output and debugging dumps) and a set of aliases (used to +indicate whether one register overlaps with another). + +In addition to the per-register description, the ``TargetRegisterInfo`` class +exposes a set of processor specific register classes (instances of the +``TargetRegisterClass`` class). Each register class contains sets of registers +that have the same properties (for example, they are all 32-bit integer +registers). Each SSA virtual register created by the instruction selector has +an associated register class. When the register allocator runs, it replaces +virtual registers with a physical register in the set. + +The target-specific implementations of these classes is auto-generated from a +`TableGen <TableGenFundamentals.html>`_ description of the register file. + +.. _TargetInstrInfo: + +The ``TargetInstrInfo`` class +----------------------------- + +The ``TargetInstrInfo`` class is used to describe the machine instructions +supported by the target. It is essentially an array of ``TargetInstrDescriptor`` +objects, each of which describes one instruction the target +supports. Descriptors define things like the mnemonic for the opcode, the number +of operands, the list of implicit register uses and defs, whether the +instruction has certain target-independent properties (accesses memory, is +commutable, etc), and holds any target-specific flags. + +The ``TargetFrameInfo`` class +----------------------------- + +The ``TargetFrameInfo`` class is used to provide information about the stack +frame layout of the target. It holds the direction of stack growth, the known +stack alignment on entry to each function, and the offset to the local area. +The offset to the local area is the offset from the stack pointer on function +entry to the first location where function data (local variables, spill +locations) can be stored. + +The ``TargetSubtarget`` class +----------------------------- + +The ``TargetSubtarget`` class is used to provide information about the specific +chip set being targeted. A sub-target informs code generation of which +instructions are supported, instruction latencies and instruction execution +itinerary; i.e., which processing units are used, in what order, and for how +long. + +The ``TargetJITInfo`` class +--------------------------- + +The ``TargetJITInfo`` class exposes an abstract interface used by the +Just-In-Time code generator to perform target-specific activities, such as +emitting stubs. If a ``TargetMachine`` supports JIT code generation, it should +provide one of these objects through the ``getJITInfo`` method. + +.. _code being generated: +.. _machine code representation: + +Machine code description classes +================================ + +At the high-level, LLVM code is translated to a machine specific representation +formed out of :raw-html:`<tt>` `MachineFunction`_ :raw-html:`</tt>`, +:raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>`, and :raw-html:`<tt>` +`MachineInstr`_ :raw-html:`</tt>` instances (defined in +``include/llvm/CodeGen``). This representation is completely target agnostic, +representing instructions in their most abstract form: an opcode and a series of +operands. This representation is designed to support both an SSA representation +for machine code, as well as a register allocated, non-SSA form. + +.. _MachineInstr: + +The ``MachineInstr`` class +-------------------------- + +Target machine instructions are represented as instances of the ``MachineInstr`` +class. This class is an extremely abstract way of representing machine +instructions. In particular, it only keeps track of an opcode number and a set +of operands. + +The opcode number is a simple unsigned integer that only has meaning to a +specific backend. All of the instructions for a target should be defined in the +``*InstrInfo.td`` file for the target. The opcode enum values are auto-generated +from this description. The ``MachineInstr`` class does not have any information +about how to interpret the instruction (i.e., what the semantics of the +instruction are); for that you must refer to the :raw-html:`<tt>` +`TargetInstrInfo`_ :raw-html:`</tt>` class. + +The operands of a machine instruction can be of several different types: a +register reference, a constant integer, a basic block reference, etc. In +addition, a machine operand should be marked as a def or a use of the value +(though only registers are allowed to be defs). + +By convention, the LLVM code generator orders instruction operands so that all +register definitions come before the register uses, even on architectures that +are normally printed in other orders. For example, the SPARC add instruction: +"``add %i1, %i2, %i3``" adds the "%i1", and "%i2" registers and stores the +result into the "%i3" register. In the LLVM code generator, the operands should +be stored as "``%i3, %i1, %i2``": with the destination first. + +Keeping destination (definition) operands at the beginning of the operand list +has several advantages. In particular, the debugging printer will print the +instruction like this: + +.. code-block:: llvm + + %r3 = add %i1, %i2 + +Also if the first operand is a def, it is easier to `create instructions`_ whose +only def is the first operand. + +.. _create instructions: + +Using the ``MachineInstrBuilder.h`` functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Machine instructions are created by using the ``BuildMI`` functions, located in +the ``include/llvm/CodeGen/MachineInstrBuilder.h`` file. The ``BuildMI`` +functions make it easy to build arbitrary machine instructions. Usage of the +``BuildMI`` functions look like this: + +.. code-block:: c++ + + // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42') + // instruction. The '1' specifies how many operands will be added. + MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42); + + // Create the same instr, but insert it at the end of a basic block. + MachineBasicBlock &MBB = ... + BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42); + + // Create the same instr, but insert it before a specified iterator point. + MachineBasicBlock::iterator MBBI = ... + BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42); + + // Create a 'cmp Reg, 0' instruction, no destination reg. + MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0); + + // Create an 'sahf' instruction which takes no operands and stores nothing. + MI = BuildMI(X86::SAHF, 0); + + // Create a self looping branch instruction. + BuildMI(MBB, X86::JNE, 1).addMBB(&MBB); + +The key thing to remember with the ``BuildMI`` functions is that you have to +specify the number of operands that the machine instruction will take. This +allows for efficient memory allocation. You also need to specify if operands +default to be uses of values, not definitions. If you need to add a definition +operand (other than the optional destination register), you must explicitly mark +it as such: + +.. code-block:: c++ + + MI.addReg(Reg, RegState::Define); + +Fixed (preassigned) registers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One important issue that the code generator needs to be aware of is the presence +of fixed registers. In particular, there are often places in the instruction +stream where the register allocator *must* arrange for a particular value to be +in a particular register. This can occur due to limitations of the instruction +set (e.g., the X86 can only do a 32-bit divide with the ``EAX``/``EDX`` +registers), or external factors like calling conventions. In any case, the +instruction selector should emit code that copies a virtual register into or out +of a physical register when needed. + +For example, consider this simple LLVM example: + +.. code-block:: llvm + + define i32 @test(i32 %X, i32 %Y) { + %Z = udiv i32 %X, %Y + ret i32 %Z + } + +The X86 instruction selector produces this machine code for the ``div`` and +``ret`` (use "``llc X.bc -march=x86 -print-machineinstrs``" to get this): + +.. code-block:: llvm + + ;; Start of div + %EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX + %reg1027 = sar %reg1024, 31 + %EDX = mov %reg1027 ;; Sign extend X into EDX + idiv %reg1025 ;; Divide by Y (in reg1025) + %reg1026 = mov %EAX ;; Read the result (Z) out of EAX + + ;; Start of ret + %EAX = mov %reg1026 ;; 32-bit return value goes in EAX + ret + +By the end of code generation, the register allocator has coalesced the +registers and deleted the resultant identity moves producing the following +code: + +.. code-block:: llvm + + ;; X is in EAX, Y is in ECX + mov %EAX, %EDX + sar %EDX, 31 + idiv %ECX + ret + +This approach is extremely general (if it can handle the X86 architecture, it +can handle anything!) and allows all of the target specific knowledge about the +instruction stream to be isolated in the instruction selector. Note that +physical registers should have a short lifetime for good code generation, and +all physical registers are assumed dead on entry to and exit from basic blocks +(before register allocation). Thus, if you need a value to be live across basic +block boundaries, it *must* live in a virtual register. + +Call-clobbered registers +^^^^^^^^^^^^^^^^^^^^^^^^ + +Some machine instructions, like calls, clobber a large number of physical +registers. Rather than adding ``<def,dead>`` operands for all of them, it is +possible to use an ``MO_RegisterMask`` operand instead. The register mask +operand holds a bit mask of preserved registers, and everything else is +considered to be clobbered by the instruction. + +Machine code in SSA form +^^^^^^^^^^^^^^^^^^^^^^^^ + +``MachineInstr``'s are initially selected in SSA-form, and are maintained in +SSA-form until register allocation happens. For the most part, this is +trivially simple since LLVM is already in SSA form; LLVM PHI nodes become +machine code PHI nodes, and virtual registers are only allowed to have a single +definition. + +After register allocation, machine code is no longer in SSA-form because there +are no virtual registers left in the code. + +.. _MachineBasicBlock: + +The ``MachineBasicBlock`` class +------------------------------- + +The ``MachineBasicBlock`` class contains a list of machine instructions +(:raw-html:`<tt>` `MachineInstr`_ :raw-html:`</tt>` instances). It roughly +corresponds to the LLVM code input to the instruction selector, but there can be +a one-to-many mapping (i.e. one LLVM basic block can map to multiple machine +basic blocks). The ``MachineBasicBlock`` class has a "``getBasicBlock``" method, +which returns the LLVM basic block that it comes from. + +.. _MachineFunction: + +The ``MachineFunction`` class +----------------------------- + +The ``MachineFunction`` class contains a list of machine basic blocks +(:raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>` instances). It +corresponds one-to-one with the LLVM function input to the instruction selector. +In addition to a list of basic blocks, the ``MachineFunction`` contains a a +``MachineConstantPool``, a ``MachineFrameInfo``, a ``MachineFunctionInfo``, and +a ``MachineRegisterInfo``. See ``include/llvm/CodeGen/MachineFunction.h`` for +more information. + +``MachineInstr Bundles`` +------------------------ + +LLVM code generator can model sequences of instructions as MachineInstr +bundles. A MI bundle can model a VLIW group / pack which contains an arbitrary +number of parallel instructions. It can also be used to model a sequential list +of instructions (potentially with data dependencies) that cannot be legally +separated (e.g. ARM Thumb2 IT blocks). + +Conceptually a MI bundle is a MI with a number of other MIs nested within: + +:: + + -------------- + | Bundle | --------- + -------------- \ + | ---------------- + | | MI | + | ---------------- + | | + | ---------------- + | | MI | + | ---------------- + | | + | ---------------- + | | MI | + | ---------------- + | + -------------- + | Bundle | -------- + -------------- \ + | ---------------- + | | MI | + | ---------------- + | | + | ---------------- + | | MI | + | ---------------- + | | + | ... + | + -------------- + | Bundle | -------- + -------------- \ + | + ... + +MI bundle support does not change the physical representations of +MachineBasicBlock and MachineInstr. All the MIs (including top level and nested +ones) are stored as sequential list of MIs. The "bundled" MIs are marked with +the 'InsideBundle' flag. A top level MI with the special BUNDLE opcode is used +to represent the start of a bundle. It's legal to mix BUNDLE MIs with indiviual +MIs that are not inside bundles nor represent bundles. + +MachineInstr passes should operate on a MI bundle as a single unit. Member +methods have been taught to correctly handle bundles and MIs inside bundles. +The MachineBasicBlock iterator has been modified to skip over bundled MIs to +enforce the bundle-as-a-single-unit concept. An alternative iterator +instr_iterator has been added to MachineBasicBlock to allow passes to iterate +over all of the MIs in a MachineBasicBlock, including those which are nested +inside bundles. The top level BUNDLE instruction must have the correct set of +register MachineOperand's that represent the cumulative inputs and outputs of +the bundled MIs. + +Packing / bundling of MachineInstr's should be done as part of the register +allocation super-pass. More specifically, the pass which determines what MIs +should be bundled together must be done after code generator exits SSA form +(i.e. after two-address pass, PHI elimination, and copy coalescing). Bundles +should only be finalized (i.e. adding BUNDLE MIs and input and output register +MachineOperands) after virtual registers have been rewritten into physical +registers. This requirement eliminates the need to add virtual register operands +to BUNDLE instructions which would effectively double the virtual register def +and use lists. + +.. _MC Layer: + +The "MC" Layer +============== + +The MC Layer is used to represent and process code at the raw machine code +level, devoid of "high level" information like "constant pools", "jump tables", +"global variables" or anything like that. At this level, LLVM handles things +like label names, machine instructions, and sections in the object file. The +code in this layer is used for a number of important purposes: the tail end of +the code generator uses it to write a .s or .o file, and it is also used by the +llvm-mc tool to implement standalone machine code assemblers and disassemblers. + +This section describes some of the important classes. There are also a number +of important subsystems that interact at this layer, they are described later in +this manual. + +.. _MCStreamer: + +The ``MCStreamer`` API +---------------------- + +MCStreamer is best thought of as an assembler API. It is an abstract API which +is *implemented* in different ways (e.g. to output a .s file, output an ELF .o +file, etc) but whose API correspond directly to what you see in a .s file. +MCStreamer has one method per directive, such as EmitLabel, EmitSymbolAttribute, +SwitchSection, EmitValue (for .byte, .word), etc, which directly correspond to +assembly level directives. It also has an EmitInstruction method, which is used +to output an MCInst to the streamer. + +This API is most important for two clients: the llvm-mc stand-alone assembler is +effectively a parser that parses a line, then invokes a method on MCStreamer. In +the code generator, the `Code Emission`_ phase of the code generator lowers +higher level LLVM IR and Machine* constructs down to the MC layer, emitting +directives through MCStreamer. + +On the implementation side of MCStreamer, there are two major implementations: +one for writing out a .s file (MCAsmStreamer), and one for writing out a .o +file (MCObjectStreamer). MCAsmStreamer is a straight-forward implementation +that prints out a directive for each method (e.g. ``EmitValue -> .byte``), but +MCObjectStreamer implements a full assembler. + +The ``MCContext`` class +----------------------- + +The MCContext class is the owner of a variety of uniqued data structures at the +MC layer, including symbols, sections, etc. As such, this is the class that you +interact with to create symbols and sections. This class can not be subclassed. + +The ``MCSymbol`` class +---------------------- + +The MCSymbol class represents a symbol (aka label) in the assembly file. There +are two interesting kinds of symbols: assembler temporary symbols, and normal +symbols. Assembler temporary symbols are used and processed by the assembler +but are discarded when the object file is produced. The distinction is usually +represented by adding a prefix to the label, for example "L" labels are +assembler temporary labels in MachO. + +MCSymbols are created by MCContext and uniqued there. This means that MCSymbols +can be compared for pointer equivalence to find out if they are the same symbol. +Note that pointer inequality does not guarantee the labels will end up at +different addresses though. It's perfectly legal to output something like this +to the .s file: + +:: + + foo: + bar: + .byte 4 + +In this case, both the foo and bar symbols will have the same address. + +The ``MCSection`` class +----------------------- + +The ``MCSection`` class represents an object-file specific section. It is +subclassed by object file specific implementations (e.g. ``MCSectionMachO``, +``MCSectionCOFF``, ``MCSectionELF``) and these are created and uniqued by +MCContext. The MCStreamer has a notion of the current section, which can be +changed with the SwitchToSection method (which corresponds to a ".section" +directive in a .s file). + +.. _MCInst: + +The ``MCInst`` class +-------------------- + +The ``MCInst`` class is a target-independent representation of an instruction. +It is a simple class (much more so than `MachineInstr`_) that holds a +target-specific opcode and a vector of MCOperands. MCOperand, in turn, is a +simple discriminated union of three cases: 1) a simple immediate, 2) a target +register ID, 3) a symbolic expression (e.g. "``Lfoo-Lbar+42``") as an MCExpr. + +MCInst is the common currency used to represent machine instructions at the MC +layer. It is the type used by the instruction encoder, the instruction printer, +and the type generated by the assembly parser and disassembler. + +.. _Target-independent algorithms: +.. _code generation algorithm: + +Target-independent code generation algorithms +============================================= + +This section documents the phases described in the `high-level design of the +code generator`_. It explains how they work and some of the rationale behind +their design. + +.. _Instruction Selection: +.. _instruction selection section: + +Instruction Selection +--------------------- + +Instruction Selection is the process of translating LLVM code presented to the +code generator into target-specific machine instructions. There are several +well-known ways to do this in the literature. LLVM uses a SelectionDAG based +instruction selector. + +Portions of the DAG instruction selector are generated from the target +description (``*.td``) files. Our goal is for the entire instruction selector +to be generated from these ``.td`` files, though currently there are still +things that require custom C++ code. + +.. _SelectionDAG: + +Introduction to SelectionDAGs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The SelectionDAG provides an abstraction for code representation in a way that +is amenable to instruction selection using automatic techniques +(e.g. dynamic-programming based optimal pattern matching selectors). It is also +well-suited to other phases of code generation; in particular, instruction +scheduling (SelectionDAG's are very close to scheduling DAGs post-selection). +Additionally, the SelectionDAG provides a host representation where a large +variety of very-low-level (but target-independent) `optimizations`_ may be +performed; ones which require extensive information about the instructions +efficiently supported by the target. + +The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the +``SDNode`` class. The primary payload of the ``SDNode`` is its operation code +(Opcode) that indicates what operation the node performs and the operands to the +operation. The various operation node types are described at the top of the +``include/llvm/CodeGen/SelectionDAGNodes.h`` file. + +Although most operations define a single value, each node in the graph may +define multiple values. For example, a combined div/rem operation will define +both the dividend and the remainder. Many other situations require multiple +values as well. Each node also has some number of operands, which are edges to +the node defining the used value. Because nodes may define multiple values, +edges are represented by instances of the ``SDValue`` class, which is a +``<SDNode, unsigned>`` pair, indicating the node and result value being used, +respectively. Each value produced by an ``SDNode`` has an associated ``MVT`` +(Machine Value Type) indicating what the type of the value is. + +SelectionDAGs contain two different kinds of values: those that represent data +flow and those that represent control flow dependencies. Data values are simple +edges with an integer or floating point value type. Control edges are +represented as "chain" edges which are of type ``MVT::Other``. These edges +provide an ordering between nodes that have side effects (such as loads, stores, +calls, returns, etc). All nodes that have side effects should take a token +chain as input and produce a new one as output. By convention, token chain +inputs are always operand #0, and chain results are always the last value +produced by an operation. + +A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is +always a marker node with an Opcode of ``ISD::EntryToken``. The Root node is +the final side-effecting node in the token chain. For example, in a single basic +block function it would be the return node. + +One important concept for SelectionDAGs is the notion of a "legal" vs. +"illegal" DAG. A legal DAG for a target is one that only uses supported +operations and supported types. On a 32-bit PowerPC, for example, a DAG with a +value of type i1, i8, i16, or i64 would be illegal, as would a DAG that uses a +SREM or UREM operation. The `legalize types`_ and `legalize operations`_ phases +are responsible for turning an illegal DAG into a legal DAG. + +SelectionDAG Instruction Selection Process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +SelectionDAG-based instruction selection consists of the following steps: + +#. `Build initial DAG`_ --- This stage performs a simple translation from the + input LLVM code to an illegal SelectionDAG. + +#. `Optimize SelectionDAG`_ --- This stage performs simple optimizations on the + SelectionDAG to simplify it, and recognize meta instructions (like rotates + and ``div``/``rem`` pairs) for targets that support these meta operations. + This makes the resultant code more efficient and the `select instructions + from DAG`_ phase (below) simpler. + +#. `Legalize SelectionDAG Types`_ --- This stage transforms SelectionDAG nodes + to eliminate any types that are unsupported on the target. + +#. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to clean up + redundancies exposed by type legalization. + +#. `Legalize SelectionDAG Ops`_ --- This stage transforms SelectionDAG nodes to + eliminate any operations that are unsupported on the target. + +#. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to eliminate + inefficiencies introduced by operation legalization. + +#. `Select instructions from DAG`_ --- Finally, the target instruction selector + matches the DAG operations to target instructions. This process translates + the target-independent input DAG into another DAG of target instructions. + +#. `SelectionDAG Scheduling and Formation`_ --- The last phase assigns a linear + order to the instructions in the target-instruction DAG and emits them into + the MachineFunction being compiled. This step uses traditional prepass + scheduling techniques. + +After all of these steps are complete, the SelectionDAG is destroyed and the +rest of the code generation passes are run. + +One great way to visualize what is going on here is to take advantage of a few +LLC command line options. The following options pop up a window displaying the +SelectionDAG at specific times (if you only get errors printed to the console +while using this, you probably `need to configure your +system <ProgrammersManual.html#ViewGraph>`_ to add support for it). + +* ``-view-dag-combine1-dags`` displays the DAG after being built, before the + first optimization pass. + +* ``-view-legalize-dags`` displays the DAG before Legalization. + +* ``-view-dag-combine2-dags`` displays the DAG before the second optimization + pass. + +* ``-view-isel-dags`` displays the DAG before the Select phase. + +* ``-view-sched-dags`` displays the DAG before Scheduling. + +The ``-view-sunit-dags`` displays the Scheduler's dependency graph. This graph +is based on the final SelectionDAG, with nodes that must be scheduled together +bundled into a single scheduling-unit node, and with immediate operands and +other nodes that aren't relevant for scheduling omitted. + +.. _Build initial DAG: + +Initial SelectionDAG Construction +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The initial SelectionDAG is na\ :raw-html:`ï`\ vely peephole expanded from +the LLVM input by the ``SelectionDAGLowering`` class in the +``lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp`` file. The intent of this pass +is to expose as much low-level, target-specific details to the SelectionDAG as +possible. This pass is mostly hard-coded (e.g. an LLVM ``add`` turns into an +``SDNode add`` while a ``getelementptr`` is expanded into the obvious +arithmetic). This pass requires target-specific hooks to lower calls, returns, +varargs, etc. For these features, the :raw-html:`<tt>` `TargetLowering`_ +:raw-html:`</tt>` interface is used. + +.. _legalize types: +.. _Legalize SelectionDAG Types: +.. _Legalize SelectionDAG Ops: + +SelectionDAG LegalizeTypes Phase +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Legalize phase is in charge of converting a DAG to only use the types that +are natively supported by the target. + +There are two main ways of converting values of unsupported scalar types to +values of supported types: converting small types to larger types ("promoting"), +and breaking up large integer types into smaller ones ("expanding"). For +example, a target might require that all f32 values are promoted to f64 and that +all i1/i8/i16 values are promoted to i32. The same target might require that +all i64 values be expanded into pairs of i32 values. These changes can insert +sign and zero extensions as needed to make sure that the final code has the same +behavior as the input. + +There are two main ways of converting values of unsupported vector types to +value of supported types: splitting vector types, multiple times if necessary, +until a legal type is found, and extending vector types by adding elements to +the end to round them out to legal types ("widening"). If a vector gets split +all the way down to single-element parts with no supported vector type being +found, the elements are converted to scalars ("scalarizing"). + +A target implementation tells the legalizer which types are supported (and which +register class to use for them) by calling the ``addRegisterClass`` method in +its TargetLowering constructor. + +.. _legalize operations: +.. _Legalizer: + +SelectionDAG Legalize Phase +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Legalize phase is in charge of converting a DAG to only use the operations +that are natively supported by the target. + +Targets often have weird constraints, such as not supporting every operation on +every supported datatype (e.g. X86 does not support byte conditional moves and +PowerPC does not support sign-extending loads from a 16-bit memory location). +Legalize takes care of this by open-coding another sequence of operations to +emulate the operation ("expansion"), by promoting one type to a larger type that +supports the operation ("promotion"), or by using a target-specific hook to +implement the legalization ("custom"). + +A target implementation tells the legalizer which operations are not supported +(and which of the above three actions to take) by calling the +``setOperationAction`` method in its ``TargetLowering`` constructor. + +Prior to the existence of the Legalize passes, we required that every target +`selector`_ supported and handled every operator and type even if they are not +natively supported. The introduction of the Legalize phases allows all of the +canonicalization patterns to be shared across targets, and makes it very easy to +optimize the canonicalized code because it is still in the form of a DAG. + +.. _optimizations: +.. _Optimize SelectionDAG: +.. _selector: + +SelectionDAG Optimization Phase: the DAG Combiner +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The SelectionDAG optimization phase is run multiple times for code generation, +immediately after the DAG is built and once after each legalization. The first +run of the pass allows the initial code to be cleaned up (e.g. performing +optimizations that depend on knowing that the operators have restricted type +inputs). Subsequent runs of the pass clean up the messy code generated by the +Legalize passes, which allows Legalize to be very simple (it can focus on making +code legal instead of focusing on generating *good* and legal code). + +One important class of optimizations performed is optimizing inserted sign and +zero extension instructions. We currently use ad-hoc techniques, but could move +to more rigorous techniques in the future. Here are some good papers on the +subject: + +"`Widening integer arithmetic <http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html>`_" :raw-html:`<br>` +Kevin Redwine and Norman Ramsey :raw-html:`<br>` +International Conference on Compiler Construction (CC) 2004 + +"`Effective sign extension elimination <http://portal.acm.org/citation.cfm?doid=512529.512552>`_" :raw-html:`<br>` +Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani :raw-html:`<br>` +Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design +and Implementation. + +.. _Select instructions from DAG: + +SelectionDAG Select Phase +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Select phase is the bulk of the target-specific code for instruction +selection. This phase takes a legal SelectionDAG as input, pattern matches the +instructions supported by the target to this DAG, and produces a new DAG of +target code. For example, consider the following LLVM fragment: + +.. code-block:: llvm + + %t1 = fadd float %W, %X + %t2 = fmul float %t1, %Y + %t3 = fadd float %t2, %Z + +This LLVM code corresponds to a SelectionDAG that looks basically like this: + +.. code-block:: llvm + + (fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z) + +If a target supports floating point multiply-and-add (FMA) operations, one of +the adds can be merged with the multiply. On the PowerPC, for example, the +output of the instruction selector might look like this DAG: + +:: + + (FMADDS (FADDS W, X), Y, Z) + +The ``FMADDS`` instruction is a ternary instruction that multiplies its first +two operands and adds the third (as single-precision floating-point numbers). +The ``FADDS`` instruction is a simple binary single-precision add instruction. +To perform this pattern match, the PowerPC backend includes the following +instruction definitions: + +:: + + def FMADDS : AForm_1<59, 29, + (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB), + "fmadds $FRT, $FRA, $FRC, $FRB", + [(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC), + F4RC:$FRB))]>; + def FADDS : AForm_2<59, 21, + (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB), + "fadds $FRT, $FRA, $FRB", + [(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))]>; + +The portion of the instruction definition in bold indicates the pattern used to +match the instruction. The DAG operators (like ``fmul``/``fadd``) are defined +in the ``include/llvm/Target/TargetSelectionDAG.td`` file. " ``F4RC``" is the +register class of the input and result values. + +The TableGen DAG instruction selector generator reads the instruction patterns +in the ``.td`` file and automatically builds parts of the pattern matching code +for your target. It has the following strengths: + +* At compiler-compiler time, it analyzes your instruction patterns and tells you + if your patterns make sense or not. + +* It can handle arbitrary constraints on operands for the pattern match. In + particular, it is straight-forward to say things like "match any immediate + that is a 13-bit sign-extended value". For examples, see the ``immSExt16`` + and related ``tblgen`` classes in the PowerPC backend. + +* It knows several important identities for the patterns defined. For example, + it knows that addition is commutative, so it allows the ``FMADDS`` pattern + above to match "``(fadd X, (fmul Y, Z))``" as well as "``(fadd (fmul X, Y), + Z)``", without the target author having to specially handle this case. + +* It has a full-featured type-inferencing system. In particular, you should + rarely have to explicitly tell the system what type parts of your patterns + are. In the ``FMADDS`` case above, we didn't have to tell ``tblgen`` that all + of the nodes in the pattern are of type 'f32'. It was able to infer and + propagate this knowledge from the fact that ``F4RC`` has type 'f32'. + +* Targets can define their own (and rely on built-in) "pattern fragments". + Pattern fragments are chunks of reusable patterns that get inlined into your + patterns during compiler-compiler time. For example, the integer "``(not + x)``" operation is actually defined as a pattern fragment that expands as + "``(xor x, -1)``", since the SelectionDAG does not have a native '``not``' + operation. Targets can define their own short-hand fragments as they see fit. + See the definition of '``not``' and '``ineg``' for examples. + +* In addition to instructions, targets can specify arbitrary patterns that map + to one or more instructions using the 'Pat' class. For example, the PowerPC + has no way to load an arbitrary integer immediate into a register in one + instruction. To tell tblgen how to do this, it defines: + + :: + + // Arbitrary immediate support. Implement in terms of LIS/ORI. + def : Pat<(i32 imm:$imm), + (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>; + + If none of the single-instruction patterns for loading an immediate into a + register match, this will be used. This rule says "match an arbitrary i32 + immediate, turning it into an ``ORI`` ('or a 16-bit immediate') and an ``LIS`` + ('load 16-bit immediate, where the immediate is shifted to the left 16 bits') + instruction". To make this work, the ``LO16``/``HI16`` node transformations + are used to manipulate the input immediate (in this case, take the high or low + 16-bits of the immediate). + +* While the system does automate a lot, it still allows you to write custom C++ + code to match special cases if there is something that is hard to + express. + +While it has many strengths, the system currently has some limitations, +primarily because it is a work in progress and is not yet finished: + +* Overall, there is no way to define or match SelectionDAG nodes that define + multiple values (e.g. ``SMUL_LOHI``, ``LOAD``, ``CALL``, etc). This is the + biggest reason that you currently still *have to* write custom C++ code + for your instruction selector. + +* There is no great way to support matching complex addressing modes yet. In + the future, we will extend pattern fragments to allow them to define multiple + values (e.g. the four operands of the `X86 addressing mode`_, which are + currently matched with custom C++ code). In addition, we'll extend fragments + so that a fragment can match multiple different patterns. + +* We don't automatically infer flags like ``isStore``/``isLoad`` yet. + +* We don't automatically generate the set of supported registers and operations + for the `Legalizer`_ yet. + +* We don't have a way of tying in custom legalized nodes yet. + +Despite these limitations, the instruction selector generator is still quite +useful for most of the binary and logical operations in typical instruction +sets. If you run into any problems or can't figure out how to do something, +please let Chris know! + +.. _Scheduling and Formation: +.. _SelectionDAG Scheduling and Formation: + +SelectionDAG Scheduling and Formation Phase +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The scheduling phase takes the DAG of target instructions from the selection +phase and assigns an order. The scheduler can pick an order depending on +various constraints of the machines (i.e. order for minimal register pressure or +try to cover instruction latencies). Once an order is established, the DAG is +converted to a list of :raw-html:`<tt>` `MachineInstr`_\s :raw-html:`</tt>` and +the SelectionDAG is destroyed. + +Note that this phase is logically separate from the instruction selection phase, +but is tied to it closely in the code because it operates on SelectionDAGs. + +Future directions for the SelectionDAG +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. Optional function-at-a-time selection. + +#. Auto-generate entire selector from ``.td`` file. + +.. _SSA-based Machine Code Optimizations: + +SSA-based Machine Code Optimizations +------------------------------------ + +To Be Written + +Live Intervals +-------------- + +Live Intervals are the ranges (intervals) where a variable is *live*. They are +used by some `register allocator`_ passes to determine if two or more virtual +registers which require the same physical register are live at the same point in +the program (i.e., they conflict). When this situation occurs, one virtual +register must be *spilled*. + +Live Variable Analysis +^^^^^^^^^^^^^^^^^^^^^^ + +The first step in determining the live intervals of variables is to calculate +the set of registers that are immediately dead after the instruction (i.e., the +instruction calculates the value, but it is never used) and the set of registers +that are used by the instruction, but are never used after the instruction +(i.e., they are killed). Live variable information is computed for +each *virtual* register and *register allocatable* physical register +in the function. This is done in a very efficient manner because it uses SSA to +sparsely compute lifetime information for virtual registers (which are in SSA +form) and only has to track physical registers within a block. Before register +allocation, LLVM can assume that physical registers are only live within a +single basic block. This allows it to do a single, local analysis to resolve +physical register lifetimes within each basic block. If a physical register is +not register allocatable (e.g., a stack pointer or condition codes), it is not +tracked. + +Physical registers may be live in to or out of a function. Live in values are +typically arguments in registers. Live out values are typically return values in +registers. Live in values are marked as such, and are given a dummy "defining" +instruction during live intervals analysis. If the last basic block of a +function is a ``return``, then it's marked as using all live out values in the +function. + +``PHI`` nodes need to be handled specially, because the calculation of the live +variable information from a depth first traversal of the CFG of the function +won't guarantee that a virtual register used by the ``PHI`` node is defined +before it's used. When a ``PHI`` node is encountered, only the definition is +handled, because the uses will be handled in other basic blocks. + +For each ``PHI`` node of the current basic block, we simulate an assignment at +the end of the current basic block and traverse the successor basic blocks. If a +successor basic block has a ``PHI`` node and one of the ``PHI`` node's operands +is coming from the current basic block, then the variable is marked as *alive* +within the current basic block and all of its predecessor basic blocks, until +the basic block with the defining instruction is encountered. + +Live Intervals Analysis +^^^^^^^^^^^^^^^^^^^^^^^ + +We now have the information available to perform the live intervals analysis and +build the live intervals themselves. We start off by numbering the basic blocks +and machine instructions. We then handle the "live-in" values. These are in +physical registers, so the physical register is assumed to be killed by the end +of the basic block. Live intervals for virtual registers are computed for some +ordering of the machine instructions ``[1, N]``. A live interval is an interval +``[i, j)``, where ``1 >= i >= j > N``, for which a variable is live. + +.. note:: + More to come... + +.. _Register Allocation: +.. _register allocator: + +Register Allocation +------------------- + +The *Register Allocation problem* consists in mapping a program +:raw-html:`<b><tt>` P\ :sub:`v`\ :raw-html:`</tt></b>`, that can use an unbounded +number of virtual registers, to a program :raw-html:`<b><tt>` P\ :sub:`p`\ +:raw-html:`</tt></b>` that contains a finite (possibly small) number of physical +registers. Each target architecture has a different number of physical +registers. If the number of physical registers is not enough to accommodate all +the virtual registers, some of them will have to be mapped into memory. These +virtuals are called *spilled virtuals*. + +How registers are represented in LLVM +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In LLVM, physical registers are denoted by integer numbers that normally range +from 1 to 1023. To see how this numbering is defined for a particular +architecture, you can read the ``GenRegisterNames.inc`` file for that +architecture. For instance, by inspecting +``lib/Target/X86/X86GenRegisterInfo.inc`` we see that the 32-bit register +``EAX`` is denoted by 43, and the MMX register ``MM0`` is mapped to 65. + +Some architectures contain registers that share the same physical location. A +notable example is the X86 platform. For instance, in the X86 architecture, the +registers ``EAX``, ``AX`` and ``AL`` share the first eight bits. These physical +registers are marked as *aliased* in LLVM. Given a particular architecture, you +can check which registers are aliased by inspecting its ``RegisterInfo.td`` +file. Moreover, the class ``MCRegAliasIterator`` enumerates all the physical +registers aliased to a register. + +Physical registers, in LLVM, are grouped in *Register Classes*. Elements in the +same register class are functionally equivalent, and can be interchangeably +used. Each virtual register can only be mapped to physical registers of a +particular class. For instance, in the X86 architecture, some virtuals can only +be allocated to 8 bit registers. A register class is described by +``TargetRegisterClass`` objects. To discover if a virtual register is +compatible with a given physical, this code can be used:</p> + +.. code-block:: c++ + + bool RegMapping_Fer::compatible_class(MachineFunction &mf, + unsigned v_reg, + unsigned p_reg) { + assert(TargetRegisterInfo::isPhysicalRegister(p_reg) && + "Target register must be physical"); + const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg); + return trc->contains(p_reg); + } + +Sometimes, mostly for debugging purposes, it is useful to change the number of +physical registers available in the target architecture. This must be done +statically, inside the ``TargetRegsterInfo.td`` file. Just ``grep`` for +``RegisterClass``, the last parameter of which is a list of registers. Just +commenting some out is one simple way to avoid them being used. A more polite +way is to explicitly exclude some registers from the *allocation order*. See the +definition of the ``GR8`` register class in +``lib/Target/X86/X86RegisterInfo.td`` for an example of this. + +Virtual registers are also denoted by integer numbers. Contrary to physical +registers, different virtual registers never share the same number. Whereas +physical registers are statically defined in a ``TargetRegisterInfo.td`` file +and cannot be created by the application developer, that is not the case with +virtual registers. In order to create new virtual registers, use the method +``MachineRegisterInfo::createVirtualRegister()``. This method will return a new +virtual register. Use an ``IndexedMap<Foo, VirtReg2IndexFunctor>`` to hold +information per virtual register. If you need to enumerate all virtual +registers, use the function ``TargetRegisterInfo::index2VirtReg()`` to find the +virtual register numbers: + +.. code-block:: c++ + + for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) { + unsigned VirtReg = TargetRegisterInfo::index2VirtReg(i); + stuff(VirtReg); + } + +Before register allocation, the operands of an instruction are mostly virtual +registers, although physical registers may also be used. In order to check if a +given machine operand is a register, use the boolean function +``MachineOperand::isRegister()``. To obtain the integer code of a register, use +``MachineOperand::getReg()``. An instruction may define or use a register. For +instance, ``ADD reg:1026 := reg:1025 reg:1024`` defines the registers 1024, and +uses registers 1025 and 1026. Given a register operand, the method +``MachineOperand::isUse()`` informs if that register is being used by the +instruction. The method ``MachineOperand::isDef()`` informs if that registers is +being defined. + +We will call physical registers present in the LLVM bitcode before register +allocation *pre-colored registers*. Pre-colored registers are used in many +different situations, for instance, to pass parameters of functions calls, and +to store results of particular instructions. There are two types of pre-colored +registers: the ones *implicitly* defined, and those *explicitly* +defined. Explicitly defined registers are normal operands, and can be accessed +with ``MachineInstr::getOperand(int)::getReg()``. In order to check which +registers are implicitly defined by an instruction, use the +``TargetInstrInfo::get(opcode)::ImplicitDefs``, where ``opcode`` is the opcode +of the target instruction. One important difference between explicit and +implicit physical registers is that the latter are defined statically for each +instruction, whereas the former may vary depending on the program being +compiled. For example, an instruction that represents a function call will +always implicitly define or use the same set of physical registers. To read the +registers implicitly used by an instruction, use +``TargetInstrInfo::get(opcode)::ImplicitUses``. Pre-colored registers impose +constraints on any register allocation algorithm. The register allocator must +make sure that none of them are overwritten by the values of virtual registers +while still alive. + +Mapping virtual registers to physical registers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are two ways to map virtual registers to physical registers (or to memory +slots). The first way, that we will call *direct mapping*, is based on the use +of methods of the classes ``TargetRegisterInfo``, and ``MachineOperand``. The +second way, that we will call *indirect mapping*, relies on the ``VirtRegMap`` +class in order to insert loads and stores sending and getting values to and from +memory. + +The direct mapping provides more flexibility to the developer of the register +allocator; however, it is more error prone, and demands more implementation +work. Basically, the programmer will have to specify where load and store +instructions should be inserted in the target function being compiled in order +to get and store values in memory. To assign a physical register to a virtual +register present in a given operand, use ``MachineOperand::setReg(p_reg)``. To +insert a store instruction, use ``TargetInstrInfo::storeRegToStackSlot(...)``, +and to insert a load instruction, use ``TargetInstrInfo::loadRegFromStackSlot``. + +The indirect mapping shields the application developer from the complexities of +inserting load and store instructions. In order to map a virtual register to a +physical one, use ``VirtRegMap::assignVirt2Phys(vreg, preg)``. In order to map +a certain virtual register to memory, use +``VirtRegMap::assignVirt2StackSlot(vreg)``. This method will return the stack +slot where ``vreg``'s value will be located. If it is necessary to map another +virtual register to the same stack slot, use +``VirtRegMap::assignVirt2StackSlot(vreg, stack_location)``. One important point +to consider when using the indirect mapping, is that even if a virtual register +is mapped to memory, it still needs to be mapped to a physical register. This +physical register is the location where the virtual register is supposed to be +found before being stored or after being reloaded. + +If the indirect strategy is used, after all the virtual registers have been +mapped to physical registers or stack slots, it is necessary to use a spiller +object to place load and store instructions in the code. Every virtual that has +been mapped to a stack slot will be stored to memory after been defined and will +be loaded before being used. The implementation of the spiller tries to recycle +load/store instructions, avoiding unnecessary instructions. For an example of +how to invoke the spiller, see ``RegAllocLinearScan::runOnMachineFunction`` in +``lib/CodeGen/RegAllocLinearScan.cpp``. + +Handling two address instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +With very rare exceptions (e.g., function calls), the LLVM machine code +instructions are three address instructions. That is, each instruction is +expected to define at most one register, and to use at most two registers. +However, some architectures use two address instructions. In this case, the +defined register is also one of the used register. For instance, an instruction +such as ``ADD %EAX, %EBX``, in X86 is actually equivalent to ``%EAX = %EAX + +%EBX``. + +In order to produce correct code, LLVM must convert three address instructions +that represent two address instructions into true two address instructions. LLVM +provides the pass ``TwoAddressInstructionPass`` for this specific purpose. It +must be run before register allocation takes place. After its execution, the +resulting code may no longer be in SSA form. This happens, for instance, in +situations where an instruction such as ``%a = ADD %b %c`` is converted to two +instructions such as: + +:: + + %a = MOVE %b + %a = ADD %a %c + +Notice that, internally, the second instruction is represented as ``ADD +%a[def/use] %c``. I.e., the register operand ``%a`` is both used and defined by +the instruction. + +The SSA deconstruction phase +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +An important transformation that happens during register allocation is called +the *SSA Deconstruction Phase*. The SSA form simplifies many analyses that are +performed on the control flow graph of programs. However, traditional +instruction sets do not implement PHI instructions. Thus, in order to generate +executable code, compilers must replace PHI instructions with other instructions +that preserve their semantics. + +There are many ways in which PHI instructions can safely be removed from the +target code. The most traditional PHI deconstruction algorithm replaces PHI +instructions with copy instructions. That is the strategy adopted by LLVM. The +SSA deconstruction algorithm is implemented in +``lib/CodeGen/PHIElimination.cpp``. In order to invoke this pass, the identifier +``PHIEliminationID`` must be marked as required in the code of the register +allocator. + +Instruction folding +^^^^^^^^^^^^^^^^^^^ + +*Instruction folding* is an optimization performed during register allocation +that removes unnecessary copy instructions. For instance, a sequence of +instructions such as: + +:: + + %EBX = LOAD %mem_address + %EAX = COPY %EBX + +can be safely substituted by the single instruction: + +:: + + %EAX = LOAD %mem_address + +Instructions can be folded with the +``TargetRegisterInfo::foldMemoryOperand(...)`` method. Care must be taken when +folding instructions; a folded instruction can be quite different from the +original instruction. See ``LiveIntervals::addIntervalsForSpills`` in +``lib/CodeGen/LiveIntervalAnalysis.cpp`` for an example of its use. + +Built in register allocators +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The LLVM infrastructure provides the application developer with three different +register allocators: + +* *Fast* --- This register allocator is the default for debug builds. It + allocates registers on a basic block level, attempting to keep values in + registers and reusing registers as appropriate. + +* *Basic* --- This is an incremental approach to register allocation. Live + ranges are assigned to registers one at a time in an order that is driven by + heuristics. Since code can be rewritten on-the-fly during allocation, this + framework allows interesting allocators to be developed as extensions. It is + not itself a production register allocator but is a potentially useful + stand-alone mode for triaging bugs and as a performance baseline. + +* *Greedy* --- *The default allocator*. This is a highly tuned implementation of + the *Basic* allocator that incorporates global live range splitting. This + allocator works hard to minimize the cost of spill code. + +* *PBQP* --- A Partitioned Boolean Quadratic Programming (PBQP) based register + allocator. This allocator works by constructing a PBQP problem representing + the register allocation problem under consideration, solving this using a PBQP + solver, and mapping the solution back to a register assignment. + +The type of register allocator used in ``llc`` can be chosen with the command +line option ``-regalloc=...``: + +.. code-block:: bash + + $ llc -regalloc=linearscan file.bc -o ln.s + $ llc -regalloc=fast file.bc -o fa.s + $ llc -regalloc=pbqp file.bc -o pbqp.s + +.. _Prolog/Epilog Code Insertion: + +Prolog/Epilog Code Insertion +---------------------------- + +Compact Unwind + +Throwing an exception requires *unwinding* out of a function. The information on +how to unwind a given function is traditionally expressed in DWARF unwind +(a.k.a. frame) info. But that format was originally developed for debuggers to +backtrace, and each Frame Description Entry (FDE) requires ~20-30 bytes per +function. There is also the cost of mapping from an address in a function to the +corresponding FDE at runtime. An alternative unwind encoding is called *compact +unwind* and requires just 4-bytes per function. + +The compact unwind encoding is a 32-bit value, which is encoded in an +architecture-specific way. It specifies which registers to restore and from +where, and how to unwind out of the function. When the linker creates a final +linked image, it will create a ``__TEXT,__unwind_info`` section. This section is +a small and fast way for the runtime to access unwind info for any given +function. If we emit compact unwind info for the function, that compact unwind +info will be encoded in the ``__TEXT,__unwind_info`` section. If we emit DWARF +unwind info, the ``__TEXT,__unwind_info`` section will contain the offset of the +FDE in the ``__TEXT,__eh_frame`` section in the final linked image. + +For X86, there are three modes for the compact unwind encoding: + +*Function with a Frame Pointer (``EBP`` or ``RBP``)* + ``EBP/RBP``-based frame, where ``EBP/RBP`` is pushed onto the stack + immediately after the return address, then ``ESP/RSP`` is moved to + ``EBP/RBP``. Thus to unwind, ``ESP/RSP`` is restored with the current + ``EBP/RBP`` value, then ``EBP/RBP`` is restored by popping the stack, and the + return is done by popping the stack once more into the PC. All non-volatile + registers that need to be restored must have been saved in a small range on + the stack that starts ``EBP-4`` to ``EBP-1020`` (``RBP-8`` to + ``RBP-1020``). The offset (divided by 4 in 32-bit mode and 8 in 64-bit mode) + is encoded in bits 16-23 (mask: ``0x00FF0000``). The registers saved are + encoded in bits 0-14 (mask: ``0x00007FFF``) as five 3-bit entries from the + following table: + + ============== ============= =============== + Compact Number i386 Register x86-64 Register + ============== ============= =============== + 1 ``EBX`` ``RBX`` + 2 ``ECX`` ``R12`` + 3 ``EDX`` ``R13`` + 4 ``EDI`` ``R14`` + 5 ``ESI`` ``R15`` + 6 ``EBP`` ``RBP`` + ============== ============= =============== + +*Frameless with a Small Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)* + To return, a constant (encoded in the compact unwind encoding) is added to the + ``ESP/RSP``. Then the return is done by popping the stack into the PC. All + non-volatile registers that need to be restored must have been saved on the + stack immediately after the return address. The stack size (divided by 4 in + 32-bit mode and 8 in 64-bit mode) is encoded in bits 16-23 (mask: + ``0x00FF0000``). There is a maximum stack size of 1024 bytes in 32-bit mode + and 2048 in 64-bit mode. The number of registers saved is encoded in bits 9-12 + (mask: ``0x00001C00``). Bits 0-9 (mask: ``0x000003FF``) contain which + registers were saved and their order. (See the + ``encodeCompactUnwindRegistersWithoutFrame()`` function in + ``lib/Target/X86FrameLowering.cpp`` for the encoding algorithm.) + +*Frameless with a Large Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)* + This case is like the "Frameless with a Small Constant Stack Size" case, but + the stack size is too large to encode in the compact unwind encoding. Instead + it requires that the function contains "``subl $nnnnnn, %esp``" in its + prolog. The compact encoding contains the offset to the ``$nnnnnn`` value in + the function in bits 9-12 (mask: ``0x00001C00``). + +.. _Late Machine Code Optimizations: + +Late Machine Code Optimizations +------------------------------- + +.. note:: + + To Be Written + +.. _Code Emission: + +Code Emission +------------- + +The code emission step of code generation is responsible for lowering from the +code generator abstractions (like `MachineFunction`_, `MachineInstr`_, etc) down +to the abstractions used by the MC layer (`MCInst`_, `MCStreamer`_, etc). This +is done with a combination of several different classes: the (misnamed) +target-independent AsmPrinter class, target-specific subclasses of AsmPrinter +(such as SparcAsmPrinter), and the TargetLoweringObjectFile class. + +Since the MC layer works at the level of abstraction of object files, it doesn't +have a notion of functions, global variables etc. Instead, it thinks about +labels, directives, and instructions. A key class used at this time is the +MCStreamer class. This is an abstract API that is implemented in different ways +(e.g. to output a .s file, output an ELF .o file, etc) that is effectively an +"assembler API". MCStreamer has one method per directive, such as EmitLabel, +EmitSymbolAttribute, SwitchSection, etc, which directly correspond to assembly +level directives. + +If you are interested in implementing a code generator for a target, there are +three important things that you have to implement for your target: + +#. First, you need a subclass of AsmPrinter for your target. This class + implements the general lowering process converting MachineFunction's into MC + label constructs. The AsmPrinter base class provides a number of useful + methods and routines, and also allows you to override the lowering process in + some important ways. You should get much of the lowering for free if you are + implementing an ELF, COFF, or MachO target, because the + TargetLoweringObjectFile class implements much of the common logic. + +#. Second, you need to implement an instruction printer for your target. The + instruction printer takes an `MCInst`_ and renders it to a raw_ostream as + text. Most of this is automatically generated from the .td file (when you + specify something like "``add $dst, $src1, $src2``" in the instructions), but + you need to implement routines to print operands. + +#. Third, you need to implement code that lowers a `MachineInstr`_ to an MCInst, + usually implemented in "<target>MCInstLower.cpp". This lowering process is + often target specific, and is responsible for turning jump table entries, + constant pool indices, global variable addresses, etc into MCLabels as + appropriate. This translation layer is also responsible for expanding pseudo + ops used by the code generator into the actual machine instructions they + correspond to. The MCInsts that are generated by this are fed into the + instruction printer or the encoder. + +Finally, at your choosing, you can also implement an subclass of MCCodeEmitter +which lowers MCInst's into machine code bytes and relocations. This is +important if you want to support direct .o file emission, or would like to +implement an assembler for your target. + +VLIW Packetizer +--------------- + +In a Very Long Instruction Word (VLIW) architecture, the compiler is responsible +for mapping instructions to functional-units available on the architecture. To +that end, the compiler creates groups of instructions called *packets* or +*bundles*. The VLIW packetizer in LLVM is a target-independent mechanism to +enable the packetization of machine instructions. + +Mapping from instructions to functional units +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Instructions in a VLIW target can typically be mapped to multiple functional +units. During the process of packetizing, the compiler must be able to reason +about whether an instruction can be added to a packet. This decision can be +complex since the compiler has to examine all possible mappings of instructions +to functional units. Therefore to alleviate compilation-time complexity, the +VLIW packetizer parses the instruction classes of a target and generates tables +at compiler build time. These tables can then be queried by the provided +machine-independent API to determine if an instruction can be accommodated in a +packet. + +How the packetization tables are generated and used +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The packetizer reads instruction classes from a target's itineraries and creates +a deterministic finite automaton (DFA) to represent the state of a packet. A DFA +consists of three major elements: inputs, states, and transitions. The set of +inputs for the generated DFA represents the instruction being added to a +packet. The states represent the possible consumption of functional units by +instructions in a packet. In the DFA, transitions from one state to another +occur on the addition of an instruction to an existing packet. If there is a +legal mapping of functional units to instructions, then the DFA contains a +corresponding transition. The absence of a transition indicates that a legal +mapping does not exist and that the instruction cannot be added to the packet. + +To generate tables for a VLIW target, add *Target*\ GenDFAPacketizer.inc as a +target to the Makefile in the target directory. The exported API provides three +functions: ``DFAPacketizer::clearResources()``, +``DFAPacketizer::reserveResources(MachineInstr *MI)``, and +``DFAPacketizer::canReserveResources(MachineInstr *MI)``. These functions allow +a target packetizer to add an instruction to an existing packet and to check +whether an instruction can be added to a packet. See +``llvm/CodeGen/DFAPacketizer.h`` for more information. + +Implementing a Native Assembler +=============================== + +Though you're probably reading this because you want to write or maintain a +compiler backend, LLVM also fully supports building a native assemblers too. +We've tried hard to automate the generation of the assembler from the .td files +(in particular the instruction syntax and encodings), which means that a large +part of the manual and repetitive data entry can be factored and shared with the +compiler. + +Instruction Parsing +------------------- + +.. note:: + + To Be Written + + +Instruction Alias Processing +---------------------------- + +Once the instruction is parsed, it enters the MatchInstructionImpl function. +The MatchInstructionImpl function performs alias processing and then does actual +matching. + +Alias processing is the phase that canonicalizes different lexical forms of the +same instructions down to one representation. There are several different kinds +of alias that are possible to implement and they are listed below in the order +that they are processed (which is in order from simplest/weakest to most +complex/powerful). Generally you want to use the first alias mechanism that +meets the needs of your instruction, because it will allow a more concise +description. + +Mnemonic Aliases +^^^^^^^^^^^^^^^^ + +The first phase of alias processing is simple instruction mnemonic remapping for +classes of instructions which are allowed with two different mnemonics. This +phase is a simple and unconditionally remapping from one input mnemonic to one +output mnemonic. It isn't possible for this form of alias to look at the +operands at all, so the remapping must apply for all forms of a given mnemonic. +Mnemonic aliases are defined simply, for example X86 has: + +:: + + def : MnemonicAlias<"cbw", "cbtw">; + def : MnemonicAlias<"smovq", "movsq">; + def : MnemonicAlias<"fldcww", "fldcw">; + def : MnemonicAlias<"fucompi", "fucomip">; + def : MnemonicAlias<"ud2a", "ud2">; + +... and many others. With a MnemonicAlias definition, the mnemonic is remapped +simply and directly. Though MnemonicAlias's can't look at any aspect of the +instruction (such as the operands) they can depend on global modes (the same +ones supported by the matcher), through a Requires clause: + +:: + + def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>; + def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>; + +In this example, the mnemonic gets mapped into different a new one depending on +the current instruction set. + +Instruction Aliases +^^^^^^^^^^^^^^^^^^^ + +The most general phase of alias processing occurs while matching is happening: +it provides new forms for the matcher to match along with a specific instruction +to generate. An instruction alias has two parts: the string to match and the +instruction to generate. For example: + +:: + + def : InstAlias<"movsx $src, $dst", (MOVSX16rr8W GR16:$dst, GR8 :$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX16rm8W GR16:$dst, i8mem:$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX32rr8 GR32:$dst, GR8 :$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX32rr16 GR32:$dst, GR16 :$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX64rr8 GR64:$dst, GR8 :$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX64rr16 GR64:$dst, GR16 :$src)>; + def : InstAlias<"movsx $src, $dst", (MOVSX64rr32 GR64:$dst, GR32 :$src)>; + +This shows a powerful example of the instruction aliases, matching the same +mnemonic in multiple different ways depending on what operands are present in +the assembly. The result of instruction aliases can include operands in a +different order than the destination instruction, and can use an input multiple +times, for example: + +:: + + def : InstAlias<"clrb $reg", (XOR8rr GR8 :$reg, GR8 :$reg)>; + def : InstAlias<"clrw $reg", (XOR16rr GR16:$reg, GR16:$reg)>; + def : InstAlias<"clrl $reg", (XOR32rr GR32:$reg, GR32:$reg)>; + def : InstAlias<"clrq $reg", (XOR64rr GR64:$reg, GR64:$reg)>; + +This example also shows that tied operands are only listed once. In the X86 +backend, XOR8rr has two input GR8's and one output GR8 (where an input is tied +to the output). InstAliases take a flattened operand list without duplicates +for tied operands. The result of an instruction alias can also use immediates +and fixed physical registers which are added as simple immediate operands in the +result, for example: + +:: + + // Fixed Immediate operand. + def : InstAlias<"aad", (AAD8i8 10)>; + + // Fixed register operand. + def : InstAlias<"fcomi", (COM_FIr ST1)>; + + // Simple alias. + def : InstAlias<"fcomi $reg", (COM_FIr RST:$reg)>; + +Instruction aliases can also have a Requires clause to make them subtarget +specific. + +If the back-end supports it, the instruction printer can automatically emit the +alias rather than what's being aliased. It typically leads to better, more +readable code. If it's better to print out what's being aliased, then pass a '0' +as the third parameter to the InstAlias definition. + +Instruction Matching +-------------------- + +.. note:: + + To Be Written + +.. _Implementations of the abstract target description interfaces: +.. _implement the target description: + +Target-specific Implementation Notes +==================================== + +This section of the document explains features or design decisions that are +specific to the code generator for a particular target. First we start with a +table that summarizes what features are supported by each target. + +Target Feature Matrix +--------------------- + +Note that this table does not include the C backend or Cpp backends, since they +do not use the target independent code generator infrastructure. It also +doesn't list features that are not supported fully by any target yet. It +considers a feature to be supported if at least one subtarget supports it. A +feature being supported means that it is useful and works for most cases, it +does not indicate that there are zero known bugs in the implementation. Here is +the key: + +:raw-html:`<table border="1" cellspacing="0">` +:raw-html:`<tr>` +:raw-html:`<th>Unknown</th>` +:raw-html:`<th>No support</th>` +:raw-html:`<th>Partial Support</th>` +:raw-html:`<th>Complete Support</th>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td class="unknown"></td>` +:raw-html:`<td class="no"></td>` +:raw-html:`<td class="partial"></td>` +:raw-html:`<td class="yes"></td>` +:raw-html:`</tr>` +:raw-html:`</table>` + +Here is the table: + +:raw-html:`<table width="689" border="1" cellspacing="0">` +:raw-html:`<tr><td></td>` +:raw-html:`<td colspan="13" align="center" style="background-color:#ffc">Target</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<th>Feature</th>` +:raw-html:`<th>ARM</th>` +:raw-html:`<th>CellSPU</th>` +:raw-html:`<th>Hexagon</th>` +:raw-html:`<th>MBlaze</th>` +:raw-html:`<th>MSP430</th>` +:raw-html:`<th>Mips</th>` +:raw-html:`<th>PTX</th>` +:raw-html:`<th>PowerPC</th>` +:raw-html:`<th>Sparc</th>` +:raw-html:`<th>X86</th>` +:raw-html:`<th>XCore</th>` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_reliable">is generally reliable</a></td>` +:raw-html:`<td class="yes"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="yes"></td> <!-- Hexagon -->` +:raw-html:`<td class="no"></td> <!-- MBlaze -->` +:raw-html:`<td class="unknown"></td> <!-- MSP430 -->` +:raw-html:`<td class="yes"></td> <!-- Mips -->` +:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- PowerPC -->` +:raw-html:`<td class="yes"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="unknown"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_asmparser">assembly parser</a></td>` +:raw-html:`<td class="no"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="no"></td> <!-- Hexagon -->` +:raw-html:`<td class="yes"></td> <!-- MBlaze -->` +:raw-html:`<td class="no"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- PowerPC -->` +:raw-html:`<td class="no"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="no"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_disassembler">disassembler</a></td>` +:raw-html:`<td class="yes"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="no"></td> <!-- Hexagon -->` +:raw-html:`<td class="yes"></td> <!-- MBlaze -->` +:raw-html:`<td class="no"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- PowerPC -->` +:raw-html:`<td class="no"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="no"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_inlineasm">inline asm</a></td>` +:raw-html:`<td class="yes"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="yes"></td> <!-- Hexagon -->` +:raw-html:`<td class="yes"></td> <!-- MBlaze -->` +:raw-html:`<td class="unknown"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- PowerPC -->` +:raw-html:`<td class="unknown"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="unknown"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_jit">jit</a></td>` +:raw-html:`<td class="partial"><a href="#feat_jit_arm">*</a></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="no"></td> <!-- Hexagon -->` +:raw-html:`<td class="no"></td> <!-- MBlaze -->` +:raw-html:`<td class="unknown"></td> <!-- MSP430 -->` +:raw-html:`<td class="yes"></td> <!-- Mips -->` +:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- PowerPC -->` +:raw-html:`<td class="unknown"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="unknown"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_objectwrite">.o file writing</a></td>` +:raw-html:`<td class="no"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="no"></td> <!-- Hexagon -->` +:raw-html:`<td class="yes"></td> <!-- MBlaze -->` +:raw-html:`<td class="no"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- PowerPC -->` +:raw-html:`<td class="no"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="no"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a hr:raw-html:`ef="#feat_tailcall">tail calls</a></td>` +:raw-html:`<td class="yes"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="yes"></td> <!-- Hexagon -->` +:raw-html:`<td class="no"></td> <!-- MBlaze -->` +:raw-html:`<td class="unknown"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- PowerPC -->` +:raw-html:`<td class="unknown"></td> <!-- Sparc -->` +:raw-html:`<td class="yes"></td> <!-- X86 -->` +:raw-html:`<td class="unknown"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`<tr>` +:raw-html:`<td><a href="#feat_segstacks">segmented stacks</a></td>` +:raw-html:`<td class="no"></td> <!-- ARM -->` +:raw-html:`<td class="no"></td> <!-- CellSPU -->` +:raw-html:`<td class="no"></td> <!-- Hexagon -->` +:raw-html:`<td class="no"></td> <!-- MBlaze -->` +:raw-html:`<td class="no"></td> <!-- MSP430 -->` +:raw-html:`<td class="no"></td> <!-- Mips -->` +:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- PowerPC -->` +:raw-html:`<td class="no"></td> <!-- Sparc -->` +:raw-html:`<td class="partial"><a href="#feat_segstacks_x86">*</a></td> <!-- X86 -->` +:raw-html:`<td class="no"></td> <!-- XCore -->` +:raw-html:`</tr>` + +:raw-html:`</table>` + +.. _feat_reliable: + +Is Generally Reliable +^^^^^^^^^^^^^^^^^^^^^ + +This box indicates whether the target is considered to be production quality. +This indicates that the target has been used as a static compiler to compile +large amounts of code by a variety of different people and is in continuous use. + +.. _feat_asmparser: + +Assembly Parser +^^^^^^^^^^^^^^^ + +This box indicates whether the target supports parsing target specific .s files +by implementing the MCAsmParser interface. This is required for llvm-mc to be +able to act as a native assembler and is required for inline assembly support in +the native .o file writer. + +.. _feat_disassembler: + +Disassembler +^^^^^^^^^^^^ + +This box indicates whether the target supports the MCDisassembler API for +disassembling machine opcode bytes into MCInst's. + +.. _feat_inlineasm: + +Inline Asm +^^^^^^^^^^ + +This box indicates whether the target supports most popular inline assembly +constraints and modifiers. + +.. _feat_jit: + +JIT Support +^^^^^^^^^^^ + +This box indicates whether the target supports the JIT compiler through the +ExecutionEngine interface. + +.. _feat_jit_arm: + +The ARM backend has basic support for integer code in ARM codegen mode, but +lacks NEON and full Thumb support. + +.. _feat_objectwrite: + +.o File Writing +^^^^^^^^^^^^^^^ + +This box indicates whether the target supports writing .o files (e.g. MachO, +ELF, and/or COFF) files directly from the target. Note that the target also +must include an assembly parser and general inline assembly support for full +inline assembly support in the .o writer. + +Targets that don't support this feature can obviously still write out .o files, +they just rely on having an external assembler to translate from a .s file to a +.o file (as is the case for many C compilers). + +.. _feat_tailcall: + +Tail Calls +^^^^^^^^^^ + +This box indicates whether the target supports guaranteed tail calls. These are +calls marked "`tail <LangRef.html#i_call>`_" and use the fastcc calling +convention. Please see the `tail call section more more details`_. + +.. _feat_segstacks: + +Segmented Stacks +^^^^^^^^^^^^^^^^ + +This box indicates whether the target supports segmented stacks. This replaces +the traditional large C stack with many linked segments. It is compatible with +the `gcc implementation <http://gcc.gnu.org/wiki/SplitStacks>`_ used by the Go +front end. + +.. _feat_segstacks_x86: + +Basic support exists on the X86 backend. Currently vararg doesn't work and the +object files are not marked the way the gold linker expects, but simple Go +programs can be built by dragonegg. + +.. _tail call section more more details: + +Tail call optimization +---------------------- + +Tail call optimization, callee reusing the stack of the caller, is currently +supported on x86/x86-64 and PowerPC. It is performed if: + +* Caller and callee have the calling convention ``fastcc`` or ``cc 10`` (GHC + call convention). + +* The call is a tail call - in tail position (ret immediately follows call and + ret uses value of call or is void). + +* Option ``-tailcallopt`` is enabled. + +* Platform specific constraints are met. + +x86/x86-64 constraints: + +* No variable argument lists are used. + +* On x86-64 when generating GOT/PIC code only module-local calls (visibility = + hidden or protected) are supported. + +PowerPC constraints: + +* No variable argument lists are used. + +* No byval parameters are used. + +* On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected) + are supported. + +Example: + +Call as ``llc -tailcallopt test.ll``. + +.. code-block:: llvm + + declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4) + + define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { + %l1 = add i32 %in1, %in2 + %tmp = tail call fastcc i32 @tailcallee(i32 %in1 inreg, i32 %in2 inreg, i32 %in1, i32 %l1) + ret i32 %tmp + } + +Implications of ``-tailcallopt``: + +To support tail call optimization in situations where the callee has more +arguments than the caller a 'callee pops arguments' convention is used. This +currently causes each ``fastcc`` call that is not tail call optimized (because +one or more of above constraints are not met) to be followed by a readjustment +of the stack. So performance might be worse in such cases. + +Sibling call optimization +------------------------- + +Sibling call optimization is a restricted form of tail call optimization. +Unlike tail call optimization described in the previous section, it can be +performed automatically on any tail calls when ``-tailcallopt`` option is not +specified. + +Sibling call optimization is currently performed on x86/x86-64 when the +following constraints are met: + +* Caller and callee have the same calling convention. It can be either ``c`` or + ``fastcc``. + +* The call is a tail call - in tail position (ret immediately follows call and + ret uses value of call or is void). + +* Caller and callee have matching return type or the callee result is not used. + +* If any of the callee arguments are being passed in stack, they must be + available in caller's own incoming argument stack and the frame offsets must + be the same. + +Example: + +.. code-block:: llvm + + declare i32 @bar(i32, i32) + + define i32 @foo(i32 %a, i32 %b, i32 %c) { + entry: + %0 = tail call i32 @bar(i32 %a, i32 %b) + ret i32 %0 + } + +The X86 backend +--------------- + +The X86 code generator lives in the ``lib/Target/X86`` directory. This code +generator is capable of targeting a variety of x86-32 and x86-64 processors, and +includes support for ISA extensions such as MMX and SSE. + +X86 Target Triples supported +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following are the known target triples that are supported by the X86 +backend. This is not an exhaustive list, and it would be useful to add those +that people test. + +* **i686-pc-linux-gnu** --- Linux + +* **i386-unknown-freebsd5.3** --- FreeBSD 5.3 + +* **i686-pc-cygwin** --- Cygwin on Win32 + +* **i686-pc-mingw32** --- MingW on Win32 + +* **i386-pc-mingw32msvc** --- MingW crosscompiler on Linux + +* **i686-apple-darwin*** --- Apple Darwin on X86 + +* **x86_64-unknown-linux-gnu** --- Linux + +X86 Calling Conventions supported +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following target-specific calling conventions are known to backend: + +* **x86_StdCall** --- stdcall calling convention seen on Microsoft Windows + platform (CC ID = 64). + +* **x86_FastCall** --- fastcall calling convention seen on Microsoft Windows + platform (CC ID = 65). + +* **x86_ThisCall** --- Similar to X86_StdCall. Passes first argument in ECX, + others via stack. Callee is responsible for stack cleaning. This convention is + used by MSVC by default for methods in its ABI (CC ID = 70). + +.. _X86 addressing mode: + +Representing X86 addressing modes in MachineInstrs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The x86 has a very flexible way of accessing memory. It is capable of forming +memory addresses of the following expression directly in integer instructions +(which use ModR/M addressing): + +:: + + SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32 + +In order to represent this, LLVM tracks no less than 5 operands for each memory +operand of this form. This means that the "load" form of '``mov``' has the +following ``MachineOperand``\s in this order: + +:: + + Index: 0 | 1 2 3 4 5 + Meaning: DestReg, | BaseReg, Scale, IndexReg, Displacement Segment + OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm PhysReg + +Stores, and all other instructions, treat the four memory operands in the same +way and in the same order. If the segment register is unspecified (regno = 0), +then no segment override is generated. "Lea" operations do not have a segment +register specified, so they only have 4 operands for their memory reference. + +X86 address spaces supported +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +x86 has a feature which provides the ability to perform loads and stores to +different address spaces via the x86 segment registers. A segment override +prefix byte on an instruction causes the instruction's memory access to go to +the specified segment. LLVM address space 0 is the default address space, which +includes the stack, and any unqualified memory accesses in a program. Address +spaces 1-255 are currently reserved for user-defined code. The GS-segment is +represented by address space 256, while the FS-segment is represented by address +space 257. Other x86 segments have yet to be allocated address space +numbers. + +While these address spaces may seem similar to TLS via the ``thread_local`` +keyword, and often use the same underlying hardware, there are some fundamental +differences. + +The ``thread_local`` keyword applies to global variables and specifies that they +are to be allocated in thread-local memory. There are no type qualifiers +involved, and these variables can be pointed to with normal pointers and +accessed with normal loads and stores. The ``thread_local`` keyword is +target-independent at the LLVM IR level (though LLVM doesn't yet have +implementations of it for some configurations) + +Special address spaces, in contrast, apply to static types. Every load and store +has a particular address space in its address operand type, and this is what +determines which address space is accessed. LLVM ignores these special address +space qualifiers on global variables, and does not provide a way to directly +allocate storage in them. At the LLVM IR level, the behavior of these special +address spaces depends in part on the underlying OS or runtime environment, and +they are specific to x86 (and LLVM doesn't yet handle them correctly in some +cases). + +Some operating systems and runtime environments use (or may in the future use) +the FS/GS-segment registers for various low-level purposes, so care should be +taken when considering them. + +Instruction naming +^^^^^^^^^^^^^^^^^^ + +An instruction name consists of the base name, a default operand size, and a a +character per operand with an optional special size. For example: + +:: + + ADD8rr -> add, 8-bit register, 8-bit register + IMUL16rmi -> imul, 16-bit register, 16-bit memory, 16-bit immediate + IMUL16rmi8 -> imul, 16-bit register, 16-bit memory, 8-bit immediate + MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory + +The PowerPC backend +------------------- + +The PowerPC code generator lives in the lib/Target/PowerPC directory. The code +generation is retargetable to several variations or *subtargets* of the PowerPC +ISA; including ppc32, ppc64 and altivec. + +LLVM PowerPC ABI +^^^^^^^^^^^^^^^^ + +LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC relative +(PIC) or static addressing for accessing global values, so no TOC (r2) is +used. Second, r31 is used as a frame pointer to allow dynamic growth of a stack +frame. LLVM takes advantage of having no TOC to provide space to save the frame +pointer in the PowerPC linkage area of the caller frame. Other details of +PowerPC ABI can be found at `PowerPC ABI +<http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/32bitPowerPC.html>`_\ +. Note: This link describes the 32 bit ABI. The 64 bit ABI is similar except +space for GPRs are 8 bytes wide (not 4) and r13 is reserved for system use. + +Frame Layout +^^^^^^^^^^^^ + +The size of a PowerPC frame is usually fixed for the duration of a function's +invocation. Since the frame is fixed size, all references into the frame can be +accessed via fixed offsets from the stack pointer. The exception to this is +when dynamic alloca or variable sized arrays are present, then a base pointer +(r31) is used as a proxy for the stack pointer and stack pointer is free to grow +or shrink. A base pointer is also used if llvm-gcc is not passed the +-fomit-frame-pointer flag. The stack pointer is always aligned to 16 bytes, so +that space allocated for altivec vectors will be properly aligned. + +An invocation frame is laid out as follows (low memory at top): + +:raw-html:`<table border="1" cellspacing="0">` +:raw-html:`<tr>` +:raw-html:`<td>Linkage<br><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>Parameter area<br><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>Dynamic area<br><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>Locals area<br><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>Saved registers area<br><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr style="border-style: none hidden none hidden;">` +:raw-html:`<td><br></td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>Previous Frame<br><br></td>` +:raw-html:`</tr>` +:raw-html:`</table>` + +The *linkage* area is used by a callee to save special registers prior to +allocating its own frame. Only three entries are relevant to LLVM. The first +entry is the previous stack pointer (sp), aka link. This allows probing tools +like gdb or exception handlers to quickly scan the frames in the stack. A +function epilog can also use the link to pop the frame from the stack. The +third entry in the linkage area is used to save the return address from the lr +register. Finally, as mentioned above, the last entry is used to save the +previous frame pointer (r31.) The entries in the linkage area are the size of a +GPR, thus the linkage area is 24 bytes long in 32 bit mode and 48 bytes in 64 +bit mode. + +32 bit linkage area: + +:raw-html:`<table border="1" cellspacing="0">` +:raw-html:`<tr>` +:raw-html:`<td>0</td>` +:raw-html:`<td>Saved SP (r1)</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>4</td>` +:raw-html:`<td>Saved CR</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>8</td>` +:raw-html:`<td>Saved LR</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>12</td>` +:raw-html:`<td>Reserved</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>16</td>` +:raw-html:`<td>Reserved</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>20</td>` +:raw-html:`<td>Saved FP (r31)</td>` +:raw-html:`</tr>` +:raw-html:`</table>` + +64 bit linkage area: + +:raw-html:`<table border="1" cellspacing="0">` +:raw-html:`<tr>` +:raw-html:`<td>0</td>` +:raw-html:`<td>Saved SP (r1)</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>8</td>` +:raw-html:`<td>Saved CR</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>16</td>` +:raw-html:`<td>Saved LR</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>24</td>` +:raw-html:`<td>Reserved</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>32</td>` +:raw-html:`<td>Reserved</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>40</td>` +:raw-html:`<td>Saved FP (r31)</td>` +:raw-html:`</tr>` +:raw-html:`</table>` + +The *parameter area* is used to store arguments being passed to a callee +function. Following the PowerPC ABI, the first few arguments are actually +passed in registers, with the space in the parameter area unused. However, if +there are not enough registers or the callee is a thunk or vararg function, +these register arguments can be spilled into the parameter area. Thus, the +parameter area must be large enough to store all the parameters for the largest +call sequence made by the caller. The size must also be minimally large enough +to spill registers r3-r10. This allows callees blind to the call signature, +such as thunks and vararg functions, enough space to cache the argument +registers. Therefore, the parameter area is minimally 32 bytes (64 bytes in 64 +bit mode.) Also note that since the parameter area is a fixed offset from the +top of the frame, that a callee can access its spilt arguments using fixed +offsets from the stack pointer (or base pointer.) + +Combining the information about the linkage, parameter areas and alignment. A +stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit mode. + +The *dynamic area* starts out as size zero. If a function uses dynamic alloca +then space is added to the stack, the linkage and parameter areas are shifted to +top of stack, and the new space is available immediately below the linkage and +parameter areas. The cost of shifting the linkage and parameter areas is minor +since only the link value needs to be copied. The link value can be easily +fetched by adding the original frame size to the base pointer. Note that +allocations in the dynamic space need to observe 16 byte alignment. + +The *locals area* is where the llvm compiler reserves space for local variables. + +The *saved registers area* is where the llvm compiler spills callee saved +registers on entry to the callee. + +Prolog/Epilog +^^^^^^^^^^^^^ + +The llvm prolog and epilog are the same as described in the PowerPC ABI, with +the following exceptions. Callee saved registers are spilled after the frame is +created. This allows the llvm epilog/prolog support to be common with other +targets. The base pointer callee saved register r31 is saved in the TOC slot of +linkage area. This simplifies allocation of space for the base pointer and +makes it convenient to locate programatically and during debugging. + +Dynamic Allocation +^^^^^^^^^^^^^^^^^^ + +.. note:: + + TODO - More to come. + +The PTX backend +--------------- + +The PTX code generator lives in the lib/Target/PTX directory. It is currently a +work-in-progress, but already supports most of the code generation functionality +needed to generate correct PTX kernels for CUDA devices. + +The code generator can target PTX 2.0+, and shader model 1.0+. The PTX ISA +Reference Manual is used as the primary source of ISA information, though an +effort is made to make the output of the code generator match the output of the +NVidia nvcc compiler, whenever possible. + +Code Generator Options: + +:raw-html:`<table border="1" cellspacing="0">` +:raw-html:`<tr>` +:raw-html:`<th>Option</th>` +:raw-html:`<th>Description</th>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>``double``</td>` +:raw-html:`<td align="left">If enabled, the map_f64_to_f32 directive is disabled in the PTX output, allowing native double-precision arithmetic</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>``no-fma``</td>` +:raw-html:`<td align="left">Disable generation of Fused-Multiply Add instructions, which may be beneficial for some devices</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>``smxy / computexy``</td>` +:raw-html:`<td align="left">Set shader model/compute capability to x.y, e.g. sm20 or compute13</td>` +:raw-html:`</tr>` +:raw-html:`</table>` + +Working: + +* Arithmetic instruction selection (including combo FMA) + +* Bitwise instruction selection + +* Control-flow instruction selection + +* Function calls (only on SM 2.0+ and no return arguments) + +* Addresses spaces (0 = global, 1 = constant, 2 = local, 4 = shared) + +* Thread synchronization (bar.sync) + +* Special register reads ([N]TID, [N]CTAID, PMx, CLOCK, etc.) + +In Progress: + +* Robust call instruction selection + +* Stack frame allocation + +* Device-specific instruction scheduling optimizations diff --git a/docs/CodingStandards.html b/docs/CodingStandards.html deleted file mode 100644 index 847ac4c..0000000 --- a/docs/CodingStandards.html +++ /dev/null @@ -1,1568 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <link rel="stylesheet" href="llvm.css" type="text/css"> - <title>LLVM Coding Standards</title> -</head> -<body> - -<h1> - LLVM Coding Standards -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - <li><a href="#mechanicalissues">Mechanical Source Issues</a> - <ol> - <li><a href="#sourceformating">Source Code Formatting</a> - <ol> - <li><a href="#scf_commenting">Commenting</a></li> - <li><a href="#scf_commentformat">Comment Formatting</a></li> - <li><a href="#scf_includes"><tt>#include</tt> Style</a></li> - <li><a href="#scf_codewidth">Source Code Width</a></li> - <li><a href="#scf_spacestabs">Use Spaces Instead of Tabs</a></li> - <li><a href="#scf_indentation">Indent Code Consistently</a></li> - </ol></li> - <li><a href="#compilerissues">Compiler Issues</a> - <ol> - <li><a href="#ci_warningerrors">Treat Compiler Warnings Like - Errors</a></li> - <li><a href="#ci_portable_code">Write Portable Code</a></li> - <li><a href="#ci_rtti_exceptions">Do not use RTTI or Exceptions</a></li> - <li><a href="#ci_static_ctors">Do not use Static Constructors</a></li> - <li><a href="#ci_class_struct">Use of <tt>class</tt>/<tt>struct</tt> Keywords</a></li> - </ol></li> - </ol></li> - <li><a href="#styleissues">Style Issues</a> - <ol> - <li><a href="#macro">The High-Level Issues</a> - <ol> - <li><a href="#hl_module">A Public Header File <b>is</b> a - Module</a></li> - <li><a href="#hl_dontinclude"><tt>#include</tt> as Little as Possible</a></li> - <li><a href="#hl_privateheaders">Keep "internal" Headers - Private</a></li> - <li><a href="#hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify - Code</a></li> - <li><a href="#hl_else_after_return">Don't use <tt>else</tt> after a - <tt>return</tt></a></li> - <li><a href="#hl_predicateloops">Turn Predicate Loops into Predicate - Functions</a></li> - </ol></li> - <li><a href="#micro">The Low-Level Issues</a> - <ol> - <li><a href="#ll_naming">Name Types, Functions, Variables, and Enumerators Properly</a></li> - <li><a href="#ll_assert">Assert Liberally</a></li> - <li><a href="#ll_ns_std">Do not use '<tt>using namespace std</tt>'</a></li> - <li><a href="#ll_virtual_anch">Provide a virtual method anchor for - classes in headers</a></li> - <li><a href="#ll_end">Don't evaluate <tt>end()</tt> every time through a - loop</a></li> - <li><a href="#ll_iostream"><tt>#include <iostream></tt> is - <em>forbidden</em></a></li> - <li><a href="#ll_raw_ostream">Use <tt>raw_ostream</tt></a></li> - <li><a href="#ll_avoidendl">Avoid <tt>std::endl</tt></a></li> - </ol></li> - - <li><a href="#nano">Microscopic Details</a> - <ol> - <li><a href="#micro_spaceparen">Spaces Before Parentheses</a></li> - <li><a href="#micro_preincrement">Prefer Preincrement</a></li> - <li><a href="#micro_namespaceindent">Namespace Indentation</a></li> - <li><a href="#micro_anonns">Anonymous Namespaces</a></li> - </ol></li> - - - </ol></li> - <li><a href="#seealso">See Also</a></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - - -<!-- *********************************************************************** --> -<h2><a name="introduction">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>This document attempts to describe a few coding standards that are being used -in the LLVM source tree. Although no coding standards should be regarded as -absolute requirements to be followed in all instances, coding standards are -particularly important for large-scale code bases that follow a library-based -design (like LLVM).</p> - -<p>This document intentionally does not prescribe fixed standards for religious -issues such as brace placement and space usage. For issues like this, follow -the golden rule:</p> - -<blockquote> - -<p><b><a name="goldenrule">If you are extending, enhancing, or bug fixing -already implemented code, use the style that is already being used so that the -source is uniform and easy to follow.</a></b></p> - -</blockquote> - -<p>Note that some code bases (e.g. libc++) have really good reasons to deviate -from the coding standards. In the case of libc++, this is because the naming -and other conventions are dictated by the C++ standard. If you think there is -a specific good reason to deviate from the standards here, please bring it up -on the LLVMdev mailing list.</p> - -<p>There are some conventions that are not uniformly followed in the code base -(e.g. the naming convention). This is because they are relatively new, and a -lot of code was written before they were put in place. Our long term goal is -for the entire codebase to follow the convention, but we explicitly <em>do -not</em> want patches that do large-scale reformating of existing code. OTOH, -it is reasonable to rename the methods of a class if you're about to change it -in some other way. Just do the reformating as a separate commit from the -functionality change. </p> - -<p>The ultimate goal of these guidelines is the increase readability and -maintainability of our common source base. If you have suggestions for topics to -be included, please mail them to <a -href="mailto:sabre@nondot.org">Chris</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="mechanicalissues">Mechanical Source Issues</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<!-- ======================================================================= --> -<h3> - <a name="sourceformating">Source Code Formatting</a> -</h3> - -<div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_commenting">Commenting</a> -</h4> - -<div> - -<p>Comments are one critical part of readability and maintainability. Everyone -knows they should comment their code, and so should you. When writing comments, -write them as English prose, which means they should use proper capitalization, -punctuation, etc. Aim to describe what a code is trying to do and why, not -"how" it does it at a micro level. Here are a few critical things to -document:</p> - -<h5>File Headers</h5> - -<div> - -<p>Every source file should have a header on it that describes the basic -purpose of the file. If a file does not have a header, it should not be -checked into the tree. The standard header looks like this:</p> - -<div class="doc_code"> -<pre> -//===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===// -// -// The LLVM Compiler Infrastructure -// -// This file is distributed under the University of Illinois Open Source -// License. See LICENSE.TXT for details. -// -//===----------------------------------------------------------------------===// -// -// This file contains the declaration of the Instruction class, which is the -// base class for all of the VM instructions. -// -//===----------------------------------------------------------------------===// -</pre> -</div> - -<p>A few things to note about this particular format: The "<tt>-*- C++ --*-</tt>" string on the first line is there to tell Emacs that the source file -is a C++ file, not a C file (Emacs assumes <tt>.h</tt> files are C files by default). -Note that this tag is not necessary in <tt>.cpp</tt> files. The name of the file is also -on the first line, along with a very short description of the purpose of the -file. This is important when printing out code and flipping though lots of -pages.</p> - -<p>The next section in the file is a concise note that defines the license -that the file is released under. This makes it perfectly clear what terms the -source code can be distributed under and should not be modified in any way.</p> - -<p>The main body of the description does not have to be very long in most cases. -Here it's only two lines. If an algorithm is being implemented or something -tricky is going on, a reference to the paper where it is published should be -included, as well as any notes or "gotchas" in the code to watch out for.</p> - -</div> - -<h5>Class overviews</h5> - -<p>Classes are one fundamental part of a good object oriented design. As such, -a class definition should have a comment block that explains what the class is -used for and how it works. Every non-trivial class is expected to have a -doxygen comment block.</p> - - -<h5>Method information</h5> - -<div> - -<p>Methods defined in a class (as well as any global functions) should also be -documented properly. A quick note about what it does and a description of the -borderline behaviour is all that is necessary here (unless something -particularly tricky or insidious is going on). The hope is that people can -figure out how to use your interfaces without reading the code itself.</p> - -<p>Good things to talk about here are what happens when something unexpected -happens: does the method return null? Abort? Format your hard disk?</p> - -</div> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_commentformat">Comment Formatting</a> -</h4> - -<div> - -<p>In general, prefer C++ style (<tt>//</tt>) comments. They take less space, -require less typing, don't have nesting problems, etc. There are a few cases -when it is useful to use C style (<tt>/* */</tt>) comments however:</p> - -<ol> - <li>When writing C code: Obviously if you are writing C code, use C style - comments.</li> - <li>When writing a header file that may be <tt>#include</tt>d by a C source - file.</li> - <li>When writing a source file that is used by a tool that only accepts C - style comments.</li> -</ol> - -<p>To comment out a large block of code, use <tt>#if 0</tt> and <tt>#endif</tt>. -These nest properly and are better behaved in general than C style comments.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_includes"><tt>#include</tt> Style</a> -</h4> - -<div> - -<p>Immediately after the <a href="#scf_commenting">header file comment</a> (and -include guards if working on a header file), the <a -href="#hl_dontinclude">minimal</a> list of <tt>#include</tt>s required by the -file should be listed. We prefer these <tt>#include</tt>s to be listed in this -order:</p> - -<ol> - <li><a href="#mmheader">Main Module Header</a></li> - <li><a href="#hl_privateheaders">Local/Private Headers</a></li> - <li><tt>llvm/*</tt></li> - <li><tt>llvm/Analysis/*</tt></li> - <li><tt>llvm/Assembly/*</tt></li> - <li><tt>llvm/Bitcode/*</tt></li> - <li><tt>llvm/CodeGen/*</tt></li> - <li>...</li> - <li><tt>Support/*</tt></li> - <li><tt>Config/*</tt></li> - <li>System <tt>#includes</tt></li> -</ol> - -<p>and each category should be sorted by name.</p> - -<p><a name="mmheader">The "Main Module Header"</a> file applies to <tt>.cpp</tt> files -which implement an interface defined by a <tt>.h</tt> file. This <tt>#include</tt> -should always be included <b>first</b> regardless of where it lives on the file -system. By including a header file first in the <tt>.cpp</tt> files that implement the -interfaces, we ensure that the header does not have any hidden dependencies -which are not explicitly #included in the header, but should be. It is also a -form of documentation in the <tt>.cpp</tt> file to indicate where the interfaces it -implements are defined.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_codewidth">Source Code Width</a> -</h4> - -<div> - -<p>Write your code to fit within 80 columns of text. This helps those of us who -like to print out code and look at your code in an xterm without resizing -it.</p> - -<p>The longer answer is that there must be some limit to the width of the code -in order to reasonably allow developers to have multiple files side-by-side in -windows on a modest display. If you are going to pick a width limit, it is -somewhat arbitrary but you might as well pick something standard. Going with -90 columns (for example) instead of 80 columns wouldn't add any significant -value and would be detrimental to printing out code. Also many other projects -have standardized on 80 columns, so some people have already configured their -editors for it (vs something else, like 90 columns).</p> - -<p>This is one of many contentious issues in coding standards, but it is not up -for debate.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_spacestabs">Use Spaces Instead of Tabs</a> -</h4> - -<div> - -<p>In all cases, prefer spaces to tabs in source files. People have different -preferred indentation levels, and different styles of indentation that they -like; this is fine. What isn't fine is that different editors/viewers expand -tabs out to different tab stops. This can cause your code to look completely -unreadable, and it is not worth dealing with.</p> - -<p>As always, follow the <a href="#goldenrule">Golden Rule</a> above: follow the -style of existing code if you are modifying and extending it. If you like four -spaces of indentation, <b>DO NOT</b> do that in the middle of a chunk of code -with two spaces of indentation. Also, do not reindent a whole source file: it -makes for incredible diffs that are absolutely worthless.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scf_indentation">Indent Code Consistently</a> -</h4> - -<div> - -<p>Okay, in your first year of programming you were told that indentation is -important. If you didn't believe and internalize this then, now is the time. -Just do it.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="compilerissues">Compiler Issues</a> -</h3> - -<div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ci_warningerrors">Treat Compiler Warnings Like Errors</a> -</h4> - -<div> - -<p>If your code has compiler warnings in it, something is wrong — you -aren't casting values correctly, your have "questionable" constructs in your -code, or you are doing something legitimately wrong. Compiler warnings can -cover up legitimate errors in output and make dealing with a translation unit -difficult.</p> - -<p>It is not possible to prevent all warnings from all compilers, nor is it -desirable. Instead, pick a standard compiler (like <tt>gcc</tt>) that provides -a good thorough set of warnings, and stick to it. At least in the case of -<tt>gcc</tt>, it is possible to work around any spurious errors by changing the -syntax of the code slightly. For example, a warning that annoys me occurs when -I write code like this:</p> - -<div class="doc_code"> -<pre> -if (V = getValue()) { - ... -} -</pre> -</div> - -<p><tt>gcc</tt> will warn me that I probably want to use the <tt>==</tt> -operator, and that I probably mistyped it. In most cases, I haven't, and I -really don't want the spurious errors. To fix this particular problem, I -rewrite the code like this:</p> - -<div class="doc_code"> -<pre> -if ((V = getValue())) { - ... -} -</pre> -</div> - -<p>which shuts <tt>gcc</tt> up. Any <tt>gcc</tt> warning that annoys you can -be fixed by massaging the code appropriately.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ci_portable_code">Write Portable Code</a> -</h4> - -<div> - -<p>In almost all cases, it is possible and within reason to write completely -portable code. If there are cases where it isn't possible to write portable -code, isolate it behind a well defined (and well documented) interface.</p> - -<p>In practice, this means that you shouldn't assume much about the host -compiler, and Visual Studio tends to be the lowest common denominator. -If advanced features are used, they should only be an implementation detail of -a library which has a simple exposed API, and preferably be buried in -libSystem.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> -<a name="ci_rtti_exceptions">Do not use RTTI or Exceptions</a> -</h4> -<div> - -<p>In an effort to reduce code and executable size, LLVM does not use RTTI -(e.g. <tt>dynamic_cast<></tt>) or exceptions. These two language features -violate the general C++ principle of <i>"you only pay for what you use"</i>, -causing executable bloat even if exceptions are never used in the code base, or -if RTTI is never used for a class. Because of this, we turn them off globally -in the code.</p> - -<p>That said, LLVM does make extensive use of a hand-rolled form of RTTI that -use templates like <a href="ProgrammersManual.html#isa"><tt>isa<></tt>, -<tt>cast<></tt>, and <tt>dyn_cast<></tt></a>. This form of RTTI is -opt-in and can be added to any class. It is also substantially more efficient -than <tt>dynamic_cast<></tt>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> -<a name="ci_static_ctors">Do not use Static Constructors</a> -</h4> -<div> - -<p>Static constructors and destructors (e.g. global variables whose types have -a constructor or destructor) should not be added to the code base, and should be -removed wherever possible. Besides <a -href="http://yosefk.com/c++fqa/ctors.html#fqa-10.12">well known problems</a> -where the order of initialization is undefined between globals in different -source files, the entire concept of static constructors is at odds with the -common use case of LLVM as a library linked into a larger application.</p> - -<p>Consider the use of LLVM as a JIT linked into another application (perhaps -for <a href="http://llvm.org/Users.html">OpenGL, custom languages</a>, -<a href="http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf">shaders in -movies</a>, etc). Due to the design of static constructors, they must be -executed at startup time of the entire application, regardless of whether or -how LLVM is used in that larger application. There are two problems with -this:</p> - -<ol> - <li>The time to run the static constructors impacts startup time of - applications — a critical time for GUI apps, among others.</li> - - <li>The static constructors cause the app to pull many extra pages of memory - off the disk: both the code for the constructor in each <tt>.o</tt> file and - the small amount of data that gets touched. In addition, touched/dirty pages - put more pressure on the VM system on low-memory machines.</li> -</ol> - -<p>We would really like for there to be zero cost for linking in an additional -LLVM target or other library into an application, but static constructors -violate this goal.</p> - -<p>That said, LLVM unfortunately does contain static constructors. It would be -a <a href="http://llvm.org/PR11944">great project</a> for someone to purge all -static constructors from LLVM, and then enable the -<tt>-Wglobal-constructors</tt> warning flag (when building with Clang) to ensure -we do not regress in the future. -</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> -<a name="ci_class_struct">Use of <tt>class</tt> and <tt>struct</tt> Keywords</a> -</h4> -<div> - -<p>In C++, the <tt>class</tt> and <tt>struct</tt> keywords can be used almost -interchangeably. The only difference is when they are used to declare a class: -<tt>class</tt> makes all members private by default while <tt>struct</tt> makes -all members public by default.</p> - -<p>Unfortunately, not all compilers follow the rules and some will generate -different symbols based on whether <tt>class</tt> or <tt>struct</tt> was used to -declare the symbol. This can lead to problems at link time.</p> - -<p>So, the rule for LLVM is to always use the <tt>class</tt> keyword, unless -<b>all</b> members are public and the type is a C++ -<a href="http://en.wikipedia.org/wiki/Plain_old_data_structure">POD</a> type, in -which case <tt>struct</tt> is allowed.</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="styleissues">Style Issues</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<!-- ======================================================================= --> -<h3> - <a name="macro">The High-Level Issues</a> -</h3> -<!-- ======================================================================= --> - -<div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_module">A Public Header File <b>is</b> a Module</a> -</h4> - -<div> - -<p>C++ doesn't do too well in the modularity department. There is no real -encapsulation or data hiding (unless you use expensive protocol classes), but it -is what we have to work with. When you write a public header file (in the LLVM -source tree, they live in the top level "<tt>include</tt>" directory), you are -defining a module of functionality.</p> - -<p>Ideally, modules should be completely independent of each other, and their -header files should only <tt>#include</tt> the absolute minimum number of -headers possible. A module is not just a class, a function, or a -namespace: <a href="http://www.cuj.com/articles/2000/0002/0002c/0002c.htm">it's -a collection of these</a> that defines an interface. This interface may be -several functions, classes, or data structures, but the important issue is how -they work together.</p> - -<p>In general, a module should be implemented by one or more <tt>.cpp</tt> -files. Each of these <tt>.cpp</tt> files should include the header that defines -their interface first. This ensures that all of the dependences of the module -header have been properly added to the module header itself, and are not -implicit. System headers should be included after user headers for a -translation unit.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_dontinclude"><tt>#include</tt> as Little as Possible</a> -</h4> - -<div> - -<p><tt>#include</tt> hurts compile time performance. Don't do it unless you -have to, especially in header files.</p> - -<p>But wait! Sometimes you need to have the definition of a class to use it, or -to inherit from it. In these cases go ahead and <tt>#include</tt> that header -file. Be aware however that there are many cases where you don't need to have -the full definition of a class. If you are using a pointer or reference to a -class, you don't need the header file. If you are simply returning a class -instance from a prototyped function or method, you don't need it. In fact, for -most cases, you simply don't need the definition of a class. And not -<tt>#include</tt>'ing speeds up compilation.</p> - -<p>It is easy to try to go too overboard on this recommendation, however. You -<b>must</b> include all of the header files that you are using — you can -include them either directly or indirectly (through another header file). To -make sure that you don't accidentally forget to include a header file in your -module header, make sure to include your module header <b>first</b> in the -implementation file (as mentioned above). This way there won't be any hidden -dependencies that you'll find out about later.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_privateheaders">Keep "Internal" Headers Private</a> -</h4> - -<div> - -<p>Many modules have a complex implementation that causes them to use more than -one implementation (<tt>.cpp</tt>) file. It is often tempting to put the -internal communication interface (helper classes, extra functions, etc) in the -public module header file. Don't do this!</p> - -<p>If you really need to do something like this, put a private header file in -the same directory as the source files, and include it locally. This ensures -that your private interface remains private and undisturbed by outsiders.</p> - -<p>Note however, that it's okay to put extra implementation methods in a public -class itself. Just make them private (or protected) and all is well.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify Code</a> -</h4> - -<div> - -<p>When reading code, keep in mind how much state and how many previous -decisions have to be remembered by the reader to understand a block of code. -Aim to reduce indentation where possible when it doesn't make it more difficult -to understand the code. One great way to do this is by making use of early -exits and the <tt>continue</tt> keyword in long loops. As an example of using -an early exit from a function, consider this "bad" code:</p> - -<div class="doc_code"> -<pre> -Value *DoSomething(Instruction *I) { - if (!isa<TerminatorInst>(I) && - I->hasOneUse() && SomeOtherThing(I)) { - ... some long code .... - } - - return 0; -} -</pre> -</div> - -<p>This code has several problems if the body of the '<tt>if</tt>' is large. -When you're looking at the top of the function, it isn't immediately clear that -this <em>only</em> does interesting things with non-terminator instructions, and -only applies to things with the other predicates. Second, it is relatively -difficult to describe (in comments) why these predicates are important because -the <tt>if</tt> statement makes it difficult to lay out the comments. Third, -when you're deep within the body of the code, it is indented an extra level. -Finally, when reading the top of the function, it isn't clear what the result is -if the predicate isn't true; you have to read to the end of the function to know -that it returns null.</p> - -<p>It is much preferred to format the code like this:</p> - -<div class="doc_code"> -<pre> -Value *DoSomething(Instruction *I) { - // Terminators never need 'something' done to them because ... - if (isa<TerminatorInst>(I)) - return 0; - - // We conservatively avoid transforming instructions with multiple uses - // because goats like cheese. - if (!I->hasOneUse()) - return 0; - - // This is really just here for example. - if (!SomeOtherThing(I)) - return 0; - - ... some long code .... -} -</pre> -</div> - -<p>This fixes these problems. A similar problem frequently happens in <tt>for</tt> -loops. A silly example is something like this:</p> - -<div class="doc_code"> -<pre> - for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) { - if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) { - Value *LHS = BO->getOperand(0); - Value *RHS = BO->getOperand(1); - if (LHS != RHS) { - ... - } - } - } -</pre> -</div> - -<p>When you have very, very small loops, this sort of structure is fine. But if -it exceeds more than 10-15 lines, it becomes difficult for people to read and -understand at a glance. The problem with this sort of code is that it gets very -nested very quickly. Meaning that the reader of the code has to keep a lot of -context in their brain to remember what is going immediately on in the loop, -because they don't know if/when the <tt>if</tt> conditions will have elses etc. -It is strongly preferred to structure the loop like this:</p> - -<div class="doc_code"> -<pre> - for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) { - BinaryOperator *BO = dyn_cast<BinaryOperator>(II); - if (!BO) continue; - - Value *LHS = BO->getOperand(0); - Value *RHS = BO->getOperand(1); - if (LHS == RHS) continue; - - ... - } -</pre> -</div> - -<p>This has all the benefits of using early exits for functions: it reduces -nesting of the loop, it makes it easier to describe why the conditions are true, -and it makes it obvious to the reader that there is no <tt>else</tt> coming up -that they have to push context into their brain for. If a loop is large, this -can be a big understandability win.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_else_after_return">Don't use <tt>else</tt> after a <tt>return</tt></a> -</h4> - -<div> - -<p>For similar reasons above (reduction of indentation and easier reading), -please do not use '<tt>else</tt>' or '<tt>else if</tt>' after something that -interrupts control flow — like <tt>return</tt>, <tt>break</tt>, -<tt>continue</tt>, <tt>goto</tt>, etc. For example, this is <em>bad</em>:</p> - -<div class="doc_code"> -<pre> - case 'J': { - if (Signed) { - Type = Context.getsigjmp_bufType(); - if (Type.isNull()) { - Error = ASTContext::GE_Missing_sigjmp_buf; - return QualType(); - <b>} else { - break; - }</b> - } else { - Type = Context.getjmp_bufType(); - if (Type.isNull()) { - Error = ASTContext::GE_Missing_jmp_buf; - return QualType(); - <b>} else { - break; - }</b> - } - } - } -</pre> -</div> - -<p>It is better to write it like this:</p> - -<div class="doc_code"> -<pre> - case 'J': - if (Signed) { - Type = Context.getsigjmp_bufType(); - if (Type.isNull()) { - Error = ASTContext::GE_Missing_sigjmp_buf; - return QualType(); - } - } else { - Type = Context.getjmp_bufType(); - if (Type.isNull()) { - Error = ASTContext::GE_Missing_jmp_buf; - return QualType(); - } - } - <b>break;</b> -</pre> -</div> - -<p>Or better yet (in this case) as:</p> - -<div class="doc_code"> -<pre> - case 'J': - if (Signed) - Type = Context.getsigjmp_bufType(); - else - Type = Context.getjmp_bufType(); - - if (Type.isNull()) { - Error = Signed ? ASTContext::GE_Missing_sigjmp_buf : - ASTContext::GE_Missing_jmp_buf; - return QualType(); - } - <b>break;</b> -</pre> -</div> - -<p>The idea is to reduce indentation and the amount of code you have to keep -track of when reading the code.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hl_predicateloops">Turn Predicate Loops into Predicate Functions</a> -</h4> - -<div> - -<p>It is very common to write small loops that just compute a boolean value. -There are a number of ways that people commonly write these, but an example of -this sort of thing is:</p> - -<div class="doc_code"> -<pre> - <b>bool FoundFoo = false;</b> - for (unsigned i = 0, e = BarList.size(); i != e; ++i) - if (BarList[i]->isFoo()) { - <b>FoundFoo = true;</b> - break; - } - - <b>if (FoundFoo) {</b> - ... - } -</pre> -</div> - -<p>This sort of code is awkward to write, and is almost always a bad sign. -Instead of this sort of loop, we strongly prefer to use a predicate function -(which may be <a href="#micro_anonns">static</a>) that uses -<a href="#hl_earlyexit">early exits</a> to compute the predicate. We prefer -the code to be structured like this:</p> - -<div class="doc_code"> -<pre> -/// ListContainsFoo - Return true if the specified list has an element that is -/// a foo. -static bool ListContainsFoo(const std::vector<Bar*> &List) { - for (unsigned i = 0, e = List.size(); i != e; ++i) - if (List[i]->isFoo()) - return true; - return false; -} -... - - <b>if (ListContainsFoo(BarList)) {</b> - ... - } -</pre> -</div> - -<p>There are many reasons for doing this: it reduces indentation and factors out -code which can often be shared by other code that checks for the same predicate. -More importantly, it <em>forces you to pick a name</em> for the function, and -forces you to write a comment for it. In this silly example, this doesn't add -much value. However, if the condition is complex, this can make it a lot easier -for the reader to understand the code that queries for this predicate. Instead -of being faced with the in-line details of how we check to see if the BarList -contains a foo, we can trust the function name and continue reading with better -locality.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="micro">The Low-Level Issues</a> -</h3> -<!-- ======================================================================= --> - -<div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_naming"> - Name Types, Functions, Variables, and Enumerators Properly - </a> -</h4> - -<div> - -<p>Poorly-chosen names can mislead the reader and cause bugs. We cannot stress -enough how important it is to use <em>descriptive</em> names. Pick names that -match the semantics and role of the underlying entities, within reason. Avoid -abbreviations unless they are well known. After picking a good name, make sure -to use consistent capitalization for the name, as inconsistency requires clients -to either memorize the APIs or to look it up to find the exact spelling.</p> - -<p>In general, names should be in camel case (e.g. <tt>TextFileReader</tt> -and <tt>isLValue()</tt>). Different kinds of declarations have different -rules:</p> - -<ul> -<li><p><b>Type names</b> (including classes, structs, enums, typedefs, etc) - should be nouns and start with an upper-case letter (e.g. - <tt>TextFileReader</tt>).</p></li> - -<li><p><b>Variable names</b> should be nouns (as they represent state). The - name should be camel case, and start with an upper case letter (e.g. - <tt>Leader</tt> or <tt>Boats</tt>).</p></li> - -<li><p><b>Function names</b> should be verb phrases (as they represent - actions), and command-like function should be imperative. The name should - be camel case, and start with a lower case letter (e.g. <tt>openFile()</tt> - or <tt>isFoo()</tt>).</p></li> - -<li><p><b>Enum declarations</b> (e.g. <tt>enum Foo {...}</tt>) are types, so - they should follow the naming conventions for types. A common use for enums - is as a discriminator for a union, or an indicator of a subclass. When an - enum is used for something like this, it should have a <tt>Kind</tt> suffix - (e.g. <tt>ValueKind</tt>).</p></li> - -<li><p><b>Enumerators</b> (e.g. <tt>enum { Foo, Bar }</tt>) and <b>public member - variables</b> should start with an upper-case letter, just like types. - Unless the enumerators are defined in their own small namespace or inside a - class, enumerators should have a prefix corresponding to the enum - declaration name. For example, <tt>enum ValueKind { ... };</tt> may contain - enumerators like <tt>VK_Argument</tt>, <tt>VK_BasicBlock</tt>, etc. - Enumerators that are just convenience constants are exempt from the - requirement for a prefix. For instance:</p> - -<div class="doc_code"> -<pre> -enum { - MaxSize = 42, - Density = 12 -}; -</pre> -</div> -</li> - -</ul> - -<p>As an exception, classes that mimic STL classes can have member names in -STL's style of lower-case words separated by underscores (e.g. <tt>begin()</tt>, -<tt>push_back()</tt>, and <tt>empty()</tt>).</p> - -<p>Here are some examples of good and bad names:</p> - -<div class="doc_code"> -<pre> -class VehicleMaker { - ... - Factory<Tire> F; // Bad -- abbreviation and non-descriptive. - Factory<Tire> Factory; // Better. - Factory<Tire> TireFactory; // Even better -- if VehicleMaker has more than one - // kind of factories. -}; - -Vehicle MakeVehicle(VehicleType Type) { - VehicleMaker M; // Might be OK if having a short life-span. - Tire tmp1 = M.makeTire(); // Bad -- 'tmp1' provides no information. - Light headlight = M.makeLight("head"); // Good -- descriptive. - ... -} -</pre> -</div> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_assert">Assert Liberally</a> -</h4> - -<div> - -<p>Use the "<tt>assert</tt>" macro to its fullest. Check all of your -preconditions and assumptions, you never know when a bug (not necessarily even -yours) might be caught early by an assertion, which reduces debugging time -dramatically. The "<tt><cassert></tt>" header file is probably already -included by the header files you are using, so it doesn't cost anything to use -it.</p> - -<p>To further assist with debugging, make sure to put some kind of error message -in the assertion statement, which is printed if the assertion is tripped. This -helps the poor debugger make sense of why an assertion is being made and -enforced, and hopefully what to do about it. Here is one complete example:</p> - -<div class="doc_code"> -<pre> -inline Value *getOperand(unsigned i) { - assert(i < Operands.size() && "getOperand() out of range!"); - return Operands[i]; -} -</pre> -</div> - -<p>Here are more examples:</p> - -<div class="doc_code"> -<pre> -assert(Ty->isPointerType() && "Can't allocate a non pointer type!"); - -assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!"); - -assert(idx < getNumSuccessors() && "Successor # out of range!"); - -assert(V1.getType() == V2.getType() && "Constant types must be identical!"); - -assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!"); -</pre> -</div> - -<p>You get the idea.</p> - -<p>Please be aware that, when adding assert statements, not all compilers are aware of -the semantics of the assert. In some places, asserts are used to indicate a piece of -code that should not be reached. These are typically of the form:</p> - -<div class="doc_code"> -<pre> -assert(0 && "Some helpful error message"); -</pre> -</div> - -<p>When used in a function that returns a value, they should be followed with a return -statement and a comment indicating that this line is never reached. This will prevent -a compiler which is unable to deduce that the assert statement never returns from -generating a warning.</p> - -<div class="doc_code"> -<pre> -assert(0 && "Some helpful error message"); -// Not reached -return 0; -</pre> -</div> - -<p>Another issue is that values used only by assertions will produce an "unused -value" warning when assertions are disabled. For example, this code will -warn:</p> - -<div class="doc_code"> -<pre> -unsigned Size = V.size(); -assert(Size > 42 && "Vector smaller than it should be"); - -bool NewToSet = Myset.insert(Value); -assert(NewToSet && "The value shouldn't be in the set yet"); -</pre> -</div> - -<p>These are two interesting different cases. In the first case, the call to -V.size() is only useful for the assert, and we don't want it executed when -assertions are disabled. Code like this should move the call into the assert -itself. In the second case, the side effects of the call must happen whether -the assert is enabled or not. In this case, the value should be cast to void to -disable the warning. To be specific, it is preferred to write the code like -this:</p> - -<div class="doc_code"> -<pre> -assert(V.size() > 42 && "Vector smaller than it should be"); - -bool NewToSet = Myset.insert(Value); (void)NewToSet; -assert(NewToSet && "The value shouldn't be in the set yet"); -</pre> -</div> - - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_ns_std">Do Not Use '<tt>using namespace std</tt>'</a> -</h4> - -<div> - -<p>In LLVM, we prefer to explicitly prefix all identifiers from the standard -namespace with an "<tt>std::</tt>" prefix, rather than rely on -"<tt>using namespace std;</tt>".</p> - -<p> In header files, adding a '<tt>using namespace XXX</tt>' directive pollutes -the namespace of any source file that <tt>#include</tt>s the header. This is -clearly a bad thing.</p> - -<p>In implementation files (e.g. <tt>.cpp</tt> files), the rule is more of a stylistic -rule, but is still important. Basically, using explicit namespace prefixes -makes the code <b>clearer</b>, because it is immediately obvious what facilities -are being used and where they are coming from. And <b>more portable</b>, because -namespace clashes cannot occur between LLVM code and other namespaces. The -portability rule is important because different standard library implementations -expose different symbols (potentially ones they shouldn't), and future revisions -to the C++ standard will add more symbols to the <tt>std</tt> namespace. As -such, we never use '<tt>using namespace std;</tt>' in LLVM.</p> - -<p>The exception to the general rule (i.e. it's not an exception for -the <tt>std</tt> namespace) is for implementation files. For example, all of -the code in the LLVM project implements code that lives in the 'llvm' namespace. -As such, it is ok, and actually clearer, for the <tt>.cpp</tt> files to have a -'<tt>using namespace llvm;</tt>' directive at the top, after the -<tt>#include</tt>s. This reduces indentation in the body of the file for source -editors that indent based on braces, and keeps the conceptual context cleaner. -The general form of this rule is that any <tt>.cpp</tt> file that implements -code in any namespace may use that namespace (and its parents'), but should not -use any others.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_virtual_anch"> - Provide a Virtual Method Anchor for Classes in Headers - </a> -</h4> - -<div> - -<p>If a class is defined in a header file and has a v-table (either it has -virtual methods or it derives from classes with virtual methods), it must -always have at least one out-of-line virtual method in the class. Without -this, the compiler will copy the vtable and RTTI into every <tt>.o</tt> file -that <tt>#include</tt>s the header, bloating <tt>.o</tt> file sizes and -increasing link times.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_end">Don't evaluate <tt>end()</tt> every time through a loop</a> -</h4> - -<div> - -<p>Because C++ doesn't have a standard "<tt>foreach</tt>" loop (though it can be -emulated with macros and may be coming in C++'0x) we end up writing a lot of -loops that manually iterate from begin to end on a variety of containers or -through other data structures. One common mistake is to write a loop in this -style:</p> - -<div class="doc_code"> -<pre> - BasicBlock *BB = ... - for (BasicBlock::iterator I = BB->begin(); I != <b>BB->end()</b>; ++I) - ... use I ... -</pre> -</div> - -<p>The problem with this construct is that it evaluates "<tt>BB->end()</tt>" -every time through the loop. Instead of writing the loop like this, we strongly -prefer loops to be written so that they evaluate it once before the loop starts. -A convenient way to do this is like so:</p> - -<div class="doc_code"> -<pre> - BasicBlock *BB = ... - for (BasicBlock::iterator I = BB->begin(), E = <b>BB->end()</b>; I != E; ++I) - ... use I ... -</pre> -</div> - -<p>The observant may quickly point out that these two loops may have different -semantics: if the container (a basic block in this case) is being mutated, then -"<tt>BB->end()</tt>" may change its value every time through the loop and the -second loop may not in fact be correct. If you actually do depend on this -behavior, please write the loop in the first form and add a comment indicating -that you did it intentionally.</p> - -<p>Why do we prefer the second form (when correct)? Writing the loop in the -first form has two problems. First it may be less efficient than evaluating it -at the start of the loop. In this case, the cost is probably minor — a -few extra loads every time through the loop. However, if the base expression is -more complex, then the cost can rise quickly. I've seen loops where the end -expression was actually something like: "<tt>SomeMap[x]->end()</tt>" and map -lookups really aren't cheap. By writing it in the second form consistently, you -eliminate the issue entirely and don't even have to think about it.</p> - -<p>The second (even bigger) issue is that writing the loop in the first form -hints to the reader that the loop is mutating the container (a fact that a -comment would handily confirm!). If you write the loop in the second form, it -is immediately obvious without even looking at the body of the loop that the -container isn't being modified, which makes it easier to read the code and -understand what it does.</p> - -<p>While the second form of the loop is a few extra keystrokes, we do strongly -prefer it.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_iostream"><tt>#include <iostream></tt> is Forbidden</a> -</h4> - -<div> - -<p>The use of <tt>#include <iostream></tt> in library files is -hereby <b><em>forbidden</em></b>, because many common implementations -transparently inject a <a href="#ci_static_ctors">static constructor</a> into -every translation unit that includes it.</p> - -<p>Note that using the other stream headers (<tt><sstream></tt> for -example) is not problematic in this regard — -just <tt><iostream></tt>. However, <tt>raw_ostream</tt> provides various -APIs that are better performing for almost every use than <tt>std::ostream</tt> -style APIs. <b>Therefore new code should always -use <a href="#ll_raw_ostream"><tt>raw_ostream</tt></a> for writing, or -the <tt>llvm::MemoryBuffer</tt> API for reading files.</b></p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_raw_ostream">Use <tt>raw_ostream</tt></a> -</h4> - -<div> - -<p>LLVM includes a lightweight, simple, and efficient stream implementation -in <tt>llvm/Support/raw_ostream.h</tt>, which provides all of the common -features of <tt>std::ostream</tt>. All new code should use <tt>raw_ostream</tt> -instead of <tt>ostream</tt>.</p> - -<p>Unlike <tt>std::ostream</tt>, <tt>raw_ostream</tt> is not a template and can -be forward declared as <tt>class raw_ostream</tt>. Public headers should -generally not include the <tt>raw_ostream</tt> header, but use forward -declarations and constant references to <tt>raw_ostream</tt> instances.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ll_avoidendl">Avoid <tt>std::endl</tt></a> -</h4> - -<div> - -<p>The <tt>std::endl</tt> modifier, when used with <tt>iostreams</tt> outputs a -newline to the output stream specified. In addition to doing this, however, it -also flushes the output stream. In other words, these are equivalent:</p> - -<div class="doc_code"> -<pre> -std::cout << std::endl; -std::cout << '\n' << std::flush; -</pre> -</div> - -<p>Most of the time, you probably have no reason to flush the output stream, so -it's better to use a literal <tt>'\n'</tt>.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="nano">Microscopic Details</a> -</h3> -<!-- ======================================================================= --> - -<div> - -<p>This section describes preferred low-level formatting guidelines along with -reasoning on why we prefer them.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="micro_spaceparen">Spaces Before Parentheses</a> -</h4> - -<div> - -<p>We prefer to put a space before an open parenthesis only in control flow -statements, but not in normal function call expressions and function-like -macros. For example, this is good:</p> - -<div class="doc_code"> -<pre> -<b>if (</b>x) ... -<b>for (</b>i = 0; i != 100; ++i) ... -<b>while (</b>llvm_rocks) ... - -<b>somefunc(</b>42); -<b><a href="#ll_assert">assert</a>(</b>3 != 4 && "laws of math are failing me"); - -a = <b>foo(</b>42, 92) + <b>bar(</b>x); -</pre> -</div> - -<p>and this is bad:</p> - -<div class="doc_code"> -<pre> -<b>if(</b>x) ... -<b>for(</b>i = 0; i != 100; ++i) ... -<b>while(</b>llvm_rocks) ... - -<b>somefunc (</b>42); -<b><a href="#ll_assert">assert</a> (</b>3 != 4 && "laws of math are failing me"); - -a = <b>foo (</b>42, 92) + <b>bar (</b>x); -</pre> -</div> - -<p>The reason for doing this is not completely arbitrary. This style makes -control flow operators stand out more, and makes expressions flow better. The -function call operator binds very tightly as a postfix operator. Putting a -space after a function name (as in the last example) makes it appear that the -code might bind the arguments of the left-hand-side of a binary operator with -the argument list of a function and the name of the right side. More -specifically, it is easy to misread the "a" example as:</p> - -<div class="doc_code"> -<pre> -a = foo <b>(</b>(42, 92) + bar<b>)</b> (x); -</pre> -</div> - -<p>when skimming through the code. By avoiding a space in a function, we avoid -this misinterpretation.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="micro_preincrement">Prefer Preincrement</a> -</h4> - -<div> - -<p>Hard fast rule: Preincrement (<tt>++X</tt>) may be no slower than -postincrement (<tt>X++</tt>) and could very well be a lot faster than it. Use -preincrementation whenever possible.</p> - -<p>The semantics of postincrement include making a copy of the value being -incremented, returning it, and then preincrementing the "work value". For -primitive types, this isn't a big deal... but for iterators, it can be a huge -issue (for example, some iterators contains stack and set objects in them... -copying an iterator could invoke the copy ctor's of these as well). In general, -get in the habit of always using preincrement, and you won't have a problem.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="micro_namespaceindent">Namespace Indentation</a> -</h4> - -<div> - -<p> -In general, we strive to reduce indentation wherever possible. This is useful -because we want code to <a href="#scf_codewidth">fit into 80 columns</a> without -wrapping horribly, but also because it makes it easier to understand the code. -Namespaces are a funny thing: they are often large, and we often desire to put -lots of stuff into them (so they can be large). Other times they are tiny, -because they just hold an enum or something similar. In order to balance this, -we use different approaches for small versus large namespaces. -</p> - -<p> -If a namespace definition is small and <em>easily</em> fits on a screen (say, -less than 35 lines of code), then you should indent its body. Here's an -example: -</p> - -<div class="doc_code"> -<pre> -namespace llvm { - namespace X86 { - /// RelocationType - An enum for the x86 relocation codes. Note that - /// the terminology here doesn't follow x86 convention - word means - /// 32-bit and dword means 64-bit. - enum RelocationType { - /// reloc_pcrel_word - PC relative relocation, add the relocated value to - /// the value already in memory, after we adjust it for where the PC is. - reloc_pcrel_word = 0, - - /// reloc_picrel_word - PIC base relative relocation, add the relocated - /// value to the value already in memory, after we adjust it for where the - /// PIC base is. - reloc_picrel_word = 1, - - /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just - /// add the relocated value to the value already in memory. - reloc_absolute_word = 2, - reloc_absolute_dword = 3 - }; - } -} -</pre> -</div> - -<p>Since the body is small, indenting adds value because it makes it very clear -where the namespace starts and ends, and it is easy to take the whole thing in -in one "gulp" when reading the code. If the blob of code in the namespace is -larger (as it typically is in a header in the <tt>llvm</tt> or <tt>clang</tt> namespaces), do not -indent the code, and add a comment indicating what namespace is being closed. -For example:</p> - -<div class="doc_code"> -<pre> -namespace llvm { -namespace knowledge { - -/// Grokable - This class represents things that Smith can have an intimate -/// understanding of and contains the data associated with it. -class Grokable { -... -public: - explicit Grokable() { ... } - virtual ~Grokable() = 0; - - ... - -}; - -} // end namespace knowledge -} // end namespace llvm -</pre> -</div> - -<p>Because the class is large, we don't expect that the reader can easily -understand the entire concept in a glance, and the end of the file (where the -namespaces end) may be a long ways away from the place they open. As such, -indenting the contents of the namespace doesn't add any value, and detracts from -the readability of the class. In these cases it is best to <em>not</em> indent -the contents of the namespace.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="micro_anonns">Anonymous Namespaces</a> -</h4> - -<div> - -<p>After talking about namespaces in general, you may be wondering about -anonymous namespaces in particular. -Anonymous namespaces are a great language feature that tells the C++ compiler -that the contents of the namespace are only visible within the current -translation unit, allowing more aggressive optimization and eliminating the -possibility of symbol name collisions. Anonymous namespaces are to C++ as -"static" is to C functions and global variables. While "static" is available -in C++, anonymous namespaces are more general: they can make entire classes -private to a file.</p> - -<p>The problem with anonymous namespaces is that they naturally want to -encourage indentation of their body, and they reduce locality of reference: if -you see a random function definition in a C++ file, it is easy to see if it is -marked static, but seeing if it is in an anonymous namespace requires scanning -a big chunk of the file.</p> - -<p>Because of this, we have a simple guideline: make anonymous namespaces as -small as possible, and only use them for class declarations. For example, this -is good:</p> - -<div class="doc_code"> -<pre> -<b>namespace {</b> - class StringSort { - ... - public: - StringSort(...) - bool operator<(const char *RHS) const; - }; -<b>} // end anonymous namespace</b> - -static void Helper() { - ... -} - -bool StringSort::operator<(const char *RHS) const { - ... -} - -</pre> -</div> - -<p>This is bad:</p> - - -<div class="doc_code"> -<pre> -<b>namespace {</b> -class StringSort { -... -public: - StringSort(...) - bool operator<(const char *RHS) const; -}; - -void Helper() { - ... -} - -bool StringSort::operator<(const char *RHS) const { - ... -} - -<b>} // end anonymous namespace</b> - -</pre> -</div> - - -<p>This is bad specifically because if you're looking at "Helper" in the middle -of a large C++ file, that you have no immediate way to tell if it is local to -the file. When it is marked static explicitly, this is immediately obvious. -Also, there is no reason to enclose the definition of "operator<" in the -namespace just because it was declared there. -</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="seealso">See Also</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>A lot of these comments and recommendations have been culled for other -sources. Two particularly important books for our work are:</p> - -<ol> - -<li><a href="http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876">Effective -C++</a> by Scott Meyers. Also -interesting and useful are "More Effective C++" and "Effective STL" by the same -author.</li> - -<li>Large-Scale C++ Software Design by John Lakos</li> - -</ol> - -<p>If you get some free time, and you haven't read them: do so, you might learn -something.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ -</address> - -</body> -</html> diff --git a/docs/CodingStandards.rst b/docs/CodingStandards.rst new file mode 100644 index 0000000..a416a1e --- /dev/null +++ b/docs/CodingStandards.rst @@ -0,0 +1,1147 @@ +.. _coding_standards: + +===================== +LLVM Coding Standards +===================== + +.. contents:: + :local: + +Introduction +============ + +This document attempts to describe a few coding standards that are being used in +the LLVM source tree. Although no coding standards should be regarded as +absolute requirements to be followed in all instances, coding standards are +particularly important for large-scale code bases that follow a library-based +design (like LLVM). + +This document intentionally does not prescribe fixed standards for religious +issues such as brace placement and space usage. For issues like this, follow +the golden rule: + +.. _Golden Rule: + + **If you are extending, enhancing, or bug fixing already implemented code, + use the style that is already being used so that the source is uniform and + easy to follow.** + +Note that some code bases (e.g. ``libc++``) have really good reasons to deviate +from the coding standards. In the case of ``libc++``, this is because the +naming and other conventions are dictated by the C++ standard. If you think +there is a specific good reason to deviate from the standards here, please bring +it up on the LLVMdev mailing list. + +There are some conventions that are not uniformly followed in the code base +(e.g. the naming convention). This is because they are relatively new, and a +lot of code was written before they were put in place. Our long term goal is +for the entire codebase to follow the convention, but we explicitly *do not* +want patches that do large-scale reformating of existing code. On the other +hand, it is reasonable to rename the methods of a class if you're about to +change it in some other way. Just do the reformating as a separate commit from +the functionality change. + +The ultimate goal of these guidelines is the increase readability and +maintainability of our common source base. If you have suggestions for topics to +be included, please mail them to `Chris <mailto:sabre@nondot.org>`_. + +Mechanical Source Issues +======================== + +Source Code Formatting +---------------------- + +Commenting +^^^^^^^^^^ + +Comments are one critical part of readability and maintainability. Everyone +knows they should comment their code, and so should you. When writing comments, +write them as English prose, which means they should use proper capitalization, +punctuation, etc. Aim to describe what the code is trying to do and why, not +*how* it does it at a micro level. Here are a few critical things to document: + +.. _header file comment: + +File Headers +"""""""""""" + +Every source file should have a header on it that describes the basic purpose of +the file. If a file does not have a header, it should not be checked into the +tree. The standard header looks like this: + +.. code-block:: c++ + + //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===// + // + // The LLVM Compiler Infrastructure + // + // This file is distributed under the University of Illinois Open Source + // License. See LICENSE.TXT for details. + // + //===----------------------------------------------------------------------===// + // + // This file contains the declaration of the Instruction class, which is the + // base class for all of the VM instructions. + // + //===----------------------------------------------------------------------===// + +A few things to note about this particular format: The "``-*- C++ -*-``" string +on the first line is there to tell Emacs that the source file is a C++ file, not +a C file (Emacs assumes ``.h`` files are C files by default). + +.. note:: + + This tag is not necessary in ``.cpp`` files. The name of the file is also + on the first line, along with a very short description of the purpose of the + file. This is important when printing out code and flipping though lots of + pages. + +The next section in the file is a concise note that defines the license that the +file is released under. This makes it perfectly clear what terms the source +code can be distributed under and should not be modified in any way. + +The main body of the description does not have to be very long in most cases. +Here it's only two lines. If an algorithm is being implemented or something +tricky is going on, a reference to the paper where it is published should be +included, as well as any notes or *gotchas* in the code to watch out for. + +Class overviews +""""""""""""""" + +Classes are one fundamental part of a good object oriented design. As such, a +class definition should have a comment block that explains what the class is +used for and how it works. Every non-trivial class is expected to have a +``doxygen`` comment block. + +Method information +"""""""""""""""""" + +Methods defined in a class (as well as any global functions) should also be +documented properly. A quick note about what it does and a description of the +borderline behaviour is all that is necessary here (unless something +particularly tricky or insidious is going on). The hope is that people can +figure out how to use your interfaces without reading the code itself. + +Good things to talk about here are what happens when something unexpected +happens: does the method return null? Abort? Format your hard disk? + +Comment Formatting +^^^^^^^^^^^^^^^^^^ + +In general, prefer C++ style (``//``) comments. They take less space, require +less typing, don't have nesting problems, etc. There are a few cases when it is +useful to use C style (``/* */``) comments however: + +#. When writing C code: Obviously if you are writing C code, use C style + comments. + +#. When writing a header file that may be ``#include``\d by a C source file. + +#. When writing a source file that is used by a tool that only accepts C style + comments. + +To comment out a large block of code, use ``#if 0`` and ``#endif``. These nest +properly and are better behaved in general than C style comments. + +``#include`` Style +^^^^^^^^^^^^^^^^^^ + +Immediately after the `header file comment`_ (and include guards if working on a +header file), the `minimal list of #includes`_ required by the file should be +listed. We prefer these ``#include``\s to be listed in this order: + +.. _Main Module Header: +.. _Local/Private Headers: + +#. Main Module Header +#. Local/Private Headers +#. ``llvm/*`` +#. ``llvm/Analysis/*`` +#. ``llvm/Assembly/*`` +#. ``llvm/Bitcode/*`` +#. ``llvm/CodeGen/*`` +#. ... +#. ``llvm/Support/*`` +#. ``llvm/Config/*`` +#. System ``#include``\s + +and each category should be sorted by name. + +The `Main Module Header`_ file applies to ``.cpp`` files which implement an +interface defined by a ``.h`` file. This ``#include`` should always be included +**first** regardless of where it lives on the file system. By including a +header file first in the ``.cpp`` files that implement the interfaces, we ensure +that the header does not have any hidden dependencies which are not explicitly +``#include``\d in the header, but should be. It is also a form of documentation +in the ``.cpp`` file to indicate where the interfaces it implements are defined. + +.. _fit into 80 columns: + +Source Code Width +^^^^^^^^^^^^^^^^^ + +Write your code to fit within 80 columns of text. This helps those of us who +like to print out code and look at your code in an ``xterm`` without resizing +it. + +The longer answer is that there must be some limit to the width of the code in +order to reasonably allow developers to have multiple files side-by-side in +windows on a modest display. If you are going to pick a width limit, it is +somewhat arbitrary but you might as well pick something standard. Going with 90 +columns (for example) instead of 80 columns wouldn't add any significant value +and would be detrimental to printing out code. Also many other projects have +standardized on 80 columns, so some people have already configured their editors +for it (vs something else, like 90 columns). + +This is one of many contentious issues in coding standards, but it is not up for +debate. + +Use Spaces Instead of Tabs +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In all cases, prefer spaces to tabs in source files. People have different +preferred indentation levels, and different styles of indentation that they +like; this is fine. What isn't fine is that different editors/viewers expand +tabs out to different tab stops. This can cause your code to look completely +unreadable, and it is not worth dealing with. + +As always, follow the `Golden Rule`_ above: follow the style of +existing code if you are modifying and extending it. If you like four spaces of +indentation, **DO NOT** do that in the middle of a chunk of code with two spaces +of indentation. Also, do not reindent a whole source file: it makes for +incredible diffs that are absolutely worthless. + +Indent Code Consistently +^^^^^^^^^^^^^^^^^^^^^^^^ + +Okay, in your first year of programming you were told that indentation is +important. If you didn't believe and internalize this then, now is the time. +Just do it. + +Compiler Issues +--------------- + +Treat Compiler Warnings Like Errors +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your code has compiler warnings in it, something is wrong --- you aren't +casting values correctly, you have "questionable" constructs in your code, or +you are doing something legitimately wrong. Compiler warnings can cover up +legitimate errors in output and make dealing with a translation unit difficult. + +It is not possible to prevent all warnings from all compilers, nor is it +desirable. Instead, pick a standard compiler (like ``gcc``) that provides a +good thorough set of warnings, and stick to it. At least in the case of +``gcc``, it is possible to work around any spurious errors by changing the +syntax of the code slightly. For example, a warning that annoys me occurs when +I write code like this: + +.. code-block:: c++ + + if (V = getValue()) { + ... + } + +``gcc`` will warn me that I probably want to use the ``==`` operator, and that I +probably mistyped it. In most cases, I haven't, and I really don't want the +spurious errors. To fix this particular problem, I rewrite the code like +this: + +.. code-block:: c++ + + if ((V = getValue())) { + ... + } + +which shuts ``gcc`` up. Any ``gcc`` warning that annoys you can be fixed by +massaging the code appropriately. + +Write Portable Code +^^^^^^^^^^^^^^^^^^^ + +In almost all cases, it is possible and within reason to write completely +portable code. If there are cases where it isn't possible to write portable +code, isolate it behind a well defined (and well documented) interface. + +In practice, this means that you shouldn't assume much about the host compiler +(and Visual Studio tends to be the lowest common denominator). If advanced +features are used, they should only be an implementation detail of a library +which has a simple exposed API, and preferably be buried in ``libSystem``. + +Do not use RTTI or Exceptions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In an effort to reduce code and executable size, LLVM does not use RTTI +(e.g. ``dynamic_cast<>;``) or exceptions. These two language features violate +the general C++ principle of *"you only pay for what you use"*, causing +executable bloat even if exceptions are never used in the code base, or if RTTI +is never used for a class. Because of this, we turn them off globally in the +code. + +That said, LLVM does make extensive use of a hand-rolled form of RTTI that use +templates like `isa<>, cast<>, and dyn_cast<> <ProgrammersManual.html#isa>`_. +This form of RTTI is opt-in and can be added to any class. It is also +substantially more efficient than ``dynamic_cast<>``. + +.. _static constructor: + +Do not use Static Constructors +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Static constructors and destructors (e.g. global variables whose types have a +constructor or destructor) should not be added to the code base, and should be +removed wherever possible. Besides `well known problems +<http://yosefk.com/c++fqa/ctors.html#fqa-10.12>`_ where the order of +initialization is undefined between globals in different source files, the +entire concept of static constructors is at odds with the common use case of +LLVM as a library linked into a larger application. + +Consider the use of LLVM as a JIT linked into another application (perhaps for +`OpenGL, custom languages <http://llvm.org/Users.html>`_, `shaders in movies +<http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf>`_, etc). Due to the +design of static constructors, they must be executed at startup time of the +entire application, regardless of whether or how LLVM is used in that larger +application. There are two problems with this: + +* The time to run the static constructors impacts startup time of applications + --- a critical time for GUI apps, among others. + +* The static constructors cause the app to pull many extra pages of memory off + the disk: both the code for the constructor in each ``.o`` file and the small + amount of data that gets touched. In addition, touched/dirty pages put more + pressure on the VM system on low-memory machines. + +We would really like for there to be zero cost for linking in an additional LLVM +target or other library into an application, but static constructors violate +this goal. + +That said, LLVM unfortunately does contain static constructors. It would be a +`great project <http://llvm.org/PR11944>`_ for someone to purge all static +constructors from LLVM, and then enable the ``-Wglobal-constructors`` warning +flag (when building with Clang) to ensure we do not regress in the future. + +Use of ``class`` and ``struct`` Keywords +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In C++, the ``class`` and ``struct`` keywords can be used almost +interchangeably. The only difference is when they are used to declare a class: +``class`` makes all members private by default while ``struct`` makes all +members public by default. + +Unfortunately, not all compilers follow the rules and some will generate +different symbols based on whether ``class`` or ``struct`` was used to declare +the symbol. This can lead to problems at link time. + +So, the rule for LLVM is to always use the ``class`` keyword, unless **all** +members are public and the type is a C++ `POD +<http://en.wikipedia.org/wiki/Plain_old_data_structure>`_ type, in which case +``struct`` is allowed. + +Style Issues +============ + +The High-Level Issues +--------------------- + +A Public Header File **is** a Module +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +C++ doesn't do too well in the modularity department. There is no real +encapsulation or data hiding (unless you use expensive protocol classes), but it +is what we have to work with. When you write a public header file (in the LLVM +source tree, they live in the top level "``include``" directory), you are +defining a module of functionality. + +Ideally, modules should be completely independent of each other, and their +header files should only ``#include`` the absolute minimum number of headers +possible. A module is not just a class, a function, or a namespace: it's a +collection of these that defines an interface. This interface may be several +functions, classes, or data structures, but the important issue is how they work +together. + +In general, a module should be implemented by one or more ``.cpp`` files. Each +of these ``.cpp`` files should include the header that defines their interface +first. This ensures that all of the dependences of the module header have been +properly added to the module header itself, and are not implicit. System +headers should be included after user headers for a translation unit. + +.. _minimal list of #includes: + +``#include`` as Little as Possible +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``#include`` hurts compile time performance. Don't do it unless you have to, +especially in header files. + +But wait! Sometimes you need to have the definition of a class to use it, or to +inherit from it. In these cases go ahead and ``#include`` that header file. Be +aware however that there are many cases where you don't need to have the full +definition of a class. If you are using a pointer or reference to a class, you +don't need the header file. If you are simply returning a class instance from a +prototyped function or method, you don't need it. In fact, for most cases, you +simply don't need the definition of a class. And not ``#include``\ing speeds up +compilation. + +It is easy to try to go too overboard on this recommendation, however. You +**must** include all of the header files that you are using --- you can include +them either directly or indirectly through another header file. To make sure +that you don't accidentally forget to include a header file in your module +header, make sure to include your module header **first** in the implementation +file (as mentioned above). This way there won't be any hidden dependencies that +you'll find out about later. + +Keep "Internal" Headers Private +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Many modules have a complex implementation that causes them to use more than one +implementation (``.cpp``) file. It is often tempting to put the internal +communication interface (helper classes, extra functions, etc) in the public +module header file. Don't do this! + +If you really need to do something like this, put a private header file in the +same directory as the source files, and include it locally. This ensures that +your private interface remains private and undisturbed by outsiders. + +.. note:: + + It's okay to put extra implementation methods in a public class itself. Just + make them private (or protected) and all is well. + +.. _early exits: + +Use Early Exits and ``continue`` to Simplify Code +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When reading code, keep in mind how much state and how many previous decisions +have to be remembered by the reader to understand a block of code. Aim to +reduce indentation where possible when it doesn't make it more difficult to +understand the code. One great way to do this is by making use of early exits +and the ``continue`` keyword in long loops. As an example of using an early +exit from a function, consider this "bad" code: + +.. code-block:: c++ + + Value *DoSomething(Instruction *I) { + if (!isa<TerminatorInst>(I) && + I->hasOneUse() && SomeOtherThing(I)) { + ... some long code .... + } + + return 0; + } + +This code has several problems if the body of the ``'if'`` is large. When +you're looking at the top of the function, it isn't immediately clear that this +*only* does interesting things with non-terminator instructions, and only +applies to things with the other predicates. Second, it is relatively difficult +to describe (in comments) why these predicates are important because the ``if`` +statement makes it difficult to lay out the comments. Third, when you're deep +within the body of the code, it is indented an extra level. Finally, when +reading the top of the function, it isn't clear what the result is if the +predicate isn't true; you have to read to the end of the function to know that +it returns null. + +It is much preferred to format the code like this: + +.. code-block:: c++ + + Value *DoSomething(Instruction *I) { + // Terminators never need 'something' done to them because ... + if (isa<TerminatorInst>(I)) + return 0; + + // We conservatively avoid transforming instructions with multiple uses + // because goats like cheese. + if (!I->hasOneUse()) + return 0; + + // This is really just here for example. + if (!SomeOtherThing(I)) + return 0; + + ... some long code .... + } + +This fixes these problems. A similar problem frequently happens in ``for`` +loops. A silly example is something like this: + +.. code-block:: c++ + + for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) { + if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) { + Value *LHS = BO->getOperand(0); + Value *RHS = BO->getOperand(1); + if (LHS != RHS) { + ... + } + } + } + +When you have very, very small loops, this sort of structure is fine. But if it +exceeds more than 10-15 lines, it becomes difficult for people to read and +understand at a glance. The problem with this sort of code is that it gets very +nested very quickly. Meaning that the reader of the code has to keep a lot of +context in their brain to remember what is going immediately on in the loop, +because they don't know if/when the ``if`` conditions will have ``else``\s etc. +It is strongly preferred to structure the loop like this: + +.. code-block:: c++ + + for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) { + BinaryOperator *BO = dyn_cast<BinaryOperator>(II); + if (!BO) continue; + + Value *LHS = BO->getOperand(0); + Value *RHS = BO->getOperand(1); + if (LHS == RHS) continue; + + ... + } + +This has all the benefits of using early exits for functions: it reduces nesting +of the loop, it makes it easier to describe why the conditions are true, and it +makes it obvious to the reader that there is no ``else`` coming up that they +have to push context into their brain for. If a loop is large, this can be a +big understandability win. + +Don't use ``else`` after a ``return`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For similar reasons above (reduction of indentation and easier reading), please +do not use ``'else'`` or ``'else if'`` after something that interrupts control +flow --- like ``return``, ``break``, ``continue``, ``goto``, etc. For +example, this is *bad*: + +.. code-block:: c++ + + case 'J': { + if (Signed) { + Type = Context.getsigjmp_bufType(); + if (Type.isNull()) { + Error = ASTContext::GE_Missing_sigjmp_buf; + return QualType(); + } else { + break; + } + } else { + Type = Context.getjmp_bufType(); + if (Type.isNull()) { + Error = ASTContext::GE_Missing_jmp_buf; + return QualType(); + } else { + break; + } + } + } + +It is better to write it like this: + +.. code-block:: c++ + + case 'J': + if (Signed) { + Type = Context.getsigjmp_bufType(); + if (Type.isNull()) { + Error = ASTContext::GE_Missing_sigjmp_buf; + return QualType(); + } + } else { + Type = Context.getjmp_bufType(); + if (Type.isNull()) { + Error = ASTContext::GE_Missing_jmp_buf; + return QualType(); + } + } + break; + +Or better yet (in this case) as: + +.. code-block:: c++ + + case 'J': + if (Signed) + Type = Context.getsigjmp_bufType(); + else + Type = Context.getjmp_bufType(); + + if (Type.isNull()) { + Error = Signed ? ASTContext::GE_Missing_sigjmp_buf : + ASTContext::GE_Missing_jmp_buf; + return QualType(); + } + break; + +The idea is to reduce indentation and the amount of code you have to keep track +of when reading the code. + +Turn Predicate Loops into Predicate Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is very common to write small loops that just compute a boolean value. There +are a number of ways that people commonly write these, but an example of this +sort of thing is: + +.. code-block:: c++ + + bool FoundFoo = false; + for (unsigned i = 0, e = BarList.size(); i != e; ++i) + if (BarList[i]->isFoo()) { + FoundFoo = true; + break; + } + + if (FoundFoo) { + ... + } + +This sort of code is awkward to write, and is almost always a bad sign. Instead +of this sort of loop, we strongly prefer to use a predicate function (which may +be `static`_) that uses `early exits`_ to compute the predicate. We prefer the +code to be structured like this: + +.. code-block:: c++ + + /// ListContainsFoo - Return true if the specified list has an element that is + /// a foo. + static bool ListContainsFoo(const std::vector<Bar*> &List) { + for (unsigned i = 0, e = List.size(); i != e; ++i) + if (List[i]->isFoo()) + return true; + return false; + } + ... + + if (ListContainsFoo(BarList)) { + ... + } + +There are many reasons for doing this: it reduces indentation and factors out +code which can often be shared by other code that checks for the same predicate. +More importantly, it *forces you to pick a name* for the function, and forces +you to write a comment for it. In this silly example, this doesn't add much +value. However, if the condition is complex, this can make it a lot easier for +the reader to understand the code that queries for this predicate. Instead of +being faced with the in-line details of how we check to see if the BarList +contains a foo, we can trust the function name and continue reading with better +locality. + +The Low-Level Issues +-------------------- + +Name Types, Functions, Variables, and Enumerators Properly +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Poorly-chosen names can mislead the reader and cause bugs. We cannot stress +enough how important it is to use *descriptive* names. Pick names that match +the semantics and role of the underlying entities, within reason. Avoid +abbreviations unless they are well known. After picking a good name, make sure +to use consistent capitalization for the name, as inconsistency requires clients +to either memorize the APIs or to look it up to find the exact spelling. + +In general, names should be in camel case (e.g. ``TextFileReader`` and +``isLValue()``). Different kinds of declarations have different rules: + +* **Type names** (including classes, structs, enums, typedefs, etc) should be + nouns and start with an upper-case letter (e.g. ``TextFileReader``). + +* **Variable names** should be nouns (as they represent state). The name should + be camel case, and start with an upper case letter (e.g. ``Leader`` or + ``Boats``). + +* **Function names** should be verb phrases (as they represent actions), and + command-like function should be imperative. The name should be camel case, + and start with a lower case letter (e.g. ``openFile()`` or ``isFoo()``). + +* **Enum declarations** (e.g. ``enum Foo {...}``) are types, so they should + follow the naming conventions for types. A common use for enums is as a + discriminator for a union, or an indicator of a subclass. When an enum is + used for something like this, it should have a ``Kind`` suffix + (e.g. ``ValueKind``). + +* **Enumerators** (e.g. ``enum { Foo, Bar }``) and **public member variables** + should start with an upper-case letter, just like types. Unless the + enumerators are defined in their own small namespace or inside a class, + enumerators should have a prefix corresponding to the enum declaration name. + For example, ``enum ValueKind { ... };`` may contain enumerators like + ``VK_Argument``, ``VK_BasicBlock``, etc. Enumerators that are just + convenience constants are exempt from the requirement for a prefix. For + instance: + + .. code-block:: c++ + + enum { + MaxSize = 42, + Density = 12 + }; + +As an exception, classes that mimic STL classes can have member names in STL's +style of lower-case words separated by underscores (e.g. ``begin()``, +``push_back()``, and ``empty()``). + +Here are some examples of good and bad names: + +.. code-block:: c++ + + class VehicleMaker { + ... + Factory<Tire> F; // Bad -- abbreviation and non-descriptive. + Factory<Tire> Factory; // Better. + Factory<Tire> TireFactory; // Even better -- if VehicleMaker has more than one + // kind of factories. + }; + + Vehicle MakeVehicle(VehicleType Type) { + VehicleMaker M; // Might be OK if having a short life-span. + Tire tmp1 = M.makeTire(); // Bad -- 'tmp1' provides no information. + Light headlight = M.makeLight("head"); // Good -- descriptive. + ... + } + +Assert Liberally +^^^^^^^^^^^^^^^^ + +Use the "``assert``" macro to its fullest. Check all of your preconditions and +assumptions, you never know when a bug (not necessarily even yours) might be +caught early by an assertion, which reduces debugging time dramatically. The +"``<cassert>``" header file is probably already included by the header files you +are using, so it doesn't cost anything to use it. + +To further assist with debugging, make sure to put some kind of error message in +the assertion statement, which is printed if the assertion is tripped. This +helps the poor debugger make sense of why an assertion is being made and +enforced, and hopefully what to do about it. Here is one complete example: + +.. code-block:: c++ + + inline Value *getOperand(unsigned i) { + assert(i < Operands.size() && "getOperand() out of range!"); + return Operands[i]; + } + +Here are more examples: + +.. code-block:: c++ + + assert(Ty->isPointerType() && "Can't allocate a non pointer type!"); + + assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!"); + + assert(idx < getNumSuccessors() && "Successor # out of range!"); + + assert(V1.getType() == V2.getType() && "Constant types must be identical!"); + + assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!"); + +You get the idea. + +Please be aware that, when adding assert statements, not all compilers are aware +of the semantics of the assert. In some places, asserts are used to indicate a +piece of code that should not be reached. These are typically of the form: + +.. code-block:: c++ + + assert(0 && "Some helpful error message"); + +When used in a function that returns a value, they should be followed with a +return statement and a comment indicating that this line is never reached. This +will prevent a compiler which is unable to deduce that the assert statement +never returns from generating a warning. + +.. code-block:: c++ + + assert(0 && "Some helpful error message"); + return 0; + +Another issue is that values used only by assertions will produce an "unused +value" warning when assertions are disabled. For example, this code will warn: + +.. code-block:: c++ + + unsigned Size = V.size(); + assert(Size > 42 && "Vector smaller than it should be"); + + bool NewToSet = Myset.insert(Value); + assert(NewToSet && "The value shouldn't be in the set yet"); + +These are two interesting different cases. In the first case, the call to +``V.size()`` is only useful for the assert, and we don't want it executed when +assertions are disabled. Code like this should move the call into the assert +itself. In the second case, the side effects of the call must happen whether +the assert is enabled or not. In this case, the value should be cast to void to +disable the warning. To be specific, it is preferred to write the code like +this: + +.. code-block:: c++ + + assert(V.size() > 42 && "Vector smaller than it should be"); + + bool NewToSet = Myset.insert(Value); (void)NewToSet; + assert(NewToSet && "The value shouldn't be in the set yet"); + +Do Not Use ``using namespace std`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In LLVM, we prefer to explicitly prefix all identifiers from the standard +namespace with an "``std::``" prefix, rather than rely on "``using namespace +std;``". + +In header files, adding a ``'using namespace XXX'`` directive pollutes the +namespace of any source file that ``#include``\s the header. This is clearly a +bad thing. + +In implementation files (e.g. ``.cpp`` files), the rule is more of a stylistic +rule, but is still important. Basically, using explicit namespace prefixes +makes the code **clearer**, because it is immediately obvious what facilities +are being used and where they are coming from. And **more portable**, because +namespace clashes cannot occur between LLVM code and other namespaces. The +portability rule is important because different standard library implementations +expose different symbols (potentially ones they shouldn't), and future revisions +to the C++ standard will add more symbols to the ``std`` namespace. As such, we +never use ``'using namespace std;'`` in LLVM. + +The exception to the general rule (i.e. it's not an exception for the ``std`` +namespace) is for implementation files. For example, all of the code in the +LLVM project implements code that lives in the 'llvm' namespace. As such, it is +ok, and actually clearer, for the ``.cpp`` files to have a ``'using namespace +llvm;'`` directive at the top, after the ``#include``\s. This reduces +indentation in the body of the file for source editors that indent based on +braces, and keeps the conceptual context cleaner. The general form of this rule +is that any ``.cpp`` file that implements code in any namespace may use that +namespace (and its parents'), but should not use any others. + +Provide a Virtual Method Anchor for Classes in Headers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If a class is defined in a header file and has a vtable (either it has virtual +methods or it derives from classes with virtual methods), it must always have at +least one out-of-line virtual method in the class. Without this, the compiler +will copy the vtable and RTTI into every ``.o`` file that ``#include``\s the +header, bloating ``.o`` file sizes and increasing link times. + +Don't evaluate ``end()`` every time through a loop +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Because C++ doesn't have a standard "``foreach``" loop (though it can be +emulated with macros and may be coming in C++'0x) we end up writing a lot of +loops that manually iterate from begin to end on a variety of containers or +through other data structures. One common mistake is to write a loop in this +style: + +.. code-block:: c++ + + BasicBlock *BB = ... + for (BasicBlock::iterator I = BB->begin(); I != BB->end(); ++I) + ... use I ... + +The problem with this construct is that it evaluates "``BB->end()``" every time +through the loop. Instead of writing the loop like this, we strongly prefer +loops to be written so that they evaluate it once before the loop starts. A +convenient way to do this is like so: + +.. code-block:: c++ + + BasicBlock *BB = ... + for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) + ... use I ... + +The observant may quickly point out that these two loops may have different +semantics: if the container (a basic block in this case) is being mutated, then +"``BB->end()``" may change its value every time through the loop and the second +loop may not in fact be correct. If you actually do depend on this behavior, +please write the loop in the first form and add a comment indicating that you +did it intentionally. + +Why do we prefer the second form (when correct)? Writing the loop in the first +form has two problems. First it may be less efficient than evaluating it at the +start of the loop. In this case, the cost is probably minor --- a few extra +loads every time through the loop. However, if the base expression is more +complex, then the cost can rise quickly. I've seen loops where the end +expression was actually something like: "``SomeMap[x]->end()``" and map lookups +really aren't cheap. By writing it in the second form consistently, you +eliminate the issue entirely and don't even have to think about it. + +The second (even bigger) issue is that writing the loop in the first form hints +to the reader that the loop is mutating the container (a fact that a comment +would handily confirm!). If you write the loop in the second form, it is +immediately obvious without even looking at the body of the loop that the +container isn't being modified, which makes it easier to read the code and +understand what it does. + +While the second form of the loop is a few extra keystrokes, we do strongly +prefer it. + +``#include <iostream>`` is Forbidden +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The use of ``#include <iostream>`` in library files is hereby **forbidden**, +because many common implementations transparently inject a `static constructor`_ +into every translation unit that includes it. + +Note that using the other stream headers (``<sstream>`` for example) is not +problematic in this regard --- just ``<iostream>``. However, ``raw_ostream`` +provides various APIs that are better performing for almost every use than +``std::ostream`` style APIs. + +.. note:: + + New code should always use `raw_ostream`_ for writing, or the + ``llvm::MemoryBuffer`` API for reading files. + +.. _raw_ostream: + +Use ``raw_ostream`` +^^^^^^^^^^^^^^^^^^^ + +LLVM includes a lightweight, simple, and efficient stream implementation in +``llvm/Support/raw_ostream.h``, which provides all of the common features of +``std::ostream``. All new code should use ``raw_ostream`` instead of +``ostream``. + +Unlike ``std::ostream``, ``raw_ostream`` is not a template and can be forward +declared as ``class raw_ostream``. Public headers should generally not include +the ``raw_ostream`` header, but use forward declarations and constant references +to ``raw_ostream`` instances. + +Avoid ``std::endl`` +^^^^^^^^^^^^^^^^^^^ + +The ``std::endl`` modifier, when used with ``iostreams`` outputs a newline to +the output stream specified. In addition to doing this, however, it also +flushes the output stream. In other words, these are equivalent: + +.. code-block:: c++ + + std::cout << std::endl; + std::cout << '\n' << std::flush; + +Most of the time, you probably have no reason to flush the output stream, so +it's better to use a literal ``'\n'``. + +Microscopic Details +------------------- + +This section describes preferred low-level formatting guidelines along with +reasoning on why we prefer them. + +Spaces Before Parentheses +^^^^^^^^^^^^^^^^^^^^^^^^^ + +We prefer to put a space before an open parenthesis only in control flow +statements, but not in normal function call expressions and function-like +macros. For example, this is good: + +.. code-block:: c++ + + if (x) ... + for (i = 0; i != 100; ++i) ... + while (llvm_rocks) ... + + somefunc(42); + assert(3 != 4 && "laws of math are failing me"); + + a = foo(42, 92) + bar(x); + +and this is bad: + +.. code-block:: c++ + + if(x) ... + for(i = 0; i != 100; ++i) ... + while(llvm_rocks) ... + + somefunc (42); + assert (3 != 4 && "laws of math are failing me"); + + a = foo (42, 92) + bar (x); + +The reason for doing this is not completely arbitrary. This style makes control +flow operators stand out more, and makes expressions flow better. The function +call operator binds very tightly as a postfix operator. Putting a space after a +function name (as in the last example) makes it appear that the code might bind +the arguments of the left-hand-side of a binary operator with the argument list +of a function and the name of the right side. More specifically, it is easy to +misread the "``a``" example as: + +.. code-block:: c++ + + a = foo ((42, 92) + bar) (x); + +when skimming through the code. By avoiding a space in a function, we avoid +this misinterpretation. + +Prefer Preincrement +^^^^^^^^^^^^^^^^^^^ + +Hard fast rule: Preincrement (``++X``) may be no slower than postincrement +(``X++``) and could very well be a lot faster than it. Use preincrementation +whenever possible. + +The semantics of postincrement include making a copy of the value being +incremented, returning it, and then preincrementing the "work value". For +primitive types, this isn't a big deal. But for iterators, it can be a huge +issue (for example, some iterators contains stack and set objects in them... +copying an iterator could invoke the copy ctor's of these as well). In general, +get in the habit of always using preincrement, and you won't have a problem. + + +Namespace Indentation +^^^^^^^^^^^^^^^^^^^^^ + +In general, we strive to reduce indentation wherever possible. This is useful +because we want code to `fit into 80 columns`_ without wrapping horribly, but +also because it makes it easier to understand the code. Namespaces are a funny +thing: they are often large, and we often desire to put lots of stuff into them +(so they can be large). Other times they are tiny, because they just hold an +enum or something similar. In order to balance this, we use different +approaches for small versus large namespaces. + +If a namespace definition is small and *easily* fits on a screen (say, less than +35 lines of code), then you should indent its body. Here's an example: + +.. code-block:: c++ + + namespace llvm { + namespace X86 { + /// RelocationType - An enum for the x86 relocation codes. Note that + /// the terminology here doesn't follow x86 convention - word means + /// 32-bit and dword means 64-bit. + enum RelocationType { + /// reloc_pcrel_word - PC relative relocation, add the relocated value to + /// the value already in memory, after we adjust it for where the PC is. + reloc_pcrel_word = 0, + + /// reloc_picrel_word - PIC base relative relocation, add the relocated + /// value to the value already in memory, after we adjust it for where the + /// PIC base is. + reloc_picrel_word = 1, + + /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just + /// add the relocated value to the value already in memory. + reloc_absolute_word = 2, + reloc_absolute_dword = 3 + }; + } + } + +Since the body is small, indenting adds value because it makes it very clear +where the namespace starts and ends, and it is easy to take the whole thing in +in one "gulp" when reading the code. If the blob of code in the namespace is +larger (as it typically is in a header in the ``llvm`` or ``clang`` namespaces), +do not indent the code, and add a comment indicating what namespace is being +closed. For example: + +.. code-block:: c++ + + namespace llvm { + namespace knowledge { + + /// Grokable - This class represents things that Smith can have an intimate + /// understanding of and contains the data associated with it. + class Grokable { + ... + public: + explicit Grokable() { ... } + virtual ~Grokable() = 0; + + ... + + }; + + } // end namespace knowledge + } // end namespace llvm + +Because the class is large, we don't expect that the reader can easily +understand the entire concept in a glance, and the end of the file (where the +namespaces end) may be a long ways away from the place they open. As such, +indenting the contents of the namespace doesn't add any value, and detracts from +the readability of the class. In these cases it is best to *not* indent the +contents of the namespace. + +.. _static: + +Anonymous Namespaces +^^^^^^^^^^^^^^^^^^^^ + +After talking about namespaces in general, you may be wondering about anonymous +namespaces in particular. Anonymous namespaces are a great language feature +that tells the C++ compiler that the contents of the namespace are only visible +within the current translation unit, allowing more aggressive optimization and +eliminating the possibility of symbol name collisions. Anonymous namespaces are +to C++ as "static" is to C functions and global variables. While "``static``" +is available in C++, anonymous namespaces are more general: they can make entire +classes private to a file. + +The problem with anonymous namespaces is that they naturally want to encourage +indentation of their body, and they reduce locality of reference: if you see a +random function definition in a C++ file, it is easy to see if it is marked +static, but seeing if it is in an anonymous namespace requires scanning a big +chunk of the file. + +Because of this, we have a simple guideline: make anonymous namespaces as small +as possible, and only use them for class declarations. For example, this is +good: + +.. code-block:: c++ + + namespace { + class StringSort { + ... + public: + StringSort(...) + bool operator<(const char *RHS) const; + }; + } // end anonymous namespace + + static void Helper() { + ... + } + + bool StringSort::operator<(const char *RHS) const { + ... + } + +This is bad: + +.. code-block:: c++ + + namespace { + class StringSort { + ... + public: + StringSort(...) + bool operator<(const char *RHS) const; + }; + + void Helper() { + ... + } + + bool StringSort::operator<(const char *RHS) const { + ... + } + + } // end anonymous namespace + +This is bad specifically because if you're looking at "``Helper``" in the middle +of a large C++ file, that you have no immediate way to tell if it is local to +the file. When it is marked static explicitly, this is immediately obvious. +Also, there is no reason to enclose the definition of "``operator<``" in the +namespace just because it was declared there. + +See Also +======== + +A lot of these comments and recommendations have been culled for other sources. +Two particularly important books for our work are: + +#. `Effective C++ + <http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876>`_ + by Scott Meyers. Also interesting and useful are "More Effective C++" and + "Effective STL" by the same author. + +#. `Large-Scale C++ Software Design + <http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1>`_ + by John Lakos + +If you get some free time, and you haven't read them: do so, you might learn +something. diff --git a/docs/CommandGuide/FileCheck.pod b/docs/CommandGuide/FileCheck.rst index 2662cc0..51a9bf6 100644 --- a/docs/CommandGuide/FileCheck.pod +++ b/docs/CommandGuide/FileCheck.rst @@ -1,67 +1,87 @@ +FileCheck - Flexible pattern matching file verifier +=================================================== -=pod -=head1 NAME +SYNOPSIS +-------- -FileCheck - Flexible pattern matching file verifier -=head1 SYNOPSIS +**FileCheck** *match-filename* [*--check-prefix=XXX*] [*--strict-whitespace*] + -B<FileCheck> I<match-filename> [I<--check-prefix=XXX>] [I<--strict-whitespace>] +DESCRIPTION +----------- -=head1 DESCRIPTION -B<FileCheck> reads two files (one from standard input, and one specified on the +**FileCheck** reads two files (one from standard input, and one specified on the command line) and uses one to verify the other. This behavior is particularly useful for the testsuite, which wants to verify that the output of some tool (e.g. llc) contains the expected information (for example, a movsd from esp or whatever is interesting). This is similar to using grep, but it is optimized for matching multiple different inputs in one file in a specific order. -The I<match-filename> file specifies the file that contains the patterns to +The *match-filename* file specifies the file that contains the patterns to match. The file to verify is always read from standard input. -=head1 OPTIONS -=over +OPTIONS +------- + + + +**-help** + + Print a summary of command line options. + + + +**--check-prefix** *prefix* + + FileCheck searches the contents of *match-filename* for patterns to match. By + default, these patterns are prefixed with "CHECK:". If you'd like to use a + different prefix (e.g. because the same input file is checking multiple + different tool or options), the **--check-prefix** argument allows you to specify + a specific prefix to match. + -=item B<-help> -Print a summary of command line options. +**--strict-whitespace** -=item B<--check-prefix> I<prefix> + By default, FileCheck canonicalizes input horizontal whitespace (spaces and + tabs) which causes it to ignore these differences (a space will match a tab). + The --strict-whitespace argument disables this behavior. -FileCheck searches the contents of I<match-filename> for patterns to match. By -default, these patterns are prefixed with "CHECK:". If you'd like to use a -different prefix (e.g. because the same input file is checking multiple -different tool or options), the B<--check-prefix> argument allows you to specify -a specific prefix to match. -=item B<--strict-whitespace> -By default, FileCheck canonicalizes input horizontal whitespace (spaces and -tabs) which causes it to ignore these differences (a space will match a tab). -The --strict-whitespace argument disables this behavior. +**-version** -=item B<-version> + Show the version number of this program. -Show the version number of this program. -=back -=head1 EXIT STATUS -If B<FileCheck> verifies that the file matches the expected contents, it exits +EXIT STATUS +----------- + + +If **FileCheck** verifies that the file matches the expected contents, it exits with 0. Otherwise, if not, or if an error occurs, it will exit with a non-zero value. -=head1 TUTORIAL + +TUTORIAL +-------- + FileCheck is typically used from LLVM regression tests, being invoked on the RUN line of the test. A simple example of using FileCheck from a RUN line looks like this: - ; RUN: llvm-as < %s | llc -march=x86-64 | FileCheck %s + +.. code-block:: llvm + + ; RUN: llvm-as < %s | llc -march=x86-64 | FileCheck %s + This syntax says to pipe the current file ("%s") into llvm-as, pipe that into llc, then pipe the output of llc into FileCheck. This means that FileCheck will @@ -69,21 +89,25 @@ be verifying its standard input (the llc output) against the filename argument specified (the original .ll file specified by "%s"). To see how this works, let's look at the rest of the .ll file (after the RUN line): - define void @sub1(i32* %p, i32 %v) { - entry: - ; CHECK: sub1: - ; CHECK: subl - %0 = tail call i32 @llvm.atomic.load.sub.i32.p0i32(i32* %p, i32 %v) - ret void - } - - define void @inc4(i64* %p) { - entry: - ; CHECK: inc4: - ; CHECK: incq - %0 = tail call i64 @llvm.atomic.load.add.i64.p0i64(i64* %p, i64 1) - ret void - } + +.. code-block:: llvm + + define void @sub1(i32* %p, i32 %v) { + entry: + ; CHECK: sub1: + ; CHECK: subl + %0 = tail call i32 @llvm.atomic.load.sub.i32.p0i32(i32* %p, i32 %v) + ret void + } + + define void @inc4(i64* %p) { + entry: + ; CHECK: inc4: + ; CHECK: incq + %0 = tail call i64 @llvm.atomic.load.add.i64.p0i64(i64* %p, i64 1) + ret void + } + Here you can see some "CHECK:" lines specified in comments. Now you can see how the file is piped into llvm-as, then llc, and the machine code output is @@ -102,35 +126,40 @@ is a "subl" in between those labels. If it existed somewhere else in the file, that would not count: "grep subl" matches if subl exists anywhere in the file. +The FileCheck -check-prefix option +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -=head2 The FileCheck -check-prefix option - The FileCheck -check-prefix option allows multiple test configurations to be driven from one .ll file. This is useful in many circumstances, for example, testing different architectural variants with llc. Here's a simple example: - ; RUN: llvm-as < %s | llc -mtriple=i686-apple-darwin9 -mattr=sse41 \ - ; RUN: | FileCheck %s -check-prefix=X32> - ; RUN: llvm-as < %s | llc -mtriple=x86_64-apple-darwin9 -mattr=sse41 \ - ; RUN: | FileCheck %s -check-prefix=X64> - - define <4 x i32> @pinsrd_1(i32 %s, <4 x i32> %tmp) nounwind { - %tmp1 = insertelement <4 x i32>; %tmp, i32 %s, i32 1 - ret <4 x i32> %tmp1 - ; X32: pinsrd_1: - ; X32: pinsrd $1, 4(%esp), %xmm0 - - ; X64: pinsrd_1: - ; X64: pinsrd $1, %edi, %xmm0 - } + +.. code-block:: llvm + + ; RUN: llvm-as < %s | llc -mtriple=i686-apple-darwin9 -mattr=sse41 \ + ; RUN: | FileCheck %s -check-prefix=X32 + ; RUN: llvm-as < %s | llc -mtriple=x86_64-apple-darwin9 -mattr=sse41 \ + ; RUN: | FileCheck %s -check-prefix=X64 + + define <4 x i32> @pinsrd_1(i32 %s, <4 x i32> %tmp) nounwind { + %tmp1 = insertelement <4 x i32>; %tmp, i32 %s, i32 1 + ret <4 x i32> %tmp1 + ; X32: pinsrd_1: + ; X32: pinsrd $1, 4(%esp), %xmm0 + + ; X64: pinsrd_1: + ; X64: pinsrd $1, %edi, %xmm0 + } + In this case, we're testing that we get the expected code generation with both 32-bit and 64-bit code generation. +The "CHECK-NEXT:" directive +~~~~~~~~~~~~~~~~~~~~~~~~~~~ -=head2 The "CHECK-NEXT:" directive Sometimes you want to match lines and would like to verify that matches happen on exactly consecutive lines with no other lines in between them. In @@ -138,64 +167,78 @@ this case, you can use CHECK: and CHECK-NEXT: directives to specify this. If you specified a custom check prefix, just use "<PREFIX>-NEXT:". For example, something like this works as you'd expect: - define void @t2(<2 x double>* %r, <2 x double>* %A, double %B) { - %tmp3 = load <2 x double>* %A, align 16 - %tmp7 = insertelement <2 x double> undef, double %B, i32 0 - %tmp9 = shufflevector <2 x double> %tmp3, - <2 x double> %tmp7, - <2 x i32> < i32 0, i32 2 > - store <2 x double> %tmp9, <2 x double>* %r, align 16 - ret void - - ; CHECK: t2: - ; CHECK: movl 8(%esp), %eax - ; CHECK-NEXT: movapd (%eax), %xmm0 - ; CHECK-NEXT: movhpd 12(%esp), %xmm0 - ; CHECK-NEXT: movl 4(%esp), %eax - ; CHECK-NEXT: movapd %xmm0, (%eax) - ; CHECK-NEXT: ret - } + +.. code-block:: llvm + + define void @t2(<2 x double>* %r, <2 x double>* %A, double %B) { + %tmp3 = load <2 x double>* %A, align 16 + %tmp7 = insertelement <2 x double> undef, double %B, i32 0 + %tmp9 = shufflevector <2 x double> %tmp3, + <2 x double> %tmp7, + <2 x i32> < i32 0, i32 2 > + store <2 x double> %tmp9, <2 x double>* %r, align 16 + ret void + + ; CHECK: t2: + ; CHECK: movl 8(%esp), %eax + ; CHECK-NEXT: movapd (%eax), %xmm0 + ; CHECK-NEXT: movhpd 12(%esp), %xmm0 + ; CHECK-NEXT: movl 4(%esp), %eax + ; CHECK-NEXT: movapd %xmm0, (%eax) + ; CHECK-NEXT: ret + } + CHECK-NEXT: directives reject the input unless there is exactly one newline between it an the previous directive. A CHECK-NEXT cannot be the first directive in a file. +The "CHECK-NOT:" directive +~~~~~~~~~~~~~~~~~~~~~~~~~~ -=head2 The "CHECK-NOT:" directive The CHECK-NOT: directive is used to verify that a string doesn't occur between two matches (or before the first match, or after the last match). For example, to verify that a load is removed by a transformation, a test like this can be used: - define i8 @coerce_offset0(i32 %V, i32* %P) { - store i32 %V, i32* %P - - %P2 = bitcast i32* %P to i8* - %P3 = getelementptr i8* %P2, i32 2 - %A = load i8* %P3 - ret i8 %A - ; CHECK: @coerce_offset0 - ; CHECK-NOT: load - ; CHECK: ret i8 - } +.. code-block:: llvm + + define i8 @coerce_offset0(i32 %V, i32* %P) { + store i32 %V, i32* %P + + %P2 = bitcast i32* %P to i8* + %P3 = getelementptr i8* %P2, i32 2 + + %A = load i8* %P3 + ret i8 %A + ; CHECK: @coerce_offset0 + ; CHECK-NOT: load + ; CHECK: ret i8 + } -=head2 FileCheck Pattern Matching Syntax +FileCheck Pattern Matching Syntax +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The CHECK: and CHECK-NOT: directives both take a pattern to match. For most uses of FileCheck, fixed string matching is perfectly sufficient. For some things, a more flexible form of matching is desired. To support this, FileCheck allows you to specify regular expressions in matching strings, surrounded by -double braces: B<{{yourregex}}>. Because we want to use fixed string +double braces: **{{yourregex}}**. Because we want to use fixed string matching for a majority of what we do, FileCheck has been designed to support mixing and matching fixed string matching with regular expressions. This allows you to write things like this: - ; CHECK: movhpd {{[0-9]+}}(%esp), {{%xmm[0-7]}} + +.. code-block:: llvm + + ; CHECK: movhpd {{[0-9]+}}(%esp), {{%xmm[0-7]}} + In this case, any offset from the ESP register will be allowed, and any xmm register will be allowed. @@ -204,11 +247,12 @@ Because regular expressions are enclosed with double braces, they are visually distinct, and you don't need to use escape characters within the double braces like you would in C. In the rare case that you want to match double braces explicitly from the input, you can use something ugly like -B<{{[{][{]}}> as your pattern. +**{{[{][{]}}** as your pattern. +FileCheck Variables +~~~~~~~~~~~~~~~~~~~ -=head2 FileCheck Variables It is often useful to match a pattern and then verify that it occurs again later in the file. For codegen tests, this can be useful to allow any register, @@ -216,30 +260,25 @@ but verify that that register is used consistently later. To do this, FileCheck allows named variables to be defined and substituted into patterns. Here is a simple example: - ; CHECK: test5: - ; CHECK: notw [[REGISTER:%[a-z]+]] - ; CHECK: andw {{.*}}[REGISTER]] -The first check line matches a regex (B<%[a-z]+>) and captures it into +.. code-block:: llvm + + ; CHECK: test5: + ; CHECK: notw [[REGISTER:%[a-z]+]] + ; CHECK: andw {{.*}}[[REGISTER]] + + +The first check line matches a regex (**%[a-z]+**) and captures it into the variable "REGISTER". The second line verifies that whatever is in REGISTER occurs later in the file after an "andw". FileCheck variable references are -always contained in B<[[ ]]> pairs, are named, and their names can be -formed with the regex "B<[a-zA-Z_][a-zA-Z0-9_]*>". If a colon follows the +always contained in **[[ ]]** pairs, are named, and their names can be name, then it is a definition of the variable, if not, it is a use. FileCheck variables can be defined multiple times, and uses always get the latest value. Note that variables are all read at the start of a "CHECK" line and are all defined at the end. This means that if you have something like -"B<CHECK: [[XYZ:.*]]x[[XYZ]]>", the check line will read the previous +"**CHECK: [[XYZ:.\\*]]x[[XYZ]]**", the check line will read the previous value of the XYZ variable and define a new one after the match is performed. If you need to do something like this you can probably take advantage of the fact that FileCheck is not actually line-oriented when it matches, this allows you to define two separate CHECK lines that match on the same line. - - - -=head1 AUTHORS - -Maintained by The LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/Makefile b/docs/CommandGuide/Makefile deleted file mode 100644 index 3f9f60b..0000000 --- a/docs/CommandGuide/Makefile +++ /dev/null @@ -1,103 +0,0 @@ -##===- docs/CommandGuide/Makefile --------------------------*- Makefile -*-===## -# -# The LLVM Compiler Infrastructure -# -# This file is distributed under the University of Illinois Open Source -# License. See LICENSE.TXT for details. -# -##===----------------------------------------------------------------------===## - -ifdef BUILD_FOR_WEBSITE -# This special case is for keeping the CommandGuide on the LLVM web site -# up to date automatically as the documents are checked in. It must build -# the POD files to HTML only and keep them in the src directories. It must also -# build in an unconfigured tree, hence the ifdef. To use this, run -# make -s BUILD_FOR_WEBSITE=1 inside the cvs commit script. -SRC_DOC_DIR= -DST_HTML_DIR=html/ -DST_MAN_DIR=man/man1/ -DST_PS_DIR=ps/ - -# If we are in BUILD_FOR_WEBSITE mode, default to the all target. -all:: html man ps - -clean: - rm -f pod2htm*.*~~ $(HTML) $(MAN) $(PS) - -# To create other directories, as needed, and timestamp their creation -%/.dir: - -mkdir $* > /dev/null - date > $@ - -else - -# Otherwise, if not in BUILD_FOR_WEBSITE mode, use the project info. -LEVEL := ../.. -include $(LEVEL)/Makefile.common - -SRC_DOC_DIR=$(PROJ_SRC_DIR)/ -DST_HTML_DIR=$(PROJ_OBJ_DIR)/ -DST_MAN_DIR=$(PROJ_OBJ_DIR)/ -DST_PS_DIR=$(PROJ_OBJ_DIR)/ - -endif - - -POD := $(wildcard $(SRC_DOC_DIR)*.pod) -HTML := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_HTML_DIR)%.html, $(POD)) -MAN := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_MAN_DIR)%.1, $(POD)) -PS := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_PS_DIR)%.ps, $(POD)) - -# The set of man pages we will not install -NO_INSTALL_MANS = $(DST_MAN_DIR)FileCheck.1 $(DST_MAN_DIR)llvm-build.1 - -# The set of man pages that we will install -INSTALL_MANS = $(filter-out $(NO_INSTALL_MANS), $(MAN)) - -.SUFFIXES: -.SUFFIXES: .html .pod .1 .ps - -$(DST_HTML_DIR)%.html: %.pod $(DST_HTML_DIR)/.dir - pod2html --css=manpage.css --htmlroot=. \ - --podpath=. --noindex --infile=$< --outfile=$@ --title=$* - -$(DST_MAN_DIR)%.1: %.pod $(DST_MAN_DIR)/.dir - pod2man --release=CVS --center="LLVM Command Guide" $< $@ - -$(DST_PS_DIR)%.ps: $(DST_MAN_DIR)%.1 $(DST_PS_DIR)/.dir - groff -Tps -man $< > $@ - - -html: $(HTML) -man: $(MAN) -ps: $(PS) - -EXTRA_DIST := $(POD) index.html - -clean-local:: - $(Verb) $(RM) -f pod2htm*.*~~ $(HTML) $(MAN) $(PS) - -HTML_DIR := $(DESTDIR)$(PROJ_docsdir)/html/CommandGuide -MAN_DIR := $(DESTDIR)$(PROJ_mandir)/man1 -PS_DIR := $(DESTDIR)$(PROJ_docsdir)/ps - -install-local:: $(HTML) $(INSTALL_MANS) $(PS) - $(Echo) Installing HTML CommandGuide Documentation - $(Verb) $(MKDIR) $(HTML_DIR) - $(Verb) $(DataInstall) $(HTML) $(HTML_DIR) - $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/index.html $(HTML_DIR) - $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/manpage.css $(HTML_DIR) - $(Echo) Installing MAN CommandGuide Documentation - $(Verb) $(MKDIR) $(MAN_DIR) - $(Verb) $(DataInstall) $(INSTALL_MANS) $(MAN_DIR) - $(Echo) Installing PS CommandGuide Documentation - $(Verb) $(MKDIR) $(PS_DIR) - $(Verb) $(DataInstall) $(PS) $(PS_DIR) - -uninstall-local:: - $(Echo) Uninstalling CommandGuide Documentation - $(Verb) $(RM) -rf $(HTML_DIR) $(MAN_DIR) $(PS_DIR) - -printvars:: - $(Echo) "POD : " '$(POD)' - $(Echo) "HTML : " '$(HTML)' diff --git a/docs/CommandGuide/bugpoint.pod b/docs/CommandGuide/bugpoint.pod deleted file mode 100644 index 31db62f..0000000 --- a/docs/CommandGuide/bugpoint.pod +++ /dev/null @@ -1,186 +0,0 @@ -=pod - -=head1 NAME - -bugpoint - automatic test case reduction tool - -=head1 SYNOPSIS - -B<bugpoint> [I<options>] [I<input LLVM ll/bc files>] [I<LLVM passes>] B<--args> -I<program arguments> - -=head1 DESCRIPTION - -B<bugpoint> narrows down the source of problems in LLVM tools and passes. It -can be used to debug three types of failures: optimizer crashes, miscompilations -by optimizers, or bad native code generation (including problems in the static -and JIT compilers). It aims to reduce large test cases to small, useful ones. -For more information on the design and inner workings of B<bugpoint>, as well as -advice for using bugpoint, see F<llvm/docs/Bugpoint.html> in the LLVM -distribution. - -=head1 OPTIONS - -=over - -=item B<--additional-so> F<library> - -Load the dynamic shared object F<library> into the test program whenever it is -run. This is useful if you are debugging programs which depend on non-LLVM -libraries (such as the X or curses libraries) to run. - -=item B<--append-exit-code>=I<{true,false}> - -Append the test programs exit code to the output file so that a change in exit -code is considered a test failure. Defaults to false. - -=item B<--args> I<program args> - -Pass all arguments specified after -args to the test program whenever it runs. -Note that if any of the I<program args> start with a '-', you should use: - - bugpoint [bugpoint args] --args -- [program args] - -The "--" right after the B<--args> option tells B<bugpoint> to consider any -options starting with C<-> to be part of the B<--args> option, not as options to -B<bugpoint> itself. - -=item B<--tool-args> I<tool args> - -Pass all arguments specified after --tool-args to the LLVM tool under test -(B<llc>, B<lli>, etc.) whenever it runs. You should use this option in the -following way: - - bugpoint [bugpoint args] --tool-args -- [tool args] - -The "--" right after the B<--tool-args> option tells B<bugpoint> to consider any -options starting with C<-> to be part of the B<--tool-args> option, not as -options to B<bugpoint> itself. (See B<--args>, above.) - -=item B<--safe-tool-args> I<tool args> - -Pass all arguments specified after B<--safe-tool-args> to the "safe" execution -tool. - -=item B<--gcc-tool-args> I<gcc tool args> - -Pass all arguments specified after B<--gcc-tool-args> to the invocation of -B<gcc>. - -=item B<--opt-args> I<opt args> - -Pass all arguments specified after B<--opt-args> to the invocation of B<opt>. - -=item B<--disable-{dce,simplifycfg}> - -Do not run the specified passes to clean up and reduce the size of the test -program. By default, B<bugpoint> uses these passes internally when attempting to -reduce test programs. If you're trying to find a bug in one of these passes, -B<bugpoint> may crash. - -=item B<--enable-valgrind> - -Use valgrind to find faults in the optimization phase. This will allow -bugpoint to find otherwise asymptomatic problems caused by memory -mis-management. - -=item B<-find-bugs> - -Continually randomize the specified passes and run them on the test program -until a bug is found or the user kills B<bugpoint>. - -=item B<-help> - -Print a summary of command line options. - -=item B<--input> F<filename> - -Open F<filename> and redirect the standard input of the test program, whenever -it runs, to come from that file. - -=item B<--load> F<plugin> - -Load the dynamic object F<plugin> into B<bugpoint> itself. This object should -register new optimization passes. Once loaded, the object will add new command -line options to enable various optimizations. To see the new complete list of -optimizations, use the B<-help> and B<--load> options together; for example: - - bugpoint --load myNewPass.so -help - -=item B<--mlimit> F<megabytes> - -Specifies an upper limit on memory usage of the optimization and codegen. Set -to zero to disable the limit. - -=item B<--output> F<filename> - -Whenever the test program produces output on its standard output stream, it -should match the contents of F<filename> (the "reference output"). If you -do not use this option, B<bugpoint> will attempt to generate a reference output -by compiling the program with the "safe" backend and running it. - -=item B<--profile-info-file> F<filename> - -Profile file loaded by B<--profile-loader>. - -=item B<--run-{int,jit,llc,cbe,custom}> - -Whenever the test program is compiled, B<bugpoint> should generate code for it -using the specified code generator. These options allow you to choose the -interpreter, the JIT compiler, the static native code compiler, the C -backend, or a custom command (see B<--exec-command>) respectively. - -=item B<--safe-{llc,cbe,custom}> - -When debugging a code generator, B<bugpoint> should use the specified code -generator as the "safe" code generator. This is a known-good code generator -used to generate the "reference output" if it has not been provided, and to -compile portions of the program that as they are excluded from the testcase. -These options allow you to choose the -static native code compiler, the C backend, or a custom command, -(see B<--exec-command>) respectively. The interpreter and the JIT backends -cannot currently be used as the "safe" backends. - -=item B<--exec-command> I<command> - -This option defines the command to use with the B<--run-custom> and -B<--safe-custom> options to execute the bitcode testcase. This can -be useful for cross-compilation. - -=item B<--compile-command> I<command> - -This option defines the command to use with the B<--compile-custom> -option to compile the bitcode testcase. This can be useful for -testing compiler output without running any link or execute stages. To -generate a reduced unit test, you may add CHECK directives to the -testcase and pass the name of an executable compile-command script in this form: - - #!/bin/sh - llc "$@" - not FileCheck [bugpoint input file].ll < bugpoint-test-program.s - -This script will "fail" as long as FileCheck passes. So the result -will be the minimum bitcode that passes FileCheck. - -=item B<--safe-path> I<path> - -This option defines the path to the command to execute with the -B<--safe-{int,jit,llc,cbe,custom}> -option. - -=back - -=head1 EXIT STATUS - -If B<bugpoint> succeeds in finding a problem, it will exit with 0. Otherwise, -if an error occurs, it will exit with a non-zero value. - -=head1 SEE ALSO - -L<opt|opt> - -=head1 AUTHOR - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/bugpoint.rst b/docs/CommandGuide/bugpoint.rst new file mode 100644 index 0000000..c1b3b6e --- /dev/null +++ b/docs/CommandGuide/bugpoint.rst @@ -0,0 +1,247 @@ +bugpoint - automatic test case reduction tool +============================================= + + +SYNOPSIS +-------- + + +**bugpoint** [*options*] [*input LLVM ll/bc files*] [*LLVM passes*] **--args** +*program arguments* + + +DESCRIPTION +----------- + + +**bugpoint** narrows down the source of problems in LLVM tools and passes. It +can be used to debug three types of failures: optimizer crashes, miscompilations +by optimizers, or bad native code generation (including problems in the static +and JIT compilers). It aims to reduce large test cases to small, useful ones. +For more information on the design and inner workings of **bugpoint**, as well as +advice for using bugpoint, see *llvm/docs/Bugpoint.html* in the LLVM +distribution. + + +OPTIONS +------- + + + +**--additional-so** *library* + + Load the dynamic shared object *library* into the test program whenever it is + run. This is useful if you are debugging programs which depend on non-LLVM + libraries (such as the X or curses libraries) to run. + + + +**--append-exit-code**\ =\ *{true,false}* + + Append the test programs exit code to the output file so that a change in exit + code is considered a test failure. Defaults to false. + + + +**--args** *program args* + + Pass all arguments specified after -args to the test program whenever it runs. + Note that if any of the *program args* start with a '-', you should use: + + + .. code-block:: perl + + bugpoint [bugpoint args] --args -- [program args] + + + The "--" right after the **--args** option tells **bugpoint** to consider any + options starting with ``-`` to be part of the **--args** option, not as options to + **bugpoint** itself. + + + +**--tool-args** *tool args* + + Pass all arguments specified after --tool-args to the LLVM tool under test + (**llc**, **lli**, etc.) whenever it runs. You should use this option in the + following way: + + + .. code-block:: perl + + bugpoint [bugpoint args] --tool-args -- [tool args] + + + The "--" right after the **--tool-args** option tells **bugpoint** to consider any + options starting with ``-`` to be part of the **--tool-args** option, not as + options to **bugpoint** itself. (See **--args**, above.) + + + +**--safe-tool-args** *tool args* + + Pass all arguments specified after **--safe-tool-args** to the "safe" execution + tool. + + + +**--gcc-tool-args** *gcc tool args* + + Pass all arguments specified after **--gcc-tool-args** to the invocation of + **gcc**. + + + +**--opt-args** *opt args* + + Pass all arguments specified after **--opt-args** to the invocation of **opt**. + + + +**--disable-{dce,simplifycfg}** + + Do not run the specified passes to clean up and reduce the size of the test + program. By default, **bugpoint** uses these passes internally when attempting to + reduce test programs. If you're trying to find a bug in one of these passes, + **bugpoint** may crash. + + + +**--enable-valgrind** + + Use valgrind to find faults in the optimization phase. This will allow + bugpoint to find otherwise asymptomatic problems caused by memory + mis-management. + + + +**-find-bugs** + + Continually randomize the specified passes and run them on the test program + until a bug is found or the user kills **bugpoint**. + + + +**-help** + + Print a summary of command line options. + + + +**--input** *filename* + + Open *filename* and redirect the standard input of the test program, whenever + it runs, to come from that file. + + + +**--load** *plugin* + + Load the dynamic object *plugin* into **bugpoint** itself. This object should + register new optimization passes. Once loaded, the object will add new command + line options to enable various optimizations. To see the new complete list of + optimizations, use the **-help** and **--load** options together; for example: + + + .. code-block:: perl + + bugpoint --load myNewPass.so -help + + + + +**--mlimit** *megabytes* + + Specifies an upper limit on memory usage of the optimization and codegen. Set + to zero to disable the limit. + + + +**--output** *filename* + + Whenever the test program produces output on its standard output stream, it + should match the contents of *filename* (the "reference output"). If you + do not use this option, **bugpoint** will attempt to generate a reference output + by compiling the program with the "safe" backend and running it. + + + +**--profile-info-file** *filename* + + Profile file loaded by **--profile-loader**. + + + +**--run-{int,jit,llc,custom}** + + Whenever the test program is compiled, **bugpoint** should generate code for it + using the specified code generator. These options allow you to choose the + interpreter, the JIT compiler, the static native code compiler, or a + custom command (see **--exec-command**) respectively. + + + +**--safe-{llc,custom}** + + When debugging a code generator, **bugpoint** should use the specified code + generator as the "safe" code generator. This is a known-good code generator + used to generate the "reference output" if it has not been provided, and to + compile portions of the program that as they are excluded from the testcase. + These options allow you to choose the + static native code compiler, or a custom command, (see **--exec-command**) + respectively. The interpreter and the JIT backends cannot currently + be used as the "safe" backends. + + + +**--exec-command** *command* + + This option defines the command to use with the **--run-custom** and + **--safe-custom** options to execute the bitcode testcase. This can + be useful for cross-compilation. + + + +**--compile-command** *command* + + This option defines the command to use with the **--compile-custom** + option to compile the bitcode testcase. This can be useful for + testing compiler output without running any link or execute stages. To + generate a reduced unit test, you may add CHECK directives to the + testcase and pass the name of an executable compile-command script in this form: + + + .. code-block:: sh + + #!/bin/sh + llc "$@" + not FileCheck [bugpoint input file].ll < bugpoint-test-program.s + + + This script will "fail" as long as FileCheck passes. So the result + will be the minimum bitcode that passes FileCheck. + + + +**--safe-path** *path* + + This option defines the path to the command to execute with the + **--safe-{int,jit,llc,custom}** + option. + + + + +EXIT STATUS +----------- + + +If **bugpoint** succeeds in finding a problem, it will exit with 0. Otherwise, +if an error occurs, it will exit with a non-zero value. + + +SEE ALSO +-------- + + +opt|opt diff --git a/docs/CommandGuide/html/manpage.css b/docs/CommandGuide/html/manpage.css deleted file mode 100644 index b200343..0000000 --- a/docs/CommandGuide/html/manpage.css +++ /dev/null @@ -1,256 +0,0 @@ -/* Based on http://www.perldoc.com/css/perldoc.css */ - -@import url("../llvm.css"); - -body { font-family: Arial,Helvetica; } - -blockquote { margin: 10pt; } - -h1, a { color: #336699; } - - -/*** Top menu style ****/ -.mmenuon { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ff6600; font-size: 10pt; - } -.mmenuoff { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: 10pt; -} -.cpyright { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: xx-small; -} -.cpyrightText { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: xx-small; -} -.sections { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 11pt; -} -.dsections { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 12pt; -} -.slink { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #000000; font-size: 9pt; -} - -.slink2 { font-family: Arial,Helvetica; text-decoration: none; color: #336699; } - -.maintitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 18pt; -} -.dblArrow { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: small; -} -.menuSec { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: small; -} - -.newstext { - font-family: Arial,Helvetica; font-size: small; -} - -.linkmenu { - font-family: Arial,Helvetica; color: #000000; font-weight: bold; - text-decoration: none; -} - -P { - font-family: Arial,Helvetica; -} - -PRE { - font-size: 10pt; -} -.quote { - font-family: Times; text-decoration: none; - color: #000000; font-size: 9pt; font-style: italic; -} -.smstd { font-family: Arial,Helvetica; color: #000000; font-size: x-small; } -.std { font-family: Arial,Helvetica; color: #000000; } -.meerkatTitle { - font-family: sans-serif; font-size: x-small; color: black; } - -.meerkatDescription { font-family: sans-serif; font-size: 10pt; color: black } -.meerkatCategory { - font-family: sans-serif; font-size: 9pt; font-weight: bold; font-style: italic; - color: brown; } -.meerkatChannel { - font-family: sans-serif; font-size: 9pt; font-style: italic; color: brown; } -.meerkatDate { font-family: sans-serif; font-size: xx-small; color: #336699; } - -.tocTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 10pt; -} - -.toc-item { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; text-decoration: underline; -} - -.perlVersion { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; text-decoration: none; -} - -.podTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #000000; -} - -.docTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #000000; font-size: 10pt; -} -.dotDot { - font-family: Arial,Helvetica; font-weight: bold; - color: #000000; font-size: 9pt; -} - -.docSec { - font-family: Arial,Helvetica; font-weight: normal; - color: #333333; font-size: 9pt; -} -.docVersion { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 10pt; -} - -.docSecs-on { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #ff0000; font-size: 10pt; -} -.docSecs-off { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #333333; font-size: 10pt; -} - -h2 { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: medium; -} -h1 { - font-family: Verdana,Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: large; -} - -DL { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #333333; font-size: 10pt; -} - -UL > LI > A { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; -} - -.moduleInfo { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 11pt; -} - -.moduleInfoSec { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 10pt; -} - -.moduleInfoVal { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: underline; - color: #000000; font-size: 10pt; -} - -.cpanNavTitle { - font-family: Arial,Helvetica; font-weight: bold; - color: #ffffff; font-size: 10pt; -} -.cpanNavLetter { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 9pt; -} -.cpanCat { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 9pt; -} - -.bttndrkblue-bkgd-top { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgtop.gif); -} -.bttndrkblue-bkgd-left { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgleft.gif); -} -.bttndrkblue-bkgd { - padding-top: 0px; - padding-bottom: 0px; - margin-bottom: 0px; - margin-top: 0px; - background-repeat: no-repeat; - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgmiddle.gif); - vertical-align: top; -} -.bttndrkblue-bkgd-right { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgright.gif); -} -.bttndrkblue-bkgd-bottom { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgbottom.gif); -} -.bttndrkblue-text a { - color: #ffffff; - text-decoration: none; -} -a.bttndrkblue-text:hover { - color: #ffDD3C; - text-decoration: none; -} -.bg-ltblue { - background-color: #f0f5fa; -} - -.border-left-b { - background: #f0f5fa url(/i/corner-leftline.gif) repeat-y; -} - -.border-right-b { - background: #f0f5fa url(/i/corner-rightline.gif) repeat-y; -} - -.border-top-b { - background: #f0f5fa url(/i/corner-topline.gif) repeat-x; -} - -.border-bottom-b { - background: #f0f5fa url(/i/corner-botline.gif) repeat-x; -} - -.border-right-w { - background: #ffffff url(/i/corner-rightline.gif) repeat-y; -} - -.border-top-w { - background: #ffffff url(/i/corner-topline.gif) repeat-x; -} - -.border-bottom-w { - background: #ffffff url(/i/corner-botline.gif) repeat-x; -} - -.bg-white { - background-color: #ffffff; -} - -.border-left-w { - background: #ffffff url(/i/corner-leftline.gif) repeat-y; -} diff --git a/docs/CommandGuide/index.html b/docs/CommandGuide/index.html deleted file mode 100644 index 74ac004..0000000 --- a/docs/CommandGuide/index.html +++ /dev/null @@ -1,142 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <title>LLVM Command Guide</title> - <link rel="stylesheet" href="../llvm.css" type="text/css"> -</head> -<body> - -<h1> - LLVM Command Guide -</h1> - -<div> - -<p>These documents are HTML versions of the <a href="man/man1/">man pages</a> -for all of the LLVM tools. These pages describe how to use the LLVM commands -and what their options are. Note that these pages do not describe all of the -options available for all tools. To get a complete listing, pass the -<tt>-help</tt> (general options) or <tt>-help-hidden</tt> (general+debugging -options) arguments to the tool you are interested in.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="basic">Basic Commands</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<ul> - -<li><a href="/cmds/llvm-as.html"><b>llvm-as</b></a> - - assemble a human-readable .ll file into bytecode</li> - -<li><a href="/cmds/llvm-dis.html"><b>llvm-dis</b></a> - - disassemble a bytecode file into a human-readable .ll file</li> - -<li><a href="/cmds/opt.html"><b>opt</b></a> - - run a series of LLVM-to-LLVM optimizations on a bytecode file</li> - -<li><a href="/cmds/llc.html"><b>llc</b></a> - - generate native machine code for a bytecode file</li> - -<li><a href="/cmds/lli.html"><b>lli</b></a> - - directly run a program compiled to bytecode using a JIT compiler or - interpreter</li> - -<li><a href="/cmds/llvm-link.html"><b>llvm-link</b></a> - - link several bytecode files into one</li> - -<li><a href="/cmds/llvm-ar.html"><b>llvm-ar</b></a> - - archive bytecode files</li> - -<li><a href="/cmds/llvm-ranlib.html"><b>llvm-ranlib</b></a> - - create an index for archives made with llvm-ar</li> - -<li><a href="/cmds/llvm-nm.html"><b>llvm-nm</b></a> - - print out the names and types of symbols in a bytecode file</li> - -<li><a href="/cmds/llvm-prof.html"><b>llvm-prof</b></a> - - format raw `<tt>llvmprof.out</tt>' data into a human-readable report</li> - -<li><a href="/cmds/llvm-ld.html"><b>llvm-ld</b></a> - - general purpose linker with loadable runtime optimization support</li> - -<li><a href="/cmds/llvm-config.html"><b>llvm-config</b></a> - - print out LLVM compilation options, libraries, etc. as configured</li> - -<li><a href="/cmds/llvm-diff.html"><b>llvm-diff</b></a> - - structurally compare two modules</li> - -<li><a href="/cmds/llvm-cov.html"><b>llvm-cov</b></a> - - emit coverage information</li> - -<li><a href="/cmds/llvm-stress.html"><b>llvm-stress</b></a> - - generate random .ll files to fuzz different llvm components</li> - -</ul> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="debug">Debugging Tools</a> -</h2> -<!-- *********************************************************************** --> - - -<div> - -<ul> - -<li><a href="/cmds/bugpoint.html"><b>bugpoint</b></a> - - automatic test-case reducer</li> - -<li><a href="/cmds/llvm-extract.html"><b>llvm-extract</b></a> - - extract a function from an LLVM bytecode file</li> - -<li><a href="/cmds/llvm-bcanalyzer.html"><b>llvm-bcanalyzer</b></a> - - bytecode analyzer (analyzes the binary encoding itself, not the program it - represents)</li> - -</ul> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="internal">Internal Tools</a> -</h2> -<!-- *********************************************************************** --> - -<div> -<ul> - -<li><a href="/cmds/FileCheck.html"><b>FileCheck</b></a> - - Flexible file verifier used extensively by the testing harness</li> -<li><a href="/cmds/tblgen.html"><b>tblgen</b></a> - - target description reader and generator</li> -<li><a href="/cmds/lit.html"><b>lit</b></a> - - LLVM Integrated Tester, for running tests</li> - -</ul> -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-02-26 09:35:53 +0100 (Sun, 26 Feb 2012) $ -</address> - -</body> -</html> diff --git a/docs/CommandGuide/index.rst b/docs/CommandGuide/index.rst new file mode 100644 index 0000000..73a4835 --- /dev/null +++ b/docs/CommandGuide/index.rst @@ -0,0 +1,53 @@ +.. _commands: + +LLVM Command Guide +------------------ + +The following documents are command descriptions for all of the LLVM tools. +These pages describe how to use the LLVM commands and what their options are. +Note that these pages do not describe all of the options available for all +tools. To get a complete listing, pass the ``--help`` (general options) or +``--help-hidden`` (general and debugging options) arguments to the tool you are +interested in. + +Basic Commands +~~~~~~~~~~~~~~ + +.. toctree:: + :maxdepth: 1 + + llvm-as + llvm-dis + opt + llc + lli + llvm-link + llvm-ar + llvm-ranlib + llvm-nm + llvm-prof + llvm-config + llvm-diff + llvm-cov + llvm-stress + +Debugging Tools +~~~~~~~~~~~~~~~ + +.. toctree:: + :maxdepth: 1 + + bugpoint + llvm-extract + llvm-bcanalyzer + +Developer Tools +~~~~~~~~~~~~~~~ + +.. toctree:: + :maxdepth: 1 + + FileCheck + tblgen + lit + llvm-build diff --git a/docs/CommandGuide/lit.pod b/docs/CommandGuide/lit.pod deleted file mode 100644 index 81fc2c9..0000000 --- a/docs/CommandGuide/lit.pod +++ /dev/null @@ -1,404 +0,0 @@ -=pod - -=head1 NAME - -lit - LLVM Integrated Tester - -=head1 SYNOPSIS - -B<lit> [I<options>] [I<tests>] - -=head1 DESCRIPTION - -B<lit> is a portable tool for executing LLVM and Clang style test suites, -summarizing their results, and providing indication of failures. B<lit> is -designed to be a lightweight testing tool with as simple a user interface as -possible. - -B<lit> should be run with one or more I<tests> to run specified on the command -line. Tests can be either individual test files or directories to search for -tests (see L<"TEST DISCOVERY">). - -Each specified test will be executed (potentially in parallel) and once all -tests have been run B<lit> will print summary information on the number of tests -which passed or failed (see L<"TEST STATUS RESULTS">). The B<lit> program will -execute with a non-zero exit code if any tests fail. - -By default B<lit> will use a succinct progress display and will only print -summary information for test failures. See L<"OUTPUT OPTIONS"> for options -controlling the B<lit> progress display and output. - -B<lit> also includes a number of options for controlling how tests are executed -(specific features may depend on the particular test format). See L<"EXECUTION -OPTIONS"> for more information. - -Finally, B<lit> also supports additional options for only running a subset of -the options specified on the command line, see L<"SELECTION OPTIONS"> for -more information. - -Users interested in the B<lit> architecture or designing a B<lit> testing -implementation should see L<"LIT INFRASTRUCTURE"> - -=head1 GENERAL OPTIONS - -=over - -=item B<-h>, B<--help> - -Show the B<lit> help message. - -=item B<-j> I<N>, B<--threads>=I<N> - -Run I<N> tests in parallel. By default, this is automatically chosen to match -the number of detected available CPUs. - -=item B<--config-prefix>=I<NAME> - -Search for I<NAME.cfg> and I<NAME.site.cfg> when searching for test suites, -instead of I<lit.cfg> and I<lit.site.cfg>. - -=item B<--param> I<NAME>, B<--param> I<NAME>=I<VALUE> - -Add a user defined parameter I<NAME> with the given I<VALUE> (or the empty -string if not given). The meaning and use of these parameters is test suite -dependent. - -=back - -=head1 OUTPUT OPTIONS - -=over - -=item B<-q>, B<--quiet> - -Suppress any output except for test failures. - -=item B<-s>, B<--succinct> - -Show less output, for example don't show information on tests that pass. - -=item B<-v>, B<--verbose> - -Show more information on test failures, for example the entire test output -instead of just the test result. - -=item B<--no-progress-bar> - -Do not use curses based progress bar. - -=back - -=head1 EXECUTION OPTIONS - -=over - -=item B<--path>=I<PATH> - -Specify an addition I<PATH> to use when searching for executables in tests. - -=item B<--vg> - -Run individual tests under valgrind (using the memcheck tool). The -I<--error-exitcode> argument for valgrind is used so that valgrind failures will -cause the program to exit with a non-zero status. - -=item B<--vg-arg>=I<ARG> - -When I<--vg> is used, specify an additional argument to pass to valgrind itself. - -=item B<--time-tests> - -Track the wall time individual tests take to execute and includes the results in -the summary output. This is useful for determining which tests in a test suite -take the most time to execute. Note that this option is most useful with I<-j -1>. - -=back - -=head1 SELECTION OPTIONS - -=over - -=item B<--max-tests>=I<N> - -Run at most I<N> tests and then terminate. - -=item B<--max-time>=I<N> - -Spend at most I<N> seconds (approximately) running tests and then terminate. - -=item B<--shuffle> - -Run the tests in a random order. - -=back - -=head1 ADDITIONAL OPTIONS - -=over - -=item B<--debug> - -Run B<lit> in debug mode, for debugging configuration issues and B<lit> itself. - -=item B<--show-suites> - -List the discovered test suites as part of the standard output. - -=item B<--no-tcl-as-sh> - -Run Tcl scripts internally (instead of converting to shell scripts). - -=item B<--repeat>=I<N> - -Run each test I<N> times. Currently this is primarily useful for timing tests, -other results are not collated in any reasonable fashion. - -=back - -=head1 EXIT STATUS - -B<lit> will exit with an exit code of 1 if there are any FAIL or XPASS -results. Otherwise, it will exit with the status 0. Other exit codes are used -for non-test related failures (for example a user error or an internal program -error). - -=head1 TEST DISCOVERY - -The inputs passed to B<lit> can be either individual tests, or entire -directories or hierarchies of tests to run. When B<lit> starts up, the first -thing it does is convert the inputs into a complete list of tests to run as part -of I<test discovery>. - -In the B<lit> model, every test must exist inside some I<test suite>. B<lit> -resolves the inputs specified on the command line to test suites by searching -upwards from the input path until it finds a I<lit.cfg> or I<lit.site.cfg> -file. These files serve as both a marker of test suites and as configuration -files which B<lit> loads in order to understand how to find and run the tests -inside the test suite. - -Once B<lit> has mapped the inputs into test suites it traverses the list of -inputs adding tests for individual files and recursively searching for tests in -directories. - -This behavior makes it easy to specify a subset of tests to run, while still -allowing the test suite configuration to control exactly how tests are -interpreted. In addition, B<lit> always identifies tests by the test suite they -are in, and their relative path inside the test suite. For appropriately -configured projects, this allows B<lit> to provide convenient and flexible -support for out-of-tree builds. - -=head1 TEST STATUS RESULTS - -Each test ultimately produces one of the following six results: - -=over - -=item B<PASS> - -The test succeeded. - -=item B<XFAIL> - -The test failed, but that is expected. This is used for test formats which allow -specifying that a test does not currently work, but wish to leave it in the test -suite. - -=item B<XPASS> - -The test succeeded, but it was expected to fail. This is used for tests which -were specified as expected to fail, but are now succeeding (generally because -the feature they test was broken and has been fixed). - -=item B<FAIL> - -The test failed. - -=item B<UNRESOLVED> - -The test result could not be determined. For example, this occurs when the test -could not be run, the test itself is invalid, or the test was interrupted. - -=item B<UNSUPPORTED> - -The test is not supported in this environment. This is used by test formats -which can report unsupported tests. - -=back - -Depending on the test format tests may produce additional information about -their status (generally only for failures). See the L<Output|"OUTPUT OPTIONS"> -section for more information. - -=head1 LIT INFRASTRUCTURE - -This section describes the B<lit> testing architecture for users interested in -creating a new B<lit> testing implementation, or extending an existing one. - -B<lit> proper is primarily an infrastructure for discovering and running -arbitrary tests, and to expose a single convenient interface to these -tests. B<lit> itself doesn't know how to run tests, rather this logic is -defined by I<test suites>. - -=head2 TEST SUITES - -As described in L<"TEST DISCOVERY">, tests are always located inside a I<test -suite>. Test suites serve to define the format of the tests they contain, the -logic for finding those tests, and any additional information to run the tests. - -B<lit> identifies test suites as directories containing I<lit.cfg> or -I<lit.site.cfg> files (see also B<--config-prefix>). Test suites are initially -discovered by recursively searching up the directory hierarchy for all the input -files passed on the command line. You can use B<--show-suites> to display the -discovered test suites at startup. - -Once a test suite is discovered, its config file is loaded. Config files -themselves are Python modules which will be executed. When the config file is -executed, two important global variables are predefined: - -=over - -=item B<lit> - -The global B<lit> configuration object (a I<LitConfig> instance), which defines -the builtin test formats, global configuration parameters, and other helper -routines for implementing test configurations. - -=item B<config> - -This is the config object (a I<TestingConfig> instance) for the test suite, -which the config file is expected to populate. The following variables are also -available on the I<config> object, some of which must be set by the config and -others are optional or predefined: - -B<name> I<[required]> The name of the test suite, for use in reports and -diagnostics. - -B<test_format> I<[required]> The test format object which will be used to -discover and run tests in the test suite. Generally this will be a builtin test -format available from the I<lit.formats> module. - -B<test_src_root> The filesystem path to the test suite root. For out-of-dir -builds this is the directory that will be scanned for tests. - -B<test_exec_root> For out-of-dir builds, the path to the test suite root inside -the object directory. This is where tests will be run and temporary output files -placed. - -B<environment> A dictionary representing the environment to use when executing -tests in the suite. - -B<suffixes> For B<lit> test formats which scan directories for tests, this -variable is a list of suffixes to identify test files. Used by: I<ShTest>, -I<TclTest>. - -B<substitutions> For B<lit> test formats which substitute variables into a test -script, the list of substitutions to perform. Used by: I<ShTest>, I<TclTest>. - -B<unsupported> Mark an unsupported directory, all tests within it will be -reported as unsupported. Used by: I<ShTest>, I<TclTest>. - -B<parent> The parent configuration, this is the config object for the directory -containing the test suite, or None. - -B<root> The root configuration. This is the top-most B<lit> configuration in -the project. - -B<on_clone> The config is actually cloned for every subdirectory inside a test -suite, to allow local configuration on a per-directory basis. The I<on_clone> -variable can be set to a Python function which will be called whenever a -configuration is cloned (for a subdirectory). The function should takes three -arguments: (1) the parent configuration, (2) the new configuration (which the -I<on_clone> function will generally modify), and (3) the test path to the new -directory being scanned. - -=back - -=head2 TEST DISCOVERY - -Once test suites are located, B<lit> recursively traverses the source directory -(following I<test_src_root>) looking for tests. When B<lit> enters a -sub-directory, it first checks to see if a nested test suite is defined in that -directory. If so, it loads that test suite recursively, otherwise it -instantiates a local test config for the directory (see L<"LOCAL CONFIGURATION -FILES">). - -Tests are identified by the test suite they are contained within, and the -relative path inside that suite. Note that the relative path may not refer to an -actual file on disk; some test formats (such as I<GoogleTest>) define "virtual -tests" which have a path that contains both the path to the actual test file and -a subpath to identify the virtual test. - -=head2 LOCAL CONFIGURATION FILES - -When B<lit> loads a subdirectory in a test suite, it instantiates a local test -configuration by cloning the configuration for the parent direction -- the root -of this configuration chain will always be a test suite. Once the test -configuration is cloned B<lit> checks for a I<lit.local.cfg> file in the -subdirectory. If present, this file will be loaded and can be used to specialize -the configuration for each individual directory. This facility can be used to -define subdirectories of optional tests, or to change other configuration -parameters -- for example, to change the test format, or the suffixes which -identify test files. - -=head2 TEST RUN OUTPUT FORMAT - -The b<lit> output for a test run conforms to the following schema, in both short -and verbose modes (although in short mode no PASS lines will be shown). This -schema has been chosen to be relatively easy to reliably parse by a machine (for -example in buildbot log scraping), and for other tools to generate. - -Each test result is expected to appear on a line that matches: - -<result code>: <test name> (<progress info>) - -where <result-code> is a standard test result such as PASS, FAIL, XFAIL, XPASS, -UNRESOLVED, or UNSUPPORTED. The performance result codes of IMPROVED and -REGRESSED are also allowed. - -The <test name> field can consist of an arbitrary string containing no newline. - -The <progress info> field can be used to report progress information such as -(1/300) or can be empty, but even when empty the parentheses are required. - -Each test result may include additional (multiline) log information in the -following format. - -<log delineator> TEST '(<test name>)' <trailing delineator> -... log message ... -<log delineator> - -where <test name> should be the name of a preceeding reported test, <log -delineator> is a string of '*' characters I<at least> four characters long (the -recommended length is 20), and <trailing delineator> is an arbitrary (unparsed) -string. - -The following is an example of a test run output which consists of four tests A, -B, C, and D, and a log message for the failing test C. - -=head3 Example Test Run Output Listing - -PASS: A (1 of 4) -PASS: B (2 of 4) -FAIL: C (3 of 4) -******************** TEST 'C' FAILED ******************** -Test 'C' failed as a result of exit code 1. -******************** -PASS: D (4 of 4) - -=back - -=head2 LIT EXAMPLE TESTS - -The B<lit> distribution contains several example implementations of test suites -in the I<ExampleTests> directory. - -=head1 SEE ALSO - -L<valgrind(1)> - -=head1 AUTHOR - -Written by Daniel Dunbar and maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/lit.rst b/docs/CommandGuide/lit.rst new file mode 100644 index 0000000..3eb0be9 --- /dev/null +++ b/docs/CommandGuide/lit.rst @@ -0,0 +1,474 @@ +lit - LLVM Integrated Tester +============================ + + +SYNOPSIS +-------- + + +**lit** [*options*] [*tests*] + + +DESCRIPTION +----------- + + +**lit** is a portable tool for executing LLVM and Clang style test suites, +summarizing their results, and providing indication of failures. **lit** is +designed to be a lightweight testing tool with as simple a user interface as +possible. + +**lit** should be run with one or more *tests* to run specified on the command +line. Tests can be either individual test files or directories to search for +tests (see "TEST DISCOVERY"). + +Each specified test will be executed (potentially in parallel) and once all +tests have been run **lit** will print summary information on the number of tests +which passed or failed (see "TEST STATUS RESULTS"). The **lit** program will +execute with a non-zero exit code if any tests fail. + +By default **lit** will use a succinct progress display and will only print +summary information for test failures. See "OUTPUT OPTIONS" for options +controlling the **lit** progress display and output. + +**lit** also includes a number of options for controlling how tests are executed +(specific features may depend on the particular test format). See "EXECUTION +OPTIONS" for more information. + +Finally, **lit** also supports additional options for only running a subset of +the options specified on the command line, see "SELECTION OPTIONS" for +more information. + +Users interested in the **lit** architecture or designing a **lit** testing +implementation should see "LIT INFRASTRUCTURE" + + +GENERAL OPTIONS +--------------- + + + +**-h**, **--help** + + Show the **lit** help message. + + + +**-j** *N*, **--threads**\ =\ *N* + + Run *N* tests in parallel. By default, this is automatically chosen to match + the number of detected available CPUs. + + + +**--config-prefix**\ =\ *NAME* + + Search for *NAME.cfg* and *NAME.site.cfg* when searching for test suites, + instead of *lit.cfg* and *lit.site.cfg*. + + + +**--param** *NAME*, **--param** *NAME*\ =\ *VALUE* + + Add a user defined parameter *NAME* with the given *VALUE* (or the empty + string if not given). The meaning and use of these parameters is test suite + dependent. + + + + +OUTPUT OPTIONS +-------------- + + + +**-q**, **--quiet** + + Suppress any output except for test failures. + + + +**-s**, **--succinct** + + Show less output, for example don't show information on tests that pass. + + + +**-v**, **--verbose** + + Show more information on test failures, for example the entire test output + instead of just the test result. + + + +**--no-progress-bar** + + Do not use curses based progress bar. + + + + +EXECUTION OPTIONS +----------------- + + + +**--path**\ =\ *PATH* + + Specify an addition *PATH* to use when searching for executables in tests. + + + +**--vg** + + Run individual tests under valgrind (using the memcheck tool). The + *--error-exitcode* argument for valgrind is used so that valgrind failures will + cause the program to exit with a non-zero status. + + + +**--vg-arg**\ =\ *ARG* + + When *--vg* is used, specify an additional argument to pass to valgrind itself. + + + +**--time-tests** + + Track the wall time individual tests take to execute and includes the results in + the summary output. This is useful for determining which tests in a test suite + take the most time to execute. Note that this option is most useful with *-j + 1*. + + + + +SELECTION OPTIONS +----------------- + + + +**--max-tests**\ =\ *N* + + Run at most *N* tests and then terminate. + + + +**--max-time**\ =\ *N* + + Spend at most *N* seconds (approximately) running tests and then terminate. + + + +**--shuffle** + + Run the tests in a random order. + + + + +ADDITIONAL OPTIONS +------------------ + + + +**--debug** + + Run **lit** in debug mode, for debugging configuration issues and **lit** itself. + + + +**--show-suites** + + List the discovered test suites as part of the standard output. + + + +**--no-tcl-as-sh** + + Run Tcl scripts internally (instead of converting to shell scripts). + + + +**--repeat**\ =\ *N* + + Run each test *N* times. Currently this is primarily useful for timing tests, + other results are not collated in any reasonable fashion. + + + + +EXIT STATUS +----------- + + +**lit** will exit with an exit code of 1 if there are any FAIL or XPASS +results. Otherwise, it will exit with the status 0. Other exit codes are used +for non-test related failures (for example a user error or an internal program +error). + + +TEST DISCOVERY +-------------- + + +The inputs passed to **lit** can be either individual tests, or entire +directories or hierarchies of tests to run. When **lit** starts up, the first +thing it does is convert the inputs into a complete list of tests to run as part +of *test discovery*. + +In the **lit** model, every test must exist inside some *test suite*. **lit** +resolves the inputs specified on the command line to test suites by searching +upwards from the input path until it finds a *lit.cfg* or *lit.site.cfg* +file. These files serve as both a marker of test suites and as configuration +files which **lit** loads in order to understand how to find and run the tests +inside the test suite. + +Once **lit** has mapped the inputs into test suites it traverses the list of +inputs adding tests for individual files and recursively searching for tests in +directories. + +This behavior makes it easy to specify a subset of tests to run, while still +allowing the test suite configuration to control exactly how tests are +interpreted. In addition, **lit** always identifies tests by the test suite they +are in, and their relative path inside the test suite. For appropriately +configured projects, this allows **lit** to provide convenient and flexible +support for out-of-tree builds. + + +TEST STATUS RESULTS +------------------- + + +Each test ultimately produces one of the following six results: + + +**PASS** + + The test succeeded. + + + +**XFAIL** + + The test failed, but that is expected. This is used for test formats which allow + specifying that a test does not currently work, but wish to leave it in the test + suite. + + + +**XPASS** + + The test succeeded, but it was expected to fail. This is used for tests which + were specified as expected to fail, but are now succeeding (generally because + the feature they test was broken and has been fixed). + + + +**FAIL** + + The test failed. + + + +**UNRESOLVED** + + The test result could not be determined. For example, this occurs when the test + could not be run, the test itself is invalid, or the test was interrupted. + + + +**UNSUPPORTED** + + The test is not supported in this environment. This is used by test formats + which can report unsupported tests. + + + +Depending on the test format tests may produce additional information about +their status (generally only for failures). See the Output|"OUTPUT OPTIONS" +section for more information. + + +LIT INFRASTRUCTURE +------------------ + + +This section describes the **lit** testing architecture for users interested in +creating a new **lit** testing implementation, or extending an existing one. + +**lit** proper is primarily an infrastructure for discovering and running +arbitrary tests, and to expose a single convenient interface to these +tests. **lit** itself doesn't know how to run tests, rather this logic is +defined by *test suites*. + +TEST SUITES +~~~~~~~~~~~ + + +As described in "TEST DISCOVERY", tests are always located inside a *test +suite*. Test suites serve to define the format of the tests they contain, the +logic for finding those tests, and any additional information to run the tests. + +**lit** identifies test suites as directories containing *lit.cfg* or +*lit.site.cfg* files (see also **--config-prefix**). Test suites are initially +discovered by recursively searching up the directory hierarchy for all the input +files passed on the command line. You can use **--show-suites** to display the +discovered test suites at startup. + +Once a test suite is discovered, its config file is loaded. Config files +themselves are Python modules which will be executed. When the config file is +executed, two important global variables are predefined: + + +**lit** + + The global **lit** configuration object (a *LitConfig* instance), which defines + the builtin test formats, global configuration parameters, and other helper + routines for implementing test configurations. + + + +**config** + + This is the config object (a *TestingConfig* instance) for the test suite, + which the config file is expected to populate. The following variables are also + available on the *config* object, some of which must be set by the config and + others are optional or predefined: + + **name** *[required]* The name of the test suite, for use in reports and + diagnostics. + + **test_format** *[required]* The test format object which will be used to + discover and run tests in the test suite. Generally this will be a builtin test + format available from the *lit.formats* module. + + **test_src_root** The filesystem path to the test suite root. For out-of-dir + builds this is the directory that will be scanned for tests. + + **test_exec_root** For out-of-dir builds, the path to the test suite root inside + the object directory. This is where tests will be run and temporary output files + placed. + + **environment** A dictionary representing the environment to use when executing + tests in the suite. + + **suffixes** For **lit** test formats which scan directories for tests, this + variable is a list of suffixes to identify test files. Used by: *ShTest*, + *TclTest*. + + **substitutions** For **lit** test formats which substitute variables into a test + script, the list of substitutions to perform. Used by: *ShTest*, *TclTest*. + + **unsupported** Mark an unsupported directory, all tests within it will be + reported as unsupported. Used by: *ShTest*, *TclTest*. + + **parent** The parent configuration, this is the config object for the directory + containing the test suite, or None. + + **root** The root configuration. This is the top-most **lit** configuration in + the project. + + **on_clone** The config is actually cloned for every subdirectory inside a test + suite, to allow local configuration on a per-directory basis. The *on_clone* + variable can be set to a Python function which will be called whenever a + configuration is cloned (for a subdirectory). The function should takes three + arguments: (1) the parent configuration, (2) the new configuration (which the + *on_clone* function will generally modify), and (3) the test path to the new + directory being scanned. + + + + +TEST DISCOVERY +~~~~~~~~~~~~~~ + + +Once test suites are located, **lit** recursively traverses the source directory +(following *test_src_root*) looking for tests. When **lit** enters a +sub-directory, it first checks to see if a nested test suite is defined in that +directory. If so, it loads that test suite recursively, otherwise it +instantiates a local test config for the directory (see "LOCAL CONFIGURATION +FILES"). + +Tests are identified by the test suite they are contained within, and the +relative path inside that suite. Note that the relative path may not refer to an +actual file on disk; some test formats (such as *GoogleTest*) define "virtual +tests" which have a path that contains both the path to the actual test file and +a subpath to identify the virtual test. + + +LOCAL CONFIGURATION FILES +~~~~~~~~~~~~~~~~~~~~~~~~~ + + +When **lit** loads a subdirectory in a test suite, it instantiates a local test +configuration by cloning the configuration for the parent direction -- the root +of this configuration chain will always be a test suite. Once the test +configuration is cloned **lit** checks for a *lit.local.cfg* file in the +subdirectory. If present, this file will be loaded and can be used to specialize +the configuration for each individual directory. This facility can be used to +define subdirectories of optional tests, or to change other configuration +parameters -- for example, to change the test format, or the suffixes which +identify test files. + + +TEST RUN OUTPUT FORMAT +~~~~~~~~~~~~~~~~~~~~~~ + + +The b<lit> output for a test run conforms to the following schema, in both short +and verbose modes (although in short mode no PASS lines will be shown). This +schema has been chosen to be relatively easy to reliably parse by a machine (for +example in buildbot log scraping), and for other tools to generate. + +Each test result is expected to appear on a line that matches: + +<result code>: <test name> (<progress info>) + +where <result-code> is a standard test result such as PASS, FAIL, XFAIL, XPASS, +UNRESOLVED, or UNSUPPORTED. The performance result codes of IMPROVED and +REGRESSED are also allowed. + +The <test name> field can consist of an arbitrary string containing no newline. + +The <progress info> field can be used to report progress information such as +(1/300) or can be empty, but even when empty the parentheses are required. + +Each test result may include additional (multiline) log information in the +following format. + +<log delineator> TEST '(<test name>)' <trailing delineator> +... log message ... +<log delineator> + +where <test name> should be the name of a preceding reported test, <log +delineator> is a string of '\*' characters *at least* four characters long (the +recommended length is 20), and <trailing delineator> is an arbitrary (unparsed) +string. + +The following is an example of a test run output which consists of four tests A, +B, C, and D, and a log message for the failing test C:: + + PASS: A (1 of 4) + PASS: B (2 of 4) + FAIL: C (3 of 4) + \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* TEST 'C' FAILED \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* + Test 'C' failed as a result of exit code 1. + \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* + PASS: D (4 of 4) + + +LIT EXAMPLE TESTS +~~~~~~~~~~~~~~~~~ + + +The **lit** distribution contains several example implementations of test suites +in the *ExampleTests* directory. + + +SEE ALSO +-------- + + +valgrind(1) diff --git a/docs/CommandGuide/llc.pod b/docs/CommandGuide/llc.pod deleted file mode 100644 index 35abdae..0000000 --- a/docs/CommandGuide/llc.pod +++ /dev/null @@ -1,201 +0,0 @@ -=pod - -=head1 NAME - -llc - LLVM static compiler - -=head1 SYNOPSIS - -B<llc> [I<options>] [I<filename>] - -=head1 DESCRIPTION - -The B<llc> command compiles LLVM source inputs into assembly language for a -specified architecture. The assembly language output can then be passed through -a native assembler and linker to generate a native executable. - -The choice of architecture for the output assembly code is automatically -determined from the input file, unless the B<-march> option is used to override -the default. - -=head1 OPTIONS - -If I<filename> is - or omitted, B<llc> reads from standard input. Otherwise, it -will from I<filename>. Inputs can be in either the LLVM assembly language -format (.ll) or the LLVM bitcode format (.bc). - -If the B<-o> option is omitted, then B<llc> will send its output to standard -output if the input is from standard input. If the B<-o> option specifies -, -then the output will also be sent to standard output. - -If no B<-o> option is specified and an input file other than - is specified, -then B<llc> creates the output filename by taking the input filename, -removing any existing F<.bc> extension, and adding a F<.s> suffix. - -Other B<llc> options are as follows: - -=head2 End-user Options - -=over - -=item B<-help> - -Print a summary of command line options. - -=item B<-O>=I<uint> - -Generate code at different optimization levels. These correspond to the I<-O0>, -I<-O1>, I<-O2>, and I<-O3> optimization levels used by B<llvm-gcc> and -B<clang>. - -=item B<-mtriple>=I<target triple> - -Override the target triple specified in the input file with the specified -string. - -=item B<-march>=I<arch> - -Specify the architecture for which to generate assembly, overriding the target -encoded in the input file. See the output of B<llc -help> for a list of -valid architectures. By default this is inferred from the target triple or -autodetected to the current architecture. - -=item B<-mcpu>=I<cpuname> - -Specify a specific chip in the current architecture to generate code for. -By default this is inferred from the target triple and autodetected to -the current architecture. For a list of available CPUs, use: -B<llvm-as E<lt> /dev/null | llc -march=xyz -mcpu=help> - -=item B<-mattr>=I<a1,+a2,-a3,...> - -Override or control specific attributes of the target, such as whether SIMD -operations are enabled or not. The default set of attributes is set by the -current CPU. For a list of available attributes, use: -B<llvm-as E<lt> /dev/null | llc -march=xyz -mattr=help> - -=item B<--disable-fp-elim> - -Disable frame pointer elimination optimization. - -=item B<--disable-excess-fp-precision> - -Disable optimizations that may produce excess precision for floating point. -Note that this option can dramatically slow down code on some systems -(e.g. X86). - -=item B<--enable-no-infs-fp-math> - -Enable optimizations that assume no Inf values. - -=item B<--enable-no-nans-fp-math> - -Enable optimizations that assume no NAN values. - -=item B<--enable-unsafe-fp-math> - -Enable optimizations that make unsafe assumptions about IEEE math (e.g. that -addition is associative) or may not work for all input ranges. These -optimizations allow the code generator to make use of some instructions which -would otherwise not be usable (such as fsin on X86). - -=item B<--enable-correct-eh-support> - -Instruct the B<lowerinvoke> pass to insert code for correct exception handling -support. This is expensive and is by default omitted for efficiency. - -=item B<--stats> - -Print statistics recorded by code-generation passes. - -=item B<--time-passes> - -Record the amount of time needed for each pass and print a report to standard -error. - -=item B<--load>=F<dso_path> - -Dynamically load F<dso_path> (a path to a dynamically shared object) that -implements an LLVM target. This will permit the target name to be used with the -B<-march> option so that code can be generated for that target. - -=back - -=head2 Tuning/Configuration Options - -=over - -=item B<--print-machineinstrs> - -Print generated machine code between compilation phases (useful for debugging). - -=item B<--regalloc>=I<allocator> - -Specify the register allocator to use. The default I<allocator> is I<local>. -Valid register allocators are: - -=over - -=item I<simple> - -Very simple "always spill" register allocator - -=item I<local> - -Local register allocator - -=item I<linearscan> - -Linear scan global register allocator - -=item I<iterativescan> - -Iterative scan global register allocator - -=back - -=item B<--spiller>=I<spiller> - -Specify the spiller to use for register allocators that support it. Currently -this option is used only by the linear scan register allocator. The default -I<spiller> is I<local>. Valid spillers are: - -=over - -=item I<simple> - -Simple spiller - -=item I<local> - -Local spiller - -=back - -=back - -=head2 Intel IA-32-specific Options - -=over - -=item B<--x86-asm-syntax=att|intel> - -Specify whether to emit assembly code in AT&T syntax (the default) or intel -syntax. - -=back - -=head1 EXIT STATUS - -If B<llc> succeeds, it will exit with 0. Otherwise, if an error occurs, -it will exit with a non-zero value. - -=head1 SEE ALSO - -L<lli|lli> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llc.rst b/docs/CommandGuide/llc.rst new file mode 100644 index 0000000..6f1c486 --- /dev/null +++ b/docs/CommandGuide/llc.rst @@ -0,0 +1,251 @@ +llc - LLVM static compiler +========================== + + +SYNOPSIS +-------- + + +**llc** [*options*] [*filename*] + + +DESCRIPTION +----------- + + +The **llc** command compiles LLVM source inputs into assembly language for a +specified architecture. The assembly language output can then be passed through +a native assembler and linker to generate a native executable. + +The choice of architecture for the output assembly code is automatically +determined from the input file, unless the **-march** option is used to override +the default. + + +OPTIONS +------- + + +If *filename* is - or omitted, **llc** reads from standard input. Otherwise, it +will from *filename*. Inputs can be in either the LLVM assembly language +format (.ll) or the LLVM bitcode format (.bc). + +If the **-o** option is omitted, then **llc** will send its output to standard +output if the input is from standard input. If the **-o** option specifies -, +then the output will also be sent to standard output. + +If no **-o** option is specified and an input file other than - is specified, +then **llc** creates the output filename by taking the input filename, +removing any existing *.bc* extension, and adding a *.s* suffix. + +Other **llc** options are as follows: + +End-user Options +~~~~~~~~~~~~~~~~ + + + +**-help** + + Print a summary of command line options. + + + +**-O**\ =\ *uint* + + Generate code at different optimization levels. These correspond to the *-O0*, + *-O1*, *-O2*, and *-O3* optimization levels used by **llvm-gcc** and + **clang**. + + + +**-mtriple**\ =\ *target triple* + + Override the target triple specified in the input file with the specified + string. + + + +**-march**\ =\ *arch* + + Specify the architecture for which to generate assembly, overriding the target + encoded in the input file. See the output of **llc -help** for a list of + valid architectures. By default this is inferred from the target triple or + autodetected to the current architecture. + + + +**-mcpu**\ =\ *cpuname* + + Specify a specific chip in the current architecture to generate code for. + By default this is inferred from the target triple and autodetected to + the current architecture. For a list of available CPUs, use: + **llvm-as < /dev/null | llc -march=xyz -mcpu=help** + + + +**-mattr**\ =\ *a1,+a2,-a3,...* + + Override or control specific attributes of the target, such as whether SIMD + operations are enabled or not. The default set of attributes is set by the + current CPU. For a list of available attributes, use: + **llvm-as < /dev/null | llc -march=xyz -mattr=help** + + + +**--disable-fp-elim** + + Disable frame pointer elimination optimization. + + + +**--disable-excess-fp-precision** + + Disable optimizations that may produce excess precision for floating point. + Note that this option can dramatically slow down code on some systems + (e.g. X86). + + + +**--enable-no-infs-fp-math** + + Enable optimizations that assume no Inf values. + + + +**--enable-no-nans-fp-math** + + Enable optimizations that assume no NAN values. + + + +**--enable-unsafe-fp-math** + + Enable optimizations that make unsafe assumptions about IEEE math (e.g. that + addition is associative) or may not work for all input ranges. These + optimizations allow the code generator to make use of some instructions which + would otherwise not be usable (such as fsin on X86). + + + +**--enable-correct-eh-support** + + Instruct the **lowerinvoke** pass to insert code for correct exception handling + support. This is expensive and is by default omitted for efficiency. + + + +**--stats** + + Print statistics recorded by code-generation passes. + + + +**--time-passes** + + Record the amount of time needed for each pass and print a report to standard + error. + + + +**--load**\ =\ *dso_path* + + Dynamically load *dso_path* (a path to a dynamically shared object) that + implements an LLVM target. This will permit the target name to be used with the + **-march** option so that code can be generated for that target. + + + + +Tuning/Configuration Options +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + + +**--print-machineinstrs** + + Print generated machine code between compilation phases (useful for debugging). + + + +**--regalloc**\ =\ *allocator* + + Specify the register allocator to use. The default *allocator* is *local*. + Valid register allocators are: + + + *simple* + + Very simple "always spill" register allocator + + + + *local* + + Local register allocator + + + + *linearscan* + + Linear scan global register allocator + + + + *iterativescan* + + Iterative scan global register allocator + + + + + +**--spiller**\ =\ *spiller* + + Specify the spiller to use for register allocators that support it. Currently + this option is used only by the linear scan register allocator. The default + *spiller* is *local*. Valid spillers are: + + + *simple* + + Simple spiller + + + + *local* + + Local spiller + + + + + + +Intel IA-32-specific Options +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + + +**--x86-asm-syntax=att|intel** + + Specify whether to emit assembly code in AT&T syntax (the default) or intel + syntax. + + + + + +EXIT STATUS +----------- + + +If **llc** succeeds, it will exit with 0. Otherwise, if an error occurs, +it will exit with a non-zero value. + + +SEE ALSO +-------- + + +lli|lli diff --git a/docs/CommandGuide/lli.pod b/docs/CommandGuide/lli.pod deleted file mode 100644 index a313a31..0000000 --- a/docs/CommandGuide/lli.pod +++ /dev/null @@ -1,219 +0,0 @@ -=pod - -=head1 NAME - -lli - directly execute programs from LLVM bitcode - -=head1 SYNOPSIS - -B<lli> [I<options>] [I<filename>] [I<program args>] - -=head1 DESCRIPTION - -B<lli> directly executes programs in LLVM bitcode format. It takes a program -in LLVM bitcode format and executes it using a just-in-time compiler, if one is -available for the current architecture, or an interpreter. B<lli> takes all of -the same code generator options as L<llc|llc>, but they are only effective when -B<lli> is using the just-in-time compiler. - -If I<filename> is not specified, then B<lli> reads the LLVM bitcode for the -program from standard input. - -The optional I<args> specified on the command line are passed to the program as -arguments. - -=head1 GENERAL OPTIONS - -=over - -=item B<-fake-argv0>=I<executable> - -Override the C<argv[0]> value passed into the executing program. - -=item B<-force-interpreter>=I<{false,true}> - -If set to true, use the interpreter even if a just-in-time compiler is available -for this architecture. Defaults to false. - -=item B<-help> - -Print a summary of command line options. - -=item B<-load>=I<puginfilename> - -Causes B<lli> to load the plugin (shared object) named I<pluginfilename> and use -it for optimization. - -=item B<-stats> - -Print statistics from the code-generation passes. This is only meaningful for -the just-in-time compiler, at present. - -=item B<-time-passes> - -Record the amount of time needed for each code-generation pass and print it to -standard error. - -=item B<-version> - -Print out the version of B<lli> and exit without doing anything else. - -=back - -=head1 TARGET OPTIONS - -=over - -=item B<-mtriple>=I<target triple> - -Override the target triple specified in the input bitcode file with the -specified string. This may result in a crash if you pick an -architecture which is not compatible with the current system. - -=item B<-march>=I<arch> - -Specify the architecture for which to generate assembly, overriding the target -encoded in the bitcode file. See the output of B<llc -help> for a list of -valid architectures. By default this is inferred from the target triple or -autodetected to the current architecture. - -=item B<-mcpu>=I<cpuname> - -Specify a specific chip in the current architecture to generate code for. -By default this is inferred from the target triple and autodetected to -the current architecture. For a list of available CPUs, use: -B<llvm-as E<lt> /dev/null | llc -march=xyz -mcpu=help> - -=item B<-mattr>=I<a1,+a2,-a3,...> - -Override or control specific attributes of the target, such as whether SIMD -operations are enabled or not. The default set of attributes is set by the -current CPU. For a list of available attributes, use: -B<llvm-as E<lt> /dev/null | llc -march=xyz -mattr=help> - -=back - - -=head1 FLOATING POINT OPTIONS - -=over - -=item B<-disable-excess-fp-precision> - -Disable optimizations that may increase floating point precision. - -=item B<-enable-no-infs-fp-math> - -Enable optimizations that assume no Inf values. - -=item B<-enable-no-nans-fp-math> - -Enable optimizations that assume no NAN values. - -=item B<-enable-unsafe-fp-math> - -Causes B<lli> to enable optimizations that may decrease floating point -precision. - -=item B<-soft-float> - -Causes B<lli> to generate software floating point library calls instead of -equivalent hardware instructions. - -=back - -=head1 CODE GENERATION OPTIONS - -=over - -=item B<-code-model>=I<model> - -Choose the code model from: - - default: Target default code model - small: Small code model - kernel: Kernel code model - medium: Medium code model - large: Large code model - -=item B<-disable-post-RA-scheduler> - -Disable scheduling after register allocation. - -=item B<-disable-spill-fusing> - -Disable fusing of spill code into instructions. - -=item B<-enable-correct-eh-support> - -Make the -lowerinvoke pass insert expensive, but correct, EH code. - -=item B<-jit-enable-eh> - -Exception handling should be enabled in the just-in-time compiler. - -=item B<-join-liveintervals> - -Coalesce copies (default=true). - -=item B<-nozero-initialized-in-bss> -Don't place zero-initialized symbols into the BSS section. - -=item B<-pre-RA-sched>=I<scheduler> - -Instruction schedulers available (before register allocation): - - =default: Best scheduler for the target - =none: No scheduling: breadth first sequencing - =simple: Simple two pass scheduling: minimize critical path and maximize processor utilization - =simple-noitin: Simple two pass scheduling: Same as simple except using generic latency - =list-burr: Bottom-up register reduction list scheduling - =list-tdrr: Top-down register reduction list scheduling - =list-td: Top-down list scheduler -print-machineinstrs - Print generated machine code - -=item B<-regalloc>=I<allocator> - -Register allocator to use (default=linearscan) - - =bigblock: Big-block register allocator - =linearscan: linear scan register allocator =local - local register allocator - =simple: simple register allocator - -=item B<-relocation-model>=I<model> - -Choose relocation model from: - - =default: Target default relocation model - =static: Non-relocatable code =pic - Fully relocatable, position independent code - =dynamic-no-pic: Relocatable external references, non-relocatable code - -=item B<-spiller> - -Spiller to use (default=local) - - =simple: simple spiller - =local: local spiller - -=item B<-x86-asm-syntax>=I<syntax> - -Choose style of code to emit from X86 backend: - - =att: Emit AT&T-style assembly - =intel: Emit Intel-style assembly - -=back - -=head1 EXIT STATUS - -If B<lli> fails to load the program, it will exit with an exit code of 1. -Otherwise, it will return the exit code of the program it executes. - -=head1 SEE ALSO - -L<llc|llc> - -=head1 AUTHOR - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/lli.rst b/docs/CommandGuide/lli.rst new file mode 100644 index 0000000..7cc1284 --- /dev/null +++ b/docs/CommandGuide/lli.rst @@ -0,0 +1,300 @@ +lli - directly execute programs from LLVM bitcode +================================================= + + +SYNOPSIS +-------- + + +**lli** [*options*] [*filename*] [*program args*] + + +DESCRIPTION +----------- + + +**lli** directly executes programs in LLVM bitcode format. It takes a program +in LLVM bitcode format and executes it using a just-in-time compiler, if one is +available for the current architecture, or an interpreter. **lli** takes all of +the same code generator options as llc|llc, but they are only effective when +**lli** is using the just-in-time compiler. + +If *filename* is not specified, then **lli** reads the LLVM bitcode for the +program from standard input. + +The optional *args* specified on the command line are passed to the program as +arguments. + + +GENERAL OPTIONS +--------------- + + + +**-fake-argv0**\ =\ *executable* + + Override the ``argv[0]`` value passed into the executing program. + + + +**-force-interpreter**\ =\ *{false,true}* + + If set to true, use the interpreter even if a just-in-time compiler is available + for this architecture. Defaults to false. + + + +**-help** + + Print a summary of command line options. + + + +**-load**\ =\ *puginfilename* + + Causes **lli** to load the plugin (shared object) named *pluginfilename* and use + it for optimization. + + + +**-stats** + + Print statistics from the code-generation passes. This is only meaningful for + the just-in-time compiler, at present. + + + +**-time-passes** + + Record the amount of time needed for each code-generation pass and print it to + standard error. + + + +**-version** + + Print out the version of **lli** and exit without doing anything else. + + + + +TARGET OPTIONS +-------------- + + + +**-mtriple**\ =\ *target triple* + + Override the target triple specified in the input bitcode file with the + specified string. This may result in a crash if you pick an + architecture which is not compatible with the current system. + + + +**-march**\ =\ *arch* + + Specify the architecture for which to generate assembly, overriding the target + encoded in the bitcode file. See the output of **llc -help** for a list of + valid architectures. By default this is inferred from the target triple or + autodetected to the current architecture. + + + +**-mcpu**\ =\ *cpuname* + + Specify a specific chip in the current architecture to generate code for. + By default this is inferred from the target triple and autodetected to + the current architecture. For a list of available CPUs, use: + **llvm-as < /dev/null | llc -march=xyz -mcpu=help** + + + +**-mattr**\ =\ *a1,+a2,-a3,...* + + Override or control specific attributes of the target, such as whether SIMD + operations are enabled or not. The default set of attributes is set by the + current CPU. For a list of available attributes, use: + **llvm-as < /dev/null | llc -march=xyz -mattr=help** + + + + +FLOATING POINT OPTIONS +---------------------- + + + +**-disable-excess-fp-precision** + + Disable optimizations that may increase floating point precision. + + + +**-enable-no-infs-fp-math** + + Enable optimizations that assume no Inf values. + + + +**-enable-no-nans-fp-math** + + Enable optimizations that assume no NAN values. + + + +**-enable-unsafe-fp-math** + + Causes **lli** to enable optimizations that may decrease floating point + precision. + + + +**-soft-float** + + Causes **lli** to generate software floating point library calls instead of + equivalent hardware instructions. + + + + +CODE GENERATION OPTIONS +----------------------- + + + +**-code-model**\ =\ *model* + + Choose the code model from: + + + .. code-block:: perl + + default: Target default code model + small: Small code model + kernel: Kernel code model + medium: Medium code model + large: Large code model + + + + +**-disable-post-RA-scheduler** + + Disable scheduling after register allocation. + + + +**-disable-spill-fusing** + + Disable fusing of spill code into instructions. + + + +**-enable-correct-eh-support** + + Make the -lowerinvoke pass insert expensive, but correct, EH code. + + + +**-jit-enable-eh** + + Exception handling should be enabled in the just-in-time compiler. + + + +**-join-liveintervals** + + Coalesce copies (default=true). + + + +**-nozero-initialized-in-bss** Don't place zero-initialized symbols into the BSS section. + + + +**-pre-RA-sched**\ =\ *scheduler* + + Instruction schedulers available (before register allocation): + + + .. code-block:: perl + + =default: Best scheduler for the target + =none: No scheduling: breadth first sequencing + =simple: Simple two pass scheduling: minimize critical path and maximize processor utilization + =simple-noitin: Simple two pass scheduling: Same as simple except using generic latency + =list-burr: Bottom-up register reduction list scheduling + =list-tdrr: Top-down register reduction list scheduling + =list-td: Top-down list scheduler -print-machineinstrs - Print generated machine code + + + + +**-regalloc**\ =\ *allocator* + + Register allocator to use (default=linearscan) + + + .. code-block:: perl + + =bigblock: Big-block register allocator + =linearscan: linear scan register allocator =local - local register allocator + =simple: simple register allocator + + + + +**-relocation-model**\ =\ *model* + + Choose relocation model from: + + + .. code-block:: perl + + =default: Target default relocation model + =static: Non-relocatable code =pic - Fully relocatable, position independent code + =dynamic-no-pic: Relocatable external references, non-relocatable code + + + + +**-spiller** + + Spiller to use (default=local) + + + .. code-block:: perl + + =simple: simple spiller + =local: local spiller + + + + +**-x86-asm-syntax**\ =\ *syntax* + + Choose style of code to emit from X86 backend: + + + .. code-block:: perl + + =att: Emit AT&T-style assembly + =intel: Emit Intel-style assembly + + + + + +EXIT STATUS +----------- + + +If **lli** fails to load the program, it will exit with an exit code of 1. +Otherwise, it will return the exit code of the program it executes. + + +SEE ALSO +-------- + + +llc|llc diff --git a/docs/CommandGuide/llvm-ar.pod b/docs/CommandGuide/llvm-ar.pod deleted file mode 100644 index a8f01b0..0000000 --- a/docs/CommandGuide/llvm-ar.pod +++ /dev/null @@ -1,406 +0,0 @@ -=pod - -=head1 NAME - -llvm-ar - LLVM archiver - -=head1 SYNOPSIS - -B<llvm-ar> [-]{dmpqrtx}[Rabfikouz] [relpos] [count] <archive> [files...] - - -=head1 DESCRIPTION - -The B<llvm-ar> command is similar to the common Unix utility, C<ar>. It -archives several files together into a single file. The intent for this is -to produce archive libraries by LLVM bitcode that can be linked into an -LLVM program. However, the archive can contain any kind of file. By default, -B<llvm-ar> generates a symbol table that makes linking faster because -only the symbol table needs to be consulted, not each individual file member -of the archive. - -The B<llvm-ar> command can be used to I<read> both SVR4 and BSD style archive -files. However, it cannot be used to write them. While the B<llvm-ar> command -produces files that are I<almost> identical to the format used by other C<ar> -implementations, it has two significant departures in order to make the -archive appropriate for LLVM. The first departure is that B<llvm-ar> only -uses BSD4.4 style long path names (stored immediately after the header) and -never contains a string table for long names. The second departure is that the -symbol table is formated for efficient construction of an in-memory data -structure that permits rapid (red-black tree) lookups. Consequently, archives -produced with B<llvm-ar> usually won't be readable or editable with any -C<ar> implementation or useful for linking. Using the C<f> modifier to flatten -file names will make the archive readable by other C<ar> implementations -but not for linking because the symbol table format for LLVM is unique. If an -SVR4 or BSD style archive is used with the C<r> (replace) or C<q> (quick -update) operations, the archive will be reconstructed in LLVM format. This -means that the string table will be dropped (in deference to BSD 4.4 long names) -and an LLVM symbol table will be added (by default). The system symbol table -will be retained. - -Here's where B<llvm-ar> departs from previous C<ar> implementations: - -=over - -=item I<Symbol Table> - -Since B<llvm-ar> is intended to archive bitcode files, the symbol table -won't make much sense to anything but LLVM. Consequently, the symbol table's -format has been simplified. It consists simply of a sequence of pairs -of a file member index number as an LSB 4byte integer and a null-terminated -string. - -=item I<Long Paths> - -Some C<ar> implementations (SVR4) use a separate file member to record long -path names (> 15 characters). B<llvm-ar> takes the BSD 4.4 and Mac OS X -approach which is to simply store the full path name immediately preceding -the data for the file. The path name is null terminated and may contain the -slash (/) character. - -=item I<Compression> - -B<llvm-ar> can compress the members of an archive to save space. The -compression used depends on what's available on the platform and what choices -the LLVM Compressor utility makes. It generally favors bzip2 but will select -between "no compression" or bzip2 depending on what makes sense for the -file's content. - -=item I<Directory Recursion> - -Most C<ar> implementations do not recurse through directories but simply -ignore directories if they are presented to the program in the F<files> -option. B<llvm-ar>, however, can recurse through directory structures and -add all the files under a directory, if requested. - -=item I<TOC Verbose Output> - -When B<llvm-ar> prints out the verbose table of contents (C<tv> option), it -precedes the usual output with a character indicating the basic kind of -content in the file. A blank means the file is a regular file. A 'Z' means -the file is compressed. A 'B' means the file is an LLVM bitcode file. An -'S' means the file is the symbol table. - -=back - -=head1 OPTIONS - -The options to B<llvm-ar> are compatible with other C<ar> implementations. -However, there are a few modifiers (F<zR>) that are not found in other -C<ar>s. The options to B<llvm-ar> specify a single basic operation to -perform on the archive, a variety of modifiers for that operation, the -name of the archive file, and an optional list of file names. These options -are used to determine how B<llvm-ar> should process the archive file. - -The Operations and Modifiers are explained in the sections below. The minimal -set of options is at least one operator and the name of the archive. Typically -archive files end with a C<.a> suffix, but this is not required. Following -the F<archive-name> comes a list of F<files> that indicate the specific members -of the archive to operate on. If the F<files> option is not specified, it -generally means either "none" or "all" members, depending on the operation. - -=head2 Operations - -=over - -=item d - -Delete files from the archive. No modifiers are applicable to this operation. -The F<files> options specify which members should be removed from the -archive. It is not an error if a specified file does not appear in the archive. -If no F<files> are specified, the archive is not modified. - -=item m[abi] - -Move files from one location in the archive to another. The F<a>, F<b>, and -F<i> modifiers apply to this operation. The F<files> will all be moved -to the location given by the modifiers. If no modifiers are used, the files -will be moved to the end of the archive. If no F<files> are specified, the -archive is not modified. - -=item p[k] - -Print files to the standard output. The F<k> modifier applies to this -operation. This operation simply prints the F<files> indicated to the -standard output. If no F<files> are specified, the entire archive is printed. -Printing bitcode files is ill-advised as they might confuse your terminal -settings. The F<p> operation never modifies the archive. - -=item q[Rfz] - -Quickly append files to the end of the archive. The F<R>, F<f>, and F<z> -modifiers apply to this operation. This operation quickly adds the -F<files> to the archive without checking for duplicates that should be -removed first. If no F<files> are specified, the archive is not modified. -Because of the way that B<llvm-ar> constructs the archive file, its dubious -whether the F<q> operation is any faster than the F<r> operation. - -=item r[Rabfuz] - -Replace or insert file members. The F<R>, F<a>, F<b>, F<f>, F<u>, and F<z> -modifiers apply to this operation. This operation will replace existing -F<files> or insert them at the end of the archive if they do not exist. If no -F<files> are specified, the archive is not modified. - -=item t[v] - -Print the table of contents. Without any modifiers, this operation just prints -the names of the members to the standard output. With the F<v> modifier, -B<llvm-ar> also prints out the file type (B=bitcode, Z=compressed, S=symbol -table, blank=regular file), the permission mode, the owner and group, the -size, and the date. If any F<files> are specified, the listing is only for -those files. If no F<files> are specified, the table of contents for the -whole archive is printed. - -=item x[oP] - -Extract archive members back to files. The F<o> modifier applies to this -operation. This operation retrieves the indicated F<files> from the archive -and writes them back to the operating system's file system. If no -F<files> are specified, the entire archive is extract. - -=back - -=head2 Modifiers (operation specific) - -The modifiers below are specific to certain operations. See the Operations -section (above) to determine which modifiers are applicable to which operations. - -=over - -=item [a] - -When inserting or moving member files, this option specifies the destination of -the new files as being C<a>fter the F<relpos> member. If F<relpos> is not found, -the files are placed at the end of the archive. - -=item [b] - -When inserting or moving member files, this option specifies the destination of -the new files as being C<b>efore the F<relpos> member. If F<relpos> is not -found, the files are placed at the end of the archive. This modifier is -identical to the the F<i> modifier. - -=item [f] - -Normally, B<llvm-ar> stores the full path name to a file as presented to it on -the command line. With this option, truncated (15 characters max) names are -used. This ensures name compatibility with older versions of C<ar> but may also -thwart correct extraction of the files (duplicates may overwrite). If used with -the F<R> option, the directory recursion will be performed but the file names -will all be C<f>lattened to simple file names. - -=item [i] - -A synonym for the F<b> option. - -=item [k] - -Normally, B<llvm-ar> will not print the contents of bitcode files when the -F<p> operation is used. This modifier defeats the default and allows the -bitcode members to be printed. - -=item [N] - -This option is ignored by B<llvm-ar> but provided for compatibility. - -=item [o] - -When extracting files, this option will cause B<llvm-ar> to preserve the -original modification times of the files it writes. - -=item [P] - -use full path names when matching - -=item [R] - -This modifier instructions the F<r> option to recursively process directories. -Without F<R>, directories are ignored and only those F<files> that refer to -files will be added to the archive. When F<R> is used, any directories specified -with F<files> will be scanned (recursively) to find files to be added to the -archive. Any file whose name begins with a dot will not be added. - -=item [u] - -When replacing existing files in the archive, only replace those files that have -a time stamp than the time stamp of the member in the archive. - -=item [z] - -When inserting or replacing any file in the archive, compress the file first. -This -modifier is safe to use when (previously) compressed bitcode files are added to -the archive; the compressed bitcode files will not be doubly compressed. - -=back - -=head2 Modifiers (generic) - -The modifiers below may be applied to any operation. - -=over - -=item [c] - -For all operations, B<llvm-ar> will always create the archive if it doesn't -exist. Normally, B<llvm-ar> will print a warning message indicating that the -archive is being created. Using this modifier turns off that warning. - -=item [s] - -This modifier requests that an archive index (or symbol table) be added to the -archive. This is the default mode of operation. The symbol table will contain -all the externally visible functions and global variables defined by all the -bitcode files in the archive. Using this modifier is more efficient that using -L<llvm-ranlib|llvm-ranlib> which also creates the symbol table. - -=item [S] - -This modifier is the opposite of the F<s> modifier. It instructs B<llvm-ar> to -not build the symbol table. If both F<s> and F<S> are used, the last modifier to -occur in the options will prevail. - -=item [v] - -This modifier instructs B<llvm-ar> to be verbose about what it is doing. Each -editing operation taken against the archive will produce a line of output saying -what is being done. - -=back - -=head1 STANDARDS - -The B<llvm-ar> utility is intended to provide a superset of the IEEE Std 1003.2 -(POSIX.2) functionality for C<ar>. B<llvm-ar> can read both SVR4 and BSD4.4 (or -Mac OS X) archives. If the C<f> modifier is given to the C<x> or C<r> operations -then B<llvm-ar> will write SVR4 compatible archives. Without this modifier, -B<llvm-ar> will write BSD4.4 compatible archives that have long names -immediately after the header and indicated using the "#1/ddd" notation for the -name in the header. - -=head1 FILE FORMAT - -The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX -archive files. In fact, except for the symbol table, the C<ar> commands on those -operating systems should be able to read LLVM archive files. The details of the -file format follow. - -Each archive begins with the archive magic number which is the eight printable -characters "!<arch>\n" where \n represents the newline character (0x0A). -Following the magic number, the file is composed of even length members that -begin with an archive header and end with a \n padding character if necessary -(to make the length even). Each file member is composed of a header (defined -below), an optional newline-terminated "long file name" and the contents of -the file. - -The fields of the header are described in the items below. All fields of the -header contain only ASCII characters, are left justified and are right padded -with space characters. - -=over - -=item name - char[16] - -This field of the header provides the name of the archive member. If the name is -longer than 15 characters or contains a slash (/) character, then this field -contains C<#1/nnn> where C<nnn> provides the length of the name and the C<#1/> -is literal. In this case, the actual name of the file is provided in the C<nnn> -bytes immediately following the header. If the name is 15 characters or less, it -is contained directly in this field and terminated with a slash (/) character. - -=item date - char[12] - -This field provides the date of modification of the file in the form of a -decimal encoded number that provides the number of seconds since the epoch -(since 00:00:00 Jan 1, 1970) per Posix specifications. - -=item uid - char[6] - -This field provides the user id of the file encoded as a decimal ASCII string. -This field might not make much sense on non-Unix systems. On Unix, it is the -same value as the st_uid field of the stat structure returned by the stat(2) -operating system call. - -=item gid - char[6] - -This field provides the group id of the file encoded as a decimal ASCII string. -This field might not make much sense on non-Unix systems. On Unix, it is the -same value as the st_gid field of the stat structure returned by the stat(2) -operating system call. - -=item mode - char[8] - -This field provides the access mode of the file encoded as an octal ASCII -string. This field might not make much sense on non-Unix systems. On Unix, it -is the same value as the st_mode field of the stat structure returned by the -stat(2) operating system call. - -=item size - char[10] - -This field provides the size of the file, in bytes, encoded as a decimal ASCII -string. If the size field is negative (starts with a minus sign, 0x02D), then -the archive member is stored in compressed form. The first byte of the archive -member's data indicates the compression type used. A value of 0 (0x30) indicates -that no compression was used. A value of 2 (0x32) indicates that bzip2 -compression was used. - -=item fmag - char[2] - -This field is the archive file member magic number. Its content is always the -two characters back tick (0x60) and newline (0x0A). This provides some measure -utility in identifying archive files that have been corrupted. - -=back - -The LLVM symbol table has the special name "#_LLVM_SYM_TAB_#". It is presumed -that no regular archive member file will want this name. The LLVM symbol table -is simply composed of a sequence of triplets: byte offset, length of symbol, -and the symbol itself. Symbols are not null or newline terminated. Here are -the details on each of these items: - -=over - -=item offset - vbr encoded 32-bit integer - -The offset item provides the offset into the archive file where the bitcode -member is stored that is associated with the symbol. The offset value is 0 -based at the start of the first "normal" file member. To derive the actual -file offset of the member, you must add the number of bytes occupied by the file -signature (8 bytes) and the symbol tables. The value of this item is encoded -using variable bit rate encoding to reduce the size of the symbol table. -Variable bit rate encoding uses the high bit (0x80) of each byte to indicate -if there are more bytes to follow. The remaining 7 bits in each byte carry bits -from the value. The final byte does not have the high bit set. - -=item length - vbr encoded 32-bit integer - -The length item provides the length of the symbol that follows. Like this -I<offset> item, the length is variable bit rate encoded. - -=item symbol - character array - -The symbol item provides the text of the symbol that is associated with the -I<offset>. The symbol is not terminated by any character. Its length is provided -by the I<length> field. Note that is allowed (but unwise) to use non-printing -characters (even 0x00) in the symbol. This allows for multiple encodings of -symbol names. - -=back - -=head1 EXIT STATUS - -If B<llvm-ar> succeeds, it will exit with 0. A usage error, results -in an exit code of 1. A hard (file system typically) error results in an -exit code of 2. Miscellaneous or unknown errors result in an -exit code of 3. - -=head1 SEE ALSO - -L<llvm-ranlib|llvm-ranlib>, ar(1) - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-ar.rst b/docs/CommandGuide/llvm-ar.rst new file mode 100644 index 0000000..8ff4192 --- /dev/null +++ b/docs/CommandGuide/llvm-ar.rst @@ -0,0 +1,458 @@ +llvm-ar - LLVM archiver +======================= + + +SYNOPSIS +-------- + + +**llvm-ar** [-]{dmpqrtx}[Rabfikou] [relpos] [count] <archive> [files...] + + +DESCRIPTION +----------- + + +The **llvm-ar** command is similar to the common Unix utility, ``ar``. It +archives several files together into a single file. The intent for this is +to produce archive libraries by LLVM bitcode that can be linked into an +LLVM program. However, the archive can contain any kind of file. By default, +**llvm-ar** generates a symbol table that makes linking faster because +only the symbol table needs to be consulted, not each individual file member +of the archive. + +The **llvm-ar** command can be used to *read* both SVR4 and BSD style archive +files. However, it cannot be used to write them. While the **llvm-ar** command +produces files that are *almost* identical to the format used by other ``ar`` +implementations, it has two significant departures in order to make the +archive appropriate for LLVM. The first departure is that **llvm-ar** only +uses BSD4.4 style long path names (stored immediately after the header) and +never contains a string table for long names. The second departure is that the +symbol table is formated for efficient construction of an in-memory data +structure that permits rapid (red-black tree) lookups. Consequently, archives +produced with **llvm-ar** usually won't be readable or editable with any +``ar`` implementation or useful for linking. Using the ``f`` modifier to flatten +file names will make the archive readable by other ``ar`` implementations +but not for linking because the symbol table format for LLVM is unique. If an +SVR4 or BSD style archive is used with the ``r`` (replace) or ``q`` (quick +update) operations, the archive will be reconstructed in LLVM format. This +means that the string table will be dropped (in deference to BSD 4.4 long names) +and an LLVM symbol table will be added (by default). The system symbol table +will be retained. + +Here's where **llvm-ar** departs from previous ``ar`` implementations: + + +*Symbol Table* + + Since **llvm-ar** is intended to archive bitcode files, the symbol table + won't make much sense to anything but LLVM. Consequently, the symbol table's + format has been simplified. It consists simply of a sequence of pairs + of a file member index number as an LSB 4byte integer and a null-terminated + string. + + + +*Long Paths* + + Some ``ar`` implementations (SVR4) use a separate file member to record long + path names (> 15 characters). **llvm-ar** takes the BSD 4.4 and Mac OS X + approach which is to simply store the full path name immediately preceding + the data for the file. The path name is null terminated and may contain the + slash (/) character. + + + +*Directory Recursion* + + Most ``ar`` implementations do not recurse through directories but simply + ignore directories if they are presented to the program in the *files* + option. **llvm-ar**, however, can recurse through directory structures and + add all the files under a directory, if requested. + + + +*TOC Verbose Output* + + When **llvm-ar** prints out the verbose table of contents (``tv`` option), it + precedes the usual output with a character indicating the basic kind of + content in the file. A blank means the file is a regular file. A 'B' means + the file is an LLVM bitcode file. An 'S' means the file is the symbol table. + + + + +OPTIONS +------- + + +The options to **llvm-ar** are compatible with other ``ar`` implementations. +However, there are a few modifiers (*R*) that are not found in other ``ar`` +implementations. The options to **llvm-ar** specify a single basic operation to +perform on the archive, a variety of modifiers for that operation, the name of +the archive file, and an optional list of file names. These options are used to +determine how **llvm-ar** should process the archive file. + +The Operations and Modifiers are explained in the sections below. The minimal +set of options is at least one operator and the name of the archive. Typically +archive files end with a ``.a`` suffix, but this is not required. Following +the *archive-name* comes a list of *files* that indicate the specific members +of the archive to operate on. If the *files* option is not specified, it +generally means either "none" or "all" members, depending on the operation. + +Operations +~~~~~~~~~~ + + + +d + + Delete files from the archive. No modifiers are applicable to this operation. + The *files* options specify which members should be removed from the + archive. It is not an error if a specified file does not appear in the archive. + If no *files* are specified, the archive is not modified. + + + +m[abi] + + Move files from one location in the archive to another. The *a*, *b*, and + *i* modifiers apply to this operation. The *files* will all be moved + to the location given by the modifiers. If no modifiers are used, the files + will be moved to the end of the archive. If no *files* are specified, the + archive is not modified. + + + +p[k] + + Print files to the standard output. The *k* modifier applies to this + operation. This operation simply prints the *files* indicated to the + standard output. If no *files* are specified, the entire archive is printed. + Printing bitcode files is ill-advised as they might confuse your terminal + settings. The *p* operation never modifies the archive. + + + +q[Rf] + + Quickly append files to the end of the archive. The *R*, and *f* + modifiers apply to this operation. This operation quickly adds the + *files* to the archive without checking for duplicates that should be + removed first. If no *files* are specified, the archive is not modified. + Because of the way that **llvm-ar** constructs the archive file, its dubious + whether the *q* operation is any faster than the *r* operation. + + + +r[Rabfu] + + Replace or insert file members. The *R*, *a*, *b*, *f*, and *u* + modifiers apply to this operation. This operation will replace existing + *files* or insert them at the end of the archive if they do not exist. If no + *files* are specified, the archive is not modified. + + + +t[v] + + Print the table of contents. Without any modifiers, this operation just prints + the names of the members to the standard output. With the *v* modifier, + **llvm-ar** also prints out the file type (B=bitcode, S=symbol + table, blank=regular file), the permission mode, the owner and group, the + size, and the date. If any *files* are specified, the listing is only for + those files. If no *files* are specified, the table of contents for the + whole archive is printed. + + + +x[oP] + + Extract archive members back to files. The *o* modifier applies to this + operation. This operation retrieves the indicated *files* from the archive + and writes them back to the operating system's file system. If no + *files* are specified, the entire archive is extract. + + + + +Modifiers (operation specific) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + +The modifiers below are specific to certain operations. See the Operations +section (above) to determine which modifiers are applicable to which operations. + + +[a] + + When inserting or moving member files, this option specifies the destination of + the new files as being after the *relpos* member. If *relpos* is not found, + the files are placed at the end of the archive. + + + +[b] + + When inserting or moving member files, this option specifies the destination of + the new files as being before the *relpos* member. If *relpos* is not + found, the files are placed at the end of the archive. This modifier is + identical to the *i* modifier. + + + +[f] + + Normally, **llvm-ar** stores the full path name to a file as presented to it on + the command line. With this option, truncated (15 characters max) names are + used. This ensures name compatibility with older versions of ``ar`` but may also + thwart correct extraction of the files (duplicates may overwrite). If used with + the *R* option, the directory recursion will be performed but the file names + will all be flattened to simple file names. + + + +[i] + + A synonym for the *b* option. + + + +[k] + + Normally, **llvm-ar** will not print the contents of bitcode files when the + *p* operation is used. This modifier defeats the default and allows the + bitcode members to be printed. + + + +[N] + + This option is ignored by **llvm-ar** but provided for compatibility. + + + +[o] + + When extracting files, this option will cause **llvm-ar** to preserve the + original modification times of the files it writes. + + + +[P] + + use full path names when matching + + + +[R] + + This modifier instructions the *r* option to recursively process directories. + Without *R*, directories are ignored and only those *files* that refer to + files will be added to the archive. When *R* is used, any directories specified + with *files* will be scanned (recursively) to find files to be added to the + archive. Any file whose name begins with a dot will not be added. + + + +[u] + + When replacing existing files in the archive, only replace those files that have + a time stamp than the time stamp of the member in the archive. + + + + +Modifiers (generic) +~~~~~~~~~~~~~~~~~~~ + + +The modifiers below may be applied to any operation. + + +[c] + + For all operations, **llvm-ar** will always create the archive if it doesn't + exist. Normally, **llvm-ar** will print a warning message indicating that the + archive is being created. Using this modifier turns off that warning. + + + +[s] + + This modifier requests that an archive index (or symbol table) be added to the + archive. This is the default mode of operation. The symbol table will contain + all the externally visible functions and global variables defined by all the + bitcode files in the archive. Using this modifier is more efficient that using + llvm-ranlib|llvm-ranlib which also creates the symbol table. + + + +[S] + + This modifier is the opposite of the *s* modifier. It instructs **llvm-ar** to + not build the symbol table. If both *s* and *S* are used, the last modifier to + occur in the options will prevail. + + + +[v] + + This modifier instructs **llvm-ar** to be verbose about what it is doing. Each + editing operation taken against the archive will produce a line of output saying + what is being done. + + + + + +STANDARDS +--------- + + +The **llvm-ar** utility is intended to provide a superset of the IEEE Std 1003.2 +(POSIX.2) functionality for ``ar``. **llvm-ar** can read both SVR4 and BSD4.4 (or +Mac OS X) archives. If the ``f`` modifier is given to the ``x`` or ``r`` operations +then **llvm-ar** will write SVR4 compatible archives. Without this modifier, +**llvm-ar** will write BSD4.4 compatible archives that have long names +immediately after the header and indicated using the "#1/ddd" notation for the +name in the header. + + +FILE FORMAT +----------- + + +The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX +archive files. In fact, except for the symbol table, the ``ar`` commands on those +operating systems should be able to read LLVM archive files. The details of the +file format follow. + +Each archive begins with the archive magic number which is the eight printable +characters "!<arch>\n" where \n represents the newline character (0x0A). +Following the magic number, the file is composed of even length members that +begin with an archive header and end with a \n padding character if necessary +(to make the length even). Each file member is composed of a header (defined +below), an optional newline-terminated "long file name" and the contents of +the file. + +The fields of the header are described in the items below. All fields of the +header contain only ASCII characters, are left justified and are right padded +with space characters. + + +name - char[16] + + This field of the header provides the name of the archive member. If the name is + longer than 15 characters or contains a slash (/) character, then this field + contains ``#1/nnn`` where ``nnn`` provides the length of the name and the ``#1/`` + is literal. In this case, the actual name of the file is provided in the ``nnn`` + bytes immediately following the header. If the name is 15 characters or less, it + is contained directly in this field and terminated with a slash (/) character. + + + +date - char[12] + + This field provides the date of modification of the file in the form of a + decimal encoded number that provides the number of seconds since the epoch + (since 00:00:00 Jan 1, 1970) per Posix specifications. + + + +uid - char[6] + + This field provides the user id of the file encoded as a decimal ASCII string. + This field might not make much sense on non-Unix systems. On Unix, it is the + same value as the st_uid field of the stat structure returned by the stat(2) + operating system call. + + + +gid - char[6] + + This field provides the group id of the file encoded as a decimal ASCII string. + This field might not make much sense on non-Unix systems. On Unix, it is the + same value as the st_gid field of the stat structure returned by the stat(2) + operating system call. + + + +mode - char[8] + + This field provides the access mode of the file encoded as an octal ASCII + string. This field might not make much sense on non-Unix systems. On Unix, it + is the same value as the st_mode field of the stat structure returned by the + stat(2) operating system call. + + + +size - char[10] + + This field provides the size of the file, in bytes, encoded as a decimal ASCII + string. + + + +fmag - char[2] + + This field is the archive file member magic number. Its content is always the + two characters back tick (0x60) and newline (0x0A). This provides some measure + utility in identifying archive files that have been corrupted. + + + +The LLVM symbol table has the special name "#_LLVM_SYM_TAB_#". It is presumed +that no regular archive member file will want this name. The LLVM symbol table +is simply composed of a sequence of triplets: byte offset, length of symbol, +and the symbol itself. Symbols are not null or newline terminated. Here are +the details on each of these items: + + +offset - vbr encoded 32-bit integer + + The offset item provides the offset into the archive file where the bitcode + member is stored that is associated with the symbol. The offset value is 0 + based at the start of the first "normal" file member. To derive the actual + file offset of the member, you must add the number of bytes occupied by the file + signature (8 bytes) and the symbol tables. The value of this item is encoded + using variable bit rate encoding to reduce the size of the symbol table. + Variable bit rate encoding uses the high bit (0x80) of each byte to indicate + if there are more bytes to follow. The remaining 7 bits in each byte carry bits + from the value. The final byte does not have the high bit set. + + + +length - vbr encoded 32-bit integer + + The length item provides the length of the symbol that follows. Like this + *offset* item, the length is variable bit rate encoded. + + + +symbol - character array + + The symbol item provides the text of the symbol that is associated with the + *offset*. The symbol is not terminated by any character. Its length is provided + by the *length* field. Note that is allowed (but unwise) to use non-printing + characters (even 0x00) in the symbol. This allows for multiple encodings of + symbol names. + + + + +EXIT STATUS +----------- + + +If **llvm-ar** succeeds, it will exit with 0. A usage error, results +in an exit code of 1. A hard (file system typically) error results in an +exit code of 2. Miscellaneous or unknown errors result in an +exit code of 3. + + +SEE ALSO +-------- + + +llvm-ranlib|llvm-ranlib, ar(1) diff --git a/docs/CommandGuide/llvm-as.pod b/docs/CommandGuide/llvm-as.pod deleted file mode 100644 index cc81887..0000000 --- a/docs/CommandGuide/llvm-as.pod +++ /dev/null @@ -1,77 +0,0 @@ -=pod - -=head1 NAME - -llvm-as - LLVM assembler - -=head1 SYNOPSIS - -B<llvm-as> [I<options>] [I<filename>] - -=head1 DESCRIPTION - -B<llvm-as> is the LLVM assembler. It reads a file containing human-readable -LLVM assembly language, translates it to LLVM bitcode, and writes the result -into a file or to standard output. - -If F<filename> is omitted or is C<->, then B<llvm-as> reads its input from -standard input. - -If an output file is not specified with the B<-o> option, then -B<llvm-as> sends its output to a file or standard output by following -these rules: - -=over - -=item * - -If the input is standard input, then the output is standard output. - -=item * - -If the input is a file that ends with C<.ll>, then the output file is of -the same name, except that the suffix is changed to C<.bc>. - -=item * - -If the input is a file that does not end with the C<.ll> suffix, then the -output file has the same name as the input file, except that the C<.bc> -suffix is appended. - -=back - -=head1 OPTIONS - -=over - -=item B<-f> - -Enable binary output on terminals. Normally, B<llvm-as> will refuse to -write raw bitcode output if the output stream is a terminal. With this option, -B<llvm-as> will write raw bitcode regardless of the output device. - -=item B<-help> - -Print a summary of command line options. - -=item B<-o> F<filename> - -Specify the output file name. If F<filename> is C<->, then B<llvm-as> -sends its output to standard output. - -=back - -=head1 EXIT STATUS - -If B<llvm-as> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 SEE ALSO - -L<llvm-dis|llvm-dis>, L<gccas|gccas> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-as.rst b/docs/CommandGuide/llvm-as.rst new file mode 100644 index 0000000..1b499bb --- /dev/null +++ b/docs/CommandGuide/llvm-as.rst @@ -0,0 +1,56 @@ +llvm-as - LLVM assembler +======================== + +SYNOPSIS +-------- + +**llvm-as** [*options*] [*filename*] + +DESCRIPTION +----------- + +**llvm-as** is the LLVM assembler. It reads a file containing human-readable +LLVM assembly language, translates it to LLVM bitcode, and writes the result +into a file or to standard output. + +If *filename* is omitted or is ``-``, then **llvm-as** reads its input from +standard input. + +If an output file is not specified with the **-o** option, then +**llvm-as** sends its output to a file or standard output by following +these rules: + +* If the input is standard input, then the output is standard output. + +* If the input is a file that ends with ``.ll``, then the output file is of the + same name, except that the suffix is changed to ``.bc``. + +* If the input is a file that does not end with the ``.ll`` suffix, then the + output file has the same name as the input file, except that the ``.bc`` + suffix is appended. + +OPTIONS +------- + +**-f** + Enable binary output on terminals. Normally, **llvm-as** will refuse to + write raw bitcode output if the output stream is a terminal. With this option, + **llvm-as** will write raw bitcode regardless of the output device. + +**-help** + Print a summary of command line options. + +**-o** *filename* + Specify the output file name. If *filename* is ``-``, then **llvm-as** + sends its output to standard output. + +EXIT STATUS +----------- + +If **llvm-as** succeeds, it will exit with 0. Otherwise, if an error occurs, it +will exit with a non-zero value. + +SEE ALSO +-------- + +llvm-dis|llvm-dis, gccas|gccas diff --git a/docs/CommandGuide/llvm-bcanalyzer.pod b/docs/CommandGuide/llvm-bcanalyzer.pod deleted file mode 100644 index 9c5021b..0000000 --- a/docs/CommandGuide/llvm-bcanalyzer.pod +++ /dev/null @@ -1,315 +0,0 @@ -=pod - -=head1 NAME - -llvm-bcanalyzer - LLVM bitcode analyzer - -=head1 SYNOPSIS - -B<llvm-bcanalyzer> [I<options>] [F<filename>] - -=head1 DESCRIPTION - -The B<llvm-bcanalyzer> command is a small utility for analyzing bitcode files. -The tool reads a bitcode file (such as generated with the B<llvm-as> tool) and -produces a statistical report on the contents of the bitcode file. The tool -can also dump a low level but human readable version of the bitcode file. -This tool is probably not of much interest or utility except for those working -directly with the bitcode file format. Most LLVM users can just ignore -this tool. - -If F<filename> is omitted or is C<->, then B<llvm-bcanalyzer> reads its input -from standard input. This is useful for combining the tool into a pipeline. -Output is written to the standard output. - -=head1 OPTIONS - -=over - -=item B<-nodetails> - -Causes B<llvm-bcanalyzer> to abbreviate its output by writing out only a module -level summary. The details for individual functions are not displayed. - -=item B<-dump> - -Causes B<llvm-bcanalyzer> to dump the bitcode in a human readable format. This -format is significantly different from LLVM assembly and provides details about -the encoding of the bitcode file. - -=item B<-verify> - -Causes B<llvm-bcanalyzer> to verify the module produced by reading the -bitcode. This ensures that the statistics generated are based on a consistent -module. - -=item B<-help> - -Print a summary of command line options. - -=back - -=head1 EXIT STATUS - -If B<llvm-bcanalyzer> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value, usually 1. - -=head1 SUMMARY OUTPUT DEFINITIONS - -The following items are always printed by llvm-bcanalyzer. They comprize the -summary output. - -=over - -=item B<Bitcode Analysis Of Module> - -This just provides the name of the module for which bitcode analysis is being -generated. - -=item B<Bitcode Version Number> - -The bitcode version (not LLVM version) of the file read by the analyzer. - -=item B<File Size> - -The size, in bytes, of the entire bitcode file. - -=item B<Module Bytes> - -The size, in bytes, of the module block. Percentage is relative to File Size. - -=item B<Function Bytes> - -The size, in bytes, of all the function blocks. Percentage is relative to File -Size. - -=item B<Global Types Bytes> - -The size, in bytes, of the Global Types Pool. Percentage is relative to File -Size. This is the size of the definitions of all types in the bitcode file. - -=item B<Constant Pool Bytes> - -The size, in bytes, of the Constant Pool Blocks Percentage is relative to File -Size. - -=item B<Module Globals Bytes> - -Ths size, in bytes, of the Global Variable Definitions and their initializers. -Percentage is relative to File Size. - -=item B<Instruction List Bytes> - -The size, in bytes, of all the instruction lists in all the functions. -Percentage is relative to File Size. Note that this value is also included in -the Function Bytes. - -=item B<Compaction Table Bytes> - -The size, in bytes, of all the compaction tables in all the functions. -Percentage is relative to File Size. Note that this value is also included in -the Function Bytes. - -=item B<Symbol Table Bytes> - -The size, in bytes, of all the symbol tables in all the functions. Percentage is -relative to File Size. Note that this value is also included in the Function -Bytes. - -=item B<Dependent Libraries Bytes> - -The size, in bytes, of the list of dependent libraries in the module. Percentage -is relative to File Size. Note that this value is also included in the Module -Global Bytes. - -=item B<Number Of Bitcode Blocks> - -The total number of blocks of any kind in the bitcode file. - -=item B<Number Of Functions> - -The total number of function definitions in the bitcode file. - -=item B<Number Of Types> - -The total number of types defined in the Global Types Pool. - -=item B<Number Of Constants> - -The total number of constants (of any type) defined in the Constant Pool. - -=item B<Number Of Basic Blocks> - -The total number of basic blocks defined in all functions in the bitcode file. - -=item B<Number Of Instructions> - -The total number of instructions defined in all functions in the bitcode file. - -=item B<Number Of Long Instructions> - -The total number of long instructions defined in all functions in the bitcode -file. Long instructions are those taking greater than 4 bytes. Typically long -instructions are GetElementPtr with several indices, PHI nodes, and calls to -functions with large numbers of arguments. - -=item B<Number Of Operands> - -The total number of operands used in all instructions in the bitcode file. - -=item B<Number Of Compaction Tables> - -The total number of compaction tables in all functions in the bitcode file. - -=item B<Number Of Symbol Tables> - -The total number of symbol tables in all functions in the bitcode file. - -=item B<Number Of Dependent Libs> - -The total number of dependent libraries found in the bitcode file. - -=item B<Total Instruction Size> - -The total size of the instructions in all functions in the bitcode file. - -=item B<Average Instruction Size> - -The average number of bytes per instruction across all functions in the bitcode -file. This value is computed by dividing Total Instruction Size by Number Of -Instructions. - -=item B<Maximum Type Slot Number> - -The maximum value used for a type's slot number. Larger slot number values take -more bytes to encode. - -=item B<Maximum Value Slot Number> - -The maximum value used for a value's slot number. Larger slot number values take -more bytes to encode. - -=item B<Bytes Per Value> - -The average size of a Value definition (of any type). This is computed by -dividing File Size by the total number of values of any type. - -=item B<Bytes Per Global> - -The average size of a global definition (constants and global variables). - -=item B<Bytes Per Function> - -The average number of bytes per function definition. This is computed by -dividing Function Bytes by Number Of Functions. - -=item B<# of VBR 32-bit Integers> - -The total number of 32-bit integers encoded using the Variable Bit Rate -encoding scheme. - -=item B<# of VBR 64-bit Integers> - -The total number of 64-bit integers encoded using the Variable Bit Rate encoding -scheme. - -=item B<# of VBR Compressed Bytes> - -The total number of bytes consumed by the 32-bit and 64-bit integers that use -the Variable Bit Rate encoding scheme. - -=item B<# of VBR Expanded Bytes> - -The total number of bytes that would have been consumed by the 32-bit and 64-bit -integers had they not been compressed with the Variable Bit Rage encoding -scheme. - -=item B<Bytes Saved With VBR> - -The total number of bytes saved by using the Variable Bit Rate encoding scheme. -The percentage is relative to # of VBR Expanded Bytes. - -=back - -=head1 DETAILED OUTPUT DEFINITIONS - -The following definitions occur only if the -nodetails option was not given. -The detailed output provides additional information on a per-function basis. - -=over - -=item B<Type> - -The type signature of the function. - -=item B<Byte Size> - -The total number of bytes in the function's block. - -=item B<Basic Blocks> - -The number of basic blocks defined by the function. - -=item B<Instructions> - -The number of instructions defined by the function. - -=item B<Long Instructions> - -The number of instructions using the long instruction format in the function. - -=item B<Operands> - -The number of operands used by all instructions in the function. - -=item B<Instruction Size> - -The number of bytes consumed by instructions in the function. - -=item B<Average Instruction Size> - -The average number of bytes consumed by the instructions in the function. This -value is computed by dividing Instruction Size by Instructions. - -=item B<Bytes Per Instruction> - -The average number of bytes used by the function per instruction. This value is -computed by dividing Byte Size by Instructions. Note that this is not the same -as Average Instruction Size. It computes a number relative to the total function -size not just the size of the instruction list. - -=item B<Number of VBR 32-bit Integers> - -The total number of 32-bit integers found in this function (for any use). - -=item B<Number of VBR 64-bit Integers> - -The total number of 64-bit integers found in this function (for any use). - -=item B<Number of VBR Compressed Bytes> - -The total number of bytes in this function consumed by the 32-bit and 64-bit -integers that use the Variable Bit Rate encoding scheme. - -=item B<Number of VBR Expanded Bytes> - -The total number of bytes in this function that would have been consumed by -the 32-bit and 64-bit integers had they not been compressed with the Variable -Bit Rate encoding scheme. - -=item B<Bytes Saved With VBR> - -The total number of bytes saved in this function by using the Variable Bit -Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. - -=back - -=head1 SEE ALSO - -L<llvm-dis|llvm-dis>, L<http://llvm.org/docs/BitCodeFormat.html> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-bcanalyzer.rst b/docs/CommandGuide/llvm-bcanalyzer.rst new file mode 100644 index 0000000..f1e4eac --- /dev/null +++ b/docs/CommandGuide/llvm-bcanalyzer.rst @@ -0,0 +1,424 @@ +llvm-bcanalyzer - LLVM bitcode analyzer +======================================= + + +SYNOPSIS +-------- + + +**llvm-bcanalyzer** [*options*] [*filename*] + + +DESCRIPTION +----------- + + +The **llvm-bcanalyzer** command is a small utility for analyzing bitcode files. +The tool reads a bitcode file (such as generated with the **llvm-as** tool) and +produces a statistical report on the contents of the bitcode file. The tool +can also dump a low level but human readable version of the bitcode file. +This tool is probably not of much interest or utility except for those working +directly with the bitcode file format. Most LLVM users can just ignore +this tool. + +If *filename* is omitted or is ``-``, then **llvm-bcanalyzer** reads its input +from standard input. This is useful for combining the tool into a pipeline. +Output is written to the standard output. + + +OPTIONS +------- + + + +**-nodetails** + + Causes **llvm-bcanalyzer** to abbreviate its output by writing out only a module + level summary. The details for individual functions are not displayed. + + + +**-dump** + + Causes **llvm-bcanalyzer** to dump the bitcode in a human readable format. This + format is significantly different from LLVM assembly and provides details about + the encoding of the bitcode file. + + + +**-verify** + + Causes **llvm-bcanalyzer** to verify the module produced by reading the + bitcode. This ensures that the statistics generated are based on a consistent + module. + + + +**-help** + + Print a summary of command line options. + + + + +EXIT STATUS +----------- + + +If **llvm-bcanalyzer** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value, usually 1. + + +SUMMARY OUTPUT DEFINITIONS +-------------------------- + + +The following items are always printed by llvm-bcanalyzer. They comprize the +summary output. + + +**Bitcode Analysis Of Module** + + This just provides the name of the module for which bitcode analysis is being + generated. + + + +**Bitcode Version Number** + + The bitcode version (not LLVM version) of the file read by the analyzer. + + + +**File Size** + + The size, in bytes, of the entire bitcode file. + + + +**Module Bytes** + + The size, in bytes, of the module block. Percentage is relative to File Size. + + + +**Function Bytes** + + The size, in bytes, of all the function blocks. Percentage is relative to File + Size. + + + +**Global Types Bytes** + + The size, in bytes, of the Global Types Pool. Percentage is relative to File + Size. This is the size of the definitions of all types in the bitcode file. + + + +**Constant Pool Bytes** + + The size, in bytes, of the Constant Pool Blocks Percentage is relative to File + Size. + + + +**Module Globals Bytes** + + Ths size, in bytes, of the Global Variable Definitions and their initializers. + Percentage is relative to File Size. + + + +**Instruction List Bytes** + + The size, in bytes, of all the instruction lists in all the functions. + Percentage is relative to File Size. Note that this value is also included in + the Function Bytes. + + + +**Compaction Table Bytes** + + The size, in bytes, of all the compaction tables in all the functions. + Percentage is relative to File Size. Note that this value is also included in + the Function Bytes. + + + +**Symbol Table Bytes** + + The size, in bytes, of all the symbol tables in all the functions. Percentage is + relative to File Size. Note that this value is also included in the Function + Bytes. + + + +**Dependent Libraries Bytes** + + The size, in bytes, of the list of dependent libraries in the module. Percentage + is relative to File Size. Note that this value is also included in the Module + Global Bytes. + + + +**Number Of Bitcode Blocks** + + The total number of blocks of any kind in the bitcode file. + + + +**Number Of Functions** + + The total number of function definitions in the bitcode file. + + + +**Number Of Types** + + The total number of types defined in the Global Types Pool. + + + +**Number Of Constants** + + The total number of constants (of any type) defined in the Constant Pool. + + + +**Number Of Basic Blocks** + + The total number of basic blocks defined in all functions in the bitcode file. + + + +**Number Of Instructions** + + The total number of instructions defined in all functions in the bitcode file. + + + +**Number Of Long Instructions** + + The total number of long instructions defined in all functions in the bitcode + file. Long instructions are those taking greater than 4 bytes. Typically long + instructions are GetElementPtr with several indices, PHI nodes, and calls to + functions with large numbers of arguments. + + + +**Number Of Operands** + + The total number of operands used in all instructions in the bitcode file. + + + +**Number Of Compaction Tables** + + The total number of compaction tables in all functions in the bitcode file. + + + +**Number Of Symbol Tables** + + The total number of symbol tables in all functions in the bitcode file. + + + +**Number Of Dependent Libs** + + The total number of dependent libraries found in the bitcode file. + + + +**Total Instruction Size** + + The total size of the instructions in all functions in the bitcode file. + + + +**Average Instruction Size** + + The average number of bytes per instruction across all functions in the bitcode + file. This value is computed by dividing Total Instruction Size by Number Of + Instructions. + + + +**Maximum Type Slot Number** + + The maximum value used for a type's slot number. Larger slot number values take + more bytes to encode. + + + +**Maximum Value Slot Number** + + The maximum value used for a value's slot number. Larger slot number values take + more bytes to encode. + + + +**Bytes Per Value** + + The average size of a Value definition (of any type). This is computed by + dividing File Size by the total number of values of any type. + + + +**Bytes Per Global** + + The average size of a global definition (constants and global variables). + + + +**Bytes Per Function** + + The average number of bytes per function definition. This is computed by + dividing Function Bytes by Number Of Functions. + + + +**# of VBR 32-bit Integers** + + The total number of 32-bit integers encoded using the Variable Bit Rate + encoding scheme. + + + +**# of VBR 64-bit Integers** + + The total number of 64-bit integers encoded using the Variable Bit Rate encoding + scheme. + + + +**# of VBR Compressed Bytes** + + The total number of bytes consumed by the 32-bit and 64-bit integers that use + the Variable Bit Rate encoding scheme. + + + +**# of VBR Expanded Bytes** + + The total number of bytes that would have been consumed by the 32-bit and 64-bit + integers had they not been compressed with the Variable Bit Rage encoding + scheme. + + + +**Bytes Saved With VBR** + + The total number of bytes saved by using the Variable Bit Rate encoding scheme. + The percentage is relative to # of VBR Expanded Bytes. + + + + +DETAILED OUTPUT DEFINITIONS +--------------------------- + + +The following definitions occur only if the -nodetails option was not given. +The detailed output provides additional information on a per-function basis. + + +**Type** + + The type signature of the function. + + + +**Byte Size** + + The total number of bytes in the function's block. + + + +**Basic Blocks** + + The number of basic blocks defined by the function. + + + +**Instructions** + + The number of instructions defined by the function. + + + +**Long Instructions** + + The number of instructions using the long instruction format in the function. + + + +**Operands** + + The number of operands used by all instructions in the function. + + + +**Instruction Size** + + The number of bytes consumed by instructions in the function. + + + +**Average Instruction Size** + + The average number of bytes consumed by the instructions in the function. This + value is computed by dividing Instruction Size by Instructions. + + + +**Bytes Per Instruction** + + The average number of bytes used by the function per instruction. This value is + computed by dividing Byte Size by Instructions. Note that this is not the same + as Average Instruction Size. It computes a number relative to the total function + size not just the size of the instruction list. + + + +**Number of VBR 32-bit Integers** + + The total number of 32-bit integers found in this function (for any use). + + + +**Number of VBR 64-bit Integers** + + The total number of 64-bit integers found in this function (for any use). + + + +**Number of VBR Compressed Bytes** + + The total number of bytes in this function consumed by the 32-bit and 64-bit + integers that use the Variable Bit Rate encoding scheme. + + + +**Number of VBR Expanded Bytes** + + The total number of bytes in this function that would have been consumed by + the 32-bit and 64-bit integers had they not been compressed with the Variable + Bit Rate encoding scheme. + + + +**Bytes Saved With VBR** + + The total number of bytes saved in this function by using the Variable Bit + Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. + + + + +SEE ALSO +-------- + + +llvm-dis|llvm-dis, `http://llvm.org/docs/BitCodeFormat.html <http://llvm.org/docs/BitCodeFormat.html>`_ diff --git a/docs/CommandGuide/llvm-build.pod b/docs/CommandGuide/llvm-build.pod deleted file mode 100644 index 14e08cb..0000000 --- a/docs/CommandGuide/llvm-build.pod +++ /dev/null @@ -1,86 +0,0 @@ -=pod - -=head1 NAME - -llvm-build - LLVM Project Build Utility - -=head1 SYNOPSIS - -B<llvm-build> [I<options>] - -=head1 DESCRIPTION - -B<llvm-build> is a tool for working with LLVM projects that use the LLVMBuild -system for describing their components. - -At heart, B<llvm-build> is responsible for loading, verifying, and manipulating -the project's component data. The tool is primarily designed for use in -implementing build systems and tools which need access to the project structure -information. - -=head1 OPTIONS - -=over - -=item B<-h>, B<--help> - -Print the builtin program help. - -=item B<--source-root>=I<PATH> - -If given, load the project at the given source root path. If this option is not -given, the location of the project sources will be inferred from the location of -the B<llvm-build> script itself. - -=item B<--print-tree> - -Print the component tree for the project. - -=item B<--write-library-table> - -Write out the C++ fragment which defines the components, library names, and -required libraries. This C++ fragment is built into L<llvm-config|llvm-config> -in order to provide clients with the list of required libraries for arbitrary -component combinations. - -=item B<--write-llvmbuild> - -Write out new I<LLVMBuild.txt> files based on the loaded components. This is -useful for auto-upgrading the schema of the files. B<llvm-build> will try to a -limited extent to preserve the comments which were written in the original -source file, although at this time it only preserves block comments that preceed -the section names in the I<LLVMBuild> files. - -=item B<--write-cmake-fragment> - -Write out the LLVMBuild in the form of a CMake fragment, so it can easily be -consumed by the CMake based build system. The exact contents and format of this -file are closely tied to how LLVMBuild is integrated with CMake, see LLVM's -top-level CMakeLists.txt. - -=item B<--write-make-fragment> - -Write out the LLVMBuild in the form of a Makefile fragment, so it can easily be -consumed by a Make based build system. The exact contents and format of this -file are closely tied to how LLVMBuild is integrated with the Makefiles, see -LLVM's Makefile.rules. - -=item B<--llvmbuild-source-root>=I<PATH> - -If given, expect the I<LLVMBuild> files for the project to be rooted at the -given path, instead of inside the source tree itself. This option is primarily -designed for use in conjunction with B<--write-llvmbuild> to test changes to -I<LLVMBuild> schema. - -=back - -=head1 EXIT STATUS - -B<llvm-build> exits with 0 if operation was successful. Otherwise, it will exist -with a non-zero value. - -=head1 AUTHOR - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-build.rst b/docs/CommandGuide/llvm-build.rst new file mode 100644 index 0000000..f788f7c --- /dev/null +++ b/docs/CommandGuide/llvm-build.rst @@ -0,0 +1,102 @@ +llvm-build - LLVM Project Build Utility +======================================= + + +SYNOPSIS +-------- + + +**llvm-build** [*options*] + + +DESCRIPTION +----------- + + +**llvm-build** is a tool for working with LLVM projects that use the LLVMBuild +system for describing their components. + +At heart, **llvm-build** is responsible for loading, verifying, and manipulating +the project's component data. The tool is primarily designed for use in +implementing build systems and tools which need access to the project structure +information. + + +OPTIONS +------- + + + +**-h**, **--help** + + Print the builtin program help. + + + +**--source-root**\ =\ *PATH* + + If given, load the project at the given source root path. If this option is not + given, the location of the project sources will be inferred from the location of + the **llvm-build** script itself. + + + +**--print-tree** + + Print the component tree for the project. + + + +**--write-library-table** + + Write out the C++ fragment which defines the components, library names, and + required libraries. This C++ fragment is built into llvm-config|llvm-config + in order to provide clients with the list of required libraries for arbitrary + component combinations. + + + +**--write-llvmbuild** + + Write out new *LLVMBuild.txt* files based on the loaded components. This is + useful for auto-upgrading the schema of the files. **llvm-build** will try to a + limited extent to preserve the comments which were written in the original + source file, although at this time it only preserves block comments that precede + the section names in the *LLVMBuild* files. + + + +**--write-cmake-fragment** + + Write out the LLVMBuild in the form of a CMake fragment, so it can easily be + consumed by the CMake based build system. The exact contents and format of this + file are closely tied to how LLVMBuild is integrated with CMake, see LLVM's + top-level CMakeLists.txt. + + + +**--write-make-fragment** + + Write out the LLVMBuild in the form of a Makefile fragment, so it can easily be + consumed by a Make based build system. The exact contents and format of this + file are closely tied to how LLVMBuild is integrated with the Makefiles, see + LLVM's Makefile.rules. + + + +**--llvmbuild-source-root**\ =\ *PATH* + + If given, expect the *LLVMBuild* files for the project to be rooted at the + given path, instead of inside the source tree itself. This option is primarily + designed for use in conjunction with **--write-llvmbuild** to test changes to + *LLVMBuild* schema. + + + + +EXIT STATUS +----------- + + +**llvm-build** exits with 0 if operation was successful. Otherwise, it will exist +with a non-zero value. diff --git a/docs/CommandGuide/llvm-config.pod b/docs/CommandGuide/llvm-config.pod deleted file mode 100644 index 7d68564..0000000 --- a/docs/CommandGuide/llvm-config.pod +++ /dev/null @@ -1,131 +0,0 @@ -=pod - -=head1 NAME - -llvm-config - Print LLVM compilation options - -=head1 SYNOPSIS - -B<llvm-config> I<option> [I<components>...] - -=head1 DESCRIPTION - -B<llvm-config> makes it easier to build applications that use LLVM. It can -print the compiler flags, linker flags and object libraries needed to link -against LLVM. - -=head1 EXAMPLES - -To link against the JIT: - - g++ `llvm-config --cxxflags` -o HowToUseJIT.o -c HowToUseJIT.cpp - g++ `llvm-config --ldflags` -o HowToUseJIT HowToUseJIT.o \ - `llvm-config --libs engine bcreader scalaropts` - -=head1 OPTIONS - -=over - -=item B<--version> - -Print the version number of LLVM. - -=item B<-help> - -Print a summary of B<llvm-config> arguments. - -=item B<--prefix> - -Print the installation prefix for LLVM. - -=item B<--src-root> - -Print the source root from which LLVM was built. - -=item B<--obj-root> - -Print the object root used to build LLVM. - -=item B<--bindir> - -Print the installation directory for LLVM binaries. - -=item B<--includedir> - -Print the installation directory for LLVM headers. - -=item B<--libdir> - -Print the installation directory for LLVM libraries. - -=item B<--cxxflags> - -Print the C++ compiler flags needed to use LLVM headers. - -=item B<--ldflags> - -Print the flags needed to link against LLVM libraries. - -=item B<--libs> - -Print all the libraries needed to link against the specified LLVM -I<components>, including any dependencies. - -=item B<--libnames> - -Similar to B<--libs>, but prints the bare filenames of the libraries -without B<-l> or pathnames. Useful for linking against a not-yet-installed -copy of LLVM. - -=item B<--libfiles> - -Similar to B<--libs>, but print the full path to each library file. This is -useful when creating makefile dependencies, to ensure that a tool is relinked if -any library it uses changes. - -=item B<--components> - -Print all valid component names. - -=item B<--targets-built> - -Print the component names for all targets supported by this copy of LLVM. - -=item B<--build-mode> - -Print the build mode used when LLVM was built (e.g. Debug or Release) - -=back - -=head1 COMPONENTS - -To print a list of all available components, run B<llvm-config ---components>. In most cases, components correspond directly to LLVM -libraries. Useful "virtual" components include: - -=over - -=item B<all> - -Includes all LLVM libaries. The default if no components are specified. - -=item B<backend> - -Includes either a native backend or the C backend. - -=item B<engine> - -Includes either a native JIT or the bitcode interpreter. - -=back - -=head1 EXIT STATUS - -If B<llvm-config> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-config.rst b/docs/CommandGuide/llvm-config.rst new file mode 100644 index 0000000..0ebb344 --- /dev/null +++ b/docs/CommandGuide/llvm-config.rst @@ -0,0 +1,176 @@ +llvm-config - Print LLVM compilation options +============================================ + + +SYNOPSIS +-------- + + +**llvm-config** *option* [*components*...] + + +DESCRIPTION +----------- + + +**llvm-config** makes it easier to build applications that use LLVM. It can +print the compiler flags, linker flags and object libraries needed to link +against LLVM. + + +EXAMPLES +-------- + + +To link against the JIT: + + +.. code-block:: sh + + g++ `llvm-config --cxxflags` -o HowToUseJIT.o -c HowToUseJIT.cpp + g++ `llvm-config --ldflags` -o HowToUseJIT HowToUseJIT.o \ + `llvm-config --libs engine bcreader scalaropts` + + + +OPTIONS +------- + + + +**--version** + + Print the version number of LLVM. + + + +**-help** + + Print a summary of **llvm-config** arguments. + + + +**--prefix** + + Print the installation prefix for LLVM. + + + +**--src-root** + + Print the source root from which LLVM was built. + + + +**--obj-root** + + Print the object root used to build LLVM. + + + +**--bindir** + + Print the installation directory for LLVM binaries. + + + +**--includedir** + + Print the installation directory for LLVM headers. + + + +**--libdir** + + Print the installation directory for LLVM libraries. + + + +**--cxxflags** + + Print the C++ compiler flags needed to use LLVM headers. + + + +**--ldflags** + + Print the flags needed to link against LLVM libraries. + + + +**--libs** + + Print all the libraries needed to link against the specified LLVM + *components*, including any dependencies. + + + +**--libnames** + + Similar to **--libs**, but prints the bare filenames of the libraries + without **-l** or pathnames. Useful for linking against a not-yet-installed + copy of LLVM. + + + +**--libfiles** + + Similar to **--libs**, but print the full path to each library file. This is + useful when creating makefile dependencies, to ensure that a tool is relinked if + any library it uses changes. + + + +**--components** + + Print all valid component names. + + + +**--targets-built** + + Print the component names for all targets supported by this copy of LLVM. + + + +**--build-mode** + + Print the build mode used when LLVM was built (e.g. Debug or Release) + + + + +COMPONENTS +---------- + + +To print a list of all available components, run **llvm-config +--components**. In most cases, components correspond directly to LLVM +libraries. Useful "virtual" components include: + + +**all** + + Includes all LLVM libaries. The default if no components are specified. + + + +**backend** + + Includes either a native backend or the C backend. + + + +**engine** + + Includes either a native JIT or the bitcode interpreter. + + + + +EXIT STATUS +----------- + + +If **llvm-config** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. diff --git a/docs/CommandGuide/llvm-cov.pod b/docs/CommandGuide/llvm-cov.pod deleted file mode 100644 index e8ff683..0000000 --- a/docs/CommandGuide/llvm-cov.pod +++ /dev/null @@ -1,45 +0,0 @@ -=pod - -=head1 NAME - -llvm-cov - emit coverage information - -=head1 SYNOPSIS - -B<llvm-cov> [-gcno=filename] [-gcda=filename] [dump] - -=head1 DESCRIPTION - -The experimental B<llvm-cov> tool reads in description file generated by compiler -and coverage data file generated by instrumented program. This program assumes -that the description and data file uses same format as gcov files. - -=head1 OPTIONS - -=over - -=item B<-gcno=filename] - -This option selects input description file generated by compiler while instrumenting -program. - -=item B<-gcda=filename] - -This option selects coverage data file generated by instrumented compiler. - -=item B<-dump> - -This options enables output dump that is suitable for a developer to help debug -B<llvm-cov> itself. - -=back - -=head1 EXIT STATUS - -B<llvm-cov> returns 1 if it cannot read input files. Otherwise, it exits with zero. - -=head1 AUTHOR - -B<llvm-cov> is maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-cov.rst b/docs/CommandGuide/llvm-cov.rst new file mode 100644 index 0000000..09275f6 --- /dev/null +++ b/docs/CommandGuide/llvm-cov.rst @@ -0,0 +1,51 @@ +llvm-cov - emit coverage information +==================================== + + +SYNOPSIS +-------- + + +**llvm-cov** [-gcno=filename] [-gcda=filename] [dump] + + +DESCRIPTION +----------- + + +The experimental **llvm-cov** tool reads in description file generated by compiler +and coverage data file generated by instrumented program. This program assumes +that the description and data file uses same format as gcov files. + + +OPTIONS +------- + + + +**-gcno=filename]** + + This option selects input description file generated by compiler while instrumenting + program. + + + +**-gcda=filename]** + + This option selects coverage data file generated by instrumented compiler. + + + +**-dump** + + This options enables output dump that is suitable for a developer to help debug + **llvm-cov** itself. + + + + +EXIT STATUS +----------- + + +**llvm-cov** returns 1 if it cannot read input files. Otherwise, it exits with zero. diff --git a/docs/CommandGuide/llvm-diff.pod b/docs/CommandGuide/llvm-diff.rst index ffe0b48..991d4fe 100644 --- a/docs/CommandGuide/llvm-diff.pod +++ b/docs/CommandGuide/llvm-diff.rst @@ -1,16 +1,19 @@ -=pod +llvm-diff - LLVM structural 'diff' +================================== -=head1 NAME -llvm-diff - LLVM structural 'diff' +SYNOPSIS +-------- + -=head1 SYNOPSIS +**llvm-diff** [*options*] *module 1* *module 2* [*global name ...*] -B<llvm-diff> [I<options>] I<module 1> I<module 2> [I<global name ...>] -=head1 DESCRIPTION +DESCRIPTION +----------- -B<llvm-diff> compares the structure of two LLVM modules, primarily + +**llvm-diff** compares the structure of two LLVM modules, primarily focusing on differences in function definitions. Insignificant differences, such as changes in the ordering of globals or in the names of local values, are ignored. @@ -23,31 +26,31 @@ are compared; otherwise, all global values are compared, and diagnostics are produced for globals which only appear in one module or the other. -B<llvm-diff> compares two functions by comparing their basic blocks, +**llvm-diff** compares two functions by comparing their basic blocks, beginning with the entry blocks. If the terminators seem to match, then the corresponding successors are compared; otherwise they are ignored. This algorithm is very sensitive to changes in control flow, which tend to stop any downstream changes from being detected. -B<llvm-diff> is intended as a debugging tool for writers of LLVM +**llvm-diff** is intended as a debugging tool for writers of LLVM passes and frontends. It does not have a stable output format. -=head1 EXIT STATUS -If B<llvm-diff> finds no differences between the modules, it will exit +EXIT STATUS +----------- + + +If **llvm-diff** finds no differences between the modules, it will exit with 0 and produce no output. Otherwise it will exit with a non-zero value. -=head1 BUGS + +BUGS +---- + Many important differences, like changes in linkage or function attributes, are not diagnosed. Changes in memory behavior (for example, coalescing loads) can cause massive detected differences in blocks. - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-dis.pod b/docs/CommandGuide/llvm-dis.pod deleted file mode 100644 index 9f4026c..0000000 --- a/docs/CommandGuide/llvm-dis.pod +++ /dev/null @@ -1,60 +0,0 @@ -=pod - -=head1 NAME - -llvm-dis - LLVM disassembler - -=head1 SYNOPSIS - -B<llvm-dis> [I<options>] [I<filename>] - -=head1 DESCRIPTION - -The B<llvm-dis> command is the LLVM disassembler. It takes an LLVM -bitcode file and converts it into human-readable LLVM assembly language. - -If filename is omitted or specified as C<->, B<llvm-dis> reads its -input from standard input. - -If the input is being read from standard input, then B<llvm-dis> -will send its output to standard output by default. Otherwise, the -output will be written to a file named after the input file, with -a C<.ll> suffix added (any existing C<.bc> suffix will first be -removed). You can override the choice of output file using the -B<-o> option. - -=head1 OPTIONS - -=over - -=item B<-f> - -Enable binary output on terminals. Normally, B<llvm-dis> will refuse to -write raw bitcode output if the output stream is a terminal. With this option, -B<llvm-dis> will write raw bitcode regardless of the output device. - -=item B<-help> - -Print a summary of command line options. - -=item B<-o> F<filename> - -Specify the output file name. If F<filename> is -, then the output is sent -to standard output. - -=back - -=head1 EXIT STATUS - -If B<llvm-dis> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 SEE ALSO - -L<llvm-as|llvm-as> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-dis.rst b/docs/CommandGuide/llvm-dis.rst new file mode 100644 index 0000000..85cdca8 --- /dev/null +++ b/docs/CommandGuide/llvm-dis.rst @@ -0,0 +1,69 @@ +llvm-dis - LLVM disassembler +============================ + + +SYNOPSIS +-------- + + +**llvm-dis** [*options*] [*filename*] + + +DESCRIPTION +----------- + + +The **llvm-dis** command is the LLVM disassembler. It takes an LLVM +bitcode file and converts it into human-readable LLVM assembly language. + +If filename is omitted or specified as ``-``, **llvm-dis** reads its +input from standard input. + +If the input is being read from standard input, then **llvm-dis** +will send its output to standard output by default. Otherwise, the +output will be written to a file named after the input file, with +a ``.ll`` suffix added (any existing ``.bc`` suffix will first be +removed). You can override the choice of output file using the +**-o** option. + + +OPTIONS +------- + + + +**-f** + + Enable binary output on terminals. Normally, **llvm-dis** will refuse to + write raw bitcode output if the output stream is a terminal. With this option, + **llvm-dis** will write raw bitcode regardless of the output device. + + + +**-help** + + Print a summary of command line options. + + + +**-o** *filename* + + Specify the output file name. If *filename* is -, then the output is sent + to standard output. + + + + +EXIT STATUS +----------- + + +If **llvm-dis** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. + + +SEE ALSO +-------- + + +llvm-as|llvm-as diff --git a/docs/CommandGuide/llvm-extract.pod b/docs/CommandGuide/llvm-extract.pod deleted file mode 100644 index 67f00f0..0000000 --- a/docs/CommandGuide/llvm-extract.pod +++ /dev/null @@ -1,85 +0,0 @@ -=pod - -=head1 NAME - -llvm-extract - extract a function from an LLVM module - -=head1 SYNOPSIS - -B<llvm-extract> [I<options>] B<--func> I<function-name> [I<filename>] - -=head1 DESCRIPTION - -The B<llvm-extract> command takes the name of a function and extracts it from -the specified LLVM bitcode file. It is primarily used as a debugging tool to -reduce test cases from larger programs that are triggering a bug. - -In addition to extracting the bitcode of the specified function, -B<llvm-extract> will also remove unreachable global variables, prototypes, and -unused types. - -The B<llvm-extract> command reads its input from standard input if filename is -omitted or if filename is -. The output is always written to standard output, -unless the B<-o> option is specified (see below). - -=head1 OPTIONS - -=over - -=item B<-f> - -Enable binary output on terminals. Normally, B<llvm-extract> will refuse to -write raw bitcode output if the output stream is a terminal. With this option, -B<llvm-extract> will write raw bitcode regardless of the output device. - -=item B<--func> I<function-name> - -Extract the function named I<function-name> from the LLVM bitcode. May be -specified multiple times to extract multiple functions at once. - -=item B<--rfunc> I<function-regular-expr> - -Extract the function(s) matching I<function-regular-expr> from the LLVM bitcode. -All functions matching the regular expression will be extracted. May be -specified multiple times. - -=item B<--glob> I<global-name> - -Extract the global variable named I<global-name> from the LLVM bitcode. May be -specified multiple times to extract multiple global variables at once. - -=item B<--rglob> I<glob-regular-expr> - -Extract the global variable(s) matching I<global-regular-expr> from the LLVM -bitcode. All global variables matching the regular expression will be extracted. -May be specified multiple times. - -=item B<-help> - -Print a summary of command line options. - -=item B<-o> I<filename> - -Specify the output filename. If filename is "-" (the default), then -B<llvm-extract> sends its output to standard output. - -=item B<-S> - -Write output in LLVM intermediate language (instead of bitcode). - -=back - -=head1 EXIT STATUS - -If B<llvm-extract> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 SEE ALSO - -L<bugpoint|bugpoint> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-extract.rst b/docs/CommandGuide/llvm-extract.rst new file mode 100644 index 0000000..d569e35 --- /dev/null +++ b/docs/CommandGuide/llvm-extract.rst @@ -0,0 +1,104 @@ +llvm-extract - extract a function from an LLVM module +===================================================== + + +SYNOPSIS +-------- + + +**llvm-extract** [*options*] **--func** *function-name* [*filename*] + + +DESCRIPTION +----------- + + +The **llvm-extract** command takes the name of a function and extracts it from +the specified LLVM bitcode file. It is primarily used as a debugging tool to +reduce test cases from larger programs that are triggering a bug. + +In addition to extracting the bitcode of the specified function, +**llvm-extract** will also remove unreachable global variables, prototypes, and +unused types. + +The **llvm-extract** command reads its input from standard input if filename is +omitted or if filename is -. The output is always written to standard output, +unless the **-o** option is specified (see below). + + +OPTIONS +------- + + + +**-f** + + Enable binary output on terminals. Normally, **llvm-extract** will refuse to + write raw bitcode output if the output stream is a terminal. With this option, + **llvm-extract** will write raw bitcode regardless of the output device. + + + +**--func** *function-name* + + Extract the function named *function-name* from the LLVM bitcode. May be + specified multiple times to extract multiple functions at once. + + + +**--rfunc** *function-regular-expr* + + Extract the function(s) matching *function-regular-expr* from the LLVM bitcode. + All functions matching the regular expression will be extracted. May be + specified multiple times. + + + +**--glob** *global-name* + + Extract the global variable named *global-name* from the LLVM bitcode. May be + specified multiple times to extract multiple global variables at once. + + + +**--rglob** *glob-regular-expr* + + Extract the global variable(s) matching *global-regular-expr* from the LLVM + bitcode. All global variables matching the regular expression will be extracted. + May be specified multiple times. + + + +**-help** + + Print a summary of command line options. + + + +**-o** *filename* + + Specify the output filename. If filename is "-" (the default), then + **llvm-extract** sends its output to standard output. + + + +**-S** + + Write output in LLVM intermediate language (instead of bitcode). + + + + +EXIT STATUS +----------- + + +If **llvm-extract** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. + + +SEE ALSO +-------- + + +bugpoint|bugpoint diff --git a/docs/CommandGuide/llvm-ld.pod b/docs/CommandGuide/llvm-ld.pod deleted file mode 100644 index efa9ebd..0000000 --- a/docs/CommandGuide/llvm-ld.pod +++ /dev/null @@ -1,234 +0,0 @@ -=pod - -=head1 NAME - -llvm-ld - LLVM linker - -=head1 SYNOPSIS - -B<llvm-ld> <options> <files> - -=head1 DESCRIPTION - -The B<llvm-ld> tool takes a set of LLVM bitcode files and links them -together into a single LLVM bitcode file. The output bitcode file can be -another bitcode file or an executable bitcode program. Using additional -options, B<llvm-ld> is able to produce native code executables. - -The B<llvm-ld> tool is the main linker for LLVM. It is used to link together -the output of LLVM front-end compilers and run "link time" optimizations (mostly -the inter-procedural kind). - -The B<llvm-ld> tools attempts to mimic the interface provided by the default -system linker so that it can act as a I<drop-in> replacement. - -=head2 Search Order - -When looking for objects specified on the command line, B<llvm-ld> will search -for the object first in the current directory and then in the directory -specified by the B<LLVM_LIB_SEARCH_PATH> environment variable. If it cannot -find the object, it fails. - -When looking for a library specified with the B<-l> option, B<llvm-ld> first -attempts to load a file with that name from the current directory. If that -fails, it looks for libI<library>.bc, libI<library>.a, or libI<library>.I<shared -library extension>, in that order, in each directory added to the library search -path with the B<-L> option. These directories are searched in the order they -are specified. If the library cannot be located, then B<llvm-ld> looks in the -directory specified by the B<LLVM_LIB_SEARCH_PATH> environment variable. If it -does not find a library there, it fails. - -The I<shared library extension> may be I<.so>, I<.dyld>, I<.dll>, or something -different, depending upon the system. - -The B<-L> option is global. It does not matter where it is specified in the -list of command line arguments; the directory is simply added to the search path -and is applied to all libraries, preceding or succeeding, in the command line. - -=head2 Link order - -All object and bitcode files are linked first in the order they were -specified on the command line. All library files are linked next. -Some libraries may not be linked into the object program; see below. - -=head2 Library Linkage - -Object files and static bitcode objects are always linked into the output -file. Library archives (.a files) load only the objects within the archive -that define symbols needed by the output file. Hence, libraries should be -listed after the object files and libraries which need them; otherwise, the -library may not be linked in, and the dependent library will not have its -undefined symbols defined. - -=head2 Native code generation - -The B<llvm-ld> program has limited support for native code generation, when -using the B<-native> or B<-native-cbe> options. Native code generation is -performed by converting the linked bitcode into native assembly (.s) or C code -and running the system compiler (typically gcc) on the result. - -=head1 OPTIONS - -=head2 General Options - -=over - -=item B<-help> - -Print a summary of command line options. - -=item B<-v> - -Specifies verbose mode. In this mode the linker will print additional -information about the actions it takes, programs it executes, etc. - -=item B<-stats> - -Print statistics. - -=item B<-time-passes> - -Record the amount of time needed for each pass and print it to standard -error. - -=back - -=head2 Input/Output Options - -=over - -=item B<-o> F<filename> - -This overrides the default output file and specifies the name of the file that -should be generated by the linker. By default, B<llvm-ld> generates a file named -F<a.out> for compatibility with B<ld>. The output will be written to -F<filename>. - -=item B<-b> F<filename> - -This option can be used to override the output bitcode file name. By default, -the name of the bitcode output file is one more ".bc" suffix added to the name -specified by B<-o filename> option. - -=item B<-l>F<name> - -This option specifies the F<name> of a library to search when resolving symbols -for the program. Only the base name should be specified as F<name>, without a -F<lib> prefix or any suffix. - -=item B<-L>F<Path> - -This option tells B<llvm-ld> to look in F<Path> to find any library subsequently -specified with the B<-l> option. The paths will be searched in the order in -which they are specified on the command line. If the library is still not found, -a small set of system specific directories will also be searched. Note that -libraries specified with the B<-l> option that occur I<before> any B<-L> options -will not search the paths given by the B<-L> options following it. - -=item B<-link-as-library> - -Link the bitcode files together as a library, not an executable. In this mode, -undefined symbols will be permitted. - -=item B<-r> - -An alias for -link-as-library. - -=item B<-native> - -Generate a native machine code executable. - -When generating native executables, B<llvm-ld> first checks for a bitcode -version of the library and links it in, if necessary. If the library is -missing, B<llvm-ld> skips it. Then, B<llvm-ld> links in the same -libraries as native code. - -In this way, B<llvm-ld> should be able to link in optimized bitcode -subsets of common libraries and then link in any part of the library that -hasn't been converted to bitcode. - -=item B<-native-cbe> - -Generate a native machine code executable with the LLVM C backend. - -This option is identical to the B<-native> option, but uses the -C backend to generate code for the program instead of an LLVM native -code generator. - -=back - -=head2 Optimization Options - -=over - -=item B<-disable-inlining> - -Do not run the inlining pass. Functions will not be inlined into other -functions. - -=item B<-disable-opt> - -Completely disable optimization. - -=item B<-disable-internalize> - -Do not mark all symbols as internal. - -=item B<-verify-each> - -Run the verification pass after each of the passes to verify intermediate -results. - -=item B<-strip-all> - -Strip all debug and symbol information from the executable to make it smaller. - -=item B<-strip-debug> - -Strip all debug information from the executable to make it smaller. - -=item B<-s> - -An alias for B<-strip-all>. - -=item B<-S> - -An alias for B<-strip-debug>. - -=item B<-export-dynamic> - -An alias for B<-disable-internalize> - -=item B<-post-link-opt>F<Path> - -Run post-link optimization program. After linking is completed a bitcode file -will be generated. It will be passed to the program specified by F<Path> as the -first argument. The second argument to the program will be the name of a -temporary file into which the program should place its optimized output. For -example, the "no-op optimization" would be a simple shell script: - - #!/bin/bash - cp $1 $2 - -=back - -=head1 EXIT STATUS - -If B<llvm-ld> succeeds, it will exit with 0 return code. If an error occurs, -it will exit with a non-zero return code. - -=head1 ENVIRONMENT - -The C<LLVM_LIB_SEARCH_PATH> environment variable is used to find bitcode -libraries. Any paths specified in this variable will be searched after the C<-L> -options. - -=head1 SEE ALSO - -L<llvm-link|llvm-link> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-link.pod b/docs/CommandGuide/llvm-link.pod deleted file mode 100644 index 1e466a5..0000000 --- a/docs/CommandGuide/llvm-link.pod +++ /dev/null @@ -1,79 +0,0 @@ -=pod - -=head1 NAME - -llvm-link - LLVM linker - -=head1 SYNOPSIS - -B<llvm-link> [I<options>] I<filename ...> - -=head1 DESCRIPTION - -B<llvm-link> takes several LLVM bitcode files and links them together into a -single LLVM bitcode file. It writes the output file to standard output, unless -the B<-o> option is used to specify a filename. - -B<llvm-link> attempts to load the input files from the current directory. If -that fails, it looks for each file in each of the directories specified by the -B<-L> options on the command line. The library search paths are global; each -one is searched for every input file if necessary. The directories are searched -in the order they were specified on the command line. - -=head1 OPTIONS - -=over - -=item B<-L> F<directory> - -Add the specified F<directory> to the library search path. When looking for -libraries, B<llvm-link> will look in path name for libraries. This option can be -specified multiple times; B<llvm-link> will search inside these directories in -the order in which they were specified on the command line. - -=item B<-f> - -Enable binary output on terminals. Normally, B<llvm-link> will refuse to -write raw bitcode output if the output stream is a terminal. With this option, -B<llvm-link> will write raw bitcode regardless of the output device. - -=item B<-o> F<filename> - -Specify the output file name. If F<filename> is C<->, then B<llvm-link> will -write its output to standard output. - -=item B<-S> - -Write output in LLVM intermediate language (instead of bitcode). - -=item B<-d> - -If specified, B<llvm-link> prints a human-readable version of the output -bitcode file to standard error. - -=item B<-help> - -Print a summary of command line options. - -=item B<-v> - -Verbose mode. Print information about what B<llvm-link> is doing. This -typically includes a message for each bitcode file linked in and for each -library found. - -=back - -=head1 EXIT STATUS - -If B<llvm-link> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 SEE ALSO - -L<gccld|gccld> - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-link.rst b/docs/CommandGuide/llvm-link.rst new file mode 100644 index 0000000..63019d7 --- /dev/null +++ b/docs/CommandGuide/llvm-link.rst @@ -0,0 +1,96 @@ +llvm-link - LLVM linker +======================= + + +SYNOPSIS +-------- + + +**llvm-link** [*options*] *filename ...* + + +DESCRIPTION +----------- + + +**llvm-link** takes several LLVM bitcode files and links them together into a +single LLVM bitcode file. It writes the output file to standard output, unless +the **-o** option is used to specify a filename. + +**llvm-link** attempts to load the input files from the current directory. If +that fails, it looks for each file in each of the directories specified by the +**-L** options on the command line. The library search paths are global; each +one is searched for every input file if necessary. The directories are searched +in the order they were specified on the command line. + + +OPTIONS +------- + + + +**-L** *directory* + + Add the specified *directory* to the library search path. When looking for + libraries, **llvm-link** will look in path name for libraries. This option can be + specified multiple times; **llvm-link** will search inside these directories in + the order in which they were specified on the command line. + + + +**-f** + + Enable binary output on terminals. Normally, **llvm-link** will refuse to + write raw bitcode output if the output stream is a terminal. With this option, + **llvm-link** will write raw bitcode regardless of the output device. + + + +**-o** *filename* + + Specify the output file name. If *filename* is ``-``, then **llvm-link** will + write its output to standard output. + + + +**-S** + + Write output in LLVM intermediate language (instead of bitcode). + + + +**-d** + + If specified, **llvm-link** prints a human-readable version of the output + bitcode file to standard error. + + + +**-help** + + Print a summary of command line options. + + + +**-v** + + Verbose mode. Print information about what **llvm-link** is doing. This + typically includes a message for each bitcode file linked in and for each + library found. + + + + +EXIT STATUS +----------- + + +If **llvm-link** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. + + +SEE ALSO +-------- + + +gccld|gccld diff --git a/docs/CommandGuide/llvm-nm.pod b/docs/CommandGuide/llvm-nm.pod deleted file mode 100644 index a6dc490..0000000 --- a/docs/CommandGuide/llvm-nm.pod +++ /dev/null @@ -1,122 +0,0 @@ -=pod - -=head1 NAME - -llvm-nm - list LLVM bitcode file's symbol table - -=head1 SYNOPSIS - -B<llvm-nm> [I<options>] [I<filenames...>] - -=head1 DESCRIPTION - -The B<llvm-nm> utility lists the names of symbols from the LLVM bitcode files, -or B<ar> archives containing LLVM bitcode files, named on the command line. -Each symbol is listed along with some simple information about its provenance. -If no file name is specified, or I<-> is used as a file name, B<llvm-nm> will -process a bitcode file on its standard input stream. - -B<llvm-nm>'s default output format is the traditional BSD B<nm> output format. -Each such output record consists of an (optional) 8-digit hexadecimal address, -followed by a type code character, followed by a name, for each symbol. One -record is printed per line; fields are separated by spaces. When the address is -omitted, it is replaced by 8 spaces. - -Type code characters currently supported, and their meanings, are as follows: - -=over - -=item U - -Named object is referenced but undefined in this bitcode file - -=item C - -Common (multiple definitions link together into one def) - -=item W - -Weak reference (multiple definitions link together into zero or one definitions) - -=item t - -Local function (text) object - -=item T - -Global function (text) object - -=item d - -Local data object - -=item D - -Global data object - -=item ? - -Something unrecognizable - -=back - -Because LLVM bitcode files typically contain objects that are not considered to -have addresses until they are linked into an executable image or dynamically -compiled "just-in-time", B<llvm-nm> does not print an address for any symbol, -even symbols which are defined in the bitcode file. - -=head1 OPTIONS - -=over - -=item B<-P> - -Use POSIX.2 output format. Alias for B<--format=posix>. - -=item B<-B> (default) - -Use BSD output format. Alias for B<--format=bsd>. - -=item B<-help> - -Print a summary of command-line options and their meanings. - -=item B<--defined-only> - -Print only symbols defined in this bitcode file (as opposed to -symbols which may be referenced by objects in this file, but not -defined in this file.) - -=item B<--extern-only>, B<-g> - -Print only symbols whose definitions are external; that is, accessible -from other bitcode files. - -=item B<--undefined-only>, B<-u> - -Print only symbols referenced but not defined in this bitcode file. - -=item B<--format=>I<fmt>, B<-f> - -Select an output format; I<fmt> may be I<sysv>, I<posix>, or I<bsd>. The -default is I<bsd>. - -=back - -=head1 BUGS - -B<llvm-nm> cannot demangle C++ mangled names, like GNU B<nm> can. - -=head1 EXIT STATUS - -B<llvm-nm> exits with an exit code of zero. - -=head1 SEE ALSO - -L<llvm-dis|llvm-dis>, ar(1), nm(1) - -=head1 AUTHOR - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-nm.rst b/docs/CommandGuide/llvm-nm.rst new file mode 100644 index 0000000..cbc7af2 --- /dev/null +++ b/docs/CommandGuide/llvm-nm.rst @@ -0,0 +1,189 @@ +llvm-nm - list LLVM bitcode and object file's symbol table +========================================================== + + +SYNOPSIS +-------- + + +:program:`llvm-nm` [*options*] [*filenames...*] + + +DESCRIPTION +----------- + + +The :program:`llvm-nm` utility lists the names of symbols from the LLVM bitcode +files, object files, or :program:`ar` archives containing them, named on the +command line. Each symbol is listed along with some simple information about its +provenance. If no file name is specified, or *-* is used as a file name, +:program:`llvm-nm` will process a file on its standard input stream. + +:program:`llvm-nm`'s default output format is the traditional BSD :program:`nm` +output format. Each such output record consists of an (optional) 8-digit +hexadecimal address, followed by a type code character, followed by a name, for +each symbol. One record is printed per line; fields are separated by spaces. +When the address is omitted, it is replaced by 8 spaces. + +Type code characters currently supported, and their meanings, are as follows: + + +U + + Named object is referenced but undefined in this bitcode file + + + +C + + Common (multiple definitions link together into one def) + + + +W + + Weak reference (multiple definitions link together into zero or one definitions) + + + +t + + Local function (text) object + + + +T + + Global function (text) object + + + +d + + Local data object + + + +D + + Global data object + + + +? + + Something unrecognizable + + + +Because LLVM bitcode files typically contain objects that are not considered to +have addresses until they are linked into an executable image or dynamically +compiled "just-in-time", :program:`llvm-nm` does not print an address for any +symbol in a LLVM bitcode file, even symbols which are defined in the bitcode +file. + + +OPTIONS +------- + + +.. program:: llvm-nm + + +.. option:: -B (default) + + Use BSD output format. Alias for :option:`--format=bsd`. + + +.. option:: -P + + Use POSIX.2 output format. Alias for :option:`--format=posix`. + + +.. option:: --debug-syms, -a + + Show all symbols, even debugger only. + + +.. option:: --defined-only + + Print only symbols defined in this file (as opposed to + symbols which may be referenced by objects in this file, but not + defined in this file.) + + +.. option:: --dynamic, -D + + Display dynamic symbols instead of normal symbols. + + +.. option:: --extern-only, -g + + Print only symbols whose definitions are external; that is, accessible + from other files. + + +.. option:: --format=format, -f format + + Select an output format; *format* may be *sysv*, *posix*, or *bsd*. The default + is *bsd*. + + +.. option:: -help + + Print a summary of command-line options and their meanings. + + +.. option:: --no-sort, -p + + Shows symbols in order encountered. + + +.. option:: --numeric-sort, -n, -v + + Sort symbols by address. + + +.. option:: --print-file-name, -A, -o + + Precede each symbol with the file it came from. + + +.. option:: --print-size, -S + + Show symbol size instead of address. + + +.. option:: --size-sort + + Sort symbols by size. + + +.. option:: --undefined-only, -u + + Print only symbols referenced but not defined in this file. + + +BUGS +---- + + + * :program:`llvm-nm` cannot demangle C++ mangled names, like GNU :program:`nm` + can. + + * :program:`llvm-nm` does not support the full set of arguments that GNU + :program:`nm` does. + + +EXIT STATUS +----------- + + +:program:`llvm-nm` exits with an exit code of zero. + + +SEE ALSO +-------- + + +llvm-dis|llvm-dis, ar(1), nm(1) diff --git a/docs/CommandGuide/llvm-prof.pod b/docs/CommandGuide/llvm-prof.pod deleted file mode 100644 index 4b2e09d..0000000 --- a/docs/CommandGuide/llvm-prof.pod +++ /dev/null @@ -1,57 +0,0 @@ -=pod - -=head1 NAME - -llvm-prof - print execution profile of LLVM program - -=head1 SYNOPSIS - -B<llvm-prof> [I<options>] [I<bitcode file>] [I<llvmprof.out>] - -=head1 DESCRIPTION - -The B<llvm-prof> tool reads in an F<llvmprof.out> file (which can -optionally use a specific file with the third program argument), a bitcode file -for the program, and produces a human readable report, suitable for determining -where the program hotspots are. - -This program is often used in conjunction with the F<utils/profile.pl> -script. This script automatically instruments a program, runs it with the JIT, -then runs B<llvm-prof> to format a report. To get more information about -F<utils/profile.pl>, execute it with the B<-help> option. - -=head1 OPTIONS - -=over - -=item B<--annotated-llvm> or B<-A> - -In addition to the normal report printed, print out the code for the -program, annotated with execution frequency information. This can be -particularly useful when trying to visualize how frequently basic blocks -are executed. This is most useful with basic block profiling -information or better. - -=item B<--print-all-code> - -Using this option enables the B<--annotated-llvm> option, but it -prints the entire module, instead of just the most commonly executed -functions. - -=item B<--time-passes> - -Record the amount of time needed for each pass and print it to standard -error. - -=back - -=head1 EXIT STATUS - -B<llvm-prof> returns 1 if it cannot load the bitcode file or the profile -information. Otherwise, it exits with zero. - -=head1 AUTHOR - -B<llvm-prof> is maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-prof.rst b/docs/CommandGuide/llvm-prof.rst new file mode 100644 index 0000000..e8d0b19 --- /dev/null +++ b/docs/CommandGuide/llvm-prof.rst @@ -0,0 +1,63 @@ +llvm-prof - print execution profile of LLVM program +=================================================== + + +SYNOPSIS +-------- + + +**llvm-prof** [*options*] [*bitcode file*] [*llvmprof.out*] + + +DESCRIPTION +----------- + + +The **llvm-prof** tool reads in an *llvmprof.out* file (which can +optionally use a specific file with the third program argument), a bitcode file +for the program, and produces a human readable report, suitable for determining +where the program hotspots are. + +This program is often used in conjunction with the *utils/profile.pl* +script. This script automatically instruments a program, runs it with the JIT, +then runs **llvm-prof** to format a report. To get more information about +*utils/profile.pl*, execute it with the **-help** option. + + +OPTIONS +------- + + + +**--annotated-llvm** or **-A** + + In addition to the normal report printed, print out the code for the + program, annotated with execution frequency information. This can be + particularly useful when trying to visualize how frequently basic blocks + are executed. This is most useful with basic block profiling + information or better. + + + +**--print-all-code** + + Using this option enables the **--annotated-llvm** option, but it + prints the entire module, instead of just the most commonly executed + functions. + + + +**--time-passes** + + Record the amount of time needed for each pass and print it to standard + error. + + + + +EXIT STATUS +----------- + + +**llvm-prof** returns 1 if it cannot load the bitcode file or the profile +information. Otherwise, it exits with zero. diff --git a/docs/CommandGuide/llvm-ranlib.pod b/docs/CommandGuide/llvm-ranlib.pod deleted file mode 100644 index 431bc55..0000000 --- a/docs/CommandGuide/llvm-ranlib.pod +++ /dev/null @@ -1,52 +0,0 @@ -=pod - -=head1 NAME - -llvm-ranlib - Generate index for LLVM archive - -=head1 SYNOPSIS - -B<llvm-ranlib> [--version] [-help] <archive-file> - -=head1 DESCRIPTION - -The B<llvm-ranlib> command is similar to the common Unix utility, C<ranlib>. It -adds or updates the symbol table in an LLVM archive file. Note that using the -B<llvm-ar> modifier F<s> is usually more efficient than running B<llvm-ranlib> -which is only provided only for completness and compatibility. Unlike other -implementations of C<ranlib>, B<llvm-ranlib> indexes LLVM bitcode files, not -native object modules. You can list the contents of the symbol table with the -C<llvm-nm -s> command. - -=head1 OPTIONS - -=over - -=item F<archive-file> - -Specifies the archive-file to which the symbol table is added or updated. - -=item F<--version> - -Print the version of B<llvm-ranlib> and exit without building a symbol table. - -=item F<-help> - -Print usage help for B<llvm-ranlib> and exit without building a symbol table. - -=back - -=head1 EXIT STATUS - -If B<llvm-ranlib> succeeds, it will exit with 0. If an error occurs, a non-zero -exit code will be returned. - -=head1 SEE ALSO - -L<llvm-ar|llvm-ar>, ranlib(1) - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-ranlib.rst b/docs/CommandGuide/llvm-ranlib.rst new file mode 100644 index 0000000..6658818 --- /dev/null +++ b/docs/CommandGuide/llvm-ranlib.rst @@ -0,0 +1,61 @@ +llvm-ranlib - Generate index for LLVM archive +============================================= + + +SYNOPSIS +-------- + + +**llvm-ranlib** [--version] [-help] <archive-file> + + +DESCRIPTION +----------- + + +The **llvm-ranlib** command is similar to the common Unix utility, ``ranlib``. It +adds or updates the symbol table in an LLVM archive file. Note that using the +**llvm-ar** modifier *s* is usually more efficient than running **llvm-ranlib** +which is only provided only for completness and compatibility. Unlike other +implementations of ``ranlib``, **llvm-ranlib** indexes LLVM bitcode files, not +native object modules. You can list the contents of the symbol table with the +``llvm-nm -s`` command. + + +OPTIONS +------- + + + +*archive-file* + + Specifies the archive-file to which the symbol table is added or updated. + + + +*--version* + + Print the version of **llvm-ranlib** and exit without building a symbol table. + + + +*-help* + + Print usage help for **llvm-ranlib** and exit without building a symbol table. + + + + +EXIT STATUS +----------- + + +If **llvm-ranlib** succeeds, it will exit with 0. If an error occurs, a non-zero +exit code will be returned. + + +SEE ALSO +-------- + + +llvm-ar|llvm-ar, ranlib(1) diff --git a/docs/CommandGuide/llvm-stress.pod b/docs/CommandGuide/llvm-stress.pod deleted file mode 100644 index 92083d2..0000000 --- a/docs/CommandGuide/llvm-stress.pod +++ /dev/null @@ -1,42 +0,0 @@ -=pod - -=head1 NAME - -llvm-stress - generate random .ll files - -=head1 SYNOPSIS - -B<llvm-cov> [-gcno=filename] [-gcda=filename] [dump] - -=head1 DESCRIPTION - -The B<llvm-stress> tool is used to generate random .ll files that can be used to -test different components of LLVM. - -=head1 OPTIONS - -=over - -=item B<-o> I<filename> - -Specify the output filename. - -=item B<-size> I<size> - -Specify the size of the generated .ll file. - -=item B<-seed> I<seed> - -Specify the seed to be used for the randomly generated instructions. - -=back - -=head1 EXIT STATUS - -B<llvm-stress> returns 0. - -=head1 AUTHOR - -B<llvm-stress> is maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/llvm-stress.rst b/docs/CommandGuide/llvm-stress.rst new file mode 100644 index 0000000..44aa32c --- /dev/null +++ b/docs/CommandGuide/llvm-stress.rst @@ -0,0 +1,48 @@ +llvm-stress - generate random .ll files +======================================= + + +SYNOPSIS +-------- + + +**llvm-stress** [-size=filesize] [-seed=initialseed] [-o=outfile] + + +DESCRIPTION +----------- + + +The **llvm-stress** tool is used to generate random .ll files that can be used to +test different components of LLVM. + + +OPTIONS +------- + + + +**-o** *filename* + + Specify the output filename. + + + +**-size** *size* + + Specify the size of the generated .ll file. + + + +**-seed** *seed* + + Specify the seed to be used for the randomly generated instructions. + + + + +EXIT STATUS +----------- + + +**llvm-stress** returns 0. diff --git a/docs/CommandGuide/manpage.css b/docs/CommandGuide/manpage.css deleted file mode 100644 index c922564..0000000 --- a/docs/CommandGuide/manpage.css +++ /dev/null @@ -1,256 +0,0 @@ -/* Based on http://www.perldoc.com/css/perldoc.css */ - -@import url("../llvm.css"); - -body { font-family: Arial,Helvetica; } - -blockquote { margin: 10pt; } - -h1, a { color: #336699; } - - -/*** Top menu style ****/ -.mmenuon { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ff6600; font-size: 10pt; -} -.mmenuoff { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: 10pt; -} -.cpyright { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: xx-small; -} -.cpyrightText { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #ffffff; font-size: xx-small; -} -.sections { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 11pt; -} -.dsections { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 12pt; -} -.slink { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #000000; font-size: 9pt; -} - -.slink2 { font-family: Arial,Helvetica; text-decoration: none; color: #336699; } - -.maintitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 18pt; -} -.dblArrow { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: small; -} -.menuSec { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: small; -} - -.newstext { - font-family: Arial,Helvetica; font-size: small; -} - -.linkmenu { - font-family: Arial,Helvetica; color: #000000; font-weight: bold; - text-decoration: none; -} - -P { - font-family: Arial,Helvetica; -} - -PRE { - font-size: 10pt; -} -.quote { - font-family: Times; text-decoration: none; - color: #000000; font-size: 9pt; font-style: italic; -} -.smstd { font-family: Arial,Helvetica; color: #000000; font-size: x-small; } -.std { font-family: Arial,Helvetica; color: #000000; } -.meerkatTitle { - font-family: sans-serif; font-size: x-small; color: black; } - -.meerkatDescription { font-family: sans-serif; font-size: 10pt; color: black } -.meerkatCategory { - font-family: sans-serif; font-size: 9pt; font-weight: bold; font-style: italic; - color: brown; } -.meerkatChannel { - font-family: sans-serif; font-size: 9pt; font-style: italic; color: brown; } -.meerkatDate { font-family: sans-serif; font-size: xx-small; color: #336699; } - -.tocTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 10pt; -} - -.toc-item { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; text-decoration: underline; -} - -.perlVersion { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; text-decoration: none; -} - -.podTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #000000; -} - -.docTitle { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #000000; font-size: 10pt; -} -.dotDot { - font-family: Arial,Helvetica; font-weight: bold; - color: #000000; font-size: 9pt; -} - -.docSec { - font-family: Arial,Helvetica; font-weight: normal; - color: #333333; font-size: 9pt; -} -.docVersion { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 10pt; -} - -.docSecs-on { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #ff0000; font-size: 10pt; -} -.docSecs-off { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #333333; font-size: 10pt; -} - -h2 { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: medium; -} -h1 { - font-family: Verdana,Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: large; -} - -DL { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; - color: #333333; font-size: 10pt; -} - -UL > LI > A { - font-family: Arial,Helvetica; font-weight: bold; - color: #336699; font-size: 10pt; -} - -.moduleInfo { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 11pt; -} - -.moduleInfoSec { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 10pt; -} - -.moduleInfoVal { - font-family: Arial,Helvetica; font-weight: normal; text-decoration: underline; - color: #000000; font-size: 10pt; -} - -.cpanNavTitle { - font-family: Arial,Helvetica; font-weight: bold; - color: #ffffff; font-size: 10pt; -} -.cpanNavLetter { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #333333; font-size: 9pt; -} -.cpanCat { - font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; - color: #336699; font-size: 9pt; -} - -.bttndrkblue-bkgd-top { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgtop.gif); -} -.bttndrkblue-bkgd-left { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgleft.gif); -} -.bttndrkblue-bkgd { - padding-top: 0px; - padding-bottom: 0px; - margin-bottom: 0px; - margin-top: 0px; - background-repeat: no-repeat; - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgmiddle.gif); - vertical-align: top; -} -.bttndrkblue-bkgd-right { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgright.gif); -} -.bttndrkblue-bkgd-bottom { - background-color: #225688; - background-image: url(/global/mvc_objects/images/bttndrkblue_bgbottom.gif); -} -.bttndrkblue-text a { - color: #ffffff; - text-decoration: none; -} -a.bttndrkblue-text:hover { - color: #ffDD3C; - text-decoration: none; -} -.bg-ltblue { - background-color: #f0f5fa; -} - -.border-left-b { - background: #f0f5fa url(/i/corner-leftline.gif) repeat-y; -} - -.border-right-b { - background: #f0f5fa url(/i/corner-rightline.gif) repeat-y; -} - -.border-top-b { - background: #f0f5fa url(/i/corner-topline.gif) repeat-x; -} - -.border-bottom-b { - background: #f0f5fa url(/i/corner-botline.gif) repeat-x; -} - -.border-right-w { - background: #ffffff url(/i/corner-rightline.gif) repeat-y; -} - -.border-top-w { - background: #ffffff url(/i/corner-topline.gif) repeat-x; -} - -.border-bottom-w { - background: #ffffff url(/i/corner-botline.gif) repeat-x; -} - -.bg-white { - background-color: #ffffff; -} - -.border-left-w { - background: #ffffff url(/i/corner-leftline.gif) repeat-y; -} diff --git a/docs/CommandGuide/opt.pod b/docs/CommandGuide/opt.pod deleted file mode 100644 index f5f4968..0000000 --- a/docs/CommandGuide/opt.pod +++ /dev/null @@ -1,143 +0,0 @@ -=pod - -=head1 NAME - -opt - LLVM optimizer - -=head1 SYNOPSIS - -B<opt> [I<options>] [I<filename>] - -=head1 DESCRIPTION - -The B<opt> command is the modular LLVM optimizer and analyzer. It takes LLVM -source files as input, runs the specified optimizations or analyses on it, and then -outputs the optimized file or the analysis results. The function of -B<opt> depends on whether the B<-analyze> option is given. - -When B<-analyze> is specified, B<opt> performs various analyses of the input -source. It will usually print the results on standard output, but in a few -cases, it will print output to standard error or generate a file with the -analysis output, which is usually done when the output is meant for another -program. - -While B<-analyze> is I<not> given, B<opt> attempts to produce an optimized -output file. The optimizations available via B<opt> depend upon what -libraries were linked into it as well as any additional libraries that have -been loaded with the B<-load> option. Use the B<-help> option to determine -what optimizations you can use. - -If I<filename> is omitted from the command line or is I<->, B<opt> reads its -input from standard input. Inputs can be in either the LLVM assembly language -format (.ll) or the LLVM bitcode format (.bc). - -If an output filename is not specified with the B<-o> option, B<opt> -writes its output to the standard output. - -=head1 OPTIONS - -=over - -=item B<-f> - -Enable binary output on terminals. Normally, B<opt> will refuse to -write raw bitcode output if the output stream is a terminal. With this option, -B<opt> will write raw bitcode regardless of the output device. - -=item B<-help> - -Print a summary of command line options. - -=item B<-o> I<filename> - -Specify the output filename. - -=item B<-S> - -Write output in LLVM intermediate language (instead of bitcode). - -=item B<-{passname}> - -B<opt> provides the ability to run any of LLVM's optimization or analysis passes -in any order. The B<-help> option lists all the passes available. The order in -which the options occur on the command line are the order in which they are -executed (within pass constraints). - -=item B<-std-compile-opts> - -This is short hand for a standard list of I<compile time optimization> passes. -This is typically used to optimize the output from the llvm-gcc front end. It -might be useful for other front end compilers as well. To discover the full set -of options available, use the following command: - - llvm-as < /dev/null | opt -std-compile-opts -disable-output -debug-pass=Arguments - -=item B<-disable-inlining> - -This option is only meaningful when B<-std-compile-opts> is given. It simply -removes the inlining pass from the standard list. - -=item B<-disable-opt> - -This option is only meaningful when B<-std-compile-opts> is given. It disables -most, but not all, of the B<-std-compile-opts>. The ones that remain are -B<-verify>, B<-lower-setjmp>, and B<-funcresolve>. - -=item B<-strip-debug> - -This option causes opt to strip debug information from the module before -applying other optimizations. It is essentially the same as B<-strip> but it -ensures that stripping of debug information is done first. - -=item B<-verify-each> - -This option causes opt to add a verify pass after every pass otherwise specified -on the command line (including B<-verify>). This is useful for cases where it -is suspected that a pass is creating an invalid module but it is not clear which -pass is doing it. The combination of B<-std-compile-opts> and B<-verify-each> -can quickly track down this kind of problem. - -=item B<-profile-info-file> I<filename> - -Specify the name of the file loaded by the -profile-loader option. - -=item B<-stats> - -Print statistics. - -=item B<-time-passes> - -Record the amount of time needed for each pass and print it to standard -error. - -=item B<-debug> - -If this is a debug build, this option will enable debug printouts -from passes which use the I<DEBUG()> macro. See the B<LLVM Programmer's -Manual>, section I<#DEBUG> for more information. - -=item B<-load>=I<plugin> - -Load the dynamic object I<plugin>. This object should register new optimization -or analysis passes. Once loaded, the object will add new command line options to -enable various optimizations or analyses. To see the new complete list of -optimizations, use the B<-help> and B<-load> options together. For example: - - opt -load=plugin.so -help - -=item B<-p> - -Print module after each transformation. - -=back - -=head1 EXIT STATUS - -If B<opt> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 AUTHORS - -Maintained by the LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/opt.rst b/docs/CommandGuide/opt.rst new file mode 100644 index 0000000..72f1903 --- /dev/null +++ b/docs/CommandGuide/opt.rst @@ -0,0 +1,183 @@ +opt - LLVM optimizer +==================== + + +SYNOPSIS +-------- + + +**opt** [*options*] [*filename*] + + +DESCRIPTION +----------- + + +The **opt** command is the modular LLVM optimizer and analyzer. It takes LLVM +source files as input, runs the specified optimizations or analyses on it, and then +outputs the optimized file or the analysis results. The function of +**opt** depends on whether the **-analyze** option is given. + +When **-analyze** is specified, **opt** performs various analyses of the input +source. It will usually print the results on standard output, but in a few +cases, it will print output to standard error or generate a file with the +analysis output, which is usually done when the output is meant for another +program. + +While **-analyze** is *not* given, **opt** attempts to produce an optimized +output file. The optimizations available via **opt** depend upon what +libraries were linked into it as well as any additional libraries that have +been loaded with the **-load** option. Use the **-help** option to determine +what optimizations you can use. + +If *filename* is omitted from the command line or is *-*, **opt** reads its +input from standard input. Inputs can be in either the LLVM assembly language +format (.ll) or the LLVM bitcode format (.bc). + +If an output filename is not specified with the **-o** option, **opt** +writes its output to the standard output. + + +OPTIONS +------- + + + +**-f** + + Enable binary output on terminals. Normally, **opt** will refuse to + write raw bitcode output if the output stream is a terminal. With this option, + **opt** will write raw bitcode regardless of the output device. + + + +**-help** + + Print a summary of command line options. + + + +**-o** *filename* + + Specify the output filename. + + + +**-S** + + Write output in LLVM intermediate language (instead of bitcode). + + + +**-{passname}** + + **opt** provides the ability to run any of LLVM's optimization or analysis passes + in any order. The **-help** option lists all the passes available. The order in + which the options occur on the command line are the order in which they are + executed (within pass constraints). + + + +**-std-compile-opts** + + This is short hand for a standard list of *compile time optimization* passes. + This is typically used to optimize the output from the llvm-gcc front end. It + might be useful for other front end compilers as well. To discover the full set + of options available, use the following command: + + + .. code-block:: sh + + llvm-as < /dev/null | opt -std-compile-opts -disable-output -debug-pass=Arguments + + + + +**-disable-inlining** + + This option is only meaningful when **-std-compile-opts** is given. It simply + removes the inlining pass from the standard list. + + + +**-disable-opt** + + This option is only meaningful when **-std-compile-opts** is given. It disables + most, but not all, of the **-std-compile-opts**. The ones that remain are + **-verify**, **-lower-setjmp**, and **-funcresolve**. + + + +**-strip-debug** + + This option causes opt to strip debug information from the module before + applying other optimizations. It is essentially the same as **-strip** but it + ensures that stripping of debug information is done first. + + + +**-verify-each** + + This option causes opt to add a verify pass after every pass otherwise specified + on the command line (including **-verify**). This is useful for cases where it + is suspected that a pass is creating an invalid module but it is not clear which + pass is doing it. The combination of **-std-compile-opts** and **-verify-each** + can quickly track down this kind of problem. + + + +**-profile-info-file** *filename* + + Specify the name of the file loaded by the -profile-loader option. + + + +**-stats** + + Print statistics. + + + +**-time-passes** + + Record the amount of time needed for each pass and print it to standard + error. + + + +**-debug** + + If this is a debug build, this option will enable debug printouts + from passes which use the *DEBUG()* macro. See the **LLVM Programmer's + Manual**, section *#DEBUG* for more information. + + + +**-load**\ =\ *plugin* + + Load the dynamic object *plugin*. This object should register new optimization + or analysis passes. Once loaded, the object will add new command line options to + enable various optimizations or analyses. To see the new complete list of + optimizations, use the **-help** and **-load** options together. For example: + + + .. code-block:: sh + + opt -load=plugin.so -help + + + + +**-p** + + Print module after each transformation. + + + + +EXIT STATUS +----------- + + +If **opt** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. diff --git a/docs/CommandGuide/tblgen.pod b/docs/CommandGuide/tblgen.pod deleted file mode 100644 index 180bcc1..0000000 --- a/docs/CommandGuide/tblgen.pod +++ /dev/null @@ -1,139 +0,0 @@ - -=pod - -=head1 NAME - -tblgen - Target Description To C++ Code Generator - -=head1 SYNOPSIS - -B<tblgen> [I<options>] [I<filename>] - -=head1 DESCRIPTION - -B<tblgen> translates from target description (.td) files into C++ code that can -be included in the definition of an LLVM target library. Most users of LLVM will -not need to use this program. It is only for assisting with writing an LLVM -target backend. - -The input and output of B<tblgen> is beyond the scope of this short -introduction. Please see the I<CodeGeneration> page in the LLVM documentation. - -The F<filename> argument specifies the name of a Target Description (.td) file -to read as input. - -=head1 OPTIONS - -=over - -=item B<-help> - -Print a summary of command line options. - -=item B<-o> F<filename> - -Specify the output file name. If F<filename> is C<->, then B<tblgen> -sends its output to standard output. - -=item B<-I> F<directory> - -Specify where to find other target description files for inclusion. The -F<directory> value should be a full or partial path to a directory that contains -target description files. - -=item B<-asmparsernum> F<N> - -Make -gen-asm-parser emit assembly writer number F<N>. - -=item B<-asmwriternum> F<N> - -Make -gen-asm-writer emit assembly writer number F<N>. - -=item B<-class> F<class Name> - -Print the enumeration list for this class. - -=item B<-print-records> - -Print all records to standard output (default). - -=item B<-print-enums> - -Print enumeration values for a class - -=item B<-print-sets> - -Print expanded sets for testing DAG exprs. - -=item B<-gen-emitter> - -Generate machine code emitter. - -=item B<-gen-register-info> - -Generate registers and register classes info. - -=item B<-gen-instr-info> - -Generate instruction descriptions. - -=item B<-gen-asm-writer> - -Generate the assembly writer. - -=item B<-gen-disassembler> - -Generate disassembler. - -=item B<-gen-pseudo-lowering> - -Generate pseudo instruction lowering. - -=item B<-gen-dag-isel> - -Generate a DAG (Directed Acycle Graph) instruction selector. - -=item B<-gen-asm-matcher> - -Generate assembly instruction matcher. - -=item B<-gen-dfa-packetizer> - -Generate DFA Packetizer for VLIW targets. - -=item B<-gen-fast-isel> - -Generate a "fast" instruction selector. - -=item B<-gen-subtarget> - -Generate subtarget enumerations. - -=item B<-gen-intrinsic> - -Generate intrinsic information. - -=item B<-gen-tgt-intrinsic> - -Generate target intrinsic information. - -=item B<-gen-enhanced-disassembly-info> - -Generate enhanced disassembly info. - -=item B<-version> - -Show the version number of this program. - -=back - -=head1 EXIT STATUS - -If B<tblgen> succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value. - -=head1 AUTHORS - -Maintained by The LLVM Team (L<http://llvm.org/>). - -=cut diff --git a/docs/CommandGuide/tblgen.rst b/docs/CommandGuide/tblgen.rst new file mode 100644 index 0000000..2d19167 --- /dev/null +++ b/docs/CommandGuide/tblgen.rst @@ -0,0 +1,186 @@ +tblgen - Target Description To C++ Code Generator +================================================= + + +SYNOPSIS +-------- + + +**tblgen** [*options*] [*filename*] + + +DESCRIPTION +----------- + + +**tblgen** translates from target description (.td) files into C++ code that can +be included in the definition of an LLVM target library. Most users of LLVM will +not need to use this program. It is only for assisting with writing an LLVM +target backend. + +The input and output of **tblgen** is beyond the scope of this short +introduction. Please see the *CodeGeneration* page in the LLVM documentation. + +The *filename* argument specifies the name of a Target Description (.td) file +to read as input. + + +OPTIONS +------- + + + +**-help** + + Print a summary of command line options. + + + +**-o** *filename* + + Specify the output file name. If *filename* is ``-``, then **tblgen** + sends its output to standard output. + + + +**-I** *directory* + + Specify where to find other target description files for inclusion. The + *directory* value should be a full or partial path to a directory that contains + target description files. + + + +**-asmparsernum** *N* + + Make -gen-asm-parser emit assembly writer number *N*. + + + +**-asmwriternum** *N* + + Make -gen-asm-writer emit assembly writer number *N*. + + + +**-class** *class Name* + + Print the enumeration list for this class. + + + +**-print-records** + + Print all records to standard output (default). + + + +**-print-enums** + + Print enumeration values for a class + + + +**-print-sets** + + Print expanded sets for testing DAG exprs. + + + +**-gen-emitter** + + Generate machine code emitter. + + + +**-gen-register-info** + + Generate registers and register classes info. + + + +**-gen-instr-info** + + Generate instruction descriptions. + + + +**-gen-asm-writer** + + Generate the assembly writer. + + + +**-gen-disassembler** + + Generate disassembler. + + + +**-gen-pseudo-lowering** + + Generate pseudo instruction lowering. + + + +**-gen-dag-isel** + + Generate a DAG (Directed Acycle Graph) instruction selector. + + + +**-gen-asm-matcher** + + Generate assembly instruction matcher. + + + +**-gen-dfa-packetizer** + + Generate DFA Packetizer for VLIW targets. + + + +**-gen-fast-isel** + + Generate a "fast" instruction selector. + + + +**-gen-subtarget** + + Generate subtarget enumerations. + + + +**-gen-intrinsic** + + Generate intrinsic information. + + + +**-gen-tgt-intrinsic** + + Generate target intrinsic information. + + + +**-gen-enhanced-disassembly-info** + + Generate enhanced disassembly info. + + + +**-version** + + Show the version number of this program. + + + + +EXIT STATUS +----------- + + +If **tblgen** succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. diff --git a/docs/CommandLine.html b/docs/CommandLine.html deleted file mode 100644 index 7535ca4..0000000 --- a/docs/CommandLine.html +++ /dev/null @@ -1,1976 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>CommandLine 2.0 Library Manual</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1> - CommandLine 2.0 Library Manual -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - - <li><a href="#quickstart">Quick Start Guide</a> - <ol> - <li><a href="#bool">Boolean Arguments</a></li> - <li><a href="#alias">Argument Aliases</a></li> - <li><a href="#onealternative">Selecting an alternative from a - set of possibilities</a></li> - <li><a href="#namedalternatives">Named alternatives</a></li> - <li><a href="#list">Parsing a list of options</a></li> - <li><a href="#bits">Collecting options as a set of flags</a></li> - <li><a href="#description">Adding freeform text to help output</a></li> - </ol></li> - - <li><a href="#referenceguide">Reference Guide</a> - <ol> - <li><a href="#positional">Positional Arguments</a> - <ul> - <li><a href="#--">Specifying positional options with hyphens</a></li> - <li><a href="#getPosition">Determining absolute position with - getPosition</a></li> - <li><a href="#cl::ConsumeAfter">The <tt>cl::ConsumeAfter</tt> - modifier</a></li> - </ul></li> - - <li><a href="#storage">Internal vs External Storage</a></li> - - <li><a href="#attributes">Option Attributes</a></li> - - <li><a href="#modifiers">Option Modifiers</a> - <ul> - <li><a href="#hiding">Hiding an option from <tt>-help</tt> - output</a></li> - <li><a href="#numoccurrences">Controlling the number of occurrences - required and allowed</a></li> - <li><a href="#valrequired">Controlling whether or not a value must be - specified</a></li> - <li><a href="#formatting">Controlling other formatting options</a></li> - <li><a href="#misc">Miscellaneous option modifiers</a></li> - <li><a href="#response">Response files</a></li> - </ul></li> - - <li><a href="#toplevel">Top-Level Classes and Functions</a> - <ul> - <li><a href="#cl::ParseCommandLineOptions">The - <tt>cl::ParseCommandLineOptions</tt> function</a></li> - <li><a href="#cl::ParseEnvironmentOptions">The - <tt>cl::ParseEnvironmentOptions</tt> function</a></li> - <li><a href="#cl::SetVersionPrinter">The <tt>cl::SetVersionPrinter</tt> - function</a></li> - <li><a href="#cl::opt">The <tt>cl::opt</tt> class</a></li> - <li><a href="#cl::list">The <tt>cl::list</tt> class</a></li> - <li><a href="#cl::bits">The <tt>cl::bits</tt> class</a></li> - <li><a href="#cl::alias">The <tt>cl::alias</tt> class</a></li> - <li><a href="#cl::extrahelp">The <tt>cl::extrahelp</tt> class</a></li> - </ul></li> - - <li><a href="#builtinparsers">Builtin parsers</a> - <ul> - <li><a href="#genericparser">The Generic <tt>parser<t></tt> - parser</a></li> - <li><a href="#boolparser">The <tt>parser<bool></tt> - specialization</a></li> - <li><a href="#boolOrDefaultparser">The <tt>parser<boolOrDefault></tt> - specialization</a></li> - <li><a href="#stringparser">The <tt>parser<string></tt> - specialization</a></li> - <li><a href="#intparser">The <tt>parser<int></tt> - specialization</a></li> - <li><a href="#doubleparser">The <tt>parser<double></tt> and - <tt>parser<float></tt> specializations</a></li> - </ul></li> - </ol></li> - <li><a href="#extensionguide">Extension Guide</a> - <ol> - <li><a href="#customparser">Writing a custom parser</a></li> - <li><a href="#explotingexternal">Exploiting external storage</a></li> - <li><a href="#dynamicopts">Dynamically adding command line - options</a></li> - </ol></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>This document describes the CommandLine argument processing library. It will -show you how to use it, and what it can do. The CommandLine library uses a -declarative approach to specifying the command line options that your program -takes. By default, these options declarations implicitly hold the value parsed -for the option declared (of course this <a href="#storage">can be -changed</a>).</p> - -<p>Although there are a <b>lot</b> of command line argument parsing libraries -out there in many different languages, none of them fit well with what I needed. -By looking at the features and problems of other libraries, I designed the -CommandLine library to have the following features:</p> - -<ol> -<li>Speed: The CommandLine library is very quick and uses little resources. The -parsing time of the library is directly proportional to the number of arguments -parsed, not the the number of options recognized. Additionally, command line -argument values are captured transparently into user defined global variables, -which can be accessed like any other variable (and with the same -performance).</li> - -<li>Type Safe: As a user of CommandLine, you don't have to worry about -remembering the type of arguments that you want (is it an int? a string? a -bool? an enum?) and keep casting it around. Not only does this help prevent -error prone constructs, it also leads to dramatically cleaner source code.</li> - -<li>No subclasses required: To use CommandLine, you instantiate variables that -correspond to the arguments that you would like to capture, you don't subclass a -parser. This means that you don't have to write <b>any</b> boilerplate -code.</li> - -<li>Globally accessible: Libraries can specify command line arguments that are -automatically enabled in any tool that links to the library. This is possible -because the application doesn't have to keep a list of arguments to pass to -the parser. This also makes supporting <a href="#dynamicopts">dynamically -loaded options</a> trivial.</li> - -<li>Cleaner: CommandLine supports enum and other types directly, meaning that -there is less error and more security built into the library. You don't have to -worry about whether your integral command line argument accidentally got -assigned a value that is not valid for your enum type.</li> - -<li>Powerful: The CommandLine library supports many different types of -arguments, from simple <a href="#boolparser">boolean flags</a> to <a -href="#cl::opt">scalars arguments</a> (<a href="#stringparser">strings</a>, <a -href="#intparser">integers</a>, <a href="#genericparser">enums</a>, <a -href="#doubleparser">doubles</a>), to <a href="#cl::list">lists of -arguments</a>. This is possible because CommandLine is...</li> - -<li>Extensible: It is very simple to add a new argument type to CommandLine. -Simply specify the parser that you want to use with the command line option when -you declare it. <a href="#customparser">Custom parsers</a> are no problem.</li> - -<li>Labor Saving: The CommandLine library cuts down on the amount of grunt work -that you, the user, have to do. For example, it automatically provides a -<tt>-help</tt> option that shows the available command line options for your -tool. Additionally, it does most of the basic correctness checking for -you.</li> - -<li>Capable: The CommandLine library can handle lots of different forms of -options often found in real programs. For example, <a -href="#positional">positional</a> arguments, <tt>ls</tt> style <a -href="#cl::Grouping">grouping</a> options (to allow processing '<tt>ls --lad</tt>' naturally), <tt>ld</tt> style <a href="#cl::Prefix">prefix</a> -options (to parse '<tt>-lmalloc -L/usr/lib</tt>'), and <a -href="#cl::ConsumeAfter">interpreter style options</a>.</li> - -</ol> - -<p>This document will hopefully let you jump in and start using CommandLine in -your utility quickly and painlessly. Additionally it should be a simple -reference manual to figure out how stuff works. If it is failing in some area -(or you want an extension to the library), nag the author, <a -href="mailto:sabre@nondot.org">Chris Lattner</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="quickstart">Quick Start Guide</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>This section of the manual runs through a simple CommandLine'ification of a -basic compiler tool. This is intended to show you how to jump into using the -CommandLine library in your own program, and show you some of the cool things it -can do.</p> - -<p>To start out, you need to include the CommandLine header file into your -program:</p> - -<div class="doc_code"><pre> - #include "llvm/Support/CommandLine.h" -</pre></div> - -<p>Additionally, you need to add this as the first line of your main -program:</p> - -<div class="doc_code"><pre> -int main(int argc, char **argv) { - <a href="#cl::ParseCommandLineOptions">cl::ParseCommandLineOptions</a>(argc, argv); - ... -} -</pre></div> - -<p>... which actually parses the arguments and fills in the variable -declarations.</p> - -<p>Now that you are ready to support command line arguments, we need to tell the -system which ones we want, and what type of arguments they are. The CommandLine -library uses a declarative syntax to model command line arguments with the -global variable declarations that capture the parsed values. This means that -for every command line option that you would like to support, there should be a -global variable declaration to capture the result. For example, in a compiler, -we would like to support the Unix-standard '<tt>-o <filename></tt>' option -to specify where to put the output. With the CommandLine library, this is -represented like this:</p> - -<a name="value_desc_example"></a> -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> OutputFilename("<i>o</i>", <a href="#cl::desc">cl::desc</a>("<i>Specify output filename</i>"), <a href="#cl::value_desc">cl::value_desc</a>("<i>filename</i>")); -</pre></div> - -<p>This declares a global variable "<tt>OutputFilename</tt>" that is used to -capture the result of the "<tt>o</tt>" argument (first parameter). We specify -that this is a simple scalar option by using the "<tt><a -href="#cl::opt">cl::opt</a></tt>" template (as opposed to the <a -href="#list">"<tt>cl::list</tt> template</a>), and tell the CommandLine library -that the data type that we are parsing is a string.</p> - -<p>The second and third parameters (which are optional) are used to specify what -to output for the "<tt>-help</tt>" option. In this case, we get a line that -looks like this:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] - -OPTIONS: - -help - display available options (-help-hidden for more) - <b>-o <filename> - Specify output filename</b> -</pre></div> - -<p>Because we specified that the command line option should parse using the -<tt>string</tt> data type, the variable declared is automatically usable as a -real string in all contexts that a normal C++ string object may be used. For -example:</p> - -<div class="doc_code"><pre> - ... - std::ofstream Output(OutputFilename.c_str()); - if (Output.good()) ... - ... -</pre></div> - -<p>There are many different options that you can use to customize the command -line option handling library, but the above example shows the general interface -to these options. The options can be specified in any order, and are specified -with helper functions like <a href="#cl::desc"><tt>cl::desc(...)</tt></a>, so -there are no positional dependencies to remember. The available options are -discussed in detail in the <a href="#referenceguide">Reference Guide</a>.</p> - -<p>Continuing the example, we would like to have our compiler take an input -filename as well as an output filename, but we do not want the input filename to -be specified with a hyphen (ie, not <tt>-filename.c</tt>). To support this -style of argument, the CommandLine library allows for <a -href="#positional">positional</a> arguments to be specified for the program. -These positional arguments are filled with command line parameters that are not -in option form. We use this feature like this:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> InputFilename(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<i><input file></i>"), <a href="#cl::init">cl::init</a>("<i>-</i>")); -</pre></div> - -<p>This declaration indicates that the first positional argument should be -treated as the input filename. Here we use the <tt><a -href="#cl::init">cl::init</a></tt> option to specify an initial value for the -command line option, which is used if the option is not specified (if you do not -specify a <tt><a href="#cl::init">cl::init</a></tt> modifier for an option, then -the default constructor for the data type is used to initialize the value). -Command line options default to being optional, so if we would like to require -that the user always specify an input filename, we would add the <tt><a -href="#cl::Required">cl::Required</a></tt> flag, and we could eliminate the -<tt><a href="#cl::init">cl::init</a></tt> modifier, like this:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> InputFilename(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<i><input file></i>"), <b><a href="#cl::Required">cl::Required</a></b>); -</pre></div> - -<p>Again, the CommandLine library does not require the options to be specified -in any particular order, so the above declaration is equivalent to:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> InputFilename(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::Required">cl::Required</a>, <a href="#cl::desc">cl::desc</a>("<i><input file></i>")); -</pre></div> - -<p>By simply adding the <tt><a href="#cl::Required">cl::Required</a></tt> flag, -the CommandLine library will automatically issue an error if the argument is not -specified, which shifts all of the command line option verification code out of -your application into the library. This is just one example of how using flags -can alter the default behaviour of the library, on a per-option basis. By -adding one of the declarations above, the <tt>-help</tt> option synopsis is now -extended to:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] <b><input file></b> - -OPTIONS: - -help - display available options (-help-hidden for more) - -o <filename> - Specify output filename -</pre></div> - -<p>... indicating that an input filename is expected.</p> - -<!-- ======================================================================= --> -<h3> - <a name="bool">Boolean Arguments</a> -</h3> - -<div> - -<p>In addition to input and output filenames, we would like the compiler example -to support three boolean flags: "<tt>-f</tt>" to force writing binary output to -a terminal, "<tt>--quiet</tt>" to enable quiet mode, and "<tt>-q</tt>" for -backwards compatibility with some of our users. We can support these by -declaring options of boolean type like this:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><bool> Force ("<i>f</i>", <a href="#cl::desc">cl::desc</a>("<i>Enable binary output on terminals</i>")); -<a href="#cl::opt">cl::opt</a><bool> Quiet ("<i>quiet</i>", <a href="#cl::desc">cl::desc</a>("<i>Don't print informational messages</i>")); -<a href="#cl::opt">cl::opt</a><bool> Quiet2("<i>q</i>", <a href="#cl::desc">cl::desc</a>("<i>Don't print informational messages</i>"), <a href="#cl::Hidden">cl::Hidden</a>); -</pre></div> - -<p>This does what you would expect: it declares three boolean variables -("<tt>Force</tt>", "<tt>Quiet</tt>", and "<tt>Quiet2</tt>") to recognize these -options. Note that the "<tt>-q</tt>" option is specified with the "<a -href="#cl::Hidden"><tt>cl::Hidden</tt></a>" flag. This modifier prevents it -from being shown by the standard "<tt>-help</tt>" output (note that it is still -shown in the "<tt>-help-hidden</tt>" output).</p> - -<p>The CommandLine library uses a <a href="#builtinparsers">different parser</a> -for different data types. For example, in the string case, the argument passed -to the option is copied literally into the content of the string variable... we -obviously cannot do that in the boolean case, however, so we must use a smarter -parser. In the case of the boolean parser, it allows no options (in which case -it assigns the value of true to the variable), or it allows the values -"<tt>true</tt>" or "<tt>false</tt>" to be specified, allowing any of the -following inputs:</p> - -<div class="doc_code"><pre> - compiler -f # No value, 'Force' == true - compiler -f=true # Value specified, 'Force' == true - compiler -f=TRUE # Value specified, 'Force' == true - compiler -f=FALSE # Value specified, 'Force' == false -</pre></div> - -<p>... you get the idea. The <a href="#boolparser">bool parser</a> just turns -the string values into boolean values, and rejects things like '<tt>compiler --f=foo</tt>'. Similarly, the <a href="#doubleparser">float</a>, <a -href="#doubleparser">double</a>, and <a href="#intparser">int</a> parsers work -like you would expect, using the '<tt>strtol</tt>' and '<tt>strtod</tt>' C -library calls to parse the string value into the specified data type.</p> - -<p>With the declarations above, "<tt>compiler -help</tt>" emits this:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] <input file> - -OPTIONS: - <b>-f - Enable binary output on terminals</b> - -o - Override output filename - <b>-quiet - Don't print informational messages</b> - -help - display available options (-help-hidden for more) -</pre></div> - -<p>and "<tt>compiler -help-hidden</tt>" prints this:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] <input file> - -OPTIONS: - -f - Enable binary output on terminals - -o - Override output filename - <b>-q - Don't print informational messages</b> - -quiet - Don't print informational messages - -help - display available options (-help-hidden for more) -</pre></div> - -<p>This brief example has shown you how to use the '<tt><a -href="#cl::opt">cl::opt</a></tt>' class to parse simple scalar command line -arguments. In addition to simple scalar arguments, the CommandLine library also -provides primitives to support CommandLine option <a href="#alias">aliases</a>, -and <a href="#list">lists</a> of options.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="alias">Argument Aliases</a> -</h3> - -<div> - -<p>So far, the example works well, except for the fact that we need to check the -quiet condition like this now:</p> - -<div class="doc_code"><pre> -... - if (!Quiet && !Quiet2) printInformationalMessage(...); -... -</pre></div> - -<p>... which is a real pain! Instead of defining two values for the same -condition, we can use the "<tt><a href="#cl::alias">cl::alias</a></tt>" class to make the "<tt>-q</tt>" -option an <b>alias</b> for the "<tt>-quiet</tt>" option, instead of providing -a value itself:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><bool> Force ("<i>f</i>", <a href="#cl::desc">cl::desc</a>("<i>Overwrite output files</i>")); -<a href="#cl::opt">cl::opt</a><bool> Quiet ("<i>quiet</i>", <a href="#cl::desc">cl::desc</a>("<i>Don't print informational messages</i>")); -<a href="#cl::alias">cl::alias</a> QuietA("<i>q</i>", <a href="#cl::desc">cl::desc</a>("<i>Alias for -quiet</i>"), <a href="#cl::aliasopt">cl::aliasopt</a>(Quiet)); -</pre></div> - -<p>The third line (which is the only one we modified from above) defines a -"<tt>-q</tt>" alias that updates the "<tt>Quiet</tt>" variable (as specified by -the <tt><a href="#cl::aliasopt">cl::aliasopt</a></tt> modifier) whenever it is -specified. Because aliases do not hold state, the only thing the program has to -query is the <tt>Quiet</tt> variable now. Another nice feature of aliases is -that they automatically hide themselves from the <tt>-help</tt> output -(although, again, they are still visible in the <tt>-help-hidden -output</tt>).</p> - -<p>Now the application code can simply use:</p> - -<div class="doc_code"><pre> -... - if (!Quiet) printInformationalMessage(...); -... -</pre></div> - -<p>... which is much nicer! The "<tt><a href="#cl::alias">cl::alias</a></tt>" -can be used to specify an alternative name for any variable type, and has many -uses.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="onealternative">Selecting an alternative from a set of - possibilities</a> -</h3> - -<div> - -<p>So far we have seen how the CommandLine library handles builtin types like -<tt>std::string</tt>, <tt>bool</tt> and <tt>int</tt>, but how does it handle -things it doesn't know about, like enums or '<tt>int*</tt>'s?</p> - -<p>The answer is that it uses a table-driven generic parser (unless you specify -your own parser, as described in the <a href="#extensionguide">Extension -Guide</a>). This parser maps literal strings to whatever type is required, and -requires you to tell it what this mapping should be.</p> - -<p>Let's say that we would like to add four optimization levels to our -optimizer, using the standard flags "<tt>-g</tt>", "<tt>-O0</tt>", -"<tt>-O1</tt>", and "<tt>-O2</tt>". We could easily implement this with boolean -options like above, but there are several problems with this strategy:</p> - -<ol> -<li>A user could specify more than one of the options at a time, for example, -"<tt>compiler -O3 -O2</tt>". The CommandLine library would not be able to -catch this erroneous input for us.</li> - -<li>We would have to test 4 different variables to see which ones are set.</li> - -<li>This doesn't map to the numeric levels that we want... so we cannot easily -see if some level >= "<tt>-O1</tt>" is enabled.</li> - -</ol> - -<p>To cope with these problems, we can use an enum value, and have the -CommandLine library fill it in with the appropriate level directly, which is -used like this:</p> - -<div class="doc_code"><pre> -enum OptLevel { - g, O1, O2, O3 -}; - -<a href="#cl::opt">cl::opt</a><OptLevel> OptimizationLevel(<a href="#cl::desc">cl::desc</a>("<i>Choose optimization level:</i>"), - <a href="#cl::values">cl::values</a>( - clEnumVal(g , "<i>No optimizations, enable debugging</i>"), - clEnumVal(O1, "<i>Enable trivial optimizations</i>"), - clEnumVal(O2, "<i>Enable default optimizations</i>"), - clEnumVal(O3, "<i>Enable expensive optimizations</i>"), - clEnumValEnd)); - -... - if (OptimizationLevel >= O2) doPartialRedundancyElimination(...); -... -</pre></div> - -<p>This declaration defines a variable "<tt>OptimizationLevel</tt>" of the -"<tt>OptLevel</tt>" enum type. This variable can be assigned any of the values -that are listed in the declaration (Note that the declaration list must be -terminated with the "<tt>clEnumValEnd</tt>" argument!). The CommandLine -library enforces -that the user can only specify one of the options, and it ensure that only valid -enum values can be specified. The "<tt>clEnumVal</tt>" macros ensure that the -command line arguments matched the enum values. With this option added, our -help output now is:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] <input file> - -OPTIONS: - <b>Choose optimization level: - -g - No optimizations, enable debugging - -O1 - Enable trivial optimizations - -O2 - Enable default optimizations - -O3 - Enable expensive optimizations</b> - -f - Enable binary output on terminals - -help - display available options (-help-hidden for more) - -o <filename> - Specify output filename - -quiet - Don't print informational messages -</pre></div> - -<p>In this case, it is sort of awkward that flag names correspond directly to -enum names, because we probably don't want a enum definition named "<tt>g</tt>" -in our program. Because of this, we can alternatively write this example like -this:</p> - -<div class="doc_code"><pre> -enum OptLevel { - Debug, O1, O2, O3 -}; - -<a href="#cl::opt">cl::opt</a><OptLevel> OptimizationLevel(<a href="#cl::desc">cl::desc</a>("<i>Choose optimization level:</i>"), - <a href="#cl::values">cl::values</a>( - clEnumValN(Debug, "g", "<i>No optimizations, enable debugging</i>"), - clEnumVal(O1 , "<i>Enable trivial optimizations</i>"), - clEnumVal(O2 , "<i>Enable default optimizations</i>"), - clEnumVal(O3 , "<i>Enable expensive optimizations</i>"), - clEnumValEnd)); - -... - if (OptimizationLevel == Debug) outputDebugInfo(...); -... -</pre></div> - -<p>By using the "<tt>clEnumValN</tt>" macro instead of "<tt>clEnumVal</tt>", we -can directly specify the name that the flag should get. In general a direct -mapping is nice, but sometimes you can't or don't want to preserve the mapping, -which is when you would use it.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="namedalternatives">Named Alternatives</a> -</h3> - -<div> - -<p>Another useful argument form is a named alternative style. We shall use this -style in our compiler to specify different debug levels that can be used. -Instead of each debug level being its own switch, we want to support the -following options, of which only one can be specified at a time: -"<tt>--debug-level=none</tt>", "<tt>--debug-level=quick</tt>", -"<tt>--debug-level=detailed</tt>". To do this, we use the exact same format as -our optimization level flags, but we also specify an option name. For this -case, the code looks like this:</p> - -<div class="doc_code"><pre> -enum DebugLev { - nodebuginfo, quick, detailed -}; - -// Enable Debug Options to be specified on the command line -<a href="#cl::opt">cl::opt</a><DebugLev> DebugLevel("<i>debug_level</i>", <a href="#cl::desc">cl::desc</a>("<i>Set the debugging level:</i>"), - <a href="#cl::values">cl::values</a>( - clEnumValN(nodebuginfo, "none", "<i>disable debug information</i>"), - clEnumVal(quick, "<i>enable quick debug information</i>"), - clEnumVal(detailed, "<i>enable detailed debug information</i>"), - clEnumValEnd)); -</pre></div> - -<p>This definition defines an enumerated command line variable of type "<tt>enum -DebugLev</tt>", which works exactly the same way as before. The difference here -is just the interface exposed to the user of your program and the help output by -the "<tt>-help</tt>" option:</p> - -<div class="doc_code"><pre> -USAGE: compiler [options] <input file> - -OPTIONS: - Choose optimization level: - -g - No optimizations, enable debugging - -O1 - Enable trivial optimizations - -O2 - Enable default optimizations - -O3 - Enable expensive optimizations - <b>-debug_level - Set the debugging level: - =none - disable debug information - =quick - enable quick debug information - =detailed - enable detailed debug information</b> - -f - Enable binary output on terminals - -help - display available options (-help-hidden for more) - -o <filename> - Specify output filename - -quiet - Don't print informational messages -</pre></div> - -<p>Again, the only structural difference between the debug level declaration and -the optimization level declaration is that the debug level declaration includes -an option name (<tt>"debug_level"</tt>), which automatically changes how the -library processes the argument. The CommandLine library supports both forms so -that you can choose the form most appropriate for your application.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="list">Parsing a list of options</a> -</h3> - -<div> - -<p>Now that we have the standard run-of-the-mill argument types out of the way, -lets get a little wild and crazy. Lets say that we want our optimizer to accept -a <b>list</b> of optimizations to perform, allowing duplicates. For example, we -might want to run: "<tt>compiler -dce -constprop -inline -dce -strip</tt>". In -this case, the order of the arguments and the number of appearances is very -important. This is what the "<tt><a href="#cl::list">cl::list</a></tt>" -template is for. First, start by defining an enum of the optimizations that you -would like to perform:</p> - -<div class="doc_code"><pre> -enum Opts { - // 'inline' is a C++ keyword, so name it 'inlining' - dce, constprop, inlining, strip -}; -</pre></div> - -<p>Then define your "<tt><a href="#cl::list">cl::list</a></tt>" variable:</p> - -<div class="doc_code"><pre> -<a href="#cl::list">cl::list</a><Opts> OptimizationList(<a href="#cl::desc">cl::desc</a>("<i>Available Optimizations:</i>"), - <a href="#cl::values">cl::values</a>( - clEnumVal(dce , "<i>Dead Code Elimination</i>"), - clEnumVal(constprop , "<i>Constant Propagation</i>"), - clEnumValN(inlining, "<i>inline</i>", "<i>Procedure Integration</i>"), - clEnumVal(strip , "<i>Strip Symbols</i>"), - clEnumValEnd)); -</pre></div> - -<p>This defines a variable that is conceptually of the type -"<tt>std::vector<enum Opts></tt>". Thus, you can access it with standard -vector methods:</p> - -<div class="doc_code"><pre> - for (unsigned i = 0; i != OptimizationList.size(); ++i) - switch (OptimizationList[i]) - ... -</pre></div> - -<p>... to iterate through the list of options specified.</p> - -<p>Note that the "<tt><a href="#cl::list">cl::list</a></tt>" template is -completely general and may be used with any data types or other arguments that -you can use with the "<tt><a href="#cl::opt">cl::opt</a></tt>" template. One -especially useful way to use a list is to capture all of the positional -arguments together if there may be more than one specified. In the case of a -linker, for example, the linker takes several '<tt>.o</tt>' files, and needs to -capture them into a list. This is naturally specified as:</p> - -<div class="doc_code"><pre> -... -<a href="#cl::list">cl::list</a><std::string> InputFilenames(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<Input files>"), <a href="#cl::OneOrMore">cl::OneOrMore</a>); -... -</pre></div> - -<p>This variable works just like a "<tt>vector<string></tt>" object. As -such, accessing the list is simple, just like above. In this example, we used -the <tt><a href="#cl::OneOrMore">cl::OneOrMore</a></tt> modifier to inform the -CommandLine library that it is an error if the user does not specify any -<tt>.o</tt> files on our command line. Again, this just reduces the amount of -checking we have to do.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="bits">Collecting options as a set of flags</a> -</h3> - -<div> - -<p>Instead of collecting sets of options in a list, it is also possible to -gather information for enum values in a <b>bit vector</b>. The representation used by -the <a href="#bits"><tt>cl::bits</tt></a> class is an <tt>unsigned</tt> -integer. An enum value is represented by a 0/1 in the enum's ordinal value bit -position. 1 indicating that the enum was specified, 0 otherwise. As each -specified value is parsed, the resulting enum's bit is set in the option's bit -vector:</p> - -<div class="doc_code"><pre> - <i>bits</i> |= 1 << (unsigned)<i>enum</i>; -</pre></div> - -<p>Options that are specified multiple times are redundant. Any instances after -the first are discarded.</p> - -<p>Reworking the above list example, we could replace <a href="#list"> -<tt>cl::list</tt></a> with <a href="#bits"><tt>cl::bits</tt></a>:</p> - -<div class="doc_code"><pre> -<a href="#cl::bits">cl::bits</a><Opts> OptimizationBits(<a href="#cl::desc">cl::desc</a>("<i>Available Optimizations:</i>"), - <a href="#cl::values">cl::values</a>( - clEnumVal(dce , "<i>Dead Code Elimination</i>"), - clEnumVal(constprop , "<i>Constant Propagation</i>"), - clEnumValN(inlining, "<i>inline</i>", "<i>Procedure Integration</i>"), - clEnumVal(strip , "<i>Strip Symbols</i>"), - clEnumValEnd)); -</pre></div> - -<p>To test to see if <tt>constprop</tt> was specified, we can use the -<tt>cl:bits::isSet</tt> function:</p> - -<div class="doc_code"><pre> - if (OptimizationBits.isSet(constprop)) { - ... - } -</pre></div> - -<p>It's also possible to get the raw bit vector using the -<tt>cl::bits::getBits</tt> function:</p> - -<div class="doc_code"><pre> - unsigned bits = OptimizationBits.getBits(); -</pre></div> - -<p>Finally, if external storage is used, then the location specified must be of -<b>type</b> <tt>unsigned</tt>. In all other ways a <a -href="#bits"><tt>cl::bits</tt></a> option is equivalent to a <a -href="#list"> <tt>cl::list</tt></a> option.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="description">Adding freeform text to help output</a> -</h3> - -<div> - -<p>As our program grows and becomes more mature, we may decide to put summary -information about what it does into the help output. The help output is styled -to look similar to a Unix <tt>man</tt> page, providing concise information about -a program. Unix <tt>man</tt> pages, however often have a description about what -the program does. To add this to your CommandLine program, simply pass a third -argument to the <a -href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a> -call in main. This additional argument is then printed as the overview -information for your program, allowing you to include any additional information -that you want. For example:</p> - -<div class="doc_code"><pre> -int main(int argc, char **argv) { - <a href="#cl::ParseCommandLineOptions">cl::ParseCommandLineOptions</a>(argc, argv, " CommandLine compiler example\n\n" - " This program blah blah blah...\n"); - ... -} -</pre></div> - -<p>would yield the help output:</p> - -<div class="doc_code"><pre> -<b>OVERVIEW: CommandLine compiler example - - This program blah blah blah...</b> - -USAGE: compiler [options] <input file> - -OPTIONS: - ... - -help - display available options (-help-hidden for more) - -o <filename> - Specify output filename -</pre></div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="referenceguide">Reference Guide</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Now that you know the basics of how to use the CommandLine library, this -section will give you the detailed information you need to tune how command line -options work, as well as information on more "advanced" command line option -processing capabilities.</p> - -<!-- ======================================================================= --> -<h3> - <a name="positional">Positional Arguments</a> -</h3> - -<div> - -<p>Positional arguments are those arguments that are not named, and are not -specified with a hyphen. Positional arguments should be used when an option is -specified by its position alone. For example, the standard Unix <tt>grep</tt> -tool takes a regular expression argument, and an optional filename to search -through (which defaults to standard input if a filename is not specified). -Using the CommandLine library, this would be specified as:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> Regex (<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<i><regular expression></i>"), <a href="#cl::Required">cl::Required</a>); -<a href="#cl::opt">cl::opt</a><string> Filename(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<i><input file></i>"), <a href="#cl::init">cl::init</a>("<i>-</i>")); -</pre></div> - -<p>Given these two option declarations, the <tt>-help</tt> output for our grep -replacement would look like this:</p> - -<div class="doc_code"><pre> -USAGE: spiffygrep [options] <b><regular expression> <input file></b> - -OPTIONS: - -help - display available options (-help-hidden for more) -</pre></div> - -<p>... and the resultant program could be used just like the standard -<tt>grep</tt> tool.</p> - -<p>Positional arguments are sorted by their order of construction. This means -that command line options will be ordered according to how they are listed in a -.cpp file, but will not have an ordering defined if the positional arguments -are defined in multiple .cpp files. The fix for this problem is simply to -define all of your positional arguments in one .cpp file.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="--">Specifying positional options with hyphens</a> -</h4> - -<div> - -<p>Sometimes you may want to specify a value to your positional argument that -starts with a hyphen (for example, searching for '<tt>-foo</tt>' in a file). At -first, you will have trouble doing this, because it will try to find an argument -named '<tt>-foo</tt>', and will fail (and single quotes will not save you). -Note that the system <tt>grep</tt> has the same problem:</p> - -<div class="doc_code"><pre> - $ spiffygrep '-foo' test.txt - Unknown command line argument '-foo'. Try: spiffygrep -help' - - $ grep '-foo' test.txt - grep: illegal option -- f - grep: illegal option -- o - grep: illegal option -- o - Usage: grep -hblcnsviw pattern file . . . -</pre></div> - -<p>The solution for this problem is the same for both your tool and the system -version: use the '<tt>--</tt>' marker. When the user specifies '<tt>--</tt>' on -the command line, it is telling the program that all options after the -'<tt>--</tt>' should be treated as positional arguments, not options. Thus, we -can use it like this:</p> - -<div class="doc_code"><pre> - $ spiffygrep -- -foo test.txt - ...output... -</pre></div> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="getPosition">Determining absolute position with getPosition()</a> -</h4> -<div> - <p>Sometimes an option can affect or modify the meaning of another option. For - example, consider <tt>gcc</tt>'s <tt>-x LANG</tt> option. This tells - <tt>gcc</tt> to ignore the suffix of subsequent positional arguments and force - the file to be interpreted as if it contained source code in language - <tt>LANG</tt>. In order to handle this properly, you need to know the - absolute position of each argument, especially those in lists, so their - interaction(s) can be applied correctly. This is also useful for options like - <tt>-llibname</tt> which is actually a positional argument that starts with - a dash.</p> - <p>So, generally, the problem is that you have two <tt>cl::list</tt> variables - that interact in some way. To ensure the correct interaction, you can use the - <tt>cl::list::getPosition(optnum)</tt> method. This method returns the - absolute position (as found on the command line) of the <tt>optnum</tt> - item in the <tt>cl::list</tt>.</p> - <p>The idiom for usage is like this:</p> - - <div class="doc_code"><pre> - static cl::list<std::string> Files(cl::Positional, cl::OneOrMore); - static cl::list<std::string> Libraries("l", cl::ZeroOrMore); - - int main(int argc, char**argv) { - // ... - std::vector<std::string>::iterator fileIt = Files.begin(); - std::vector<std::string>::iterator libIt = Libraries.begin(); - unsigned libPos = 0, filePos = 0; - while ( 1 ) { - if ( libIt != Libraries.end() ) - libPos = Libraries.getPosition( libIt - Libraries.begin() ); - else - libPos = 0; - if ( fileIt != Files.end() ) - filePos = Files.getPosition( fileIt - Files.begin() ); - else - filePos = 0; - - if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) { - // Source File Is next - ++fileIt; - } - else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) { - // Library is next - ++libIt; - } - else - break; // we're done with the list - } - }</pre></div> - - <p>Note that, for compatibility reasons, the <tt>cl::opt</tt> also supports an - <tt>unsigned getPosition()</tt> option that will provide the absolute position - of that option. You can apply the same approach as above with a - <tt>cl::opt</tt> and a <tt>cl::list</tt> option as you can with two lists.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::ConsumeAfter">The <tt>cl::ConsumeAfter</tt> modifier</a> -</h4> - -<div> - -<p>The <tt>cl::ConsumeAfter</tt> <a href="#formatting">formatting option</a> is -used to construct programs that use "interpreter style" option processing. With -this style of option processing, all arguments specified after the last -positional argument are treated as special interpreter arguments that are not -interpreted by the command line argument.</p> - -<p>As a concrete example, lets say we are developing a replacement for the -standard Unix Bourne shell (<tt>/bin/sh</tt>). To run <tt>/bin/sh</tt>, first -you specify options to the shell itself (like <tt>-x</tt> which turns on trace -output), then you specify the name of the script to run, then you specify -arguments to the script. These arguments to the script are parsed by the Bourne -shell command line option processor, but are not interpreted as options to the -shell itself. Using the CommandLine library, we would specify this as:</p> - -<div class="doc_code"><pre> -<a href="#cl::opt">cl::opt</a><string> Script(<a href="#cl::Positional">cl::Positional</a>, <a href="#cl::desc">cl::desc</a>("<i><input script></i>"), <a href="#cl::init">cl::init</a>("-")); -<a href="#cl::list">cl::list</a><string> Argv(<a href="#cl::ConsumeAfter">cl::ConsumeAfter</a>, <a href="#cl::desc">cl::desc</a>("<i><program arguments>...</i>")); -<a href="#cl::opt">cl::opt</a><bool> Trace("<i>x</i>", <a href="#cl::desc">cl::desc</a>("<i>Enable trace output</i>")); -</pre></div> - -<p>which automatically provides the help output:</p> - -<div class="doc_code"><pre> -USAGE: spiffysh [options] <b><input script> <program arguments>...</b> - -OPTIONS: - -help - display available options (-help-hidden for more) - <b>-x - Enable trace output</b> -</pre></div> - -<p>At runtime, if we run our new shell replacement as `<tt>spiffysh -x test.sh --a -x -y bar</tt>', the <tt>Trace</tt> variable will be set to true, the -<tt>Script</tt> variable will be set to "<tt>test.sh</tt>", and the -<tt>Argv</tt> list will contain <tt>["-a", "-x", "-y", "bar"]</tt>, because they -were specified after the last positional argument (which is the script -name).</p> - -<p>There are several limitations to when <tt>cl::ConsumeAfter</tt> options can -be specified. For example, only one <tt>cl::ConsumeAfter</tt> can be specified -per program, there must be at least one <a href="#positional">positional -argument</a> specified, there must not be any <a href="#cl::list">cl::list</a> -positional arguments, and the <tt>cl::ConsumeAfter</tt> option should be a <a -href="#cl::list">cl::list</a> option.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="storage">Internal vs External Storage</a> -</h3> - -<div> - -<p>By default, all command line options automatically hold the value that they -parse from the command line. This is very convenient in the common case, -especially when combined with the ability to define command line options in the -files that use them. This is called the internal storage model.</p> - -<p>Sometimes, however, it is nice to separate the command line option processing -code from the storage of the value parsed. For example, lets say that we have a -'<tt>-debug</tt>' option that we would like to use to enable debug information -across the entire body of our program. In this case, the boolean value -controlling the debug code should be globally accessible (in a header file, for -example) yet the command line option processing code should not be exposed to -all of these clients (requiring lots of .cpp files to #include -<tt>CommandLine.h</tt>).</p> - -<p>To do this, set up your .h file with your option, like this for example:</p> - -<div class="doc_code"> -<pre> -<i>// DebugFlag.h - Get access to the '-debug' command line option -// - -// DebugFlag - This boolean is set to true if the '-debug' command line option -// is specified. This should probably not be referenced directly, instead, use -// the DEBUG macro below. -//</i> -extern bool DebugFlag; - -<i>// DEBUG macro - This macro should be used by code to emit debug information. -// In the '-debug' option is specified on the command line, and if this is a -// debug build, then the code specified as the option to the macro will be -// executed. Otherwise it will not be.</i> -<span class="doc_hilite">#ifdef NDEBUG -#define DEBUG(X) -#else -#define DEBUG(X)</span> do { if (DebugFlag) { X; } } while (0) -<span class="doc_hilite">#endif</span> -</pre> -</div> - -<p>This allows clients to blissfully use the <tt>DEBUG()</tt> macro, or the -<tt>DebugFlag</tt> explicitly if they want to. Now we just need to be able to -set the <tt>DebugFlag</tt> boolean when the option is set. To do this, we pass -an additional argument to our command line argument processor, and we specify -where to fill in with the <a href="#cl::location">cl::location</a> -attribute:</p> - -<div class="doc_code"> -<pre> -bool DebugFlag; <i>// the actual value</i> -static <a href="#cl::opt">cl::opt</a><bool, true> <i>// The parser</i> -Debug("<i>debug</i>", <a href="#cl::desc">cl::desc</a>("<i>Enable debug output</i>"), <a href="#cl::Hidden">cl::Hidden</a>, <a href="#cl::location">cl::location</a>(DebugFlag)); -</pre> -</div> - -<p>In the above example, we specify "<tt>true</tt>" as the second argument to -the <tt><a href="#cl::opt">cl::opt</a></tt> template, indicating that the -template should not maintain a copy of the value itself. In addition to this, -we specify the <tt><a href="#cl::location">cl::location</a></tt> attribute, so -that <tt>DebugFlag</tt> is automatically set.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="attributes">Option Attributes</a> -</h3> - -<div> - -<p>This section describes the basic attributes that you can specify on -options.</p> - -<ul> - -<li>The option name attribute (which is required for all options, except <a -href="#positional">positional options</a>) specifies what the option name is. -This option is specified in simple double quotes: - -<pre> -<a href="#cl::opt">cl::opt</a><<b>bool</b>> Quiet("<i>quiet</i>"); -</pre> - -</li> - -<li><a name="cl::desc">The <b><tt>cl::desc</tt></b></a> attribute specifies a -description for the option to be shown in the <tt>-help</tt> output for the -program.</li> - -<li><a name="cl::value_desc">The <b><tt>cl::value_desc</tt></b></a> attribute -specifies a string that can be used to fine tune the <tt>-help</tt> output for -a command line option. Look <a href="#value_desc_example">here</a> for an -example.</li> - -<li><a name="cl::init">The <b><tt>cl::init</tt></b></a> attribute specifies an -initial value for a <a href="#cl::opt">scalar</a> option. If this attribute is -not specified then the command line option value defaults to the value created -by the default constructor for the type. <b>Warning</b>: If you specify both -<b><tt>cl::init</tt></b> and <b><tt>cl::location</tt></b> for an option, -you must specify <b><tt>cl::location</tt></b> first, so that when the -command-line parser sees <b><tt>cl::init</tt></b>, it knows where to put the -initial value. (You will get an error at runtime if you don't put them in -the right order.)</li> - -<li><a name="cl::location">The <b><tt>cl::location</tt></b></a> attribute where -to store the value for a parsed command line option if using external storage. -See the section on <a href="#storage">Internal vs External Storage</a> for more -information.</li> - -<li><a name="cl::aliasopt">The <b><tt>cl::aliasopt</tt></b></a> attribute -specifies which option a <tt><a href="#cl::alias">cl::alias</a></tt> option is -an alias for.</li> - -<li><a name="cl::values">The <b><tt>cl::values</tt></b></a> attribute specifies -the string-to-value mapping to be used by the generic parser. It takes a -<b>clEnumValEnd terminated</b> list of (option, value, description) triplets -that -specify the option name, the value mapped to, and the description shown in the -<tt>-help</tt> for the tool. Because the generic parser is used most -frequently with enum values, two macros are often useful: - -<ol> - -<li><a name="clEnumVal">The <b><tt>clEnumVal</tt></b></a> macro is used as a -nice simple way to specify a triplet for an enum. This macro automatically -makes the option name be the same as the enum name. The first option to the -macro is the enum, the second is the description for the command line -option.</li> - -<li><a name="clEnumValN">The <b><tt>clEnumValN</tt></b></a> macro is used to -specify macro options where the option name doesn't equal the enum name. For -this macro, the first argument is the enum value, the second is the flag name, -and the second is the description.</li> - -</ol> - -You will get a compile time error if you try to use cl::values with a parser -that does not support it.</li> - -<li><a name="cl::multi_val">The <b><tt>cl::multi_val</tt></b></a> -attribute specifies that this option takes has multiple values -(example: <tt>-sectalign segname sectname sectvalue</tt>). This -attribute takes one unsigned argument - the number of values for the -option. This attribute is valid only on <tt>cl::list</tt> options (and -will fail with compile error if you try to use it with other option -types). It is allowed to use all of the usual modifiers on -multi-valued options (besides <tt>cl::ValueDisallowed</tt>, -obviously).</li> - -</ul> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="modifiers">Option Modifiers</a> -</h3> - -<div> - -<p>Option modifiers are the flags and expressions that you pass into the -constructors for <tt><a href="#cl::opt">cl::opt</a></tt> and <tt><a -href="#cl::list">cl::list</a></tt>. These modifiers give you the ability to -tweak how options are parsed and how <tt>-help</tt> output is generated to fit -your application well.</p> - -<p>These options fall into five main categories:</p> - -<ol> -<li><a href="#hiding">Hiding an option from <tt>-help</tt> output</a></li> -<li><a href="#numoccurrences">Controlling the number of occurrences - required and allowed</a></li> -<li><a href="#valrequired">Controlling whether or not a value must be - specified</a></li> -<li><a href="#formatting">Controlling other formatting options</a></li> -<li><a href="#misc">Miscellaneous option modifiers</a></li> -</ol> - -<p>It is not possible to specify two options from the same category (you'll get -a runtime error) to a single option, except for options in the miscellaneous -category. The CommandLine library specifies defaults for all of these settings -that are the most useful in practice and the most common, which mean that you -usually shouldn't have to worry about these.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="hiding">Hiding an option from <tt>-help</tt> output</a> -</h4> - -<div> - -<p>The <tt>cl::NotHidden</tt>, <tt>cl::Hidden</tt>, and -<tt>cl::ReallyHidden</tt> modifiers are used to control whether or not an option -appears in the <tt>-help</tt> and <tt>-help-hidden</tt> output for the -compiled program:</p> - -<ul> - -<li><a name="cl::NotHidden">The <b><tt>cl::NotHidden</tt></b></a> modifier -(which is the default for <tt><a href="#cl::opt">cl::opt</a></tt> and <tt><a -href="#cl::list">cl::list</a></tt> options) indicates the option is to appear -in both help listings.</li> - -<li><a name="cl::Hidden">The <b><tt>cl::Hidden</tt></b></a> modifier (which is the -default for <tt><a href="#cl::alias">cl::alias</a></tt> options) indicates that -the option should not appear in the <tt>-help</tt> output, but should appear in -the <tt>-help-hidden</tt> output.</li> - -<li><a name="cl::ReallyHidden">The <b><tt>cl::ReallyHidden</tt></b></a> modifier -indicates that the option should not appear in any help output.</li> - -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="numoccurrences">Controlling the number of occurrences required and - allowed</a> -</h4> - -<div> - -<p>This group of options is used to control how many time an option is allowed -(or required) to be specified on the command line of your program. Specifying a -value for this setting allows the CommandLine library to do error checking for -you.</p> - -<p>The allowed values for this option group are:</p> - -<ul> - -<li><a name="cl::Optional">The <b><tt>cl::Optional</tt></b></a> modifier (which -is the default for the <tt><a href="#cl::opt">cl::opt</a></tt> and <tt><a -href="#cl::alias">cl::alias</a></tt> classes) indicates that your program will -allow either zero or one occurrence of the option to be specified.</li> - -<li><a name="cl::ZeroOrMore">The <b><tt>cl::ZeroOrMore</tt></b></a> modifier -(which is the default for the <tt><a href="#cl::list">cl::list</a></tt> class) -indicates that your program will allow the option to be specified zero or more -times.</li> - -<li><a name="cl::Required">The <b><tt>cl::Required</tt></b></a> modifier -indicates that the specified option must be specified exactly one time.</li> - -<li><a name="cl::OneOrMore">The <b><tt>cl::OneOrMore</tt></b></a> modifier -indicates that the option must be specified at least one time.</li> - -<li>The <b><tt>cl::ConsumeAfter</tt></b> modifier is described in the <a -href="#positional">Positional arguments section</a>.</li> - -</ul> - -<p>If an option is not specified, then the value of the option is equal to the -value specified by the <tt><a href="#cl::init">cl::init</a></tt> attribute. If -the <tt><a href="#cl::init">cl::init</a></tt> attribute is not specified, the -option value is initialized with the default constructor for the data type.</p> - -<p>If an option is specified multiple times for an option of the <tt><a -href="#cl::opt">cl::opt</a></tt> class, only the last value will be -retained.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="valrequired">Controlling whether or not a value must be specified</a> -</h4> - -<div> - -<p>This group of options is used to control whether or not the option allows a -value to be present. In the case of the CommandLine library, a value is either -specified with an equal sign (e.g. '<tt>-index-depth=17</tt>') or as a trailing -string (e.g. '<tt>-o a.out</tt>').</p> - -<p>The allowed values for this option group are:</p> - -<ul> - -<li><a name="cl::ValueOptional">The <b><tt>cl::ValueOptional</tt></b></a> modifier -(which is the default for <tt>bool</tt> typed options) specifies that it is -acceptable to have a value, or not. A boolean argument can be enabled just by -appearing on the command line, or it can have an explicit '<tt>-foo=true</tt>'. -If an option is specified with this mode, it is illegal for the value to be -provided without the equal sign. Therefore '<tt>-foo true</tt>' is illegal. To -get this behavior, you must use the <a -href="#cl::ValueRequired">cl::ValueRequired</a> modifier.</li> - -<li><a name="cl::ValueRequired">The <b><tt>cl::ValueRequired</tt></b></a> modifier -(which is the default for all other types except for <a -href="#onealternative">unnamed alternatives using the generic parser</a>) -specifies that a value must be provided. This mode informs the command line -library that if an option is not provides with an equal sign, that the next -argument provided must be the value. This allows things like '<tt>-o -a.out</tt>' to work.</li> - -<li><a name="cl::ValueDisallowed">The <b><tt>cl::ValueDisallowed</tt></b></a> -modifier (which is the default for <a href="#onealternative">unnamed -alternatives using the generic parser</a>) indicates that it is a runtime error -for the user to specify a value. This can be provided to disallow users from -providing options to boolean options (like '<tt>-foo=true</tt>').</li> - -</ul> - -<p>In general, the default values for this option group work just like you would -want them to. As mentioned above, you can specify the <a -href="#cl::ValueDisallowed">cl::ValueDisallowed</a> modifier to a boolean -argument to restrict your command line parser. These options are mostly useful -when <a href="#extensionguide">extending the library</a>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="formatting">Controlling other formatting options</a> -</h4> - -<div> - -<p>The formatting option group is used to specify that the command line option -has special abilities and is otherwise different from other command line -arguments. As usual, you can only specify one of these arguments at most.</p> - -<ul> - -<li><a name="cl::NormalFormatting">The <b><tt>cl::NormalFormatting</tt></b></a> -modifier (which is the default all options) specifies that this option is -"normal".</li> - -<li><a name="cl::Positional">The <b><tt>cl::Positional</tt></b></a> modifier -specifies that this is a positional argument that does not have a command line -option associated with it. See the <a href="#positional">Positional -Arguments</a> section for more information.</li> - -<li>The <b><a href="#cl::ConsumeAfter"><tt>cl::ConsumeAfter</tt></a></b> modifier -specifies that this option is used to capture "interpreter style" arguments. See <a href="#cl::ConsumeAfter">this section for more information</a>.</li> - -<li><a name="cl::Prefix">The <b><tt>cl::Prefix</tt></b></a> modifier specifies -that this option prefixes its value. With 'Prefix' options, the equal sign does -not separate the value from the option name specified. Instead, the value is -everything after the prefix, including any equal sign if present. This is useful -for processing odd arguments like <tt>-lmalloc</tt> and <tt>-L/usr/lib</tt> in a -linker tool or <tt>-DNAME=value</tt> in a compiler tool. Here, the -'<tt>l</tt>', '<tt>D</tt>' and '<tt>L</tt>' options are normal string (or list) -options, that have the <b><tt><a href="#cl::Prefix">cl::Prefix</a></tt></b> -modifier added to allow the CommandLine library to recognize them. Note that -<b><tt><a href="#cl::Prefix">cl::Prefix</a></tt></b> options must not have the -<b><tt><a href="#cl::ValueDisallowed">cl::ValueDisallowed</a></tt></b> modifier -specified.</li> - -<li><a name="cl::Grouping">The <b><tt>cl::Grouping</tt></b></a> modifier is used -to implement Unix-style tools (like <tt>ls</tt>) that have lots of single letter -arguments, but only require a single dash. For example, the '<tt>ls -labF</tt>' -command actually enables four different options, all of which are single -letters. Note that <b><tt><a href="#cl::Grouping">cl::Grouping</a></tt></b> -options cannot have values.</li> - -</ul> - -<p>The CommandLine library does not restrict how you use the <b><tt><a -href="#cl::Prefix">cl::Prefix</a></tt></b> or <b><tt><a -href="#cl::Grouping">cl::Grouping</a></tt></b> modifiers, but it is possible to -specify ambiguous argument settings. Thus, it is possible to have multiple -letter options that are prefix or grouping options, and they will still work as -designed.</p> - -<p>To do this, the CommandLine library uses a greedy algorithm to parse the -input option into (potentially multiple) prefix and grouping options. The -strategy basically looks like this:</p> - -<div class="doc_code"><tt>parse(string OrigInput) {</tt> - -<ol> -<li><tt>string input = OrigInput;</tt> -<li><tt>if (isOption(input)) return getOption(input).parse();</tt> <i>// Normal option</i> -<li><tt>while (!isOption(input) && !input.empty()) input.pop_back();</tt> <i>// Remove the last letter</i> -<li><tt>if (input.empty()) return error();</tt> <i>// No matching option</i> -<li><tt>if (getOption(input).isPrefix())<br> - return getOption(input).parse(input);</tt> -<li><tt>while (!input.empty()) { <i>// Must be grouping options</i><br> - getOption(input).parse();<br> - OrigInput.erase(OrigInput.begin(), OrigInput.begin()+input.length());<br> - input = OrigInput;<br> - while (!isOption(input) && !input.empty()) input.pop_back();<br> -}</tt> -<li><tt>if (!OrigInput.empty()) error();</tt></li> -</ol> - -<p><tt>}</tt></p> -</div> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="misc">Miscellaneous option modifiers</a> -</h4> - -<div> - -<p>The miscellaneous option modifiers are the only flags where you can specify -more than one flag from the set: they are not mutually exclusive. These flags -specify boolean properties that modify the option.</p> - -<ul> - -<li><a name="cl::CommaSeparated">The <b><tt>cl::CommaSeparated</tt></b></a> modifier -indicates that any commas specified for an option's value should be used to -split the value up into multiple values for the option. For example, these two -options are equivalent when <tt>cl::CommaSeparated</tt> is specified: -"<tt>-foo=a -foo=b -foo=c</tt>" and "<tt>-foo=a,b,c</tt>". This option only -makes sense to be used in a case where the option is allowed to accept one or -more values (i.e. it is a <a href="#cl::list">cl::list</a> option).</li> - -<li><a name="cl::PositionalEatsArgs">The -<b><tt>cl::PositionalEatsArgs</tt></b></a> modifier (which only applies to -positional arguments, and only makes sense for lists) indicates that positional -argument should consume any strings after it (including strings that start with -a "-") up until another recognized positional argument. For example, if you -have two "eating" positional arguments, "<tt>pos1</tt>" and "<tt>pos2</tt>", the -string "<tt>-pos1 -foo -bar baz -pos2 -bork</tt>" would cause the "<tt>-foo -bar --baz</tt>" strings to be applied to the "<tt>-pos1</tt>" option and the -"<tt>-bork</tt>" string to be applied to the "<tt>-pos2</tt>" option.</li> - -<li><a name="cl::Sink">The <b><tt>cl::Sink</tt></b></a> modifier is -used to handle unknown options. If there is at least one option with -<tt>cl::Sink</tt> modifier specified, the parser passes -unrecognized option strings to it as values instead of signaling an -error. As with <tt>cl::CommaSeparated</tt>, this modifier -only makes sense with a <a href="#cl::list">cl::list</a> option.</li> - -</ul> - -<p>So far, these are the only three miscellaneous option modifiers.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="response">Response files</a> -</h4> - -<div> - -<p>Some systems, such as certain variants of Microsoft Windows and -some older Unices have a relatively low limit on command-line -length. It is therefore customary to use the so-called 'response -files' to circumvent this restriction. These files are mentioned on -the command-line (using the "@file") syntax. The program reads these -files and inserts the contents into argv, thereby working around the -command-line length limits. Response files are enabled by an optional -fourth argument to -<a href="#cl::ParseEnvironmentOptions"><tt>cl::ParseEnvironmentOptions</tt></a> -and -<a href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a>. -</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="toplevel">Top-Level Classes and Functions</a> -</h3> - -<div> - -<p>Despite all of the built-in flexibility, the CommandLine option library -really only consists of one function (<a -href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a>) -and three main classes: <a href="#cl::opt"><tt>cl::opt</tt></a>, <a -href="#cl::list"><tt>cl::list</tt></a>, and <a -href="#cl::alias"><tt>cl::alias</tt></a>. This section describes these three -classes in detail.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::ParseCommandLineOptions">The <tt>cl::ParseCommandLineOptions</tt> - function</a> -</h4> - -<div> - -<p>The <tt>cl::ParseCommandLineOptions</tt> function is designed to be called -directly from <tt>main</tt>, and is used to fill in the values of all of the -command line option variables once <tt>argc</tt> and <tt>argv</tt> are -available.</p> - -<p>The <tt>cl::ParseCommandLineOptions</tt> function requires two parameters -(<tt>argc</tt> and <tt>argv</tt>), but may also take an optional third parameter -which holds <a href="#description">additional extra text</a> to emit when the -<tt>-help</tt> option is invoked, and a fourth boolean parameter that enables -<a href="#response">response files</a>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::ParseEnvironmentOptions">The <tt>cl::ParseEnvironmentOptions</tt> - function</a> -</h4> - -<div> - -<p>The <tt>cl::ParseEnvironmentOptions</tt> function has mostly the same effects -as <a -href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a>, -except that it is designed to take values for options from an environment -variable, for those cases in which reading the command line is not convenient or -desired. It fills in the values of all the command line option variables just -like <a -href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a> -does.</p> - -<p>It takes four parameters: the name of the program (since <tt>argv</tt> may -not be available, it can't just look in <tt>argv[0]</tt>), the name of the -environment variable to examine, the optional -<a href="#description">additional extra text</a> to emit when the -<tt>-help</tt> option is invoked, and the boolean -switch that controls whether <a href="#response">response files</a> -should be read.</p> - -<p><tt>cl::ParseEnvironmentOptions</tt> will break the environment -variable's value up into words and then process them using -<a href="#cl::ParseCommandLineOptions"><tt>cl::ParseCommandLineOptions</tt></a>. -<b>Note:</b> Currently <tt>cl::ParseEnvironmentOptions</tt> does not support -quoting, so an environment variable containing <tt>-option "foo bar"</tt> will -be parsed as three words, <tt>-option</tt>, <tt>"foo</tt>, and <tt>bar"</tt>, -which is different from what you would get from the shell with the same -input.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::SetVersionPrinter">The <tt>cl::SetVersionPrinter</tt> - function</a> -</h4> - -<div> - -<p>The <tt>cl::SetVersionPrinter</tt> function is designed to be called -directly from <tt>main</tt> and <i>before</i> -<tt>cl::ParseCommandLineOptions</tt>. Its use is optional. It simply arranges -for a function to be called in response to the <tt>--version</tt> option instead -of having the <tt>CommandLine</tt> library print out the usual version string -for LLVM. This is useful for programs that are not part of LLVM but wish to use -the <tt>CommandLine</tt> facilities. Such programs should just define a small -function that takes no arguments and returns <tt>void</tt> and that prints out -whatever version information is appropriate for the program. Pass the address -of that function to <tt>cl::SetVersionPrinter</tt> to arrange for it to be -called when the <tt>--version</tt> option is given by the user.</p> - -</div> -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::opt">The <tt>cl::opt</tt> class</a> -</h4> - -<div> - -<p>The <tt>cl::opt</tt> class is the class used to represent scalar command line -options, and is the one used most of the time. It is a templated class which -can take up to three arguments (all except for the first have default values -though):</p> - -<div class="doc_code"><pre> -<b>namespace</b> cl { - <b>template</b> <<b>class</b> DataType, <b>bool</b> ExternalStorage = <b>false</b>, - <b>class</b> ParserClass = parser<DataType> > - <b>class</b> opt; -} -</pre></div> - -<p>The first template argument specifies what underlying data type the command -line argument is, and is used to select a default parser implementation. The -second template argument is used to specify whether the option should contain -the storage for the option (the default) or whether external storage should be -used to contain the value parsed for the option (see <a href="#storage">Internal -vs External Storage</a> for more information).</p> - -<p>The third template argument specifies which parser to use. The default value -selects an instantiation of the <tt>parser</tt> class based on the underlying -data type of the option. In general, this default works well for most -applications, so this option is only used when using a <a -href="#customparser">custom parser</a>.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::list">The <tt>cl::list</tt> class</a> -</h4> - -<div> - -<p>The <tt>cl::list</tt> class is the class used to represent a list of command -line options. It too is a templated class which can take up to three -arguments:</p> - -<div class="doc_code"><pre> -<b>namespace</b> cl { - <b>template</b> <<b>class</b> DataType, <b>class</b> Storage = <b>bool</b>, - <b>class</b> ParserClass = parser<DataType> > - <b>class</b> list; -} -</pre></div> - -<p>This class works the exact same as the <a -href="#cl::opt"><tt>cl::opt</tt></a> class, except that the second argument is -the <b>type</b> of the external storage, not a boolean value. For this class, -the marker type '<tt>bool</tt>' is used to indicate that internal storage should -be used.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::bits">The <tt>cl::bits</tt> class</a> -</h4> - -<div> - -<p>The <tt>cl::bits</tt> class is the class used to represent a list of command -line options in the form of a bit vector. It is also a templated class which -can take up to three arguments:</p> - -<div class="doc_code"><pre> -<b>namespace</b> cl { - <b>template</b> <<b>class</b> DataType, <b>class</b> Storage = <b>bool</b>, - <b>class</b> ParserClass = parser<DataType> > - <b>class</b> bits; -} -</pre></div> - -<p>This class works the exact same as the <a -href="#cl::opt"><tt>cl::lists</tt></a> class, except that the second argument -must be of <b>type</b> <tt>unsigned</tt> if external storage is used.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::alias">The <tt>cl::alias</tt> class</a> -</h4> - -<div> - -<p>The <tt>cl::alias</tt> class is a nontemplated class that is used to form -aliases for other arguments.</p> - -<div class="doc_code"><pre> -<b>namespace</b> cl { - <b>class</b> alias; -} -</pre></div> - -<p>The <a href="#cl::aliasopt"><tt>cl::aliasopt</tt></a> attribute should be -used to specify which option this is an alias for. Alias arguments default to -being <a href="#cl::Hidden">Hidden</a>, and use the aliased options parser to do -the conversion from string to data.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="cl::extrahelp">The <tt>cl::extrahelp</tt> class</a> -</h4> - -<div> - -<p>The <tt>cl::extrahelp</tt> class is a nontemplated class that allows extra -help text to be printed out for the <tt>-help</tt> option.</p> - -<div class="doc_code"><pre> -<b>namespace</b> cl { - <b>struct</b> extrahelp; -} -</pre></div> - -<p>To use the extrahelp, simply construct one with a <tt>const char*</tt> -parameter to the constructor. The text passed to the constructor will be printed -at the bottom of the help message, verbatim. Note that multiple -<tt>cl::extrahelp</tt> <b>can</b> be used, but this practice is discouraged. If -your tool needs to print additional help information, put all that help into a -single <tt>cl::extrahelp</tt> instance.</p> -<p>For example:</p> -<div class="doc_code"><pre> - cl::extrahelp("\nADDITIONAL HELP:\n\n This is the extra help\n"); -</pre></div> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="builtinparsers">Builtin parsers</a> -</h3> - -<div> - -<p>Parsers control how the string value taken from the command line is -translated into a typed value, suitable for use in a C++ program. By default, -the CommandLine library uses an instance of <tt>parser<type></tt> if the -command line option specifies that it uses values of type '<tt>type</tt>'. -Because of this, custom option processing is specified with specializations of -the '<tt>parser</tt>' class.</p> - -<p>The CommandLine library provides the following builtin parser -specializations, which are sufficient for most applications. It can, however, -also be extended to work with new data types and new ways of interpreting the -same data. See the <a href="#customparser">Writing a Custom Parser</a> for more -details on this type of library extension.</p> - -<ul> - -<li><a name="genericparser">The <b>generic <tt>parser<t></tt> parser</b></a> -can be used to map strings values to any data type, through the use of the <a -href="#cl::values">cl::values</a> property, which specifies the mapping -information. The most common use of this parser is for parsing enum values, -which allows you to use the CommandLine library for all of the error checking to -make sure that only valid enum values are specified (as opposed to accepting -arbitrary strings). Despite this, however, the generic parser class can be used -for any data type.</li> - -<li><a name="boolparser">The <b><tt>parser<bool></tt> specialization</b></a> -is used to convert boolean strings to a boolean value. Currently accepted -strings are "<tt>true</tt>", "<tt>TRUE</tt>", "<tt>True</tt>", "<tt>1</tt>", -"<tt>false</tt>", "<tt>FALSE</tt>", "<tt>False</tt>", and "<tt>0</tt>".</li> - -<li><a name="boolOrDefaultparser">The <b><tt>parser<boolOrDefault></tt> - specialization</b></a> is used for cases where the value is boolean, -but we also need to know whether the option was specified at all. boolOrDefault -is an enum with 3 values, BOU_UNSET, BOU_TRUE and BOU_FALSE. This parser accepts -the same strings as <b><tt>parser<bool></tt></b>.</li> - -<li><a name="stringparser">The <b><tt>parser<string></tt> -specialization</b></a> simply stores the parsed string into the string value -specified. No conversion or modification of the data is performed.</li> - -<li><a name="intparser">The <b><tt>parser<int></tt> specialization</b></a> -uses the C <tt>strtol</tt> function to parse the string input. As such, it will -accept a decimal number (with an optional '+' or '-' prefix) which must start -with a non-zero digit. It accepts octal numbers, which are identified with a -'<tt>0</tt>' prefix digit, and hexadecimal numbers with a prefix of -'<tt>0x</tt>' or '<tt>0X</tt>'.</li> - -<li><a name="doubleparser">The <b><tt>parser<double></tt></b></a> and -<b><tt>parser<float></tt> specializations</b> use the standard C -<tt>strtod</tt> function to convert floating point strings into floating point -values. As such, a broad range of string formats is supported, including -exponential notation (ex: <tt>1.7e15</tt>) and properly supports locales. -</li> - -</ul> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="extensionguide">Extension Guide</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Although the CommandLine library has a lot of functionality built into it -already (as discussed previously), one of its true strengths lie in its -extensibility. This section discusses how the CommandLine library works under -the covers and illustrates how to do some simple, common, extensions.</p> - -<!-- ======================================================================= --> -<h3> - <a name="customparser">Writing a custom parser</a> -</h3> - -<div> - -<p>One of the simplest and most common extensions is the use of a custom parser. -As <a href="#builtinparsers">discussed previously</a>, parsers are the portion -of the CommandLine library that turns string input from the user into a -particular parsed data type, validating the input in the process.</p> - -<p>There are two ways to use a new parser:</p> - -<ol> - -<li> - -<p>Specialize the <a href="#genericparser"><tt>cl::parser</tt></a> template for -your custom data type.<p> - -<p>This approach has the advantage that users of your custom data type will -automatically use your custom parser whenever they define an option with a value -type of your data type. The disadvantage of this approach is that it doesn't -work if your fundamental data type is something that is already supported.</p> - -</li> - -<li> - -<p>Write an independent class, using it explicitly from options that need -it.</p> - -<p>This approach works well in situations where you would line to parse an -option using special syntax for a not-very-special data-type. The drawback of -this approach is that users of your parser have to be aware that they are using -your parser instead of the builtin ones.</p> - -</li> - -</ol> - -<p>To guide the discussion, we will discuss a custom parser that accepts file -sizes, specified with an optional unit after the numeric size. For example, we -would like to parse "102kb", "41M", "1G" into the appropriate integer value. In -this case, the underlying data type we want to parse into is -'<tt>unsigned</tt>'. We choose approach #2 above because we don't want to make -this the default for all <tt>unsigned</tt> options.</p> - -<p>To start out, we declare our new <tt>FileSizeParser</tt> class:</p> - -<div class="doc_code"><pre> -<b>struct</b> FileSizeParser : <b>public</b> cl::basic_parser<<b>unsigned</b>> { - <i>// parse - Return true on error.</i> - <b>bool</b> parse(cl::Option &O, <b>const char</b> *ArgName, <b>const</b> std::string &ArgValue, - <b>unsigned</b> &Val); -}; -</pre></div> - -<p>Our new class inherits from the <tt>cl::basic_parser</tt> template class to -fill in the default, boiler plate code for us. We give it the data type that -we parse into, the last argument to the <tt>parse</tt> method, so that clients of -our custom parser know what object type to pass in to the parse method. (Here we -declare that we parse into '<tt>unsigned</tt>' variables.)</p> - -<p>For most purposes, the only method that must be implemented in a custom -parser is the <tt>parse</tt> method. The <tt>parse</tt> method is called -whenever the option is invoked, passing in the option itself, the option name, -the string to parse, and a reference to a return value. If the string to parse -is not well-formed, the parser should output an error message and return true. -Otherwise it should return false and set '<tt>Val</tt>' to the parsed value. In -our example, we implement <tt>parse</tt> as:</p> - -<div class="doc_code"><pre> -<b>bool</b> FileSizeParser::parse(cl::Option &O, <b>const char</b> *ArgName, - <b>const</b> std::string &Arg, <b>unsigned</b> &Val) { - <b>const char</b> *ArgStart = Arg.c_str(); - <b>char</b> *End; - - <i>// Parse integer part, leaving 'End' pointing to the first non-integer char</i> - Val = (unsigned)strtol(ArgStart, &End, 0); - - <b>while</b> (1) { - <b>switch</b> (*End++) { - <b>case</b> 0: <b>return</b> false; <i>// No error</i> - <b>case</b> 'i': <i>// Ignore the 'i' in KiB if people use that</i> - <b>case</b> 'b': <b>case</b> 'B': <i>// Ignore B suffix</i> - <b>break</b>; - - <b>case</b> 'g': <b>case</b> 'G': Val *= 1024*1024*1024; <b>break</b>; - <b>case</b> 'm': <b>case</b> 'M': Val *= 1024*1024; <b>break</b>; - <b>case</b> 'k': <b>case</b> 'K': Val *= 1024; <b>break</b>; - - default: - <i>// Print an error message if unrecognized character!</i> - <b>return</b> O.error("'" + Arg + "' value invalid for file size argument!"); - } - } -} -</pre></div> - -<p>This function implements a very simple parser for the kinds of strings we are -interested in. Although it has some holes (it allows "<tt>123KKK</tt>" for -example), it is good enough for this example. Note that we use the option -itself to print out the error message (the <tt>error</tt> method always returns -true) in order to get a nice error message (shown below). Now that we have our -parser class, we can use it like this:</p> - -<div class="doc_code"><pre> -<b>static</b> <a href="#cl::opt">cl::opt</a><<b>unsigned</b>, <b>false</b>, FileSizeParser> -MFS(<i>"max-file-size"</i>, <a href="#cl::desc">cl::desc</a>(<i>"Maximum file size to accept"</i>), - <a href="#cl::value_desc">cl::value_desc</a>("<i>size</i>")); -</pre></div> - -<p>Which adds this to the output of our program:</p> - -<div class="doc_code"><pre> -OPTIONS: - -help - display available options (-help-hidden for more) - ... - <b>-max-file-size=<size> - Maximum file size to accept</b> -</pre></div> - -<p>And we can test that our parse works correctly now (the test program just -prints out the max-file-size argument value):</p> - -<div class="doc_code"><pre> -$ ./test -MFS: 0 -$ ./test -max-file-size=123MB -MFS: 128974848 -$ ./test -max-file-size=3G -MFS: 3221225472 -$ ./test -max-file-size=dog --max-file-size option: 'dog' value invalid for file size argument! -</pre></div> - -<p>It looks like it works. The error message that we get is nice and helpful, -and we seem to accept reasonable file sizes. This wraps up the "custom parser" -tutorial.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="explotingexternal">Exploiting external storage</a> -</h3> - -<div> - <p>Several of the LLVM libraries define static <tt>cl::opt</tt> instances that - will automatically be included in any program that links with that library. - This is a feature. However, sometimes it is necessary to know the value of the - command line option outside of the library. In these cases the library does or - should provide an external storage location that is accessible to users of the - library. Examples of this include the <tt>llvm::DebugFlag</tt> exported by the - <tt>lib/Support/Debug.cpp</tt> file and the <tt>llvm::TimePassesIsEnabled</tt> - flag exported by the <tt>lib/VMCore/Pass.cpp</tt> file.</p> - -<p>TODO: complete this section</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="dynamicopts">Dynamically adding command line options</a> -</h3> - -<div> - -<p>TODO: fill in this section</p> - -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ -</address> - -</body> -</html> diff --git a/docs/CommandLine.rst b/docs/CommandLine.rst new file mode 100644 index 0000000..302f5a4 --- /dev/null +++ b/docs/CommandLine.rst @@ -0,0 +1,1615 @@ +.. _commandline: + +============================== +CommandLine 2.0 Library Manual +============================== + +Introduction +============ + +This document describes the CommandLine argument processing library. It will +show you how to use it, and what it can do. The CommandLine library uses a +declarative approach to specifying the command line options that your program +takes. By default, these options declarations implicitly hold the value parsed +for the option declared (of course this `can be changed`_). + +Although there are a **lot** of command line argument parsing libraries out +there in many different languages, none of them fit well with what I needed. By +looking at the features and problems of other libraries, I designed the +CommandLine library to have the following features: + +#. Speed: The CommandLine library is very quick and uses little resources. The + parsing time of the library is directly proportional to the number of + arguments parsed, not the number of options recognized. Additionally, + command line argument values are captured transparently into user defined + global variables, which can be accessed like any other variable (and with the + same performance). + +#. Type Safe: As a user of CommandLine, you don't have to worry about + remembering the type of arguments that you want (is it an int? a string? a + bool? an enum?) and keep casting it around. Not only does this help prevent + error prone constructs, it also leads to dramatically cleaner source code. + +#. No subclasses required: To use CommandLine, you instantiate variables that + correspond to the arguments that you would like to capture, you don't + subclass a parser. This means that you don't have to write **any** + boilerplate code. + +#. Globally accessible: Libraries can specify command line arguments that are + automatically enabled in any tool that links to the library. This is + possible because the application doesn't have to keep a list of arguments to + pass to the parser. This also makes supporting `dynamically loaded options`_ + trivial. + +#. Cleaner: CommandLine supports enum and other types directly, meaning that + there is less error and more security built into the library. You don't have + to worry about whether your integral command line argument accidentally got + assigned a value that is not valid for your enum type. + +#. Powerful: The CommandLine library supports many different types of arguments, + from simple `boolean flags`_ to `scalars arguments`_ (`strings`_, + `integers`_, `enums`_, `doubles`_), to `lists of arguments`_. This is + possible because CommandLine is... + +#. Extensible: It is very simple to add a new argument type to CommandLine. + Simply specify the parser that you want to use with the command line option + when you declare it. `Custom parsers`_ are no problem. + +#. Labor Saving: The CommandLine library cuts down on the amount of grunt work + that you, the user, have to do. For example, it automatically provides a + ``-help`` option that shows the available command line options for your tool. + Additionally, it does most of the basic correctness checking for you. + +#. Capable: The CommandLine library can handle lots of different forms of + options often found in real programs. For example, `positional`_ arguments, + ``ls`` style `grouping`_ options (to allow processing '``ls -lad``' + naturally), ``ld`` style `prefix`_ options (to parse '``-lmalloc + -L/usr/lib``'), and interpreter style options. + +This document will hopefully let you jump in and start using CommandLine in your +utility quickly and painlessly. Additionally it should be a simple reference +manual to figure out how stuff works. If it is failing in some area (or you +want an extension to the library), nag the author, `Chris +Lattner <mailto:sabre@nondot.org>`_. + +Quick Start Guide +================= + +This section of the manual runs through a simple CommandLine'ification of a +basic compiler tool. This is intended to show you how to jump into using the +CommandLine library in your own program, and show you some of the cool things it +can do. + +To start out, you need to include the CommandLine header file into your program: + +.. code-block:: c++ + + #include "llvm/Support/CommandLine.h" + +Additionally, you need to add this as the first line of your main program: + +.. code-block:: c++ + + int main(int argc, char **argv) { + cl::ParseCommandLineOptions(argc, argv); + ... + } + +... which actually parses the arguments and fills in the variable declarations. + +Now that you are ready to support command line arguments, we need to tell the +system which ones we want, and what type of arguments they are. The CommandLine +library uses a declarative syntax to model command line arguments with the +global variable declarations that capture the parsed values. This means that +for every command line option that you would like to support, there should be a +global variable declaration to capture the result. For example, in a compiler, +we would like to support the Unix-standard '``-o <filename>``' option to specify +where to put the output. With the CommandLine library, this is represented like +this: + +.. _scalars arguments: +.. _here: + +.. code-block:: c++ + + cl::opt<string> OutputFilename("o", cl::desc("Specify output filename"), cl::value_desc("filename")); + +This declares a global variable "``OutputFilename``" that is used to capture the +result of the "``o``" argument (first parameter). We specify that this is a +simple scalar option by using the "``cl::opt``" template (as opposed to the +"``cl::list``" template), and tell the CommandLine library that the data +type that we are parsing is a string. + +The second and third parameters (which are optional) are used to specify what to +output for the "``-help``" option. In this case, we get a line that looks like +this: + +:: + + USAGE: compiler [options] + + OPTIONS: + -help - display available options (-help-hidden for more) + -o <filename> - Specify output filename + +Because we specified that the command line option should parse using the +``string`` data type, the variable declared is automatically usable as a real +string in all contexts that a normal C++ string object may be used. For +example: + +.. code-block:: c++ + + ... + std::ofstream Output(OutputFilename.c_str()); + if (Output.good()) ... + ... + +There are many different options that you can use to customize the command line +option handling library, but the above example shows the general interface to +these options. The options can be specified in any order, and are specified +with helper functions like `cl::desc(...)`_, so there are no positional +dependencies to remember. The available options are discussed in detail in the +`Reference Guide`_. + +Continuing the example, we would like to have our compiler take an input +filename as well as an output filename, but we do not want the input filename to +be specified with a hyphen (ie, not ``-filename.c``). To support this style of +argument, the CommandLine library allows for `positional`_ arguments to be +specified for the program. These positional arguments are filled with command +line parameters that are not in option form. We use this feature like this: + +.. code-block:: c++ + + + cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-")); + +This declaration indicates that the first positional argument should be treated +as the input filename. Here we use the `cl::init`_ option to specify an initial +value for the command line option, which is used if the option is not specified +(if you do not specify a `cl::init`_ modifier for an option, then the default +constructor for the data type is used to initialize the value). Command line +options default to being optional, so if we would like to require that the user +always specify an input filename, we would add the `cl::Required`_ flag, and we +could eliminate the `cl::init`_ modifier, like this: + +.. code-block:: c++ + + cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::Required); + +Again, the CommandLine library does not require the options to be specified in +any particular order, so the above declaration is equivalent to: + +.. code-block:: c++ + + cl::opt<string> InputFilename(cl::Positional, cl::Required, cl::desc("<input file>")); + +By simply adding the `cl::Required`_ flag, the CommandLine library will +automatically issue an error if the argument is not specified, which shifts all +of the command line option verification code out of your application into the +library. This is just one example of how using flags can alter the default +behaviour of the library, on a per-option basis. By adding one of the +declarations above, the ``-help`` option synopsis is now extended to: + +:: + + USAGE: compiler [options] <input file> + + OPTIONS: + -help - display available options (-help-hidden for more) + -o <filename> - Specify output filename + +... indicating that an input filename is expected. + +Boolean Arguments +----------------- + +In addition to input and output filenames, we would like the compiler example to +support three boolean flags: "``-f``" to force writing binary output to a +terminal, "``--quiet``" to enable quiet mode, and "``-q``" for backwards +compatibility with some of our users. We can support these by declaring options +of boolean type like this: + +.. code-block:: c++ + + cl::opt<bool> Force ("f", cl::desc("Enable binary output on terminals")); + cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages")); + cl::opt<bool> Quiet2("q", cl::desc("Don't print informational messages"), cl::Hidden); + +This does what you would expect: it declares three boolean variables +("``Force``", "``Quiet``", and "``Quiet2``") to recognize these options. Note +that the "``-q``" option is specified with the "`cl::Hidden`_" flag. This +modifier prevents it from being shown by the standard "``-help``" output (note +that it is still shown in the "``-help-hidden``" output). + +The CommandLine library uses a `different parser`_ for different data types. +For example, in the string case, the argument passed to the option is copied +literally into the content of the string variable... we obviously cannot do that +in the boolean case, however, so we must use a smarter parser. In the case of +the boolean parser, it allows no options (in which case it assigns the value of +true to the variable), or it allows the values "``true``" or "``false``" to be +specified, allowing any of the following inputs: + +:: + + compiler -f # No value, 'Force' == true + compiler -f=true # Value specified, 'Force' == true + compiler -f=TRUE # Value specified, 'Force' == true + compiler -f=FALSE # Value specified, 'Force' == false + +... you get the idea. The `bool parser`_ just turns the string values into +boolean values, and rejects things like '``compiler -f=foo``'. Similarly, the +`float`_, `double`_, and `int`_ parsers work like you would expect, using the +'``strtol``' and '``strtod``' C library calls to parse the string value into the +specified data type. + +With the declarations above, "``compiler -help``" emits this: + +:: + + USAGE: compiler [options] <input file> + + OPTIONS: + -f - Enable binary output on terminals + -o - Override output filename + -quiet - Don't print informational messages + -help - display available options (-help-hidden for more) + +and "``compiler -help-hidden``" prints this: + +:: + + USAGE: compiler [options] <input file> + + OPTIONS: + -f - Enable binary output on terminals + -o - Override output filename + -q - Don't print informational messages + -quiet - Don't print informational messages + -help - display available options (-help-hidden for more) + +This brief example has shown you how to use the '`cl::opt`_' class to parse +simple scalar command line arguments. In addition to simple scalar arguments, +the CommandLine library also provides primitives to support CommandLine option +`aliases`_, and `lists`_ of options. + +.. _aliases: + +Argument Aliases +---------------- + +So far, the example works well, except for the fact that we need to check the +quiet condition like this now: + +.. code-block:: c++ + + ... + if (!Quiet && !Quiet2) printInformationalMessage(...); + ... + +... which is a real pain! Instead of defining two values for the same +condition, we can use the "`cl::alias`_" class to make the "``-q``" option an +**alias** for the "``-quiet``" option, instead of providing a value itself: + +.. code-block:: c++ + + cl::opt<bool> Force ("f", cl::desc("Overwrite output files")); + cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages")); + cl::alias QuietA("q", cl::desc("Alias for -quiet"), cl::aliasopt(Quiet)); + +The third line (which is the only one we modified from above) defines a "``-q``" +alias that updates the "``Quiet``" variable (as specified by the `cl::aliasopt`_ +modifier) whenever it is specified. Because aliases do not hold state, the only +thing the program has to query is the ``Quiet`` variable now. Another nice +feature of aliases is that they automatically hide themselves from the ``-help`` +output (although, again, they are still visible in the ``-help-hidden output``). + +Now the application code can simply use: + +.. code-block:: c++ + + ... + if (!Quiet) printInformationalMessage(...); + ... + +... which is much nicer! The "`cl::alias`_" can be used to specify an +alternative name for any variable type, and has many uses. + +.. _unnamed alternatives using the generic parser: + +Selecting an alternative from a set of possibilities +---------------------------------------------------- + +So far we have seen how the CommandLine library handles builtin types like +``std::string``, ``bool`` and ``int``, but how does it handle things it doesn't +know about, like enums or '``int*``'s? + +The answer is that it uses a table-driven generic parser (unless you specify +your own parser, as described in the `Extension Guide`_). This parser maps +literal strings to whatever type is required, and requires you to tell it what +this mapping should be. + +Let's say that we would like to add four optimization levels to our optimizer, +using the standard flags "``-g``", "``-O0``", "``-O1``", and "``-O2``". We +could easily implement this with boolean options like above, but there are +several problems with this strategy: + +#. A user could specify more than one of the options at a time, for example, + "``compiler -O3 -O2``". The CommandLine library would not be able to catch + this erroneous input for us. + +#. We would have to test 4 different variables to see which ones are set. + +#. This doesn't map to the numeric levels that we want... so we cannot easily + see if some level >= "``-O1``" is enabled. + +To cope with these problems, we can use an enum value, and have the CommandLine +library fill it in with the appropriate level directly, which is used like this: + +.. code-block:: c++ + + enum OptLevel { + g, O1, O2, O3 + }; + + cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"), + cl::values( + clEnumVal(g , "No optimizations, enable debugging"), + clEnumVal(O1, "Enable trivial optimizations"), + clEnumVal(O2, "Enable default optimizations"), + clEnumVal(O3, "Enable expensive optimizations"), + clEnumValEnd)); + + ... + if (OptimizationLevel >= O2) doPartialRedundancyElimination(...); + ... + +This declaration defines a variable "``OptimizationLevel``" of the +"``OptLevel``" enum type. This variable can be assigned any of the values that +are listed in the declaration (Note that the declaration list must be terminated +with the "``clEnumValEnd``" argument!). The CommandLine library enforces that +the user can only specify one of the options, and it ensure that only valid enum +values can be specified. The "``clEnumVal``" macros ensure that the command +line arguments matched the enum values. With this option added, our help output +now is: + +:: + + USAGE: compiler [options] <input file> + + OPTIONS: + Choose optimization level: + -g - No optimizations, enable debugging + -O1 - Enable trivial optimizations + -O2 - Enable default optimizations + -O3 - Enable expensive optimizations + -f - Enable binary output on terminals + -help - display available options (-help-hidden for more) + -o <filename> - Specify output filename + -quiet - Don't print informational messages + +In this case, it is sort of awkward that flag names correspond directly to enum +names, because we probably don't want a enum definition named "``g``" in our +program. Because of this, we can alternatively write this example like this: + +.. code-block:: c++ + + enum OptLevel { + Debug, O1, O2, O3 + }; + + cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"), + cl::values( + clEnumValN(Debug, "g", "No optimizations, enable debugging"), + clEnumVal(O1 , "Enable trivial optimizations"), + clEnumVal(O2 , "Enable default optimizations"), + clEnumVal(O3 , "Enable expensive optimizations"), + clEnumValEnd)); + + ... + if (OptimizationLevel == Debug) outputDebugInfo(...); + ... + +By using the "``clEnumValN``" macro instead of "``clEnumVal``", we can directly +specify the name that the flag should get. In general a direct mapping is nice, +but sometimes you can't or don't want to preserve the mapping, which is when you +would use it. + +Named Alternatives +------------------ + +Another useful argument form is a named alternative style. We shall use this +style in our compiler to specify different debug levels that can be used. +Instead of each debug level being its own switch, we want to support the +following options, of which only one can be specified at a time: +"``--debug-level=none``", "``--debug-level=quick``", +"``--debug-level=detailed``". To do this, we use the exact same format as our +optimization level flags, but we also specify an option name. For this case, +the code looks like this: + +.. code-block:: c++ + + enum DebugLev { + nodebuginfo, quick, detailed + }; + + // Enable Debug Options to be specified on the command line + cl::opt<DebugLev> DebugLevel("debug_level", cl::desc("Set the debugging level:"), + cl::values( + clEnumValN(nodebuginfo, "none", "disable debug information"), + clEnumVal(quick, "enable quick debug information"), + clEnumVal(detailed, "enable detailed debug information"), + clEnumValEnd)); + +This definition defines an enumerated command line variable of type "``enum +DebugLev``", which works exactly the same way as before. The difference here is +just the interface exposed to the user of your program and the help output by +the "``-help``" option: + +:: + + USAGE: compiler [options] <input file> + + OPTIONS: + Choose optimization level: + -g - No optimizations, enable debugging + -O1 - Enable trivial optimizations + -O2 - Enable default optimizations + -O3 - Enable expensive optimizations + -debug_level - Set the debugging level: + =none - disable debug information + =quick - enable quick debug information + =detailed - enable detailed debug information + -f - Enable binary output on terminals + -help - display available options (-help-hidden for more) + -o <filename> - Specify output filename + -quiet - Don't print informational messages + +Again, the only structural difference between the debug level declaration and +the optimization level declaration is that the debug level declaration includes +an option name (``"debug_level"``), which automatically changes how the library +processes the argument. The CommandLine library supports both forms so that you +can choose the form most appropriate for your application. + +.. _lists: + +Parsing a list of options +------------------------- + +Now that we have the standard run-of-the-mill argument types out of the way, +lets get a little wild and crazy. Lets say that we want our optimizer to accept +a **list** of optimizations to perform, allowing duplicates. For example, we +might want to run: "``compiler -dce -constprop -inline -dce -strip``". In this +case, the order of the arguments and the number of appearances is very +important. This is what the "``cl::list``" template is for. First, start by +defining an enum of the optimizations that you would like to perform: + +.. code-block:: c++ + + enum Opts { + // 'inline' is a C++ keyword, so name it 'inlining' + dce, constprop, inlining, strip + }; + +Then define your "``cl::list``" variable: + +.. code-block:: c++ + + cl::list<Opts> OptimizationList(cl::desc("Available Optimizations:"), + cl::values( + clEnumVal(dce , "Dead Code Elimination"), + clEnumVal(constprop , "Constant Propagation"), + clEnumValN(inlining, "inline", "Procedure Integration"), + clEnumVal(strip , "Strip Symbols"), + clEnumValEnd)); + +This defines a variable that is conceptually of the type +"``std::vector<enum Opts>``". Thus, you can access it with standard vector +methods: + +.. code-block:: c++ + + for (unsigned i = 0; i != OptimizationList.size(); ++i) + switch (OptimizationList[i]) + ... + +... to iterate through the list of options specified. + +Note that the "``cl::list``" template is completely general and may be used with +any data types or other arguments that you can use with the "``cl::opt``" +template. One especially useful way to use a list is to capture all of the +positional arguments together if there may be more than one specified. In the +case of a linker, for example, the linker takes several '``.o``' files, and +needs to capture them into a list. This is naturally specified as: + +.. code-block:: c++ + + ... + cl::list<std::string> InputFilenames(cl::Positional, cl::desc("<Input files>"), cl::OneOrMore); + ... + +This variable works just like a "``vector<string>``" object. As such, accessing +the list is simple, just like above. In this example, we used the +`cl::OneOrMore`_ modifier to inform the CommandLine library that it is an error +if the user does not specify any ``.o`` files on our command line. Again, this +just reduces the amount of checking we have to do. + +Collecting options as a set of flags +------------------------------------ + +Instead of collecting sets of options in a list, it is also possible to gather +information for enum values in a **bit vector**. The representation used by the +`cl::bits`_ class is an ``unsigned`` integer. An enum value is represented by a +0/1 in the enum's ordinal value bit position. 1 indicating that the enum was +specified, 0 otherwise. As each specified value is parsed, the resulting enum's +bit is set in the option's bit vector: + +.. code-block:: c++ + + bits |= 1 << (unsigned)enum; + +Options that are specified multiple times are redundant. Any instances after +the first are discarded. + +Reworking the above list example, we could replace `cl::list`_ with `cl::bits`_: + +.. code-block:: c++ + + cl::bits<Opts> OptimizationBits(cl::desc("Available Optimizations:"), + cl::values( + clEnumVal(dce , "Dead Code Elimination"), + clEnumVal(constprop , "Constant Propagation"), + clEnumValN(inlining, "inline", "Procedure Integration"), + clEnumVal(strip , "Strip Symbols"), + clEnumValEnd)); + +To test to see if ``constprop`` was specified, we can use the ``cl:bits::isSet`` +function: + +.. code-block:: c++ + + if (OptimizationBits.isSet(constprop)) { + ... + } + +It's also possible to get the raw bit vector using the ``cl::bits::getBits`` +function: + +.. code-block:: c++ + + unsigned bits = OptimizationBits.getBits(); + +Finally, if external storage is used, then the location specified must be of +**type** ``unsigned``. In all other ways a `cl::bits`_ option is equivalent to a +`cl::list`_ option. + +.. _additional extra text: + +Adding freeform text to help output +----------------------------------- + +As our program grows and becomes more mature, we may decide to put summary +information about what it does into the help output. The help output is styled +to look similar to a Unix ``man`` page, providing concise information about a +program. Unix ``man`` pages, however often have a description about what the +program does. To add this to your CommandLine program, simply pass a third +argument to the `cl::ParseCommandLineOptions`_ call in main. This additional +argument is then printed as the overview information for your program, allowing +you to include any additional information that you want. For example: + +.. code-block:: c++ + + int main(int argc, char **argv) { + cl::ParseCommandLineOptions(argc, argv, " CommandLine compiler example\n\n" + " This program blah blah blah...\n"); + ... + } + +would yield the help output: + +:: + + **OVERVIEW: CommandLine compiler example + + This program blah blah blah...** + + USAGE: compiler [options] <input file> + + OPTIONS: + ... + -help - display available options (-help-hidden for more) + -o <filename> - Specify output filename + +.. _Reference Guide: + +Reference Guide +=============== + +Now that you know the basics of how to use the CommandLine library, this section +will give you the detailed information you need to tune how command line options +work, as well as information on more "advanced" command line option processing +capabilities. + +.. _positional: +.. _positional argument: +.. _Positional Arguments: +.. _Positional arguments section: +.. _positional options: + +Positional Arguments +-------------------- + +Positional arguments are those arguments that are not named, and are not +specified with a hyphen. Positional arguments should be used when an option is +specified by its position alone. For example, the standard Unix ``grep`` tool +takes a regular expression argument, and an optional filename to search through +(which defaults to standard input if a filename is not specified). Using the +CommandLine library, this would be specified as: + +.. code-block:: c++ + + cl::opt<string> Regex (cl::Positional, cl::desc("<regular expression>"), cl::Required); + cl::opt<string> Filename(cl::Positional, cl::desc("<input file>"), cl::init("-")); + +Given these two option declarations, the ``-help`` output for our grep +replacement would look like this: + +:: + + USAGE: spiffygrep [options] <regular expression> <input file> + + OPTIONS: + -help - display available options (-help-hidden for more) + +... and the resultant program could be used just like the standard ``grep`` +tool. + +Positional arguments are sorted by their order of construction. This means that +command line options will be ordered according to how they are listed in a .cpp +file, but will not have an ordering defined if the positional arguments are +defined in multiple .cpp files. The fix for this problem is simply to define +all of your positional arguments in one .cpp file. + +Specifying positional options with hyphens +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes you may want to specify a value to your positional argument that +starts with a hyphen (for example, searching for '``-foo``' in a file). At +first, you will have trouble doing this, because it will try to find an argument +named '``-foo``', and will fail (and single quotes will not save you). Note +that the system ``grep`` has the same problem: + +:: + + $ spiffygrep '-foo' test.txt + Unknown command line argument '-foo'. Try: spiffygrep -help' + + $ grep '-foo' test.txt + grep: illegal option -- f + grep: illegal option -- o + grep: illegal option -- o + Usage: grep -hblcnsviw pattern file . . . + +The solution for this problem is the same for both your tool and the system +version: use the '``--``' marker. When the user specifies '``--``' on the +command line, it is telling the program that all options after the '``--``' +should be treated as positional arguments, not options. Thus, we can use it +like this: + +:: + + $ spiffygrep -- -foo test.txt + ...output... + +Determining absolute position with getPosition() +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes an option can affect or modify the meaning of another option. For +example, consider ``gcc``'s ``-x LANG`` option. This tells ``gcc`` to ignore the +suffix of subsequent positional arguments and force the file to be interpreted +as if it contained source code in language ``LANG``. In order to handle this +properly, you need to know the absolute position of each argument, especially +those in lists, so their interaction(s) can be applied correctly. This is also +useful for options like ``-llibname`` which is actually a positional argument +that starts with a dash. + +So, generally, the problem is that you have two ``cl::list`` variables that +interact in some way. To ensure the correct interaction, you can use the +``cl::list::getPosition(optnum)`` method. This method returns the absolute +position (as found on the command line) of the ``optnum`` item in the +``cl::list``. + +The idiom for usage is like this: + +.. code-block:: c++ + + static cl::list<std::string> Files(cl::Positional, cl::OneOrMore); + static cl::list<std::string> Libraries("l", cl::ZeroOrMore); + + int main(int argc, char**argv) { + // ... + std::vector<std::string>::iterator fileIt = Files.begin(); + std::vector<std::string>::iterator libIt = Libraries.begin(); + unsigned libPos = 0, filePos = 0; + while ( 1 ) { + if ( libIt != Libraries.end() ) + libPos = Libraries.getPosition( libIt - Libraries.begin() ); + else + libPos = 0; + if ( fileIt != Files.end() ) + filePos = Files.getPosition( fileIt - Files.begin() ); + else + filePos = 0; + + if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) { + // Source File Is next + ++fileIt; + } + else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) { + // Library is next + ++libIt; + } + else + break; // we're done with the list + } + } + +Note that, for compatibility reasons, the ``cl::opt`` also supports an +``unsigned getPosition()`` option that will provide the absolute position of +that option. You can apply the same approach as above with a ``cl::opt`` and a +``cl::list`` option as you can with two lists. + +.. _interpreter style options: +.. _cl::ConsumeAfter: +.. _this section for more information: + +The ``cl::ConsumeAfter`` modifier +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::ConsumeAfter`` `formatting option`_ is used to construct programs that +use "interpreter style" option processing. With this style of option +processing, all arguments specified after the last positional argument are +treated as special interpreter arguments that are not interpreted by the command +line argument. + +As a concrete example, lets say we are developing a replacement for the standard +Unix Bourne shell (``/bin/sh``). To run ``/bin/sh``, first you specify options +to the shell itself (like ``-x`` which turns on trace output), then you specify +the name of the script to run, then you specify arguments to the script. These +arguments to the script are parsed by the Bourne shell command line option +processor, but are not interpreted as options to the shell itself. Using the +CommandLine library, we would specify this as: + +.. code-block:: c++ + + cl::opt<string> Script(cl::Positional, cl::desc("<input script>"), cl::init("-")); + cl::list<string> Argv(cl::ConsumeAfter, cl::desc("<program arguments>...")); + cl::opt<bool> Trace("x", cl::desc("Enable trace output")); + +which automatically provides the help output: + +:: + + USAGE: spiffysh [options] <input script> <program arguments>... + + OPTIONS: + -help - display available options (-help-hidden for more) + -x - Enable trace output + +At runtime, if we run our new shell replacement as ```spiffysh -x test.sh -a -x +-y bar``', the ``Trace`` variable will be set to true, the ``Script`` variable +will be set to "``test.sh``", and the ``Argv`` list will contain ``["-a", "-x", +"-y", "bar"]``, because they were specified after the last positional argument +(which is the script name). + +There are several limitations to when ``cl::ConsumeAfter`` options can be +specified. For example, only one ``cl::ConsumeAfter`` can be specified per +program, there must be at least one `positional argument`_ specified, there must +not be any `cl::list`_ positional arguments, and the ``cl::ConsumeAfter`` option +should be a `cl::list`_ option. + +.. _can be changed: +.. _Internal vs External Storage: + +Internal vs External Storage +---------------------------- + +By default, all command line options automatically hold the value that they +parse from the command line. This is very convenient in the common case, +especially when combined with the ability to define command line options in the +files that use them. This is called the internal storage model. + +Sometimes, however, it is nice to separate the command line option processing +code from the storage of the value parsed. For example, lets say that we have a +'``-debug``' option that we would like to use to enable debug information across +the entire body of our program. In this case, the boolean value controlling the +debug code should be globally accessible (in a header file, for example) yet the +command line option processing code should not be exposed to all of these +clients (requiring lots of .cpp files to ``#include CommandLine.h``). + +To do this, set up your .h file with your option, like this for example: + +.. code-block:: c++ + + // DebugFlag.h - Get access to the '-debug' command line option + // + + // DebugFlag - This boolean is set to true if the '-debug' command line option + // is specified. This should probably not be referenced directly, instead, use + // the DEBUG macro below. + // + extern bool DebugFlag; + + // DEBUG macro - This macro should be used by code to emit debug information. + // In the '-debug' option is specified on the command line, and if this is a + // debug build, then the code specified as the option to the macro will be + // executed. Otherwise it will not be. + #ifdef NDEBUG + #define DEBUG(X) + #else + #define DEBUG(X) do { if (DebugFlag) { X; } } while (0) + #endif + +This allows clients to blissfully use the ``DEBUG()`` macro, or the +``DebugFlag`` explicitly if they want to. Now we just need to be able to set +the ``DebugFlag`` boolean when the option is set. To do this, we pass an +additional argument to our command line argument processor, and we specify where +to fill in with the `cl::location`_ attribute: + +.. code-block:: c++ + + bool DebugFlag; // the actual value + static cl::opt<bool, true> // The parser + Debug("debug", cl::desc("Enable debug output"), cl::Hidden, cl::location(DebugFlag)); + +In the above example, we specify "``true``" as the second argument to the +`cl::opt`_ template, indicating that the template should not maintain a copy of +the value itself. In addition to this, we specify the `cl::location`_ +attribute, so that ``DebugFlag`` is automatically set. + +Option Attributes +----------------- + +This section describes the basic attributes that you can specify on options. + +* The option name attribute (which is required for all options, except + `positional options`_) specifies what the option name is. This option is + specified in simple double quotes: + + .. code-block:: c++ + + cl::opt<**bool**> Quiet("quiet"); + +.. _cl::desc(...): + +* The **cl::desc** attribute specifies a description for the option to be + shown in the ``-help`` output for the program. + +.. _cl::value_desc: + +* The **cl::value_desc** attribute specifies a string that can be used to + fine tune the ``-help`` output for a command line option. Look `here`_ for an + example. + +.. _cl::init: + +* The **cl::init** attribute specifies an initial value for a `scalar`_ + option. If this attribute is not specified then the command line option value + defaults to the value created by the default constructor for the + type. + + .. warning:: + + If you specify both **cl::init** and **cl::location** for an option, you + must specify **cl::location** first, so that when the command-line parser + sees **cl::init**, it knows where to put the initial value. (You will get an + error at runtime if you don't put them in the right order.) + +.. _cl::location: + +* The **cl::location** attribute where to store the value for a parsed command + line option if using external storage. See the section on `Internal vs + External Storage`_ for more information. + +.. _cl::aliasopt: + +* The **cl::aliasopt** attribute specifies which option a `cl::alias`_ option is + an alias for. + +.. _cl::values: + +* The **cl::values** attribute specifies the string-to-value mapping to be used + by the generic parser. It takes a **clEnumValEnd terminated** list of + (option, value, description) triplets that specify the option name, the value + mapped to, and the description shown in the ``-help`` for the tool. Because + the generic parser is used most frequently with enum values, two macros are + often useful: + + #. The **clEnumVal** macro is used as a nice simple way to specify a triplet + for an enum. This macro automatically makes the option name be the same as + the enum name. The first option to the macro is the enum, the second is + the description for the command line option. + + #. The **clEnumValN** macro is used to specify macro options where the option + name doesn't equal the enum name. For this macro, the first argument is + the enum value, the second is the flag name, and the second is the + description. + + You will get a compile time error if you try to use cl::values with a parser + that does not support it. + +.. _cl::multi_val: + +* The **cl::multi_val** attribute specifies that this option takes has multiple + values (example: ``-sectalign segname sectname sectvalue``). This attribute + takes one unsigned argument - the number of values for the option. This + attribute is valid only on ``cl::list`` options (and will fail with compile + error if you try to use it with other option types). It is allowed to use all + of the usual modifiers on multi-valued options (besides + ``cl::ValueDisallowed``, obviously). + +Option Modifiers +---------------- + +Option modifiers are the flags and expressions that you pass into the +constructors for `cl::opt`_ and `cl::list`_. These modifiers give you the +ability to tweak how options are parsed and how ``-help`` output is generated to +fit your application well. + +These options fall into five main categories: + +#. Hiding an option from ``-help`` output + +#. Controlling the number of occurrences required and allowed + +#. Controlling whether or not a value must be specified + +#. Controlling other formatting options + +#. Miscellaneous option modifiers + +It is not possible to specify two options from the same category (you'll get a +runtime error) to a single option, except for options in the miscellaneous +category. The CommandLine library specifies defaults for all of these settings +that are the most useful in practice and the most common, which mean that you +usually shouldn't have to worry about these. + +Hiding an option from ``-help`` output +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::NotHidden``, ``cl::Hidden``, and ``cl::ReallyHidden`` modifiers are +used to control whether or not an option appears in the ``-help`` and +``-help-hidden`` output for the compiled program: + +.. _cl::NotHidden: + +* The **cl::NotHidden** modifier (which is the default for `cl::opt`_ and + `cl::list`_ options) indicates the option is to appear in both help + listings. + +.. _cl::Hidden: + +* The **cl::Hidden** modifier (which is the default for `cl::alias`_ options) + indicates that the option should not appear in the ``-help`` output, but + should appear in the ``-help-hidden`` output. + +.. _cl::ReallyHidden: + +* The **cl::ReallyHidden** modifier indicates that the option should not appear + in any help output. + +Controlling the number of occurrences required and allowed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This group of options is used to control how many time an option is allowed (or +required) to be specified on the command line of your program. Specifying a +value for this setting allows the CommandLine library to do error checking for +you. + +The allowed values for this option group are: + +.. _cl::Optional: + +* The **cl::Optional** modifier (which is the default for the `cl::opt`_ and + `cl::alias`_ classes) indicates that your program will allow either zero or + one occurrence of the option to be specified. + +.. _cl::ZeroOrMore: + +* The **cl::ZeroOrMore** modifier (which is the default for the `cl::list`_ + class) indicates that your program will allow the option to be specified zero + or more times. + +.. _cl::Required: + +* The **cl::Required** modifier indicates that the specified option must be + specified exactly one time. + +.. _cl::OneOrMore: + +* The **cl::OneOrMore** modifier indicates that the option must be specified at + least one time. + +* The **cl::ConsumeAfter** modifier is described in the `Positional arguments + section`_. + +If an option is not specified, then the value of the option is equal to the +value specified by the `cl::init`_ attribute. If the ``cl::init`` attribute is +not specified, the option value is initialized with the default constructor for +the data type. + +If an option is specified multiple times for an option of the `cl::opt`_ class, +only the last value will be retained. + +Controlling whether or not a value must be specified +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This group of options is used to control whether or not the option allows a +value to be present. In the case of the CommandLine library, a value is either +specified with an equal sign (e.g. '``-index-depth=17``') or as a trailing +string (e.g. '``-o a.out``'). + +The allowed values for this option group are: + +.. _cl::ValueOptional: + +* The **cl::ValueOptional** modifier (which is the default for ``bool`` typed + options) specifies that it is acceptable to have a value, or not. A boolean + argument can be enabled just by appearing on the command line, or it can have + an explicit '``-foo=true``'. If an option is specified with this mode, it is + illegal for the value to be provided without the equal sign. Therefore + '``-foo true``' is illegal. To get this behavior, you must use + the `cl::ValueRequired`_ modifier. + +.. _cl::ValueRequired: + +* The **cl::ValueRequired** modifier (which is the default for all other types + except for `unnamed alternatives using the generic parser`_) specifies that a + value must be provided. This mode informs the command line library that if an + option is not provides with an equal sign, that the next argument provided + must be the value. This allows things like '``-o a.out``' to work. + +.. _cl::ValueDisallowed: + +* The **cl::ValueDisallowed** modifier (which is the default for `unnamed + alternatives using the generic parser`_) indicates that it is a runtime error + for the user to specify a value. This can be provided to disallow users from + providing options to boolean options (like '``-foo=true``'). + +In general, the default values for this option group work just like you would +want them to. As mentioned above, you can specify the `cl::ValueDisallowed`_ +modifier to a boolean argument to restrict your command line parser. These +options are mostly useful when `extending the library`_. + +.. _formatting option: + +Controlling other formatting options +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The formatting option group is used to specify that the command line option has +special abilities and is otherwise different from other command line arguments. +As usual, you can only specify one of these arguments at most. + +.. _cl::NormalFormatting: + +* The **cl::NormalFormatting** modifier (which is the default all options) + specifies that this option is "normal". + +.. _cl::Positional: + +* The **cl::Positional** modifier specifies that this is a positional argument + that does not have a command line option associated with it. See the + `Positional Arguments`_ section for more information. + +* The **cl::ConsumeAfter** modifier specifies that this option is used to + capture "interpreter style" arguments. See `this section for more + information`_. + +.. _prefix: +.. _cl::Prefix: + +* The **cl::Prefix** modifier specifies that this option prefixes its value. + With 'Prefix' options, the equal sign does not separate the value from the + option name specified. Instead, the value is everything after the prefix, + including any equal sign if present. This is useful for processing odd + arguments like ``-lmalloc`` and ``-L/usr/lib`` in a linker tool or + ``-DNAME=value`` in a compiler tool. Here, the '``l``', '``D``' and '``L``' + options are normal string (or list) options, that have the **cl::Prefix** + modifier added to allow the CommandLine library to recognize them. Note that + **cl::Prefix** options must not have the **cl::ValueDisallowed** modifier + specified. + +.. _grouping: +.. _cl::Grouping: + +* The **cl::Grouping** modifier is used to implement Unix-style tools (like + ``ls``) that have lots of single letter arguments, but only require a single + dash. For example, the '``ls -labF``' command actually enables four different + options, all of which are single letters. Note that **cl::Grouping** options + cannot have values. + +The CommandLine library does not restrict how you use the **cl::Prefix** or +**cl::Grouping** modifiers, but it is possible to specify ambiguous argument +settings. Thus, it is possible to have multiple letter options that are prefix +or grouping options, and they will still work as designed. + +To do this, the CommandLine library uses a greedy algorithm to parse the input +option into (potentially multiple) prefix and grouping options. The strategy +basically looks like this: + +:: + + parse(string OrigInput) { + + 1. string input = OrigInput; + 2. if (isOption(input)) return getOption(input).parse(); // Normal option + 3. while (!isOption(input) && !input.empty()) input.pop_back(); // Remove the last letter + 4. if (input.empty()) return error(); // No matching option + 5. if (getOption(input).isPrefix()) + return getOption(input).parse(input); + 6. while (!input.empty()) { // Must be grouping options + getOption(input).parse(); + OrigInput.erase(OrigInput.begin(), OrigInput.begin()+input.length()); + input = OrigInput; + while (!isOption(input) && !input.empty()) input.pop_back(); + } + 7. if (!OrigInput.empty()) error(); + + } + +Miscellaneous option modifiers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The miscellaneous option modifiers are the only flags where you can specify more +than one flag from the set: they are not mutually exclusive. These flags +specify boolean properties that modify the option. + +.. _cl::CommaSeparated: + +* The **cl::CommaSeparated** modifier indicates that any commas specified for an + option's value should be used to split the value up into multiple values for + the option. For example, these two options are equivalent when + ``cl::CommaSeparated`` is specified: "``-foo=a -foo=b -foo=c``" and + "``-foo=a,b,c``". This option only makes sense to be used in a case where the + option is allowed to accept one or more values (i.e. it is a `cl::list`_ + option). + +.. _cl::PositionalEatsArgs: + +* The **cl::PositionalEatsArgs** modifier (which only applies to positional + arguments, and only makes sense for lists) indicates that positional argument + should consume any strings after it (including strings that start with a "-") + up until another recognized positional argument. For example, if you have two + "eating" positional arguments, "``pos1``" and "``pos2``", the string "``-pos1 + -foo -bar baz -pos2 -bork``" would cause the "``-foo -bar -baz``" strings to + be applied to the "``-pos1``" option and the "``-bork``" string to be applied + to the "``-pos2``" option. + +.. _cl::Sink: + +* The **cl::Sink** modifier is used to handle unknown options. If there is at + least one option with ``cl::Sink`` modifier specified, the parser passes + unrecognized option strings to it as values instead of signaling an error. As + with ``cl::CommaSeparated``, this modifier only makes sense with a `cl::list`_ + option. + +So far, these are the only three miscellaneous option modifiers. + +.. _response files: + +Response files +^^^^^^^^^^^^^^ + +Some systems, such as certain variants of Microsoft Windows and some older +Unices have a relatively low limit on command-line length. It is therefore +customary to use the so-called 'response files' to circumvent this +restriction. These files are mentioned on the command-line (using the "@file") +syntax. The program reads these files and inserts the contents into argv, +thereby working around the command-line length limits. Response files are +enabled by an optional fourth argument to `cl::ParseEnvironmentOptions`_ and +`cl::ParseCommandLineOptions`_. + +Top-Level Classes and Functions +------------------------------- + +Despite all of the built-in flexibility, the CommandLine option library really +only consists of one function `cl::ParseCommandLineOptions`_) and three main +classes: `cl::opt`_, `cl::list`_, and `cl::alias`_. This section describes +these three classes in detail. + +.. _cl::ParseCommandLineOptions: + +The ``cl::ParseCommandLineOptions`` function +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::ParseCommandLineOptions`` function is designed to be called directly +from ``main``, and is used to fill in the values of all of the command line +option variables once ``argc`` and ``argv`` are available. + +The ``cl::ParseCommandLineOptions`` function requires two parameters (``argc`` +and ``argv``), but may also take an optional third parameter which holds +`additional extra text`_ to emit when the ``-help`` option is invoked, and a +fourth boolean parameter that enables `response files`_. + +.. _cl::ParseEnvironmentOptions: + +The ``cl::ParseEnvironmentOptions`` function +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::ParseEnvironmentOptions`` function has mostly the same effects as +`cl::ParseCommandLineOptions`_, except that it is designed to take values for +options from an environment variable, for those cases in which reading the +command line is not convenient or desired. It fills in the values of all the +command line option variables just like `cl::ParseCommandLineOptions`_ does. + +It takes four parameters: the name of the program (since ``argv`` may not be +available, it can't just look in ``argv[0]``), the name of the environment +variable to examine, the optional `additional extra text`_ to emit when the +``-help`` option is invoked, and the boolean switch that controls whether +`response files`_ should be read. + +``cl::ParseEnvironmentOptions`` will break the environment variable's value up +into words and then process them using `cl::ParseCommandLineOptions`_. +**Note:** Currently ``cl::ParseEnvironmentOptions`` does not support quoting, so +an environment variable containing ``-option "foo bar"`` will be parsed as three +words, ``-option``, ``"foo``, and ``bar"``, which is different from what you +would get from the shell with the same input. + +The ``cl::SetVersionPrinter`` function +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::SetVersionPrinter`` function is designed to be called directly from +``main`` and *before* ``cl::ParseCommandLineOptions``. Its use is optional. It +simply arranges for a function to be called in response to the ``--version`` +option instead of having the ``CommandLine`` library print out the usual version +string for LLVM. This is useful for programs that are not part of LLVM but wish +to use the ``CommandLine`` facilities. Such programs should just define a small +function that takes no arguments and returns ``void`` and that prints out +whatever version information is appropriate for the program. Pass the address of +that function to ``cl::SetVersionPrinter`` to arrange for it to be called when +the ``--version`` option is given by the user. + +.. _cl::opt: +.. _scalar: + +The ``cl::opt`` class +^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::opt`` class is the class used to represent scalar command line +options, and is the one used most of the time. It is a templated class which +can take up to three arguments (all except for the first have default values +though): + +.. code-block:: c++ + + namespace cl { + template <class DataType, bool ExternalStorage = false, + class ParserClass = parser<DataType> > + class opt; + } + +The first template argument specifies what underlying data type the command line +argument is, and is used to select a default parser implementation. The second +template argument is used to specify whether the option should contain the +storage for the option (the default) or whether external storage should be used +to contain the value parsed for the option (see `Internal vs External Storage`_ +for more information). + +The third template argument specifies which parser to use. The default value +selects an instantiation of the ``parser`` class based on the underlying data +type of the option. In general, this default works well for most applications, +so this option is only used when using a `custom parser`_. + +.. _lists of arguments: +.. _cl::list: + +The ``cl::list`` class +^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::list`` class is the class used to represent a list of command line +options. It too is a templated class which can take up to three arguments: + +.. code-block:: c++ + + namespace cl { + template <class DataType, class Storage = bool, + class ParserClass = parser<DataType> > + class list; + } + +This class works the exact same as the `cl::opt`_ class, except that the second +argument is the **type** of the external storage, not a boolean value. For this +class, the marker type '``bool``' is used to indicate that internal storage +should be used. + +.. _cl::bits: + +The ``cl::bits`` class +^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::bits`` class is the class used to represent a list of command line +options in the form of a bit vector. It is also a templated class which can +take up to three arguments: + +.. code-block:: c++ + + namespace cl { + template <class DataType, class Storage = bool, + class ParserClass = parser<DataType> > + class bits; + } + +This class works the exact same as the `cl::list`_ class, except that the second +argument must be of **type** ``unsigned`` if external storage is used. + +.. _cl::alias: + +The ``cl::alias`` class +^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::alias`` class is a nontemplated class that is used to form aliases for +other arguments. + +.. code-block:: c++ + + namespace cl { + class alias; + } + +The `cl::aliasopt`_ attribute should be used to specify which option this is an +alias for. Alias arguments default to being `cl::Hidden`_, and use the aliased +options parser to do the conversion from string to data. + +.. _cl::extrahelp: + +The ``cl::extrahelp`` class +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``cl::extrahelp`` class is a nontemplated class that allows extra help text +to be printed out for the ``-help`` option. + +.. code-block:: c++ + + namespace cl { + struct extrahelp; + } + +To use the extrahelp, simply construct one with a ``const char*`` parameter to +the constructor. The text passed to the constructor will be printed at the +bottom of the help message, verbatim. Note that multiple ``cl::extrahelp`` +**can** be used, but this practice is discouraged. If your tool needs to print +additional help information, put all that help into a single ``cl::extrahelp`` +instance. + +For example: + +.. code-block:: c++ + + cl::extrahelp("\nADDITIONAL HELP:\n\n This is the extra help\n"); + +.. _different parser: +.. _discussed previously: + +Builtin parsers +--------------- + +Parsers control how the string value taken from the command line is translated +into a typed value, suitable for use in a C++ program. By default, the +CommandLine library uses an instance of ``parser<type>`` if the command line +option specifies that it uses values of type '``type``'. Because of this, +custom option processing is specified with specializations of the '``parser``' +class. + +The CommandLine library provides the following builtin parser specializations, +which are sufficient for most applications. It can, however, also be extended to +work with new data types and new ways of interpreting the same data. See the +`Writing a Custom Parser`_ for more details on this type of library extension. + +.. _enums: +.. _cl::parser: + +* The generic ``parser<t>`` parser can be used to map strings values to any data + type, through the use of the `cl::values`_ property, which specifies the + mapping information. The most common use of this parser is for parsing enum + values, which allows you to use the CommandLine library for all of the error + checking to make sure that only valid enum values are specified (as opposed to + accepting arbitrary strings). Despite this, however, the generic parser class + can be used for any data type. + +.. _boolean flags: +.. _bool parser: + +* The **parser<bool> specialization** is used to convert boolean strings to a + boolean value. Currently accepted strings are "``true``", "``TRUE``", + "``True``", "``1``", "``false``", "``FALSE``", "``False``", and "``0``". + +* The **parser<boolOrDefault> specialization** is used for cases where the value + is boolean, but we also need to know whether the option was specified at all. + boolOrDefault is an enum with 3 values, BOU_UNSET, BOU_TRUE and BOU_FALSE. + This parser accepts the same strings as **``parser<bool>``**. + +.. _strings: + +* The **parser<string> specialization** simply stores the parsed string into the + string value specified. No conversion or modification of the data is + performed. + +.. _integers: +.. _int: + +* The **parser<int> specialization** uses the C ``strtol`` function to parse the + string input. As such, it will accept a decimal number (with an optional '+' + or '-' prefix) which must start with a non-zero digit. It accepts octal + numbers, which are identified with a '``0``' prefix digit, and hexadecimal + numbers with a prefix of '``0x``' or '``0X``'. + +.. _doubles: +.. _float: +.. _double: + +* The **parser<double>** and **parser<float> specializations** use the standard + C ``strtod`` function to convert floating point strings into floating point + values. As such, a broad range of string formats is supported, including + exponential notation (ex: ``1.7e15``) and properly supports locales. + +.. _Extension Guide: +.. _extending the library: + +Extension Guide +=============== + +Although the CommandLine library has a lot of functionality built into it +already (as discussed previously), one of its true strengths lie in its +extensibility. This section discusses how the CommandLine library works under +the covers and illustrates how to do some simple, common, extensions. + +.. _Custom parsers: +.. _custom parser: +.. _Writing a Custom Parser: + +Writing a custom parser +----------------------- + +One of the simplest and most common extensions is the use of a custom parser. +As `discussed previously`_, parsers are the portion of the CommandLine library +that turns string input from the user into a particular parsed data type, +validating the input in the process. + +There are two ways to use a new parser: + +#. Specialize the `cl::parser`_ template for your custom data type. + + This approach has the advantage that users of your custom data type will + automatically use your custom parser whenever they define an option with a + value type of your data type. The disadvantage of this approach is that it + doesn't work if your fundamental data type is something that is already + supported. + +#. Write an independent class, using it explicitly from options that need it. + + This approach works well in situations where you would line to parse an + option using special syntax for a not-very-special data-type. The drawback + of this approach is that users of your parser have to be aware that they are + using your parser instead of the builtin ones. + +To guide the discussion, we will discuss a custom parser that accepts file +sizes, specified with an optional unit after the numeric size. For example, we +would like to parse "102kb", "41M", "1G" into the appropriate integer value. In +this case, the underlying data type we want to parse into is '``unsigned``'. We +choose approach #2 above because we don't want to make this the default for all +``unsigned`` options. + +To start out, we declare our new ``FileSizeParser`` class: + +.. code-block:: c++ + + struct FileSizeParser : public cl::basic_parser<unsigned> { + // parse - Return true on error. + bool parse(cl::Option &O, const char *ArgName, const std::string &ArgValue, + unsigned &Val); + }; + +Our new class inherits from the ``cl::basic_parser`` template class to fill in +the default, boiler plate code for us. We give it the data type that we parse +into, the last argument to the ``parse`` method, so that clients of our custom +parser know what object type to pass in to the parse method. (Here we declare +that we parse into '``unsigned``' variables.) + +For most purposes, the only method that must be implemented in a custom parser +is the ``parse`` method. The ``parse`` method is called whenever the option is +invoked, passing in the option itself, the option name, the string to parse, and +a reference to a return value. If the string to parse is not well-formed, the +parser should output an error message and return true. Otherwise it should +return false and set '``Val``' to the parsed value. In our example, we +implement ``parse`` as: + +.. code-block:: c++ + + bool FileSizeParser::parse(cl::Option &O, const char *ArgName, + const std::string &Arg, unsigned &Val) { + const char *ArgStart = Arg.c_str(); + char *End; + + // Parse integer part, leaving 'End' pointing to the first non-integer char + Val = (unsigned)strtol(ArgStart, &End, 0); + + while (1) { + switch (*End++) { + case 0: return false; // No error + case 'i': // Ignore the 'i' in KiB if people use that + case 'b': case 'B': // Ignore B suffix + break; + + case 'g': case 'G': Val *= 1024*1024*1024; break; + case 'm': case 'M': Val *= 1024*1024; break; + case 'k': case 'K': Val *= 1024; break; + + default: + // Print an error message if unrecognized character! + return O.error("'" + Arg + "' value invalid for file size argument!"); + } + } + } + +This function implements a very simple parser for the kinds of strings we are +interested in. Although it has some holes (it allows "``123KKK``" for example), +it is good enough for this example. Note that we use the option itself to print +out the error message (the ``error`` method always returns true) in order to get +a nice error message (shown below). Now that we have our parser class, we can +use it like this: + +.. code-block:: c++ + + static cl::opt<unsigned, false, FileSizeParser> + MFS("max-file-size", cl::desc("Maximum file size to accept"), + cl::value_desc("size")); + +Which adds this to the output of our program: + +:: + + OPTIONS: + -help - display available options (-help-hidden for more) + ... + -max-file-size=<size> - Maximum file size to accept + +And we can test that our parse works correctly now (the test program just prints +out the max-file-size argument value): + +:: + + $ ./test + MFS: 0 + $ ./test -max-file-size=123MB + MFS: 128974848 + $ ./test -max-file-size=3G + MFS: 3221225472 + $ ./test -max-file-size=dog + -max-file-size option: 'dog' value invalid for file size argument! + +It looks like it works. The error message that we get is nice and helpful, and +we seem to accept reasonable file sizes. This wraps up the "custom parser" +tutorial. + +Exploiting external storage +--------------------------- + +Several of the LLVM libraries define static ``cl::opt`` instances that will +automatically be included in any program that links with that library. This is +a feature. However, sometimes it is necessary to know the value of the command +line option outside of the library. In these cases the library does or should +provide an external storage location that is accessible to users of the +library. Examples of this include the ``llvm::DebugFlag`` exported by the +``lib/Support/Debug.cpp`` file and the ``llvm::TimePassesIsEnabled`` flag +exported by the ``lib/VMCore/PassManager.cpp`` file. + +.. todo:: + + TODO: complete this section + +.. _dynamically loaded options: + +Dynamically adding command line options + +.. todo:: + + TODO: fill in this section diff --git a/docs/CompilerWriterInfo.html b/docs/CompilerWriterInfo.html index 5fdb4fc..67da783 100644 --- a/docs/CompilerWriterInfo.html +++ b/docs/CompilerWriterInfo.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Architecture/platform information for compiler writers</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -260,7 +260,7 @@ processors.</li> <a href="http://misha.brukman.net">Misha Brukman</a><br> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-28 00:56:32 +0200 (Fri, 28 Oct 2011) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> diff --git a/docs/DebuggingJITedCode.html b/docs/DebuggingJITedCode.html index ffb4cd9..652572c 100644 --- a/docs/DebuggingJITedCode.html +++ b/docs/DebuggingJITedCode.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Debugging JITed Code With GDB</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -51,7 +51,7 @@ necessary debug information. <p>In order to debug code JIT-ed by LLVM, you need GDB 7.0 or newer, which is available on most modern distributions of Linux. The version of GDB that Apple -ships with XCode has been frozen at 6.3 for a while. LLDB may be a better +ships with Xcode has been frozen at 6.3 for a while. LLDB may be a better option for debugging JIT-ed code on Mac OS X. </p> @@ -178,7 +178,7 @@ Program exited with code 0170. <a href="mailto:reid.kleckner@gmail.com">Reid Kleckner</a>, <a href="mailto:eliben@gmail.com">Eli Bendersky</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-05-01 09:58:54 +0200 (Tue, 01 May 2012) $ + Last modified: $Date: 2012-05-13 16:36:15 +0200 (Sun, 13 May 2012) $ </address> </body> </html> diff --git a/docs/DeveloperPolicy.html b/docs/DeveloperPolicy.html deleted file mode 100644 index 264975e..0000000 --- a/docs/DeveloperPolicy.html +++ /dev/null @@ -1,642 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Developer Policy</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>LLVM Developer Policy</h1> -<ol> - <li><a href="#introduction">Introduction</a></li> - <li><a href="#policies">Developer Policies</a> - <ol> - <li><a href="#informed">Stay Informed</a></li> - <li><a href="#patches">Making a Patch</a></li> - <li><a href="#reviews">Code Reviews</a></li> - <li><a href="#owners">Code Owners</a></li> - <li><a href="#testcases">Test Cases</a></li> - <li><a href="#quality">Quality</a></li> - <li><a href="#commitaccess">Obtaining Commit Access</a></li> - <li><a href="#newwork">Making a Major Change</a></li> - <li><a href="#incremental">Incremental Development</a></li> - <li><a href="#attribution">Attribution of Changes</a></li> - </ol></li> - <li><a href="#clp">Copyright, License, and Patents</a> - <ol> - <li><a href="#copyright">Copyright</a></li> - <li><a href="#license">License</a></li> - <li><a href="#patents">Patents</a></li> - </ol></li> -</ol> -<div class="doc_author">Written by the LLVM Oversight Team</div> - -<!--=========================================================================--> -<h2><a name="introduction">Introduction</a></h2> -<!--=========================================================================--> -<div> -<p>This document contains the LLVM Developer Policy which defines the project's - policy towards developers and their contributions. The intent of this policy - is to eliminate miscommunication, rework, and confusion that might arise from - the distributed nature of LLVM's development. By stating the policy in clear - terms, we hope each developer can know ahead of time what to expect when - making LLVM contributions. This policy covers all llvm.org subprojects, - including Clang, LLDB, libc++, etc.</p> -<p>This policy is also designed to accomplish the following objectives:</p> - -<ol> - <li>Attract both users and developers to the LLVM project.</li> - - <li>Make life as simple and easy for contributors as possible.</li> - - <li>Keep the top of Subversion trees as stable as possible.</li> - - <li>Establish awareness of the project's <a href="#clp">copyright, - license, and patent policies</a> with contributors to the project.</li> -</ol> - -<p>This policy is aimed at frequent contributors to LLVM. People interested in - contributing one-off patches can do so in an informal way by sending them to - the - <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">llvm-commits - mailing list</a> and engaging another developer to see it through the - process.</p> -</div> - -<!--=========================================================================--> -<h2><a name="policies">Developer Policies</a></h2> -<!--=========================================================================--> -<div> -<p>This section contains policies that pertain to frequent LLVM developers. We - always welcome <a href="#patches">one-off patches</a> from people who do not - routinely contribute to LLVM, but we expect more from frequent contributors - to keep the system as efficient as possible for everyone. Frequent LLVM - contributors are expected to meet the following requirements in order for - LLVM to maintain a high standard of quality.<p> - -<!-- _______________________________________________________________________ --> -<h3><a name="informed">Stay Informed</a></h3> -<div> -<p>Developers should stay informed by reading at least the "dev" mailing list - for the projects you are interested in, such as - <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev</a> for - LLVM, <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">cfe-dev</a> - for Clang, or <a - href="http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev">lldb-dev</a> - for LLDB. If you are doing anything more than just casual work on LLVM, it - is suggested that you also subscribe to the "commits" mailing list for the - subproject you're interested in, such as - <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">llvm-commits</a>, - <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits">cfe-commits</a>, - or <a href="http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits">lldb-commits</a>. - Reading the "commits" list and paying attention to changes being made by - others is a good way to see what other people are interested in and watching - the flow of the project as a whole.</p> - -<p>We recommend that active developers register an email account with - <a href="http://llvm.org/bugs/">LLVM Bugzilla</a> and preferably subscribe to - the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs">llvm-bugs</a> - email list to keep track of bugs and enhancements occurring in LLVM. We - really appreciate people who are proactive at catching incoming bugs in their - components and dealing with them promptly.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="patches">Making a Patch</a></h3> - -<div> -<p>When making a patch for review, the goal is to make it as easy for the - reviewer to read it as possible. As such, we recommend that you:</p> - -<ol> - <li>Make your patch against the Subversion trunk, not a branch, and not an old - version of LLVM. This makes it easy to apply the patch. For information - on how to check out SVN trunk, please see the <a - href="GettingStarted.html#checkout">Getting Started Guide</a>.</li> - - <li>Similarly, patches should be submitted soon after they are generated. Old - patches may not apply correctly if the underlying code changes between the - time the patch was created and the time it is applied.</li> - - <li>Patches should be made with <tt>svn diff</tt>, or similar. If you use - a different tool, make sure it uses the <tt>diff -u</tt> format and - that it doesn't contain clutter which makes it hard to read.</li> - - <li>If you are modifying generated files, such as the top-level - <tt>configure</tt> script, please separate out those changes into - a separate patch from the rest of your changes.</li> -</ol> - -<p>When sending a patch to a mailing list, it is a good idea to send it as an - <em>attachment</em> to the message, not embedded into the text of the - message. This ensures that your mailer will not mangle the patch when it - sends it (e.g. by making whitespace changes or by wrapping lines).</p> - -<p><em>For Thunderbird users:</em> Before submitting a patch, please open - <em>Preferences → Advanced → General → Config Editor</em>, - find the key <tt>mail.content_disposition_type</tt>, and set its value to - <tt>1</tt>. Without this setting, Thunderbird sends your attachment using - <tt>Content-Disposition: inline</tt> rather than <tt>Content-Disposition: - attachment</tt>. Apple Mail gamely displays such a file inline, making it - difficult to work with for reviewers using that program.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="reviews">Code Reviews</a></h3> -<div> -<p>LLVM has a code review policy. Code review is one way to increase the quality - of software. We generally follow these policies:</p> - -<ol> - <li>All developers are required to have significant changes reviewed before - they are committed to the repository.</li> - - <li>Code reviews are conducted by email, usually on the llvm-commits - list.</li> - - <li>Code can be reviewed either before it is committed or after. We expect - major changes to be reviewed before being committed, but smaller changes - (or changes where the developer owns the component) can be reviewed after - commit.</li> - - <li>The developer responsible for a code change is also responsible for making - all necessary review-related changes.</li> - - <li>Code review can be an iterative process, which continues until the patch - is ready to be committed.</li> -</ol> - -<p>Developers should participate in code reviews as both reviewers and - reviewees. If someone is kind enough to review your code, you should return - the favor for someone else. Note that anyone is welcome to review and give - feedback on a patch, but only people with Subversion write access can approve - it.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="owners">Code Owners</a></h3> -<div> - -<p>The LLVM Project relies on two features of its process to maintain rapid - development in addition to the high quality of its source base: the - combination of code review plus post-commit review for trusted maintainers. - Having both is a great way for the project to take advantage of the fact that - most people do the right thing most of the time, and only commit patches - without pre-commit review when they are confident they are right.</p> - -<p>The trick to this is that the project has to guarantee that all patches that - are committed are reviewed after they go in: you don't want everyone to - assume someone else will review it, allowing the patch to go unreviewed. To - solve this problem, we have a notion of an 'owner' for a piece of the code. - The sole responsibility of a code owner is to ensure that a commit to their - area of the code is appropriately reviewed, either by themself or by someone - else. The current code owners are:</p> - -<ol> - <li><b>Evan Cheng</b>: Code generator and all targets.</li> - - <li><b>Greg Clayton</b>: LLDB.</li> - - <li><b>Doug Gregor</b>: Clang Frontend Libraries.</li> - - <li><b>Howard Hinnant</b>: libc++.</li> - - <li><b>Anton Korobeynikov</b>: Exception handling, debug information, and - Windows codegen.</li> - - <li><b>Ted Kremenek</b>: Clang Static Analyzer.</li> - - <li><b>Chris Lattner</b>: Everything not covered by someone else.</li> - - <li><b>John McCall</b>: Clang LLVM IR generation.</li> - - <li><b>Jakob Olesen</b>: Register allocators and TableGen.</li> - - <li><b>Duncan Sands</b>: dragonegg and llvm-gcc 4.2.</li> - - <li><b>Peter Collingbourne</b>: libclc.</li> - - <li><b>Tobias Grosser</b>: polly.</li> -</ol> - -<p>Note that code ownership is completely different than reviewers: anyone can - review a piece of code, and we welcome code review from anyone who is - interested. Code owners are the "last line of defense" to guarantee that all - patches that are committed are actually reviewed.</p> - -<p>Being a code owner is a somewhat unglamorous position, but it is incredibly - important for the ongoing success of the project. Because people get busy, - interests change, and unexpected things happen, code ownership is purely - opt-in, and anyone can choose to resign their "title" at any time. For now, - we do not have an official policy on how one gets elected to be a code - owner.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="testcases">Test Cases</a></h3> -<div> -<p>Developers are required to create test cases for any bugs fixed and any new - features added. Some tips for getting your testcase approved:</p> - -<ol> - <li>All feature and regression test cases are added to the - <tt>llvm/test</tt> directory. The appropriate sub-directory should be - selected (see the <a href="TestingGuide.html">Testing Guide</a> for - details).</li> - - <li>Test cases should be written in <a href="LangRef.html">LLVM assembly - language</a> unless the feature or regression being tested requires - another language (e.g. the bug being fixed or feature being implemented is - in the llvm-gcc C++ front-end, in which case it must be written in - C++).</li> - - <li>Test cases, especially for regressions, should be reduced as much as - possible, by <a href="Bugpoint.html">bugpoint</a> or manually. It is - unacceptable to place an entire failing program into <tt>llvm/test</tt> as - this creates a <i>time-to-test</i> burden on all developers. Please keep - them short.</li> -</ol> - -<p>Note that llvm/test and clang/test are designed for regression and small - feature tests only. More extensive test cases (e.g., entire applications, - benchmarks, etc) - should be added to the <tt>llvm-test</tt> test suite. The llvm-test suite is - for coverage (correctness, performance, etc) testing, not feature or - regression testing.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="quality">Quality</a></h3> -<div> -<p>The minimum quality standards that any change must satisfy before being - committed to the main development branch are:</p> - -<ol> - <li>Code must adhere to the <a href="CodingStandards.html">LLVM Coding - Standards</a>.</li> - - <li>Code must compile cleanly (no errors, no warnings) on at least one - platform.</li> - - <li>Bug fixes and new features should <a href="#testcases">include a - testcase</a> so we know if the fix/feature ever regresses in the - future.</li> - - <li>Code must pass the <tt>llvm/test</tt> test suite.</li> - - <li>The code must not cause regressions on a reasonable subset of llvm-test, - where "reasonable" depends on the contributor's judgement and the scope of - the change (more invasive changes require more testing). A reasonable - subset might be something like - "<tt>llvm-test/MultiSource/Benchmarks</tt>".</li> -</ol> - -<p>Additionally, the committer is responsible for addressing any problems found - in the future that the change is responsible for. For example:</p> - -<ul> - <li>The code should compile cleanly on all supported platforms.</li> - - <li>The changes should not cause any correctness regressions in the - <tt>llvm-test</tt> suite and must not cause any major performance - regressions.</li> - - <li>The change set should not cause performance or correctness regressions for - the LLVM tools.</li> - - <li>The changes should not cause performance or correctness regressions in - code compiled by LLVM on all applicable targets.</li> - - <li>You are expected to address any <a href="http://llvm.org/bugs/">bugzilla - bugs</a> that result from your change.</li> -</ul> - -<p>We prefer for this to be handled before submission but understand that it - isn't possible to test all of this for every submission. Our build bots and - nightly testing infrastructure normally finds these problems. A good rule of - thumb is to check the nightly testers for regressions the day after your - change. Build bots will directly email you if a group of commits that - included yours caused a failure. You are expected to check the build bot - messages to see if they are your fault and, if so, fix the breakage.</p> - -<p>Commits that violate these quality standards (e.g. are very broken) may be - reverted. This is necessary when the change blocks other developers from - making progress. The developer is welcome to re-commit the change after the - problem has been fixed.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="commitaccess">Obtaining Commit Access</a></h3> -<div> - -<p>We grant commit access to contributors with a track record of submitting high - quality patches. If you would like commit access, please send an email to - <a href="mailto:sabre@nondot.org">Chris</a> with the following - information:</p> - -<ol> - <li>The user name you want to commit with, e.g. "hacker".</li> - - <li>The full name and email address you want message to llvm-commits to come - from, e.g. "J. Random Hacker <hacker@yoyodyne.com>".</li> - - <li>A "password hash" of the password you want to use, e.g. "2ACR96qjUqsyM". - Note that you don't ever tell us what your password is, you just give it - to us in an encrypted form. To get this, run "htpasswd" (a utility that - comes with apache) in crypt mode (often enabled with "-d"), or find a web - page that will do it for you.</li> -</ol> - -<p>Once you've been granted commit access, you should be able to check out an - LLVM tree with an SVN URL of "https://username@llvm.org/..." instead of the - normal anonymous URL of "http://llvm.org/...". The first time you commit - you'll have to type in your password. Note that you may get a warning from - SVN about an untrusted key, you can ignore this. To verify that your commit - access works, please do a test commit (e.g. change a comment or add a blank - line). Your first commit to a repository may require the autogenerated email - to be approved by a mailing list. This is normal, and will be done when - the mailing list owner has time.</p> - -<p>If you have recently been granted commit access, these policies apply:</p> - -<ol> - <li>You are granted <i>commit-after-approval</i> to all parts of LLVM. To get - approval, submit a <a href="#patches">patch</a> to - <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">llvm-commits</a>. - When approved you may commit it yourself.</li> - - <li>You are allowed to commit patches without approval which you think are - obvious. This is clearly a subjective decision — we simply expect - you to use good judgement. Examples include: fixing build breakage, - reverting obviously broken patches, documentation/comment changes, any - other minor changes.</li> - - <li>You are allowed to commit patches without approval to those portions of - LLVM that you have contributed or maintain (i.e., have been assigned - responsibility for), with the proviso that such commits must not break the - build. This is a "trust but verify" policy and commits of this nature are - reviewed after they are committed.</li> - - <li>Multiple violations of these policies or a single egregious violation may - cause commit access to be revoked.</li> -</ol> - -<p>In any case, your changes are still subject to <a href="#reviews">code - review</a> (either before or after they are committed, depending on the - nature of the change). You are encouraged to review other peoples' patches - as well, but you aren't required to.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="newwork">Making a Major Change</a></h3> -<div> -<p>When a developer begins a major new project with the aim of contributing it - back to LLVM, s/he should inform the community with an email to - the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev</a> - email list, to the extent possible. The reason for this is to: - -<ol> - <li>keep the community informed about future changes to LLVM, </li> - - <li>avoid duplication of effort by preventing multiple parties working on the - same thing and not knowing about it, and</li> - - <li>ensure that any technical issues around the proposed work are discussed - and resolved before any significant work is done.</li> -</ol> - -<p>The design of LLVM is carefully controlled to ensure that all the pieces fit - together well and are as consistent as possible. If you plan to make a major - change to the way LLVM works or want to add a major new extension, it is a - good idea to get consensus with the development community before you start - working on it.</p> - -<p>Once the design of the new feature is finalized, the work itself should be - done as a series of <a href="#incremental">incremental changes</a>, not as a - long-term development branch.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="incremental">Incremental Development</a></h3> -<div> -<p>In the LLVM project, we do all significant changes as a series of incremental - patches. We have a strong dislike for huge changes or long-term development - branches. Long-term development branches have a number of drawbacks:</p> - -<ol> - <li>Branches must have mainline merged into them periodically. If the branch - development and mainline development occur in the same pieces of code, - resolving merge conflicts can take a lot of time.</li> - - <li>Other people in the community tend to ignore work on branches.</li> - - <li>Huge changes (produced when a branch is merged back onto mainline) are - extremely difficult to <a href="#reviews">code review</a>.</li> - - <li>Branches are not routinely tested by our nightly tester - infrastructure.</li> - - <li>Changes developed as monolithic large changes often don't work until the - entire set of changes is done. Breaking it down into a set of smaller - changes increases the odds that any of the work will be committed to the - main repository.</li> -</ol> - -<p>To address these problems, LLVM uses an incremental development style and we - require contributors to follow this practice when making a large/invasive - change. Some tips:</p> - -<ul> - <li>Large/invasive changes usually have a number of secondary changes that are - required before the big change can be made (e.g. API cleanup, etc). These - sorts of changes can often be done before the major change is done, - independently of that work.</li> - - <li>The remaining inter-related work should be decomposed into unrelated sets - of changes if possible. Once this is done, define the first increment and - get consensus on what the end goal of the change is.</li> - - <li>Each change in the set can be stand alone (e.g. to fix a bug), or part of - a planned series of changes that works towards the development goal.</li> - - <li>Each change should be kept as small as possible. This simplifies your work - (into a logical progression), simplifies code review and reduces the - chance that you will get negative feedback on the change. Small increments - also facilitate the maintenance of a high quality code base.</li> - - <li>Often, an independent precursor to a big change is to add a new API and - slowly migrate clients to use the new API. Each change to use the new API - is often "obvious" and can be committed without review. Once the new API - is in place and used, it is much easier to replace the underlying - implementation of the API. This implementation change is logically - separate from the API change.</li> -</ul> - -<p>If you are interested in making a large change, and this scares you, please - make sure to first <a href="#newwork">discuss the change/gather consensus</a> - then ask about the best way to go about making the change.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="attribution">Attribution of Changes</a></h3> -<div> -<p>We believe in correct attribution of contributions to their contributors. - However, we do not want the source code to be littered with random - attributions "this code written by J. Random Hacker" (this is noisy and - distracting). In practice, the revision control system keeps a perfect - history of who changed what, and the CREDITS.txt file describes higher-level - contributions. If you commit a patch for someone else, please say "patch - contributed by J. Random Hacker!" in the commit message.</p> - -<p>Overall, please do not add contributor names to the source code.</p> -</div> - -</div> - -<!--=========================================================================--> -<h2> - <a name="clp">Copyright, License, and Patents</a> -</h2> -<!--=========================================================================--> - -<div> - -<div class="doc_notes"> -<p style="text-align:center;font-weight:bold">NOTE: This section deals with - legal matters but does not provide legal advice. We are not lawyers — - please seek legal counsel from an attorney.</p> -</div> - -<div> -<p>This section addresses the issues of copyright, license and patents for the - LLVM project. The copyright for the code is held by the individual - contributors of the code and the terms of its license to LLVM users and - developers is the - <a href="http://www.opensource.org/licenses/UoI-NCSA.php">University of - Illinois/NCSA Open Source License</a> (with portions dual licensed under the - <a href="http://www.opensource.org/licenses/mit-license.php">MIT License</a>, - see below). As contributor to the LLVM project, you agree to allow any - contributions to the project to licensed under these terms.</p> - - -<!-- _______________________________________________________________________ --> -<h3><a name="copyright">Copyright</a></h3> -<div> - -<p>The LLVM project does not require copyright assignments, which means that the - copyright for the code in the project is held by its respective contributors - who have each agreed to release their contributed code under the terms of the - <a href="#license">LLVM License</a>.</p> - -<p>An implication of this is that the LLVM license is unlikely to ever change: - changing it would require tracking down all the contributors to LLVM and - getting them to agree that a license change is acceptable for their - contribution. Since there are no plans to change the license, this is not a - cause for concern.</p> - -<p>As a contributor to the project, this means that you (or your company) retain - ownership of the code you contribute, that it cannot be used in a way that - contradicts the license (which is a liberal BSD-style license), and that the - license for your contributions won't change without your approval in the - future.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="license">License</a></h3> -<div> -<p>We intend to keep LLVM perpetually open source and to use a liberal open - source license. <b>As a contributor to the project, you agree that any - contributions be licensed under the terms of the corresponding - subproject.</b> - All of the code in LLVM is available under the - <a href="http://www.opensource.org/licenses/UoI-NCSA.php">University of - Illinois/NCSA Open Source License</a>, which boils down to this:</p> - -<ul> - <li>You can freely distribute LLVM.</li> - <li>You must retain the copyright notice if you redistribute LLVM.</li> - <li>Binaries derived from LLVM must reproduce the copyright notice (e.g. in an - included readme file).</li> - <li>You can't use our names to promote your LLVM derived products.</li> - <li>There's no warranty on LLVM at all.</li> -</ul> - -<p>We believe this fosters the widest adoption of LLVM because it <b>allows - commercial products to be derived from LLVM</b> with few restrictions and - without a requirement for making any derived works also open source (i.e. - LLVM's license is not a "copyleft" license like the GPL). We suggest that you - read the <a href="http://www.opensource.org/licenses/UoI-NCSA.php">License</a> - if further clarification is needed.</p> - -<p>In addition to the UIUC license, the runtime library components of LLVM - (<b>compiler_rt, libc++, and libclc</b>) are also licensed under the <a - href="http://www.opensource.org/licenses/mit-license.php">MIT license</a>, - which does not contain the binary redistribution clause. As a user of these - runtime libraries, it means that you can choose to use the code under either - license (and thus don't need the binary redistribution clause), and as a - contributor to the code that you agree that any contributions to these - libraries be licensed under both licenses. We feel that this is important - for runtime libraries, because they are implicitly linked into applications - and therefore should not subject those applications to the binary - redistribution clause. This also means that it is ok to move code from (e.g.) - libc++ to the LLVM core without concern, but that code cannot be moved from - the LLVM core to libc++ without the copyright owner's permission. -</p> - -<p>Note that the LLVM Project does distribute llvm-gcc and dragonegg, <b>which - are GPL.</b> - This means that anything "linked" into llvm-gcc must itself be compatible - with the GPL, and must be releasable under the terms of the GPL. This - implies that <b>any code linked into llvm-gcc and distributed to others may - be subject to the viral aspects of the GPL</b> (for example, a proprietary - code generator linked into llvm-gcc must be made available under the GPL). - This is not a problem for code already distributed under a more liberal - license (like the UIUC license), and GPL-containing subprojects are kept - in separate SVN repositories whose LICENSE.txt files specifically indicate - that they contain GPL code.</p> - -<p>We have no plans to change the license of LLVM. If you have questions or - comments about the license, please contact the - <a href="mailto:llvmdev@cs.uiuc.edu">LLVM Developer's Mailing List</a>.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="patents">Patents</a></h3> -<div> -<p>To the best of our knowledge, LLVM does not infringe on any patents (we have - actually removed code from LLVM in the past that was found to infringe). - Having code in LLVM that infringes on patents would violate an important goal - of the project by making it hard or impossible to reuse the code for - arbitrary purposes (including commercial use).</p> - -<p>When contributing code, we expect contributors to notify us of any potential - for patent-related trouble with their changes (including from third parties). - If you or your employer own - the rights to a patent and would like to contribute code to LLVM that relies - on it, we require that the copyright owner sign an agreement that allows any - other user of LLVM to freely use your patent. Please contact - the <a href="mailto:llvm-oversight@cs.uiuc.edu">oversight group</a> for more - details.</p> -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - Written by the - <a href="mailto:llvm-oversight@cs.uiuc.edu">LLVM Oversight Group</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ -</address> -</body> -</html> diff --git a/docs/DeveloperPolicy.rst b/docs/DeveloperPolicy.rst new file mode 100644 index 0000000..cda281a --- /dev/null +++ b/docs/DeveloperPolicy.rst @@ -0,0 +1,508 @@ +.. _developer_policy: + +===================== +LLVM Developer Policy +===================== + +.. contents:: + :local: + +Introduction +============ + +This document contains the LLVM Developer Policy which defines the project's +policy towards developers and their contributions. The intent of this policy is +to eliminate miscommunication, rework, and confusion that might arise from the +distributed nature of LLVM's development. By stating the policy in clear terms, +we hope each developer can know ahead of time what to expect when making LLVM +contributions. This policy covers all llvm.org subprojects, including Clang, +LLDB, libc++, etc. + +This policy is also designed to accomplish the following objectives: + +#. Attract both users and developers to the LLVM project. + +#. Make life as simple and easy for contributors as possible. + +#. Keep the top of Subversion trees as stable as possible. + +#. Establish awareness of the project's `copyright, license, and patent + policies`_ with contributors to the project. + +This policy is aimed at frequent contributors to LLVM. People interested in +contributing one-off patches can do so in an informal way by sending them to the +`llvm-commits mailing list +<http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>`_ and engaging another +developer to see it through the process. + +Developer Policies +================== + +This section contains policies that pertain to frequent LLVM developers. We +always welcome `one-off patches`_ from people who do not routinely contribute to +LLVM, but we expect more from frequent contributors to keep the system as +efficient as possible for everyone. Frequent LLVM contributors are expected to +meet the following requirements in order for LLVM to maintain a high standard of +quality. + +Stay Informed +------------- + +Developers should stay informed by reading at least the "dev" mailing list for +the projects you are interested in, such as `llvmdev +<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ for LLVM, `cfe-dev +<http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>`_ for Clang, or `lldb-dev +<http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev>`_ for LLDB. If you are +doing anything more than just casual work on LLVM, it is suggested that you also +subscribe to the "commits" mailing list for the subproject you're interested in, +such as `llvm-commits +<http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>`_, `cfe-commits +<http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits>`_, or `lldb-commits +<http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits>`_. Reading the +"commits" list and paying attention to changes being made by others is a good +way to see what other people are interested in and watching the flow of the +project as a whole. + +We recommend that active developers register an email account with `LLVM +Bugzilla <http://llvm.org/bugs/>`_ and preferably subscribe to the `llvm-bugs +<http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs>`_ email list to keep track +of bugs and enhancements occurring in LLVM. We really appreciate people who are +proactive at catching incoming bugs in their components and dealing with them +promptly. + +.. _patch: +.. _one-off patches: + +Making a Patch +-------------- + +When making a patch for review, the goal is to make it as easy for the reviewer +to read it as possible. As such, we recommend that you: + +#. Make your patch against the Subversion trunk, not a branch, and not an old + version of LLVM. This makes it easy to apply the patch. For information on + how to check out SVN trunk, please see the `Getting Started + Guide <GettingStarted.html#checkout>`_. + +#. Similarly, patches should be submitted soon after they are generated. Old + patches may not apply correctly if the underlying code changes between the + time the patch was created and the time it is applied. + +#. Patches should be made with ``svn diff``, or similar. If you use a + different tool, make sure it uses the ``diff -u`` format and that it + doesn't contain clutter which makes it hard to read. + +#. If you are modifying generated files, such as the top-level ``configure`` + script, please separate out those changes into a separate patch from the rest + of your changes. + +When sending a patch to a mailing list, it is a good idea to send it as an +*attachment* to the message, not embedded into the text of the message. This +ensures that your mailer will not mangle the patch when it sends it (e.g. by +making whitespace changes or by wrapping lines). + +*For Thunderbird users:* Before submitting a patch, please open *Preferences > +Advanced > General > Config Editor*, find the key +``mail.content_disposition_type``, and set its value to ``1``. Without this +setting, Thunderbird sends your attachment using ``Content-Disposition: inline`` +rather than ``Content-Disposition: attachment``. Apple Mail gamely displays such +a file inline, making it difficult to work with for reviewers using that +program. + +.. _code review: + +Code Reviews +------------ + +LLVM has a code review policy. Code review is one way to increase the quality of +software. We generally follow these policies: + +#. All developers are required to have significant changes reviewed before they + are committed to the repository. + +#. Code reviews are conducted by email, usually on the llvm-commits list. + +#. Code can be reviewed either before it is committed or after. We expect major + changes to be reviewed before being committed, but smaller changes (or + changes where the developer owns the component) can be reviewed after commit. + +#. The developer responsible for a code change is also responsible for making + all necessary review-related changes. + +#. Code review can be an iterative process, which continues until the patch is + ready to be committed. + +Developers should participate in code reviews as both reviewers and +reviewees. If someone is kind enough to review your code, you should return the +favor for someone else. Note that anyone is welcome to review and give feedback +on a patch, but only people with Subversion write access can approve it. + +Code Owners +----------- + +The LLVM Project relies on two features of its process to maintain rapid +development in addition to the high quality of its source base: the combination +of code review plus post-commit review for trusted maintainers. Having both is +a great way for the project to take advantage of the fact that most people do +the right thing most of the time, and only commit patches without pre-commit +review when they are confident they are right. + +The trick to this is that the project has to guarantee that all patches that are +committed are reviewed after they go in: you don't want everyone to assume +someone else will review it, allowing the patch to go unreviewed. To solve this +problem, we have a notion of an 'owner' for a piece of the code. The sole +responsibility of a code owner is to ensure that a commit to their area of the +code is appropriately reviewed, either by themself or by someone else. The list +of current code owners can be found in the file +`CODE_OWNERS.TXT <http://llvm.org/viewvc/llvm-project/llvm/trunk/CODE_OWNERS.TXT?view=markup>`_ +in the root of the LLVM source tree. + +Note that code ownership is completely different than reviewers: anyone can +review a piece of code, and we welcome code review from anyone who is +interested. Code owners are the "last line of defense" to guarantee that all +patches that are committed are actually reviewed. + +Being a code owner is a somewhat unglamorous position, but it is incredibly +important for the ongoing success of the project. Because people get busy, +interests change, and unexpected things happen, code ownership is purely opt-in, +and anyone can choose to resign their "title" at any time. For now, we do not +have an official policy on how one gets elected to be a code owner. + +.. _include a testcase: + +Test Cases +---------- + +Developers are required to create test cases for any bugs fixed and any new +features added. Some tips for getting your testcase approved: + +* All feature and regression test cases are added to the ``llvm/test`` + directory. The appropriate sub-directory should be selected (see the `Testing + Guide <TestingGuide.html>`_ for details). + +* Test cases should be written in `LLVM assembly language <LangRef.html>`_ + unless the feature or regression being tested requires another language + (e.g. the bug being fixed or feature being implemented is in the llvm-gcc C++ + front-end, in which case it must be written in C++). + +* Test cases, especially for regressions, should be reduced as much as possible, + by `bugpoint <Bugpoint.html>`_ or manually. It is unacceptable to place an + entire failing program into ``llvm/test`` as this creates a *time-to-test* + burden on all developers. Please keep them short. + +Note that llvm/test and clang/test are designed for regression and small feature +tests only. More extensive test cases (e.g., entire applications, benchmarks, +etc) should be added to the ``llvm-test`` test suite. The llvm-test suite is +for coverage (correctness, performance, etc) testing, not feature or regression +testing. + +Quality +------- + +The minimum quality standards that any change must satisfy before being +committed to the main development branch are: + +#. Code must adhere to the `LLVM Coding Standards <CodingStandards.html>`_. + +#. Code must compile cleanly (no errors, no warnings) on at least one platform. + +#. Bug fixes and new features should `include a testcase`_ so we know if the + fix/feature ever regresses in the future. + +#. Code must pass the ``llvm/test`` test suite. + +#. The code must not cause regressions on a reasonable subset of llvm-test, + where "reasonable" depends on the contributor's judgement and the scope of + the change (more invasive changes require more testing). A reasonable subset + might be something like "``llvm-test/MultiSource/Benchmarks``". + +Additionally, the committer is responsible for addressing any problems found in +the future that the change is responsible for. For example: + +* The code should compile cleanly on all supported platforms. + +* The changes should not cause any correctness regressions in the ``llvm-test`` + suite and must not cause any major performance regressions. + +* The change set should not cause performance or correctness regressions for the + LLVM tools. + +* The changes should not cause performance or correctness regressions in code + compiled by LLVM on all applicable targets. + +* You are expected to address any `Bugzilla bugs <http://llvm.org/bugs/>`_ that + result from your change. + +We prefer for this to be handled before submission but understand that it isn't +possible to test all of this for every submission. Our build bots and nightly +testing infrastructure normally finds these problems. A good rule of thumb is +to check the nightly testers for regressions the day after your change. Build +bots will directly email you if a group of commits that included yours caused a +failure. You are expected to check the build bot messages to see if they are +your fault and, if so, fix the breakage. + +Commits that violate these quality standards (e.g. are very broken) may be +reverted. This is necessary when the change blocks other developers from making +progress. The developer is welcome to re-commit the change after the problem has +been fixed. + +Obtaining Commit Access +----------------------- + +We grant commit access to contributors with a track record of submitting high +quality patches. If you would like commit access, please send an email to +`Chris <mailto:sabre@nondot.org>`_ with the following information: + +#. The user name you want to commit with, e.g. "hacker". + +#. The full name and email address you want message to llvm-commits to come + from, e.g. "J. Random Hacker <hacker@yoyodyne.com>". + +#. A "password hash" of the password you want to use, e.g. "``2ACR96qjUqsyM``". + Note that you don't ever tell us what your password is, you just give it to + us in an encrypted form. To get this, run "``htpasswd``" (a utility that + comes with apache) in crypt mode (often enabled with "``-d``"), or find a web + page that will do it for you. + +Once you've been granted commit access, you should be able to check out an LLVM +tree with an SVN URL of "https://username@llvm.org/..." instead of the normal +anonymous URL of "http://llvm.org/...". The first time you commit you'll have +to type in your password. Note that you may get a warning from SVN about an +untrusted key, you can ignore this. To verify that your commit access works, +please do a test commit (e.g. change a comment or add a blank line). Your first +commit to a repository may require the autogenerated email to be approved by a +mailing list. This is normal, and will be done when the mailing list owner has +time. + +If you have recently been granted commit access, these policies apply: + +#. You are granted *commit-after-approval* to all parts of LLVM. To get + approval, submit a `patch`_ to `llvm-commits + <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>`_. When approved + you may commit it yourself.</li> + +#. You are allowed to commit patches without approval which you think are + obvious. This is clearly a subjective decision --- we simply expect you to + use good judgement. Examples include: fixing build breakage, reverting + obviously broken patches, documentation/comment changes, any other minor + changes. + +#. You are allowed to commit patches without approval to those portions of LLVM + that you have contributed or maintain (i.e., have been assigned + responsibility for), with the proviso that such commits must not break the + build. This is a "trust but verify" policy and commits of this nature are + reviewed after they are committed. + +#. Multiple violations of these policies or a single egregious violation may + cause commit access to be revoked. + +In any case, your changes are still subject to `code review`_ (either before or +after they are committed, depending on the nature of the change). You are +encouraged to review other peoples' patches as well, but you aren't required +to. + +.. _discuss the change/gather consensus: + +Making a Major Change +--------------------- + +When a developer begins a major new project with the aim of contributing it back +to LLVM, s/he should inform the community with an email to the `llvmdev +<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ email list, to the extent +possible. The reason for this is to: + +#. keep the community informed about future changes to LLVM, + +#. avoid duplication of effort by preventing multiple parties working on the + same thing and not knowing about it, and + +#. ensure that any technical issues around the proposed work are discussed and + resolved before any significant work is done. + +The design of LLVM is carefully controlled to ensure that all the pieces fit +together well and are as consistent as possible. If you plan to make a major +change to the way LLVM works or want to add a major new extension, it is a good +idea to get consensus with the development community before you start working on +it. + +Once the design of the new feature is finalized, the work itself should be done +as a series of `incremental changes`_, not as a long-term development branch. + +.. _incremental changes: + +Incremental Development +----------------------- + +In the LLVM project, we do all significant changes as a series of incremental +patches. We have a strong dislike for huge changes or long-term development +branches. Long-term development branches have a number of drawbacks: + +#. Branches must have mainline merged into them periodically. If the branch + development and mainline development occur in the same pieces of code, + resolving merge conflicts can take a lot of time. + +#. Other people in the community tend to ignore work on branches. + +#. Huge changes (produced when a branch is merged back onto mainline) are + extremely difficult to `code review`_. + +#. Branches are not routinely tested by our nightly tester infrastructure. + +#. Changes developed as monolithic large changes often don't work until the + entire set of changes is done. Breaking it down into a set of smaller + changes increases the odds that any of the work will be committed to the main + repository. + +To address these problems, LLVM uses an incremental development style and we +require contributors to follow this practice when making a large/invasive +change. Some tips: + +* Large/invasive changes usually have a number of secondary changes that are + required before the big change can be made (e.g. API cleanup, etc). These + sorts of changes can often be done before the major change is done, + independently of that work. + +* The remaining inter-related work should be decomposed into unrelated sets of + changes if possible. Once this is done, define the first increment and get + consensus on what the end goal of the change is. + +* Each change in the set can be stand alone (e.g. to fix a bug), or part of a + planned series of changes that works towards the development goal. + +* Each change should be kept as small as possible. This simplifies your work + (into a logical progression), simplifies code review and reduces the chance + that you will get negative feedback on the change. Small increments also + facilitate the maintenance of a high quality code base. + +* Often, an independent precursor to a big change is to add a new API and slowly + migrate clients to use the new API. Each change to use the new API is often + "obvious" and can be committed without review. Once the new API is in place + and used, it is much easier to replace the underlying implementation of the + API. This implementation change is logically separate from the API + change. + +If you are interested in making a large change, and this scares you, please make +sure to first `discuss the change/gather consensus`_ then ask about the best way +to go about making the change. + +Attribution of Changes +---------------------- + +We believe in correct attribution of contributions to their contributors. +However, we do not want the source code to be littered with random attributions +"this code written by J. Random Hacker" (this is noisy and distracting). In +practice, the revision control system keeps a perfect history of who changed +what, and the CREDITS.txt file describes higher-level contributions. If you +commit a patch for someone else, please say "patch contributed by J. Random +Hacker!" in the commit message. + +Overall, please do not add contributor names to the source code. + +.. _copyright, license, and patent policies: + +Copyright, License, and Patents +=============================== + +.. note:: + + This section deals with legal matters but does not provide legal advice. We + are not lawyers --- please seek legal counsel from an attorney. + +This section addresses the issues of copyright, license and patents for the LLVM +project. The copyright for the code is held by the individual contributors of +the code and the terms of its license to LLVM users and developers is the +`University of Illinois/NCSA Open Source License +<http://www.opensource.org/licenses/UoI-NCSA.php>`_ (with portions dual licensed +under the `MIT License <http://www.opensource.org/licenses/mit-license.php>`_, +see below). As contributor to the LLVM project, you agree to allow any +contributions to the project to licensed under these terms. + +Copyright +--------- + +The LLVM project does not require copyright assignments, which means that the +copyright for the code in the project is held by its respective contributors who +have each agreed to release their contributed code under the terms of the `LLVM +License`_. + +An implication of this is that the LLVM license is unlikely to ever change: +changing it would require tracking down all the contributors to LLVM and getting +them to agree that a license change is acceptable for their contribution. Since +there are no plans to change the license, this is not a cause for concern. + +As a contributor to the project, this means that you (or your company) retain +ownership of the code you contribute, that it cannot be used in a way that +contradicts the license (which is a liberal BSD-style license), and that the +license for your contributions won't change without your approval in the +future. + +.. _LLVM License: + +License +------- + +We intend to keep LLVM perpetually open source and to use a liberal open source +license. **As a contributor to the project, you agree that any contributions be +licensed under the terms of the corresponding subproject.** All of the code in +LLVM is available under the `University of Illinois/NCSA Open Source License +<http://www.opensource.org/licenses/UoI-NCSA.php>`_, which boils down to +this: + +* You can freely distribute LLVM. +* You must retain the copyright notice if you redistribute LLVM. +* Binaries derived from LLVM must reproduce the copyright notice (e.g. in an + included readme file). +* You can't use our names to promote your LLVM derived products. +* There's no warranty on LLVM at all. + +We believe this fosters the widest adoption of LLVM because it **allows +commercial products to be derived from LLVM** with few restrictions and without +a requirement for making any derived works also open source (i.e. LLVM's +license is not a "copyleft" license like the GPL). We suggest that you read the +`License <http://www.opensource.org/licenses/UoI-NCSA.php>`_ if further +clarification is needed. + +In addition to the UIUC license, the runtime library components of LLVM +(**compiler_rt, libc++, and libclc**) are also licensed under the `MIT License +<http://www.opensource.org/licenses/mit-license.php>`_, which does not contain +the binary redistribution clause. As a user of these runtime libraries, it +means that you can choose to use the code under either license (and thus don't +need the binary redistribution clause), and as a contributor to the code that +you agree that any contributions to these libraries be licensed under both +licenses. We feel that this is important for runtime libraries, because they +are implicitly linked into applications and therefore should not subject those +applications to the binary redistribution clause. This also means that it is ok +to move code from (e.g.) libc++ to the LLVM core without concern, but that code +cannot be moved from the LLVM core to libc++ without the copyright owner's +permission. + +Note that the LLVM Project does distribute llvm-gcc and dragonegg, **which are +GPL.** This means that anything "linked" into llvm-gcc must itself be compatible +with the GPL, and must be releasable under the terms of the GPL. This implies +that **any code linked into llvm-gcc and distributed to others may be subject to +the viral aspects of the GPL** (for example, a proprietary code generator linked +into llvm-gcc must be made available under the GPL). This is not a problem for +code already distributed under a more liberal license (like the UIUC license), +and GPL-containing subprojects are kept in separate SVN repositories whose +LICENSE.txt files specifically indicate that they contain GPL code. + +We have no plans to change the license of LLVM. If you have questions or +comments about the license, please contact the `LLVM Developer's Mailing +List <mailto:llvmdev@cs.uiuc.edu>`_. + +Patents +------- + +To the best of our knowledge, LLVM does not infringe on any patents (we have +actually removed code from LLVM in the past that was found to infringe). Having +code in LLVM that infringes on patents would violate an important goal of the +project by making it hard or impossible to reuse the code for arbitrary purposes +(including commercial use). + +When contributing code, we expect contributors to notify us of any potential for +patent-related trouble with their changes (including from third parties). If +you or your employer own the rights to a patent and would like to contribute +code to LLVM that relies on it, we require that the copyright owner sign an +agreement that allows any other user of LLVM to freely use your patent. Please +contact the `oversight group <mailto:llvm-oversight@cs.uiuc.edu>`_ for more +details. diff --git a/docs/ExceptionHandling.html b/docs/ExceptionHandling.html deleted file mode 100644 index 49e6b01..0000000 --- a/docs/ExceptionHandling.html +++ /dev/null @@ -1,563 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <title>Exception Handling in LLVM</title> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <meta name="description" - content="Exception Handling in LLVM."> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> - -<body> - -<h1>Exception Handling in LLVM</h1> - -<table class="layout" style="width:100%"> - <tr class="layout"> - <td class="left"> -<ul> - <li><a href="#introduction">Introduction</a> - <ol> - <li><a href="#itanium">Itanium ABI Zero-cost Exception Handling</a></li> - <li><a href="#sjlj">Setjmp/Longjmp Exception Handling</a></li> - <li><a href="#overview">Overview</a></li> - </ol></li> - <li><a href="#codegen">LLVM Code Generation</a> - <ol> - <li><a href="#throw">Throw</a></li> - <li><a href="#try_catch">Try/Catch</a></li> - <li><a href="#cleanups">Cleanups</a></li> - <li><a href="#throw_filters">Throw Filters</a></li> - <li><a href="#restrictions">Restrictions</a></li> - </ol></li> - <li><a href="#format_common_intrinsics">Exception Handling Intrinsics</a> - <ol> - <li><a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a></li> - <li><a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a></li> - <li><a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a></li> - <li><a href="#llvm_eh_sjlj_lsda"><tt>llvm.eh.sjlj.lsda</tt></a></li> - <li><a href="#llvm_eh_sjlj_callsite"><tt>llvm.eh.sjlj.callsite</tt></a></li> - </ol></li> - <li><a href="#asm">Asm Table Formats</a> - <ol> - <li><a href="#unwind_tables">Exception Handling Frame</a></li> - <li><a href="#exception_tables">Exception Tables</a></li> - </ol></li> -</ul> -</td> -</tr></table> - -<div class="doc_author"> - <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p> -</div> - - -<!-- *********************************************************************** --> -<h2><a name="introduction">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>This document is the central repository for all information pertaining to - exception handling in LLVM. It describes the format that LLVM exception - handling information takes, which is useful for those interested in creating - front-ends or dealing directly with the information. Further, this document - provides specific examples of what exception handling information is used for - in C and C++.</p> - -<!-- ======================================================================= --> -<h3> - <a name="itanium">Itanium ABI Zero-cost Exception Handling</a> -</h3> - -<div> - -<p>Exception handling for most programming languages is designed to recover from - conditions that rarely occur during general use of an application. To that - end, exception handling should not interfere with the main flow of an - application's algorithm by performing checkpointing tasks, such as saving the - current pc or register state.</p> - -<p>The Itanium ABI Exception Handling Specification defines a methodology for - providing outlying data in the form of exception tables without inlining - speculative exception handling code in the flow of an application's main - algorithm. Thus, the specification is said to add "zero-cost" to the normal - execution of an application.</p> - -<p>A more complete description of the Itanium ABI exception handling runtime - support of can be found at - <a href="http://www.codesourcery.com/cxx-abi/abi-eh.html">Itanium C++ ABI: - Exception Handling</a>. A description of the exception frame format can be - found at - <a href="http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html">Exception - Frames</a>, with details of the DWARF 4 specification at - <a href="http://dwarfstd.org/Dwarf4Std.php">DWARF 4 Standard</a>. - A description for the C++ exception table formats can be found at - <a href="http://www.codesourcery.com/cxx-abi/exceptions.pdf">Exception Handling - Tables</a>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="sjlj">Setjmp/Longjmp Exception Handling</a> -</h3> - -<div> - -<p>Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics - <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a> and - <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> to - handle control flow for exception handling.</p> - -<p>For each function which does exception processing — be - it <tt>try</tt>/<tt>catch</tt> blocks or cleanups — that function - registers itself on a global frame list. When exceptions are unwinding, the - runtime uses this list to identify which functions need processing.<p> - -<p>Landing pad selection is encoded in the call site entry of the function - context. The runtime returns to the function via - <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>, where - a switch table transfers control to the appropriate landing pad based on - the index stored in the function context.</p> - -<p>In contrast to DWARF exception handling, which encodes exception regions - and frame information in out-of-line tables, SJLJ exception handling - builds and removes the unwind frame context at runtime. This results in - faster exception handling at the expense of slower execution when no - exceptions are thrown. As exceptions are, by their nature, intended for - uncommon code paths, DWARF exception handling is generally preferred to - SJLJ.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="overview">Overview</a> -</h3> - -<div> - -<p>When an exception is thrown in LLVM code, the runtime does its best to find a - handler suited to processing the circumstance.</p> - -<p>The runtime first attempts to find an <i>exception frame</i> corresponding to - the function where the exception was thrown. If the programming language - supports exception handling (e.g. C++), the exception frame contains a - reference to an exception table describing how to process the exception. If - the language does not support exception handling (e.g. C), or if the - exception needs to be forwarded to a prior activation, the exception frame - contains information about how to unwind the current activation and restore - the state of the prior activation. This process is repeated until the - exception is handled. If the exception is not handled and no activations - remain, then the application is terminated with an appropriate error - message.</p> - -<p>Because different programming languages have different behaviors when - handling exceptions, the exception handling ABI provides a mechanism for - supplying <i>personalities</i>. An exception handling personality is defined - by way of a <i>personality function</i> (e.g. <tt>__gxx_personality_v0</tt> - in C++), which receives the context of the exception, an <i>exception - structure</i> containing the exception object type and value, and a reference - to the exception table for the current function. The personality function - for the current compile unit is specified in a <i>common exception - frame</i>.</p> - -<p>The organization of an exception table is language dependent. For C++, an - exception table is organized as a series of code ranges defining what to do - if an exception occurs in that range. Typically, the information associated - with a range defines which types of exception objects (using C++ <i>type - info</i>) that are handled in that range, and an associated action that - should take place. Actions typically pass control to a <i>landing - pad</i>.</p> - -<p>A landing pad corresponds roughly to the code found in the <tt>catch</tt> - portion of a <tt>try</tt>/<tt>catch</tt> sequence. When execution resumes at - a landing pad, it receives an <i>exception structure</i> and a - <i>selector value</i> corresponding to the <i>type</i> of exception - thrown. The selector is then used to determine which <i>catch</i> should - actually process the exception.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h2> - <a name="codegen">LLVM Code Generation</a> -</h2> - -<div> - -<p>From a C++ developer's perspective, exceptions are defined in terms of the - <tt>throw</tt> and <tt>try</tt>/<tt>catch</tt> statements. In this section - we will describe the implementation of LLVM exception handling in terms of - C++ examples.</p> - -<!-- ======================================================================= --> -<h3> - <a name="throw">Throw</a> -</h3> - -<div> - -<p>Languages that support exception handling typically provide a <tt>throw</tt> - operation to initiate the exception process. Internally, a <tt>throw</tt> - operation breaks down into two steps.</p> - -<ol> - <li>A request is made to allocate exception space for an exception structure. - This structure needs to survive beyond the current activation. This - structure will contain the type and value of the object being thrown.</li> - - <li>A call is made to the runtime to raise the exception, passing the - exception structure as an argument.</li> -</ol> - -<p>In C++, the allocation of the exception structure is done by the - <tt>__cxa_allocate_exception</tt> runtime function. The exception raising is - handled by <tt>__cxa_throw</tt>. The type of the exception is represented - using a C++ RTTI structure.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="try_catch">Try/Catch</a> -</h3> - -<div> - -<p>A call within the scope of a <i>try</i> statement can potentially raise an - exception. In those circumstances, the LLVM C++ front-end replaces the call - with an <tt>invoke</tt> instruction. Unlike a call, the <tt>invoke</tt> has - two potential continuation points:</p> - -<ol> - <li>where to continue when the call succeeds as per normal, and</li> - - <li>where to continue if the call raises an exception, either by a throw or - the unwinding of a throw</li> -</ol> - -<p>The term used to define a the place where an <tt>invoke</tt> continues after - an exception is called a <i>landing pad</i>. LLVM landing pads are - conceptually alternative function entry points where an exception structure - reference and a type info index are passed in as arguments. The landing pad - saves the exception structure reference and then proceeds to select the catch - block that corresponds to the type info of the exception object.</p> - -<p>The LLVM <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> - instruction</a> is used to convey information about the landing pad to the - back end. For C++, the <tt>landingpad</tt> instruction returns a pointer and - integer pair corresponding to the pointer to the <i>exception structure</i> - and the <i>selector value</i> respectively.</p> - -<p>The <tt>landingpad</tt> instruction takes a reference to the personality - function to be used for this <tt>try</tt>/<tt>catch</tt> sequence. The - remainder of the instruction is a list of <i>cleanup</i>, <i>catch</i>, - and <i>filter</i> clauses. The exception is tested against the clauses - sequentially from first to last. The selector value is a positive number if - the exception matched a type info, a negative number if it matched a filter, - and zero if it matched a cleanup. If nothing is matched, the behavior of - the program is <a href="#restrictions">undefined</a>. If a type info matched, - then the selector value is the index of the type info in the exception table, - which can be obtained using the - <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic.</p> - -<p>Once the landing pad has the type info selector, the code branches to the - code for the first catch. The catch then checks the value of the type info - selector against the index of type info for that catch. Since the type info - index is not known until all the type infos have been gathered in the - backend, the catch code must call the - <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic to - determine the index for a given type info. If the catch fails to match the - selector then control is passed on to the next catch.</p> - -<p>Finally, the entry and exit of catch code is bracketed with calls to - <tt>__cxa_begin_catch</tt> and <tt>__cxa_end_catch</tt>.</p> - -<ul> - <li><tt>__cxa_begin_catch</tt> takes an exception structure reference as an - argument and returns the value of the exception object.</li> - - <li><tt>__cxa_end_catch</tt> takes no arguments. This function:<br><br> - <ol> - <li>Locates the most recently caught exception and decrements its handler - count,</li> - <li>Removes the exception from the <i>caught</i> stack if the handler - count goes to zero, and</li> - <li>Destroys the exception if the handler count goes to zero and the - exception was not re-thrown by throw.</li> - </ol> - <p><b>Note:</b> a rethrow from within the catch may replace this call with - a <tt>__cxa_rethrow</tt>.</p></li> -</ul> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="cleanups">Cleanups</a> -</h3> - -<div> - -<p>A cleanup is extra code which needs to be run as part of unwinding a scope. - C++ destructors are a typical example, but other languages and language - extensions provide a variety of different kinds of cleanups. In general, a - landing pad may need to run arbitrary amounts of cleanup code before actually - entering a catch block. To indicate the presence of cleanups, a - <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> - should have a <i>cleanup</i> clause. Otherwise, the unwinder will not stop at - the landing pad if there are no catches or filters that require it to.</p> - -<p><b>Note:</b> Do not allow a new exception to propagate out of the execution - of a cleanup. This can corrupt the internal state of the unwinder. - Different languages describe different high-level semantics for these - situations: for example, C++ requires that the process be terminated, whereas - Ada cancels both exceptions and throws a third.</p> - -<p>When all cleanups are finished, if the exception is not handled by the - current function, resume unwinding by calling the - <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a>, passing in - the result of the <tt>landingpad</tt> instruction for the original landing - pad.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="throw_filters">Throw Filters</a> -</h3> - -<div> - -<p>C++ allows the specification of which exception types may be thrown from a - function. To represent this, a top level landing pad may exist to filter out - invalid types. To express this in LLVM code the - <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> will - have a filter clause. The clause consists of an array of type infos. - <tt>landingpad</tt> will return a negative value if the exception does not - match any of the type infos. If no match is found then a call - to <tt>__cxa_call_unexpected</tt> should be made, otherwise - <tt>_Unwind_Resume</tt>. Each of these functions requires a reference to the - exception structure. Note that the most general form of a - <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> can - have any number of catch, cleanup, and filter clauses (though having more - than one cleanup is pointless). The LLVM C++ front-end can generate such - <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instructions</a> due - to inlining creating nested exception handling scopes.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="restrictions">Restrictions</a> -</h3> - -<div> - -<p>The unwinder delegates the decision of whether to stop in a call frame to - that call frame's language-specific personality function. Not all unwinders - guarantee that they will stop to perform cleanups. For example, the GNU C++ - unwinder doesn't do so unless the exception is actually caught somewhere - further up the stack.</p> - -<p>In order for inlining to behave correctly, landing pads must be prepared to - handle selector results that they did not originally advertise. Suppose that - a function catches exceptions of type <tt>A</tt>, and it's inlined into a - function that catches exceptions of type <tt>B</tt>. The inliner will update - the <tt>landingpad</tt> instruction for the inlined landing pad to include - the fact that <tt>B</tt> is also caught. If that landing pad assumes that it - will only be entered to catch an <tt>A</tt>, it's in for a rude awakening. - Consequently, landing pads must test for the selector results they understand - and then resume exception propagation with the - <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a> if none of - the conditions match.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h2> - <a name="format_common_intrinsics">Exception Handling Intrinsics</a> -</h2> - -<div> - -<p>In addition to the - <a href="LangRef.html#i_landingpad"><tt>landingpad</tt></a> and - <a href="LangRef.html#i_resume"><tt>resume</tt></a> instructions, LLVM uses - several intrinsic functions (name prefixed with <i><tt>llvm.eh</tt></i>) to - provide exception handling information at various points in generated - code.</p> - -<!-- ======================================================================= --> -<h4> - <a name="llvm_eh_typeid_for">llvm.eh.typeid.for</a> -</h4> - -<div> - -<pre> - i32 @llvm.eh.typeid.for(i8* %type_info) -</pre> - -<p>This intrinsic returns the type info index in the exception table of the - current function. This value can be used to compare against the result - of <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a>. - The single argument is a reference to a type info.</p> - -</div> - -<!-- ======================================================================= --> -<h4> - <a name="llvm_eh_sjlj_setjmp">llvm.eh.sjlj.setjmp</a> -</h4> - -<div> - -<pre> - i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf) -</pre> - -<p>For SJLJ based exception handling, this intrinsic forces register saving for - the current function and stores the address of the following instruction for - use as a destination address - by <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>. The - buffer format and the overall functioning of this intrinsic is compatible - with the GCC <tt>__builtin_setjmp</tt> implementation allowing code built - with the clang and GCC to interoperate.</p> - -<p>The single parameter is a pointer to a five word buffer in which the calling - context is saved. The front end places the frame pointer in the first word, - and the target implementation of this intrinsic should place the destination - address for a - <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> in the - second word. The following three words are available for use in a - target-specific manner.</p> - -</div> - -<!-- ======================================================================= --> -<h4> - <a name="llvm_eh_sjlj_longjmp">llvm.eh.sjlj.longjmp</a> -</h4> - -<div> - -<pre> - void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf) -</pre> - -<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.longjmp</tt> - intrinsic is used to implement <tt>__builtin_longjmp()</tt>. The single - parameter is a pointer to a buffer populated - by <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a>. The frame - pointer and stack pointer are restored from the buffer, then control is - transferred to the destination address.</p> - -</div> -<!-- ======================================================================= --> -<h4> - <a name="llvm_eh_sjlj_lsda">llvm.eh.sjlj.lsda</a> -</h4> - -<div> - -<pre> - i8* @llvm.eh.sjlj.lsda() -</pre> - -<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.lsda</tt> intrinsic - returns the address of the Language Specific Data Area (LSDA) for the current - function. The SJLJ front-end code stores this address in the exception - handling function context for use by the runtime.</p> - -</div> - -<!-- ======================================================================= --> -<h4> - <a name="llvm_eh_sjlj_callsite">llvm.eh.sjlj.callsite</a> -</h4> - -<div> - -<pre> - void @llvm.eh.sjlj.callsite(i32 %call_site_num) -</pre> - -<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.callsite</tt> - intrinsic identifies the callsite value associated with the - following <tt>invoke</tt> instruction. This is used to ensure that landing - pad entries in the LSDA are generated in matching order.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h2> - <a name="asm">Asm Table Formats</a> -</h2> - -<div> - -<p>There are two tables that are used by the exception handling runtime to - determine which actions should be taken when an exception is thrown.</p> - -<!-- ======================================================================= --> -<h3> - <a name="unwind_tables">Exception Handling Frame</a> -</h3> - -<div> - -<p>An exception handling frame <tt>eh_frame</tt> is very similar to the unwind - frame used by DWARF debug info. The frame contains all the information - necessary to tear down the current frame and restore the state of the prior - frame. There is an exception handling frame for each function in a compile - unit, plus a common exception handling frame that defines information common - to all functions in the unit.</p> - -<!-- Todo - Table details here. --> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="exception_tables">Exception Tables</a> -</h3> - -<div> - -<p>An exception table contains information about what actions to take when an - exception is thrown in a particular part of a function's code. There is one - exception table per function, except leaf functions and functions that have - calls only to non-throwing functions. They do not need an exception - table.</p> - -<!-- Todo - Table details here. --> - -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ -</address> - -</body> -</html> diff --git a/docs/ExceptionHandling.rst b/docs/ExceptionHandling.rst new file mode 100644 index 0000000..190f182 --- /dev/null +++ b/docs/ExceptionHandling.rst @@ -0,0 +1,367 @@ +.. _exception_handling: + +========================== +Exception Handling in LLVM +========================== + +.. contents:: + :local: + +Introduction +============ + +This document is the central repository for all information pertaining to +exception handling in LLVM. It describes the format that LLVM exception +handling information takes, which is useful for those interested in creating +front-ends or dealing directly with the information. Further, this document +provides specific examples of what exception handling information is used for in +C and C++. + +Itanium ABI Zero-cost Exception Handling +---------------------------------------- + +Exception handling for most programming languages is designed to recover from +conditions that rarely occur during general use of an application. To that end, +exception handling should not interfere with the main flow of an application's +algorithm by performing checkpointing tasks, such as saving the current pc or +register state. + +The Itanium ABI Exception Handling Specification defines a methodology for +providing outlying data in the form of exception tables without inlining +speculative exception handling code in the flow of an application's main +algorithm. Thus, the specification is said to add "zero-cost" to the normal +execution of an application. + +A more complete description of the Itanium ABI exception handling runtime +support of can be found at `Itanium C++ ABI: Exception Handling +<http://www.codesourcery.com/cxx-abi/abi-eh.html>`_. A description of the +exception frame format can be found at `Exception Frames +<http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_, +with details of the DWARF 4 specification at `DWARF 4 Standard +<http://dwarfstd.org/Dwarf4Std.php>`_. A description for the C++ exception +table formats can be found at `Exception Handling Tables +<http://www.codesourcery.com/cxx-abi/exceptions.pdf>`_. + +Setjmp/Longjmp Exception Handling +--------------------------------- + +Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics +`llvm.eh.sjlj.setjmp`_ and `llvm.eh.sjlj.longjmp`_ to handle control flow for +exception handling. + +For each function which does exception processing --- be it ``try``/``catch`` +blocks or cleanups --- that function registers itself on a global frame +list. When exceptions are unwinding, the runtime uses this list to identify +which functions need processing. + +Landing pad selection is encoded in the call site entry of the function +context. The runtime returns to the function via `llvm.eh.sjlj.longjmp`_, where +a switch table transfers control to the appropriate landing pad based on the +index stored in the function context. + +In contrast to DWARF exception handling, which encodes exception regions and +frame information in out-of-line tables, SJLJ exception handling builds and +removes the unwind frame context at runtime. This results in faster exception +handling at the expense of slower execution when no exceptions are thrown. As +exceptions are, by their nature, intended for uncommon code paths, DWARF +exception handling is generally preferred to SJLJ. + +Overview +-------- + +When an exception is thrown in LLVM code, the runtime does its best to find a +handler suited to processing the circumstance. + +The runtime first attempts to find an *exception frame* corresponding to the +function where the exception was thrown. If the programming language supports +exception handling (e.g. C++), the exception frame contains a reference to an +exception table describing how to process the exception. If the language does +not support exception handling (e.g. C), or if the exception needs to be +forwarded to a prior activation, the exception frame contains information about +how to unwind the current activation and restore the state of the prior +activation. This process is repeated until the exception is handled. If the +exception is not handled and no activations remain, then the application is +terminated with an appropriate error message. + +Because different programming languages have different behaviors when handling +exceptions, the exception handling ABI provides a mechanism for +supplying *personalities*. An exception handling personality is defined by +way of a *personality function* (e.g. ``__gxx_personality_v0`` in C++), +which receives the context of the exception, an *exception structure* +containing the exception object type and value, and a reference to the exception +table for the current function. The personality function for the current +compile unit is specified in a *common exception frame*. + +The organization of an exception table is language dependent. For C++, an +exception table is organized as a series of code ranges defining what to do if +an exception occurs in that range. Typically, the information associated with a +range defines which types of exception objects (using C++ *type info*) that are +handled in that range, and an associated action that should take place. Actions +typically pass control to a *landing pad*. + +A landing pad corresponds roughly to the code found in the ``catch`` portion of +a ``try``/``catch`` sequence. When execution resumes at a landing pad, it +receives an *exception structure* and a *selector value* corresponding to the +*type* of exception thrown. The selector is then used to determine which *catch* +should actually process the exception. + +LLVM Code Generation +==================== + +From a C++ developer's perspective, exceptions are defined in terms of the +``throw`` and ``try``/``catch`` statements. In this section we will describe the +implementation of LLVM exception handling in terms of C++ examples. + +Throw +----- + +Languages that support exception handling typically provide a ``throw`` +operation to initiate the exception process. Internally, a ``throw`` operation +breaks down into two steps. + +#. A request is made to allocate exception space for an exception structure. + This structure needs to survive beyond the current activation. This structure + will contain the type and value of the object being thrown. + +#. A call is made to the runtime to raise the exception, passing the exception + structure as an argument. + +In C++, the allocation of the exception structure is done by the +``__cxa_allocate_exception`` runtime function. The exception raising is handled +by ``__cxa_throw``. The type of the exception is represented using a C++ RTTI +structure. + +Try/Catch +--------- + +A call within the scope of a *try* statement can potentially raise an +exception. In those circumstances, the LLVM C++ front-end replaces the call with +an ``invoke`` instruction. Unlike a call, the ``invoke`` has two potential +continuation points: + +#. where to continue when the call succeeds as per normal, and + +#. where to continue if the call raises an exception, either by a throw or the + unwinding of a throw + +The term used to define a the place where an ``invoke`` continues after an +exception is called a *landing pad*. LLVM landing pads are conceptually +alternative function entry points where an exception structure reference and a +type info index are passed in as arguments. The landing pad saves the exception +structure reference and then proceeds to select the catch block that corresponds +to the type info of the exception object. + +The LLVM `landingpad instruction <LangRef.html#i_landingpad>`_ is used to convey +information about the landing pad to the back end. For C++, the ``landingpad`` +instruction returns a pointer and integer pair corresponding to the pointer to +the *exception structure* and the *selector value* respectively. + +The ``landingpad`` instruction takes a reference to the personality function to +be used for this ``try``/``catch`` sequence. The remainder of the instruction is +a list of *cleanup*, *catch*, and *filter* clauses. The exception is tested +against the clauses sequentially from first to last. The selector value is a +positive number if the exception matched a type info, a negative number if it +matched a filter, and zero if it matched a cleanup. If nothing is matched, the +behavior of the program is `undefined`_. If a type info matched, then the +selector value is the index of the type info in the exception table, which can +be obtained using the `llvm.eh.typeid.for`_ intrinsic. + +Once the landing pad has the type info selector, the code branches to the code +for the first catch. The catch then checks the value of the type info selector +against the index of type info for that catch. Since the type info index is not +known until all the type infos have been gathered in the backend, the catch code +must call the `llvm.eh.typeid.for`_ intrinsic to determine the index for a given +type info. If the catch fails to match the selector then control is passed on to +the next catch. + +Finally, the entry and exit of catch code is bracketed with calls to +``__cxa_begin_catch`` and ``__cxa_end_catch``. + +* ``__cxa_begin_catch`` takes an exception structure reference as an argument + and returns the value of the exception object. + +* ``__cxa_end_catch`` takes no arguments. This function: + + #. Locates the most recently caught exception and decrements its handler + count, + + #. Removes the exception from the *caught* stack if the handler count goes to + zero, and + + #. Destroys the exception if the handler count goes to zero and the exception + was not re-thrown by throw. + + .. note:: + + a rethrow from within the catch may replace this call with a + ``__cxa_rethrow``. + +Cleanups +-------- + +A cleanup is extra code which needs to be run as part of unwinding a scope. C++ +destructors are a typical example, but other languages and language extensions +provide a variety of different kinds of cleanups. In general, a landing pad may +need to run arbitrary amounts of cleanup code before actually entering a catch +block. To indicate the presence of cleanups, a `landingpad +instruction <LangRef.html#i_landingpad>`_ should have a *cleanup* +clause. Otherwise, the unwinder will not stop at the landing pad if there are no +catches or filters that require it to. + +.. note:: + + Do not allow a new exception to propagate out of the execution of a + cleanup. This can corrupt the internal state of the unwinder. Different + languages describe different high-level semantics for these situations: for + example, C++ requires that the process be terminated, whereas Ada cancels both + exceptions and throws a third. + +When all cleanups are finished, if the exception is not handled by the current +function, resume unwinding by calling the `resume +instruction <LangRef.html#i_resume>`_, passing in the result of the +``landingpad`` instruction for the original landing pad. + +Throw Filters +------------- + +C++ allows the specification of which exception types may be thrown from a +function. To represent this, a top level landing pad may exist to filter out +invalid types. To express this in LLVM code the `landingpad +instruction <LangRef.html#i_landingpad>`_ will have a filter clause. The clause +consists of an array of type infos. ``landingpad`` will return a negative value +if the exception does not match any of the type infos. If no match is found then +a call to ``__cxa_call_unexpected`` should be made, otherwise +``_Unwind_Resume``. Each of these functions requires a reference to the +exception structure. Note that the most general form of a ``landingpad`` +instruction can have any number of catch, cleanup, and filter clauses (though +having more than one cleanup is pointless). The LLVM C++ front-end can generate +such ``landingpad`` instructions due to inlining creating nested exception +handling scopes. + +.. _undefined: + +Restrictions +------------ + +The unwinder delegates the decision of whether to stop in a call frame to that +call frame's language-specific personality function. Not all unwinders guarantee +that they will stop to perform cleanups. For example, the GNU C++ unwinder +doesn't do so unless the exception is actually caught somewhere further up the +stack. + +In order for inlining to behave correctly, landing pads must be prepared to +handle selector results that they did not originally advertise. Suppose that a +function catches exceptions of type ``A``, and it's inlined into a function that +catches exceptions of type ``B``. The inliner will update the ``landingpad`` +instruction for the inlined landing pad to include the fact that ``B`` is also +caught. If that landing pad assumes that it will only be entered to catch an +``A``, it's in for a rude awakening. Consequently, landing pads must test for +the selector results they understand and then resume exception propagation with +the `resume instruction <LangRef.html#i_resume>`_ if none of the conditions +match. + +Exception Handling Intrinsics +============================= + +In addition to the ``landingpad`` and ``resume`` instructions, LLVM uses several +intrinsic functions (name prefixed with ``llvm.eh``) to provide exception +handling information at various points in generated code. + +.. _llvm.eh.typeid.for: + +llvm.eh.typeid.for +------------------ + +.. code-block:: llvm + + i32 @llvm.eh.typeid.for(i8* %type_info) + + +This intrinsic returns the type info index in the exception table of the current +function. This value can be used to compare against the result of +``landingpad`` instruction. The single argument is a reference to a type info. + +.. _llvm.eh.sjlj.setjmp: + +llvm.eh.sjlj.setjmp +------------------- + +.. code-block:: llvm + + i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf) + +For SJLJ based exception handling, this intrinsic forces register saving for the +current function and stores the address of the following instruction for use as +a destination address by `llvm.eh.sjlj.longjmp`_. The buffer format and the +overall functioning of this intrinsic is compatible with the GCC +``__builtin_setjmp`` implementation allowing code built with the clang and GCC +to interoperate. + +The single parameter is a pointer to a five word buffer in which the calling +context is saved. The front end places the frame pointer in the first word, and +the target implementation of this intrinsic should place the destination address +for a `llvm.eh.sjlj.longjmp`_ in the second word. The following three words are +available for use in a target-specific manner. + +.. _llvm.eh.sjlj.longjmp: + +llvm.eh.sjlj.longjmp +-------------------- + +.. code-block:: llvm + + void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf) + +For SJLJ based exception handling, the ``llvm.eh.sjlj.longjmp`` intrinsic is +used to implement ``__builtin_longjmp()``. The single parameter is a pointer to +a buffer populated by `llvm.eh.sjlj.setjmp`_. The frame pointer and stack +pointer are restored from the buffer, then control is transferred to the +destination address. + +llvm.eh.sjlj.lsda +----------------- + +.. code-block:: llvm + + i8* @llvm.eh.sjlj.lsda() + +For SJLJ based exception handling, the ``llvm.eh.sjlj.lsda`` intrinsic returns +the address of the Language Specific Data Area (LSDA) for the current +function. The SJLJ front-end code stores this address in the exception handling +function context for use by the runtime. + +llvm.eh.sjlj.callsite +--------------------- + +.. code-block:: llvm + + void @llvm.eh.sjlj.callsite(i32 %call_site_num) + +For SJLJ based exception handling, the ``llvm.eh.sjlj.callsite`` intrinsic +identifies the callsite value associated with the following ``invoke`` +instruction. This is used to ensure that landing pad entries in the LSDA are +generated in matching order. + +Asm Table Formats +================= + +There are two tables that are used by the exception handling runtime to +determine which actions should be taken when an exception is thrown. + +Exception Handling Frame +------------------------ + +An exception handling frame ``eh_frame`` is very similar to the unwind frame +used by DWARF debug info. The frame contains all the information necessary to +tear down the current frame and restore the state of the prior frame. There is +an exception handling frame for each function in a compile unit, plus a common +exception handling frame that defines information common to all functions in the +unit. + +Exception Tables +---------------- + +An exception table contains information about what actions to take when an +exception is thrown in a particular part of a function's code. There is one +exception table per function, except leaf functions and functions that have +calls only to non-throwing functions. They do not need an exception table. diff --git a/docs/ExtendingLLVM.html b/docs/ExtendingLLVM.html index a0cc4ea..6782787 100644 --- a/docs/ExtendingLLVM.html +++ b/docs/ExtendingLLVM.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Extending LLVM: Adding instructions, intrinsics, types, etc.</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -372,7 +372,7 @@ void calcTypeName(const Type *Ty, <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a> <br> - Last modified: $Date: 2012-03-23 06:50:46 +0100 (Fri, 23 Mar 2012) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> diff --git a/docs/FAQ.html b/docs/FAQ.html deleted file mode 100644 index 78c0268..0000000 --- a/docs/FAQ.html +++ /dev/null @@ -1,948 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM: Frequently Asked Questions</title> - <style type="text/css"> - @import url("llvm.css"); - .question { font-weight: bold } - .answer { margin-left: 2em } - </style> -</head> -<body> - -<h1> - LLVM: Frequently Asked Questions -</h1> - -<ol> - <li><a href="#license">License</a> - <ol> - <li>Why are the LLVM source code and the front-end distributed under - different licenses?</li> - - <li>Does the University of Illinois Open Source License really qualify as an - "open source" license?</li> - - <li>Can I modify LLVM source code and redistribute the modified source?</li> - - <li>Can I modify LLVM source code and redistribute binaries or other tools - based on it, without redistributing the source?</li> - </ol></li> - - <li><a href="#source">Source code</a> - <ol> - <li>In what language is LLVM written?</li> - - <li>How portable is the LLVM source code?</li> - </ol></li> - - <li><a href="#build">Build Problems</a> - <ol> - <li>When I run configure, it finds the wrong C compiler.</li> - - <li>The <tt>configure</tt> script finds the right C compiler, but it uses - the LLVM linker from a previous build. What do I do?</li> - - <li>When creating a dynamic library, I get a strange GLIBC error.</li> - - <li>I've updated my source tree from Subversion, and now my build is trying - to use a file/directory that doesn't exist.</li> - - <li>I've modified a Makefile in my source tree, but my build tree keeps - using the old version. What do I do?</li> - - <li>I've upgraded to a new version of LLVM, and I get strange build - errors.</li> - - <li>I've built LLVM and am testing it, but the tests freeze.</li> - - <li>Why do test results differ when I perform different types of - builds?</li> - - <li>Compiling LLVM with GCC 3.3.2 fails, what should I do?</li> - - <li>Compiling LLVM with GCC succeeds, but the resulting tools do not work, - what can be wrong?</li> - - <li>When I use the test suite, all of the C Backend tests fail. What is - wrong?</li> - - <li>After Subversion update, rebuilding gives the error "No rule to make - target".</li> - - <li><a href="#srcdir-objdir">When I compile LLVM-GCC with srcdir == objdir, - it fails. Why?</a></li> - </ol></li> - - <li><a href="#felangs">Source Languages</a> - <ol> - <li><a href="#langs">What source languages are supported?</a></li> - - <li><a href="#langirgen">I'd like to write a self-hosting LLVM compiler. How - should I interface with the LLVM middle-end optimizers and back-end code - generators?</a></li> - - <li><a href="#langhlsupp">What support is there for higher level source - language constructs for building a compiler?</a></li> - - <li><a href="GetElementPtr.html">I don't understand the GetElementPtr - instruction. Help!</a></li> - </ol> - - <li><a href="#cfe">Using the GCC Front End</a> - <ol> - <li>When I compile software that uses a configure script, the configure - script thinks my system has all of the header files and libraries it is - testing for. How do I get configure to work correctly?</li> - - <li>When I compile code using the LLVM GCC front end, it complains that it - cannot find libcrtend.a?</li> - - <li>How can I disable all optimizations when compiling code using the LLVM - GCC front end?</li> - - <li><a href="#translatecxx">Can I use LLVM to convert C++ code to C - code?</a></li> - - <li><a href="#platformindependent">Can I compile C or C++ code to - platform-independent LLVM bitcode?</a></li> - </ol> - </li> - - <li><a href="#cfe_code">Questions about code generated by the GCC front-end</a> - <ol> - <li><a href="#iosinit">What is this <tt>llvm.global_ctors</tt> and - <tt>_GLOBAL__I__tmp_webcompile...</tt> stuff that happens when I - #include <iostream>?</a></li> - - <li><a href="#codedce">Where did all of my code go??</a></li> - - <li><a href="#undef">What is this "<tt>undef</tt>" thing that shows up in - my code?</a></li> - - <li><a href="#callconvwrong">Why does instcombine + simplifycfg turn - a call to a function with a mismatched calling convention into "unreachable"? - Why not make the verifier reject it?</a></li> - </ol> - </li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="http://llvm.org/">The LLVM Team</a></p> -</div> - - -<!-- *********************************************************************** --> -<h2> - <a name="license">License</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<div class="question"> -<p>Why are the LLVM source code and the front-end distributed under different - licenses?</p> -</div> - -<div class="answer"> -<p>The C/C++ front-ends are based on GCC and must be distributed under the GPL. - Our aim is to distribute LLVM source code under a <em>much less - restrictive</em> license, in particular one that does not compel users who - distribute tools based on modifying the source to redistribute the modified - source code as well.</p> -</div> - -<div class="question"> -<p>Does the University of Illinois Open Source License really qualify as an - "open source" license?</p> -</div> - -<div class="answer"> -<p>Yes, the license - is <a href="http://www.opensource.org/licenses/UoI-NCSA.php">certified</a> by - the Open Source Initiative (OSI).</p> -</div> - -<div class="question"> -<p>Can I modify LLVM source code and redistribute the modified source?</p> -</div> - -<div class="answer"> -<p>Yes. The modified source distribution must retain the copyright notice and - follow the three bulletted conditions listed in - the <a href="http://llvm.org/svn/llvm-project/llvm/trunk/LICENSE.TXT">LLVM - license</a>.</p> -</div> - -<div class="question"> -<p>Can I modify LLVM source code and redistribute binaries or other tools based - on it, without redistributing the source?</p> -</div> - -<div class="answer"> -<p>Yes. This is why we distribute LLVM under a less restrictive license than - GPL, as explained in the first question above.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="source">Source Code</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<div class="question"> -<p>In what language is LLVM written?</p> -</div> - -<div class="answer"> -<p>All of the LLVM tools and libraries are written in C++ with extensive use of - the STL.</p> -</div> - -<div class="question"> -<p>How portable is the LLVM source code?</p> -</div> - -<div class="answer"> -<p>The LLVM source code should be portable to most modern UNIX-like operating -systems. Most of the code is written in standard C++ with operating system -services abstracted to a support library. The tools required to build and test -LLVM have been ported to a plethora of platforms.</p> - -<p>Some porting problems may exist in the following areas:</p> - -<ul> - <li>The GCC front end code is not as portable as the LLVM suite, so it may not - compile as well on unsupported platforms.</li> - - <li>The LLVM build system relies heavily on UNIX shell tools, like the Bourne - Shell and sed. Porting to systems without these tools (MacOS 9, Plan 9) - will require more effort.</li> -</ul> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="build">Build Problems</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<div class="question"> -<p>When I run configure, it finds the wrong C compiler.</p> -</div> - -<div class="answer"> -<p>The <tt>configure</tt> script attempts to locate first <tt>gcc</tt> and then - <tt>cc</tt>, unless it finds compiler paths set in <tt>CC</tt> - and <tt>CXX</tt> for the C and C++ compiler, respectively.</p> - -<p>If <tt>configure</tt> finds the wrong compiler, either adjust your - <tt>PATH</tt> environment variable or set <tt>CC</tt> and <tt>CXX</tt> - explicitly.</p> - -</div> - -<div class="question"> -<p>The <tt>configure</tt> script finds the right C compiler, but it uses the - LLVM linker from a previous build. What do I do?</p> -</div> - -<div class="answer"> -<p>The <tt>configure</tt> script uses the <tt>PATH</tt> to find executables, so - if it's grabbing the wrong linker/assembler/etc, there are two ways to fix - it:</p> - -<ol> - <li><p>Adjust your <tt>PATH</tt> environment variable so that the correct - program appears first in the <tt>PATH</tt>. This may work, but may not be - convenient when you want them <i>first</i> in your path for other - work.</p></li> - - <li><p>Run <tt>configure</tt> with an alternative <tt>PATH</tt> that is - correct. In a Borne compatible shell, the syntax would be:</p> - -<pre class="doc_code"> -% PATH=[the path without the bad program] ./configure ... -</pre> - - <p>This is still somewhat inconvenient, but it allows <tt>configure</tt> - to do its work without having to adjust your <tt>PATH</tt> - permanently.</p></li> -</ol> -</div> - -<div class="question"> -<p>When creating a dynamic library, I get a strange GLIBC error.</p> -</div> - -<div class="answer"> -<p>Under some operating systems (i.e. Linux), libtool does not work correctly if - GCC was compiled with the --disable-shared option. To work around this, - install your own version of GCC that has shared libraries enabled by - default.</p> -</div> - -<div class="question"> -<p>I've updated my source tree from Subversion, and now my build is trying to - use a file/directory that doesn't exist.</p> -</div> - -<div class="answer"> -<p>You need to re-run configure in your object directory. When new Makefiles - are added to the source tree, they have to be copied over to the object tree - in order to be used by the build.</p> -</div> - -<div class="question"> -<p>I've modified a Makefile in my source tree, but my build tree keeps using the - old version. What do I do?</p> -</div> - -<div class="answer"> -<p>If the Makefile already exists in your object tree, you can just run the - following command in the top level directory of your object tree:</p> - -<pre class="doc_code"> -% ./config.status <relative path to Makefile> -</pre> - -<p>If the Makefile is new, you will have to modify the configure script to copy - it over.</p> -</div> - -<div class="question"> -<p>I've upgraded to a new version of LLVM, and I get strange build errors.</p> -</div> - -<div class="answer"> - -<p>Sometimes, changes to the LLVM source code alters how the build system works. - Changes in libtool, autoconf, or header file dependencies are especially - prone to this sort of problem.</p> - -<p>The best thing to try is to remove the old files and re-build. In most - cases, this takes care of the problem. To do this, just type <tt>make - clean</tt> and then <tt>make</tt> in the directory that fails to build.</p> -</div> - -<div class="question"> -<p>I've built LLVM and am testing it, but the tests freeze.</p> -</div> - -<div class="answer"> -<p>This is most likely occurring because you built a profile or release - (optimized) build of LLVM and have not specified the same information on the - <tt>gmake</tt> command line.</p> - -<p>For example, if you built LLVM with the command:</p> - -<pre class="doc_code"> -% gmake ENABLE_PROFILING=1 -</pre> - -<p>...then you must run the tests with the following commands:</p> - -<pre class="doc_code"> -% cd llvm/test -% gmake ENABLE_PROFILING=1 -</pre> -</div> - -<div class="question"> -<p>Why do test results differ when I perform different types of builds?</p> -</div> - -<div class="answer"> -<p>The LLVM test suite is dependent upon several features of the LLVM tools and - libraries.</p> - -<p>First, the debugging assertions in code are not enabled in optimized or - profiling builds. Hence, tests that used to fail may pass.</p> - -<p>Second, some tests may rely upon debugging options or behavior that is only - available in the debug build. These tests will fail in an optimized or - profile build.</p> -</div> - -<div class="question"> -<p>Compiling LLVM with GCC 3.3.2 fails, what should I do?</p> -</div> - -<div class="answer"> -<p>This is <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13392">a bug in - GCC</a>, and affects projects other than LLVM. Try upgrading or downgrading - your GCC.</p> -</div> - -<div class="question"> -<p>Compiling LLVM with GCC succeeds, but the resulting tools do not work, what - can be wrong?</p> -</div> - -<div class="answer"> -<p>Several versions of GCC have shown a weakness in miscompiling the LLVM - codebase. Please consult your compiler version (<tt>gcc --version</tt>) to - find out whether it is <a href="GettingStarted.html#brokengcc">broken</a>. - If so, your only option is to upgrade GCC to a known good version.</p> -</div> - -<div class="question"> -<p>After Subversion update, rebuilding gives the error "No rule to make - target".</p> -</div> - -<div class="answer"> -<p>If the error is of the form:</p> - -<pre class="doc_code"> -gmake[2]: *** No rule to make target `/path/to/somefile', needed by -`/path/to/another/file.d'.<br> -Stop. -</pre> - -<p>This may occur anytime files are moved within the Subversion repository or - removed entirely. In this case, the best solution is to erase all - <tt>.d</tt> files, which list dependencies for source files, and rebuild:</p> - -<pre class="doc_code"> -% cd $LLVM_OBJ_DIR -% rm -f `find . -name \*\.d` -% gmake -</pre> - -<p>In other cases, it may be necessary to run <tt>make clean</tt> before - rebuilding.</p> -</div> - -<div class="question"> -<p><a name="srcdir-objdir">When I compile LLVM-GCC with srcdir == objdir, it - fails. Why?</a></p> -</div> - -<div class="answer"> -<p>The <tt>GNUmakefile</tt> in the top-level directory of LLVM-GCC is a special - <tt>Makefile</tt> used by Apple to invoke the <tt>build_gcc</tt> script after - setting up a special environment. This has the unfortunate side-effect that - trying to build LLVM-GCC with srcdir == objdir in a "non-Apple way" invokes - the <tt>GNUmakefile</tt> instead of <tt>Makefile</tt>. Because the - environment isn't set up correctly to do this, the build fails.</p> - -<p>People not building LLVM-GCC the "Apple way" need to build LLVM-GCC with - srcdir != objdir, or simply remove the GNUmakefile entirely.</p> - -<p>We regret the inconvenience.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="felangs">Source Languages</a> -</h2> - -<div> - -<div class="question"> -<p><a name="langs">What source languages are supported?</a></p> -</div> - -<div class="answer"> -<p>LLVM currently has full support for C and C++ source languages. These are - available through a special version of GCC that LLVM calls the - <a href="#cfe">C Front End</a></p> - -<p>There is an incomplete version of a Java front end available in the - <tt>java</tt> module. There is no documentation on this yet so you'll need to - download the code, compile it, and try it.</p> - -<p>The PyPy developers are working on integrating LLVM into the PyPy backend so - that PyPy language can translate to LLVM.</p> -</div> - -<div class="question"> -<p><a name="langirgen">I'd like to write a self-hosting LLVM compiler. How - should I interface with the LLVM middle-end optimizers and back-end code - generators?</a></p> -</div> - -<div class="answer"> -<p>Your compiler front-end will communicate with LLVM by creating a module in - the LLVM intermediate representation (IR) format. Assuming you want to write - your language's compiler in the language itself (rather than C++), there are - 3 major ways to tackle generating LLVM IR from a front-end:</p> - -<ul> - <li><strong>Call into the LLVM libraries code using your language's FFI - (foreign function interface).</strong> - - <ul> - <li><em>for:</em> best tracks changes to the LLVM IR, .ll syntax, and .bc - format</li> - - <li><em>for:</em> enables running LLVM optimization passes without a - emit/parse overhead</li> - - <li><em>for:</em> adapts well to a JIT context</li> - - <li><em>against:</em> lots of ugly glue code to write</li> - </ul></li> - - <li> <strong>Emit LLVM assembly from your compiler's native language.</strong> - <ul> - <li><em>for:</em> very straightforward to get started</li> - - <li><em>against:</em> the .ll parser is slower than the bitcode reader - when interfacing to the middle end</li> - - <li><em>against:</em> you'll have to re-engineer the LLVM IR object model - and asm writer in your language</li> - - <li><em>against:</em> it may be harder to track changes to the IR</li> - </ul></li> - - <li><strong>Emit LLVM bitcode from your compiler's native language.</strong> - - <ul> - <li><em>for:</em> can use the more-efficient bitcode reader when - interfacing to the middle end</li> - - <li><em>against:</em> you'll have to re-engineer the LLVM IR object - model and bitcode writer in your language</li> - - <li><em>against:</em> it may be harder to track changes to the IR</li> - </ul></li> -</ul> - -<p>If you go with the first option, the C bindings in include/llvm-c should help - a lot, since most languages have strong support for interfacing with C. The - most common hurdle with calling C from managed code is interfacing with the - garbage collector. The C interface was designed to require very little memory - management, and so is straightforward in this regard.</p> -</div> - -<div class="question"> -<p><a name="langhlsupp">What support is there for a higher level source language - constructs for building a compiler?</a></p> -</div> - -<div class="answer"> -<p>Currently, there isn't much. LLVM supports an intermediate representation - which is useful for code representation but will not support the high level - (abstract syntax tree) representation needed by most compilers. There are no - facilities for lexical nor semantic analysis.</p> -</div> - -<div class="question"> -<p><a name="getelementptr">I don't understand the GetElementPtr - instruction. Help!</a></p> -</div> - -<div class="answer"> -<p>See <a href="GetElementPtr.html">The Often Misunderstood GEP - Instruction</a>.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="cfe">Using the GCC Front End</a> -</h2> - -<div> - -<div class="question"> -<p>When I compile software that uses a configure script, the configure script - thinks my system has all of the header files and libraries it is testing for. - How do I get configure to work correctly?</p> -</div> - -<div class="answer"> -<p>The configure script is getting things wrong because the LLVM linker allows - symbols to be undefined at link time (so that they can be resolved during JIT - or translation to the C back end). That is why configure thinks your system - "has everything."</p> - -<p>To work around this, perform the following steps:</p> - -<ol> - <li>Make sure the CC and CXX environment variables contains the full path to - the LLVM GCC front end.</li> - - <li>Make sure that the regular C compiler is first in your PATH. </li> - - <li>Add the string "-Wl,-native" to your CFLAGS environment variable.</li> -</ol> - -<p>This will allow the <tt>llvm-ld</tt> linker to create a native code - executable instead of shell script that runs the JIT. Creating native code - requires standard linkage, which in turn will allow the configure script to - find out if code is not linking on your system because the feature isn't - available on your system.</p> -</div> - -<div class="question"> -<p>When I compile code using the LLVM GCC front end, it complains that it cannot - find libcrtend.a. -</p> -</div> - -<div class="answer"> -<p>The only way this can happen is if you haven't installed the runtime - library. To correct this, do:</p> - -<pre class="doc_code"> -% cd llvm/runtime -% make clean ; make install-bytecode -</pre> -</div> - -<div class="question"> -<p>How can I disable all optimizations when compiling code using the LLVM GCC - front end?</p> -</div> - -<div class="answer"> -<p>Passing "-Wa,-disable-opt -Wl,-disable-opt" will disable *all* cleanup and - optimizations done at the llvm level, leaving you with the truly horrible - code that you desire.</p> -</div> - - -<div class="question"> -<p><a name="translatecxx">Can I use LLVM to convert C++ code to C code?</a></p> -</div> - -<div class="answer"> -<p>Yes, you can use LLVM to convert code from any language LLVM supports to C. - Note that the generated C code will be very low level (all loops are lowered - to gotos, etc) and not very pretty (comments are stripped, original source - formatting is totally lost, variables are renamed, expressions are - regrouped), so this may not be what you're looking for. Also, there are - several limitations noted below.<p> - -<p>Use commands like this:</p> - -<ol> - <li><p>Compile your program with llvm-g++:</p> - -<pre class="doc_code"> -% llvm-g++ -emit-llvm x.cpp -o program.bc -c -</pre> - - <p>or:</p> - -<pre class="doc_code"> -% llvm-g++ a.cpp -c -emit-llvm -% llvm-g++ b.cpp -c -emit-llvm -% llvm-ld a.o b.o -o program -</pre> - - <p>This will generate program and program.bc. The .bc - file is the LLVM version of the program all linked together.</p></li> - - <li><p>Convert the LLVM code to C code, using the LLC tool with the C - backend:</p> - -<pre class="doc_code"> -% llc -march=c program.bc -o program.c -</pre></li> - - <li><p>Finally, compile the C file:</p> - -<pre class="doc_code"> -% cc x.c -lstdc++ -</pre></li> - -</ol> - -<p>Using LLVM does not eliminate the need for C++ library support. If you use - the llvm-g++ front-end, the generated code will depend on g++'s C++ support - libraries in the same way that code generated from g++ would. If you use - another C++ front-end, the generated code will depend on whatever library - that front-end would normally require.</p> - -<p>If you are working on a platform that does not provide any C++ libraries, you - may be able to manually compile libstdc++ to LLVM bitcode, statically link it - into your program, then use the commands above to convert the whole result - into C code. Alternatively, you might compile the libraries and your - application into two different chunks of C code and link them.</p> - -<p>Note that, by default, the C back end does not support exception handling. - If you want/need it for a certain program, you can enable it by passing - "-enable-correct-eh-support" to the llc program. The resultant code will use - setjmp/longjmp to implement exception support that is relatively slow, and - not C++-ABI-conforming on most platforms, but otherwise correct.</p> - -<p>Also, there are a number of other limitations of the C backend that cause it - to produce code that does not fully conform to the C++ ABI on most - platforms. Some of the C++ programs in LLVM's test suite are known to fail - when compiled with the C back end because of ABI incompatibilities with - standard C++ libraries.</p> -</div> - -<div class="question"> -<p><a name="platformindependent">Can I compile C or C++ code to - platform-independent LLVM bitcode?</a></p> -</div> - -<div class="answer"> -<p>No. C and C++ are inherently platform-dependent languages. The most obvious - example of this is the preprocessor. A very common way that C code is made - portable is by using the preprocessor to include platform-specific code. In - practice, information about other platforms is lost after preprocessing, so - the result is inherently dependent on the platform that the preprocessing was - targeting.</p> - -<p>Another example is <tt>sizeof</tt>. It's common for <tt>sizeof(long)</tt> to - vary between platforms. In most C front-ends, <tt>sizeof</tt> is expanded to - a constant immediately, thus hard-wiring a platform-specific detail.</p> - -<p>Also, since many platforms define their ABIs in terms of C, and since LLVM is - lower-level than C, front-ends currently must emit platform-specific IR in - order to have the result conform to the platform ABI.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="cfe_code">Questions about code generated by the GCC front-end</a> -</h2> - -<div> - -<div class="question"> -<p><a name="iosinit">What is this <tt>llvm.global_ctors</tt> and - <tt>_GLOBAL__I__tmp_webcompile...</tt> stuff that happens when I <tt>#include - <iostream></tt>?</a></p> -</div> - -<div class="answer"> -<p>If you <tt>#include</tt> the <tt><iostream></tt> header into a C++ - translation unit, the file will probably use - the <tt>std::cin</tt>/<tt>std::cout</tt>/... global objects. However, C++ - does not guarantee an order of initialization between static objects in - different translation units, so if a static ctor/dtor in your .cpp file - used <tt>std::cout</tt>, for example, the object would not necessarily be - automatically initialized before your use.</p> - -<p>To make <tt>std::cout</tt> and friends work correctly in these scenarios, the - STL that we use declares a static object that gets created in every - translation unit that includes <tt><iostream></tt>. This object has a - static constructor and destructor that initializes and destroys the global - iostream objects before they could possibly be used in the file. The code - that you see in the .ll file corresponds to the constructor and destructor - registration code. -</p> - -<p>If you would like to make it easier to <b>understand</b> the LLVM code - generated by the compiler in the demo page, consider using <tt>printf()</tt> - instead of <tt>iostream</tt>s to print values.</p> -</div> - -<!--=========================================================================--> - -<div class="question"> -<p><a name="codedce">Where did all of my code go??</a></p> -</div> - -<div class="answer"> -<p>If you are using the LLVM demo page, you may often wonder what happened to - all of the code that you typed in. Remember that the demo script is running - the code through the LLVM optimizers, so if your code doesn't actually do - anything useful, it might all be deleted.</p> - -<p>To prevent this, make sure that the code is actually needed. For example, if - you are computing some expression, return the value from the function instead - of leaving it in a local variable. If you really want to constrain the - optimizer, you can read from and assign to <tt>volatile</tt> global - variables.</p> -</div> - -<!--=========================================================================--> - -<div class="question"> -<p><a name="undef">What is this "<tt>undef</tt>" thing that shows up in my - code?</a></p> -</div> - -<div class="answer"> -<p><a href="LangRef.html#undef"><tt>undef</tt></a> is the LLVM way of - representing a value that is not defined. You can get these if you do not - initialize a variable before you use it. For example, the C function:</p> - -<pre class="doc_code"> -int X() { int i; return i; } -</pre> - -<p>Is compiled to "<tt>ret i32 undef</tt>" because "<tt>i</tt>" never has a - value specified for it.</p> -</div> - -<!--=========================================================================--> - -<div class="question"> -<p><a name="callconvwrong">Why does instcombine + simplifycfg turn - a call to a function with a mismatched calling convention into "unreachable"? - Why not make the verifier reject it?</a></p> -</div> - -<div class="answer"> -<p>This is a common problem run into by authors of front-ends that are using -custom calling conventions: you need to make sure to set the right calling -convention on both the function and on each call to the function. For example, -this code:</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} -define void @bar() { - call void @foo() - ret void -} -</pre> - -<p>Is optimized to:</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} -define void @bar() { - unreachable -} -</pre> - -<p>... with "opt -instcombine -simplifycfg". This often bites people because -"all their code disappears". Setting the calling convention on the caller and -callee is required for indirect calls to work, so people often ask why not make -the verifier reject this sort of thing.</p> - -<p>The answer is that this code has undefined behavior, but it is not illegal. -If we made it illegal, then every transformation that could potentially create -this would have to ensure that it doesn't, and there is valid code that can -create this sort of construct (in dead code). The sorts of things that can -cause this to happen are fairly contrived, but we still need to accept them. -Here's an example:</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} -define internal void @bar(void()* %FP, i1 %cond) { - br i1 %cond, label %T, label %F -T: - call void %FP() - ret void -F: - call fastcc void %FP() - ret void -} -define void @test() { - %X = or i1 false, false - call void @bar(void()* @foo, i1 %X) - ret void -} -</pre> - -<p>In this example, "test" always passes @foo/false into bar, which ensures that - it is dynamically called with the right calling conv (thus, the code is - perfectly well defined). If you run this through the inliner, you get this - (the explicit "or" is there so that the inliner doesn't dead code eliminate - a bunch of stuff): -</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} -define void @test() { - %X = or i1 false, false - br i1 %X, label %T.i, label %F.i -T.i: - call void @foo() - br label %bar.exit -F.i: - call fastcc void @foo() - br label %bar.exit -bar.exit: - ret void -} -</pre> - -<p>Here you can see that the inlining pass made an undefined call to @foo with - the wrong calling convention. We really don't want to make the inliner have - to know about this sort of thing, so it needs to be valid code. In this case, - dead code elimination can trivially remove the undefined code. However, if %X - was an input argument to @test, the inliner would produce this: -</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} - -define void @test(i1 %X) { - br i1 %X, label %T.i, label %F.i -T.i: - call void @foo() - br label %bar.exit -F.i: - call fastcc void @foo() - br label %bar.exit -bar.exit: - ret void -} -</pre> - -<p>The interesting thing about this is that %X <em>must</em> be false for the -code to be well-defined, but no amount of dead code elimination will be able to -delete the broken call as unreachable. However, since instcombine/simplifycfg -turns the undefined call into unreachable, we end up with a branch on a -condition that goes to unreachable: a branch to unreachable can never happen, so -"-inline -instcombine -simplifycfg" is able to produce:</p> - -<pre class="doc_code"> -define fastcc void @foo() { - ret void -} -define void @test(i1 %X) { -F.i: - call fastcc void @foo() - ret void -} -</pre> - -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ -</address> - -</body> -</html> diff --git a/docs/FAQ.rst b/docs/FAQ.rst new file mode 100644 index 0000000..b0e3ca0 --- /dev/null +++ b/docs/FAQ.rst @@ -0,0 +1,464 @@ +.. _faq: + +================================ +Frequently Asked Questions (FAQ) +================================ + +.. contents:: + :local: + + +License +======= + +Does the University of Illinois Open Source License really qualify as an "open source" license? +----------------------------------------------------------------------------------------------- +Yes, the license is `certified +<http://www.opensource.org/licenses/UoI-NCSA.php>`_ by the Open Source +Initiative (OSI). + + +Can I modify LLVM source code and redistribute the modified source? +------------------------------------------------------------------- +Yes. The modified source distribution must retain the copyright notice and +follow the three bulletted conditions listed in the `LLVM license +<http://llvm.org/svn/llvm-project/llvm/trunk/LICENSE.TXT>`_. + + +Can I modify the LLVM source code and redistribute binaries or other tools based on it, without redistributing the source? +-------------------------------------------------------------------------------------------------------------------------- +Yes. This is why we distribute LLVM under a less restrictive license than GPL, +as explained in the first question above. + + +Source Code +=========== + +In what language is LLVM written? +--------------------------------- +All of the LLVM tools and libraries are written in C++ with extensive use of +the STL. + + +How portable is the LLVM source code? +------------------------------------- +The LLVM source code should be portable to most modern Unix-like operating +systems. Most of the code is written in standard C++ with operating system +services abstracted to a support library. The tools required to build and +test LLVM have been ported to a plethora of platforms. + +Some porting problems may exist in the following areas: + +* The autoconf/makefile build system relies heavily on UNIX shell tools, + like the Bourne Shell and sed. Porting to systems without these tools + (MacOS 9, Plan 9) will require more effort. + + +Build Problems +============== + +When I run configure, it finds the wrong C compiler. +---------------------------------------------------- +The ``configure`` script attempts to locate first ``gcc`` and then ``cc``, +unless it finds compiler paths set in ``CC`` and ``CXX`` for the C and C++ +compiler, respectively. + +If ``configure`` finds the wrong compiler, either adjust your ``PATH`` +environment variable or set ``CC`` and ``CXX`` explicitly. + + +The ``configure`` script finds the right C compiler, but it uses the LLVM tools from a previous build. What do I do? +--------------------------------------------------------------------------------------------------------------------- +The ``configure`` script uses the ``PATH`` to find executables, so if it's +grabbing the wrong linker/assembler/etc, there are two ways to fix it: + +#. Adjust your ``PATH`` environment variable so that the correct program + appears first in the ``PATH``. This may work, but may not be convenient + when you want them *first* in your path for other work. + +#. Run ``configure`` with an alternative ``PATH`` that is correct. In a + Bourne compatible shell, the syntax would be: + +.. code-block:: bash + + % PATH=[the path without the bad program] ./configure ... + +This is still somewhat inconvenient, but it allows ``configure`` to do its +work without having to adjust your ``PATH`` permanently. + + +When creating a dynamic library, I get a strange GLIBC error. +------------------------------------------------------------- +Under some operating systems (i.e. Linux), libtool does not work correctly if +GCC was compiled with the ``--disable-shared option``. To work around this, +install your own version of GCC that has shared libraries enabled by default. + + +I've updated my source tree from Subversion, and now my build is trying to use a file/directory that doesn't exist. +------------------------------------------------------------------------------------------------------------------- +You need to re-run configure in your object directory. When new Makefiles +are added to the source tree, they have to be copied over to the object tree +in order to be used by the build. + + +I've modified a Makefile in my source tree, but my build tree keeps using the old version. What do I do? +--------------------------------------------------------------------------------------------------------- +If the Makefile already exists in your object tree, you can just run the +following command in the top level directory of your object tree: + +.. code-block:: bash + + % ./config.status <relative path to Makefile>; + +If the Makefile is new, you will have to modify the configure script to copy +it over. + + +I've upgraded to a new version of LLVM, and I get strange build errors. +----------------------------------------------------------------------- +Sometimes, changes to the LLVM source code alters how the build system works. +Changes in ``libtool``, ``autoconf``, or header file dependencies are +especially prone to this sort of problem. + +The best thing to try is to remove the old files and re-build. In most cases, +this takes care of the problem. To do this, just type ``make clean`` and then +``make`` in the directory that fails to build. + + +I've built LLVM and am testing it, but the tests freeze. +-------------------------------------------------------- +This is most likely occurring because you built a profile or release +(optimized) build of LLVM and have not specified the same information on the +``gmake`` command line. + +For example, if you built LLVM with the command: + +.. code-block:: bash + + % gmake ENABLE_PROFILING=1 + +...then you must run the tests with the following commands: + +.. code-block:: bash + + % cd llvm/test + % gmake ENABLE_PROFILING=1 + +Why do test results differ when I perform different types of builds? +-------------------------------------------------------------------- +The LLVM test suite is dependent upon several features of the LLVM tools and +libraries. + +First, the debugging assertions in code are not enabled in optimized or +profiling builds. Hence, tests that used to fail may pass. + +Second, some tests may rely upon debugging options or behavior that is only +available in the debug build. These tests will fail in an optimized or +profile build. + + +Compiling LLVM with GCC 3.3.2 fails, what should I do? +------------------------------------------------------ +This is `a bug in GCC <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13392>`_, +and affects projects other than LLVM. Try upgrading or downgrading your GCC. + + +Compiling LLVM with GCC succeeds, but the resulting tools do not work, what can be wrong? +----------------------------------------------------------------------------------------- +Several versions of GCC have shown a weakness in miscompiling the LLVM +codebase. Please consult your compiler version (``gcc --version``) to find +out whether it is `broken <GettingStarted.html#brokengcc>`_. If so, your only +option is to upgrade GCC to a known good version. + + +After Subversion update, rebuilding gives the error "No rule to make target". +----------------------------------------------------------------------------- +If the error is of the form: + +.. code-block:: bash + + gmake[2]: *** No rule to make target `/path/to/somefile', + needed by `/path/to/another/file.d'. + Stop. + +This may occur anytime files are moved within the Subversion repository or +removed entirely. In this case, the best solution is to erase all ``.d`` +files, which list dependencies for source files, and rebuild: + +.. code-block:: bash + + % cd $LLVM_OBJ_DIR + % rm -f `find . -name \*\.d` + % gmake + +In other cases, it may be necessary to run ``make clean`` before rebuilding. + + +Source Languages +================ + +What source languages are supported? +------------------------------------ +LLVM currently has full support for C and C++ source languages. These are +available through both `Clang <http://clang.llvm.org/>`_ and `DragonEgg +<http://dragonegg.llvm.org/>`_. + +The PyPy developers are working on integrating LLVM into the PyPy backend so +that PyPy language can translate to LLVM. + + +I'd like to write a self-hosting LLVM compiler. How should I interface with the LLVM middle-end optimizers and back-end code generators? +---------------------------------------------------------------------------------------------------------------------------------------- +Your compiler front-end will communicate with LLVM by creating a module in the +LLVM intermediate representation (IR) format. Assuming you want to write your +language's compiler in the language itself (rather than C++), there are 3 +major ways to tackle generating LLVM IR from a front-end: + +1. **Call into the LLVM libraries code using your language's FFI (foreign + function interface).** + + * *for:* best tracks changes to the LLVM IR, .ll syntax, and .bc format + + * *for:* enables running LLVM optimization passes without a emit/parse + overhead + + * *for:* adapts well to a JIT context + + * *against:* lots of ugly glue code to write + +2. **Emit LLVM assembly from your compiler's native language.** + + * *for:* very straightforward to get started + + * *against:* the .ll parser is slower than the bitcode reader when + interfacing to the middle end + + * *against:* it may be harder to track changes to the IR + +3. **Emit LLVM bitcode from your compiler's native language.** + + * *for:* can use the more-efficient bitcode reader when interfacing to the + middle end + + * *against:* you'll have to re-engineer the LLVM IR object model and bitcode + writer in your language + + * *against:* it may be harder to track changes to the IR + +If you go with the first option, the C bindings in include/llvm-c should help +a lot, since most languages have strong support for interfacing with C. The +most common hurdle with calling C from managed code is interfacing with the +garbage collector. The C interface was designed to require very little memory +management, and so is straightforward in this regard. + +What support is there for a higher level source language constructs for building a compiler? +-------------------------------------------------------------------------------------------- +Currently, there isn't much. LLVM supports an intermediate representation +which is useful for code representation but will not support the high level +(abstract syntax tree) representation needed by most compilers. There are no +facilities for lexical nor semantic analysis. + + +I don't understand the ``GetElementPtr`` instruction. Help! +----------------------------------------------------------- +See `The Often Misunderstood GEP Instruction <GetElementPtr.html>`_. + + +Using the C and C++ Front Ends +============================== + +Can I compile C or C++ code to platform-independent LLVM bitcode? +----------------------------------------------------------------- +No. C and C++ are inherently platform-dependent languages. The most obvious +example of this is the preprocessor. A very common way that C code is made +portable is by using the preprocessor to include platform-specific code. In +practice, information about other platforms is lost after preprocessing, so +the result is inherently dependent on the platform that the preprocessing was +targeting. + +Another example is ``sizeof``. It's common for ``sizeof(long)`` to vary +between platforms. In most C front-ends, ``sizeof`` is expanded to a +constant immediately, thus hard-wiring a platform-specific detail. + +Also, since many platforms define their ABIs in terms of C, and since LLVM is +lower-level than C, front-ends currently must emit platform-specific IR in +order to have the result conform to the platform ABI. + + +Questions about code generated by the demo page +=============================================== + +What is this ``llvm.global_ctors`` and ``_GLOBAL__I_a...`` stuff that happens when I ``#include <iostream>``? +------------------------------------------------------------------------------------------------------------- +If you ``#include`` the ``<iostream>`` header into a C++ translation unit, +the file will probably use the ``std::cin``/``std::cout``/... global objects. +However, C++ does not guarantee an order of initialization between static +objects in different translation units, so if a static ctor/dtor in your .cpp +file used ``std::cout``, for example, the object would not necessarily be +automatically initialized before your use. + +To make ``std::cout`` and friends work correctly in these scenarios, the STL +that we use declares a static object that gets created in every translation +unit that includes ``<iostream>``. This object has a static constructor +and destructor that initializes and destroys the global iostream objects +before they could possibly be used in the file. The code that you see in the +``.ll`` file corresponds to the constructor and destructor registration code. + +If you would like to make it easier to *understand* the LLVM code generated +by the compiler in the demo page, consider using ``printf()`` instead of +``iostream``\s to print values. + + +Where did all of my code go?? +----------------------------- +If you are using the LLVM demo page, you may often wonder what happened to +all of the code that you typed in. Remember that the demo script is running +the code through the LLVM optimizers, so if your code doesn't actually do +anything useful, it might all be deleted. + +To prevent this, make sure that the code is actually needed. For example, if +you are computing some expression, return the value from the function instead +of leaving it in a local variable. If you really want to constrain the +optimizer, you can read from and assign to ``volatile`` global variables. + + +What is this "``undef``" thing that shows up in my code? +-------------------------------------------------------- +``undef`` is the LLVM way of representing a value that is not defined. You +can get these if you do not initialize a variable before you use it. For +example, the C function: + +.. code-block:: c + + int X() { int i; return i; } + +Is compiled to "``ret i32 undef``" because "``i``" never has a value specified +for it. + + +Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into "unreachable"? Why not make the verifier reject it? +---------------------------------------------------------------------------------------------------------------------------------------------------------- +This is a common problem run into by authors of front-ends that are using +custom calling conventions: you need to make sure to set the right calling +convention on both the function and on each call to the function. For +example, this code: + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + define void @bar() { + call void @foo() + ret void + } + +Is optimized to: + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + define void @bar() { + unreachable + } + +... with "``opt -instcombine -simplifycfg``". This often bites people because +"all their code disappears". Setting the calling convention on the caller and +callee is required for indirect calls to work, so people often ask why not +make the verifier reject this sort of thing. + +The answer is that this code has undefined behavior, but it is not illegal. +If we made it illegal, then every transformation that could potentially create +this would have to ensure that it doesn't, and there is valid code that can +create this sort of construct (in dead code). The sorts of things that can +cause this to happen are fairly contrived, but we still need to accept them. +Here's an example: + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + define internal void @bar(void()* %FP, i1 %cond) { + br i1 %cond, label %T, label %F + T: + call void %FP() + ret void + F: + call fastcc void %FP() + ret void + } + define void @test() { + %X = or i1 false, false + call void @bar(void()* @foo, i1 %X) + ret void + } + +In this example, "test" always passes ``@foo``/``false`` into ``bar``, which +ensures that it is dynamically called with the right calling conv (thus, the +code is perfectly well defined). If you run this through the inliner, you +get this (the explicit "or" is there so that the inliner doesn't dead code +eliminate a bunch of stuff): + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + define void @test() { + %X = or i1 false, false + br i1 %X, label %T.i, label %F.i + T.i: + call void @foo() + br label %bar.exit + F.i: + call fastcc void @foo() + br label %bar.exit + bar.exit: + ret void + } + +Here you can see that the inlining pass made an undefined call to ``@foo`` +with the wrong calling convention. We really don't want to make the inliner +have to know about this sort of thing, so it needs to be valid code. In this +case, dead code elimination can trivially remove the undefined code. However, +if ``%X`` was an input argument to ``@test``, the inliner would produce this: + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + + define void @test(i1 %X) { + br i1 %X, label %T.i, label %F.i + T.i: + call void @foo() + br label %bar.exit + F.i: + call fastcc void @foo() + br label %bar.exit + bar.exit: + ret void + } + +The interesting thing about this is that ``%X`` *must* be false for the +code to be well-defined, but no amount of dead code elimination will be able +to delete the broken call as unreachable. However, since +``instcombine``/``simplifycfg`` turns the undefined call into unreachable, we +end up with a branch on a condition that goes to unreachable: a branch to +unreachable can never happen, so "``-inline -instcombine -simplifycfg``" is +able to produce: + +.. code-block:: llvm + + define fastcc void @foo() { + ret void + } + define void @test(i1 %X) { + F.i: + call fastcc void @foo() + ret void + } diff --git a/docs/GCCFEBuildInstrs.html b/docs/GCCFEBuildInstrs.html index f502481..37800c8 100644 --- a/docs/GCCFEBuildInstrs.html +++ b/docs/GCCFEBuildInstrs.html @@ -3,7 +3,7 @@ <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> - <link rel="stylesheet" href="llvm.css" type="text/css" media="screen"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css" media="screen"> <title>Building the LLVM GCC Front-End</title> </head> <body> @@ -272,7 +272,7 @@ More information is <a href="FAQ.html#license">available in the FAQ</a>. src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> diff --git a/docs/GarbageCollection.html b/docs/GarbageCollection.html index 9463eaa..0b8f588 100644 --- a/docs/GarbageCollection.html +++ b/docs/GarbageCollection.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" Content="text/html; charset=UTF-8" > <title>Accurate Garbage Collection with LLVM</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> <style type="text/css"> .rowhead { text-align: left; background: inherit; } .indent { padding-left: 1em; } @@ -475,7 +475,7 @@ Entry: ;; Tell LLVM that the stack space is a stack root. ;; Java has type-tags on objects, so we pass null as metadata. %tmp = bitcast %Object** %X to i8** - call void @llvm.gcroot(i8** %X, i8* null) + call void @llvm.gcroot(i8** %tmp, i8* null) ... ;; "CodeBlock" is the block corresponding to the start @@ -1382,7 +1382,7 @@ Fergus Henderson. International Symposium on Memory Management 2002.</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-03 05:32:33 +0100 (Sat, 03 Mar 2012) $ + Last modified: $Date: 2012-05-03 17:25:19 +0200 (Thu, 03 May 2012) $ </address> </body> diff --git a/docs/GetElementPtr.html b/docs/GetElementPtr.html deleted file mode 100644 index 17a93f5..0000000 --- a/docs/GetElementPtr.html +++ /dev/null @@ -1,753 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>The Often Misunderstood GEP Instruction</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> - <style type="text/css"> - TABLE { text-align: left; border: 1px solid black; border-collapse: collapse; margin: 0 0 0 0; } - </style> -</head> -<body> - -<h1> - The Often Misunderstood GEP Instruction -</h1> - -<ol> - <li><a href="#intro">Introduction</a></li> - <li><a href="#addresses">Address Computation</a> - <ol> - <li><a href="#extra_index">Why is the extra 0 index required?</a></li> - <li><a href="#deref">What is dereferenced by GEP?</a></li> - <li><a href="#firstptr">Why can you index through the first pointer but not - subsequent ones?</a></li> - <li><a href="#lead0">Why don't GEP x,0,0,1 and GEP x,1 alias? </a></li> - <li><a href="#trail0">Why do GEP x,1,0,0 and GEP x,1 alias? </a></li> - <li><a href="#vectors">Can GEP index into vector elements?</a> - <li><a href="#addrspace">What effect do address spaces have on GEPs?</a> - <li><a href="#int">How is GEP different from ptrtoint, arithmetic, and inttoptr?</a></li> - <li><a href="#be">I'm writing a backend for a target which needs custom lowering for GEP. How do I do this?</a> - <li><a href="#vla">How does VLA addressing work with GEPs?</a> - </ol></li> - <li><a href="#rules">Rules</a> - <ol> - <li><a href="#bounds">What happens if an array index is out of bounds?</a> - <li><a href="#negative">Can array indices be negative?</a> - <li><a href="#compare">Can I compare two values computed with GEPs?</a> - <li><a href="#types">Can I do GEP with a different pointer type than the type of the underlying object?</a> - <li><a href="#null">Can I cast an object's address to integer and add it to null?</a> - <li><a href="#ptrdiff">Can I compute the distance between two objects, and add that value to one address to compute the other address?</a> - <li><a href="#tbaa">Can I do type-based alias analysis on LLVM IR?</a> - <li><a href="#overflow">What happens if a GEP computation overflows?</a> - <li><a href="#check">How can I tell if my front-end is following the rules?</a> - </ol></li> - <li><a href="#rationale">Rationale</a> - <ol> - <li><a href="#goals">Why is GEP designed this way?</a></li> - <li><a href="#i32">Why do struct member indices always use i32?</a></li> - <li><a href="#uglygep">What's an uglygep?</a> - </ol></li> - <li><a href="#summary">Summary</a></li> -</ol> - -<div class="doc_author"> - <p>Written by: <a href="mailto:rspencer@reidspencer.com">Reid Spencer</a>.</p> -</div> - - -<!-- *********************************************************************** --> -<h2><a name="intro">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - <p>This document seeks to dispel the mystery and confusion surrounding LLVM's - <a href="LangRef.html#i_getelementptr">GetElementPtr</a> (GEP) instruction. - Questions about the wily GEP instruction are - probably the most frequently occurring questions once a developer gets down to - coding with LLVM. Here we lay out the sources of confusion and show that the - GEP instruction is really quite simple. - </p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="addresses">Address Computation</a></h2> -<!-- *********************************************************************** --> -<div> - <p>When people are first confronted with the GEP instruction, they tend to - relate it to known concepts from other programming paradigms, most notably C - array indexing and field selection. GEP closely resembles C array indexing - and field selection, however it's is a little different and this leads to - the following questions.</p> - -<!-- *********************************************************************** --> -<h3> - <a name="firstptr">What is the first index of the GEP instruction?</a> -</h3> -<div> - <p>Quick answer: The index stepping through the first operand.</p> - <p>The confusion with the first index usually arises from thinking about - the GetElementPtr instruction as if it was a C index operator. They aren't the - same. For example, when we write, in "C":</p> - -<div class="doc_code"> -<pre> -AType *Foo; -... -X = &Foo->F; -</pre> -</div> - - <p>it is natural to think that there is only one index, the selection of the - field <tt>F</tt>. However, in this example, <tt>Foo</tt> is a pointer. That - pointer must be indexed explicitly in LLVM. C, on the other hand, indices - through it transparently. To arrive at the same address location as the C - code, you would provide the GEP instruction with two index operands. The - first operand indexes through the pointer; the second operand indexes the - field <tt>F</tt> of the structure, just as if you wrote:</p> - -<div class="doc_code"> -<pre> -X = &Foo[0].F; -</pre> -</div> - - <p>Sometimes this question gets rephrased as:</p> - <blockquote><p><i>Why is it okay to index through the first pointer, but - subsequent pointers won't be dereferenced?</i></p></blockquote> - <p>The answer is simply because memory does not have to be accessed to - perform the computation. The first operand to the GEP instruction must be a - value of a pointer type. The value of the pointer is provided directly to - the GEP instruction as an operand without any need for accessing memory. It - must, therefore be indexed and requires an index operand. Consider this - example:</p> - -<div class="doc_code"> -<pre> -struct munger_struct { - int f1; - int f2; -}; -void munge(struct munger_struct *P) { - P[0].f1 = P[1].f1 + P[2].f2; -} -... -munger_struct Array[3]; -... -munge(Array); -</pre> -</div> - - <p>In this "C" example, the front end compiler (llvm-gcc) will generate three - GEP instructions for the three indices through "P" in the assignment - statement. The function argument <tt>P</tt> will be the first operand of each - of these GEP instructions. The second operand indexes through that pointer. - The third operand will be the field offset into the - <tt>struct munger_struct</tt> type, for either the <tt>f1</tt> or - <tt>f2</tt> field. So, in LLVM assembly the <tt>munge</tt> function looks - like:</p> - -<div class="doc_code"> -<pre> -void %munge(%struct.munger_struct* %P) { -entry: - %tmp = getelementptr %struct.munger_struct* %P, i32 1, i32 0 - %tmp = load i32* %tmp - %tmp6 = getelementptr %struct.munger_struct* %P, i32 2, i32 1 - %tmp7 = load i32* %tmp6 - %tmp8 = add i32 %tmp7, %tmp - %tmp9 = getelementptr %struct.munger_struct* %P, i32 0, i32 0 - store i32 %tmp8, i32* %tmp9 - ret void -} -</pre> -</div> - - <p>In each case the first operand is the pointer through which the GEP - instruction starts. The same is true whether the first operand is an - argument, allocated memory, or a global variable. </p> - <p>To make this clear, let's consider a more obtuse example:</p> - -<div class="doc_code"> -<pre> -%MyVar = uninitialized global i32 -... -%idx1 = getelementptr i32* %MyVar, i64 0 -%idx2 = getelementptr i32* %MyVar, i64 1 -%idx3 = getelementptr i32* %MyVar, i64 2 -</pre> -</div> - - <p>These GEP instructions are simply making address computations from the - base address of <tt>MyVar</tt>. They compute, as follows (using C syntax): - </p> - -<div class="doc_code"> -<pre> -idx1 = (char*) &MyVar + 0 -idx2 = (char*) &MyVar + 4 -idx3 = (char*) &MyVar + 8 -</pre> -</div> - - <p>Since the type <tt>i32</tt> is known to be four bytes long, the indices - 0, 1 and 2 translate into memory offsets of 0, 4, and 8, respectively. No - memory is accessed to make these computations because the address of - <tt>%MyVar</tt> is passed directly to the GEP instructions.</p> - <p>The obtuse part of this example is in the cases of <tt>%idx2</tt> and - <tt>%idx3</tt>. They result in the computation of addresses that point to - memory past the end of the <tt>%MyVar</tt> global, which is only one - <tt>i32</tt> long, not three <tt>i32</tt>s long. While this is legal in LLVM, - it is inadvisable because any load or store with the pointer that results - from these GEP instructions would produce undefined results.</p> -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="extra_index">Why is the extra 0 index required?</a> -</h3> -<!-- *********************************************************************** --> -<div> - <p>Quick answer: there are no superfluous indices.</p> - <p>This question arises most often when the GEP instruction is applied to a - global variable which is always a pointer type. For example, consider - this:</p> - -<div class="doc_code"> -<pre> -%MyStruct = uninitialized global { float*, i32 } -... -%idx = getelementptr { float*, i32 }* %MyStruct, i64 0, i32 1 -</pre> -</div> - - <p>The GEP above yields an <tt>i32*</tt> by indexing the <tt>i32</tt> typed - field of the structure <tt>%MyStruct</tt>. When people first look at it, they - wonder why the <tt>i64 0</tt> index is needed. However, a closer inspection - of how globals and GEPs work reveals the need. Becoming aware of the following - facts will dispel the confusion:</p> - <ol> - <li>The type of <tt>%MyStruct</tt> is <i>not</i> <tt>{ float*, i32 }</tt> - but rather <tt>{ float*, i32 }*</tt>. That is, <tt>%MyStruct</tt> is a - pointer to a structure containing a pointer to a <tt>float</tt> and an - <tt>i32</tt>.</li> - <li>Point #1 is evidenced by noticing the type of the first operand of - the GEP instruction (<tt>%MyStruct</tt>) which is - <tt>{ float*, i32 }*</tt>.</li> - <li>The first index, <tt>i64 0</tt> is required to step over the global - variable <tt>%MyStruct</tt>. Since the first argument to the GEP - instruction must always be a value of pointer type, the first index - steps through that pointer. A value of 0 means 0 elements offset from that - pointer.</li> - <li>The second index, <tt>i32 1</tt> selects the second field of the - structure (the <tt>i32</tt>). </li> - </ol> -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="deref">What is dereferenced by GEP?</a> -</h3> -<div> - <p>Quick answer: nothing.</p> - <p>The GetElementPtr instruction dereferences nothing. That is, it doesn't - access memory in any way. That's what the Load and Store instructions are for. - GEP is only involved in the computation of addresses. For example, consider - this:</p> - -<div class="doc_code"> -<pre> -%MyVar = uninitialized global { [40 x i32 ]* } -... -%idx = getelementptr { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17 -</pre> -</div> - - <p>In this example, we have a global variable, <tt>%MyVar</tt> that is a - pointer to a structure containing a pointer to an array of 40 ints. The - GEP instruction seems to be accessing the 18th integer of the structure's - array of ints. However, this is actually an illegal GEP instruction. It - won't compile. The reason is that the pointer in the structure <i>must</i> - be dereferenced in order to index into the array of 40 ints. Since the - GEP instruction never accesses memory, it is illegal.</p> - <p>In order to access the 18th integer in the array, you would need to do the - following:</p> - -<div class="doc_code"> -<pre> -%idx = getelementptr { [40 x i32]* }* %, i64 0, i32 0 -%arr = load [40 x i32]** %idx -%idx = getelementptr [40 x i32]* %arr, i64 0, i64 17 -</pre> -</div> - - <p>In this case, we have to load the pointer in the structure with a load - instruction before we can index into the array. If the example was changed - to:</p> - -<div class="doc_code"> -<pre> -%MyVar = uninitialized global { [40 x i32 ] } -... -%idx = getelementptr { [40 x i32] }*, i64 0, i32 0, i64 17 -</pre> -</div> - - <p>then everything works fine. In this case, the structure does not contain a - pointer and the GEP instruction can index through the global variable, - into the first field of the structure and access the 18th <tt>i32</tt> in the - array there.</p> -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="lead0">Why don't GEP x,0,0,1 and GEP x,1 alias?</a> -</h3> -<div> - <p>Quick Answer: They compute different address locations.</p> - <p>If you look at the first indices in these GEP - instructions you find that they are different (0 and 1), therefore the address - computation diverges with that index. Consider this example:</p> - -<div class="doc_code"> -<pre> -%MyVar = global { [10 x i32 ] } -%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 0, i32 0, i64 1 -%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1 -</pre> -</div> - - <p>In this example, <tt>idx1</tt> computes the address of the second integer - in the array that is in the structure in <tt>%MyVar</tt>, that is - <tt>MyVar+4</tt>. The type of <tt>idx1</tt> is <tt>i32*</tt>. However, - <tt>idx2</tt> computes the address of <i>the next</i> structure after - <tt>%MyVar</tt>. The type of <tt>idx2</tt> is <tt>{ [10 x i32] }*</tt> and its - value is equivalent to <tt>MyVar + 40</tt> because it indexes past the ten - 4-byte integers in <tt>MyVar</tt>. Obviously, in such a situation, the - pointers don't alias.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="trail0">Why do GEP x,1,0,0 and GEP x,1 alias?</a> -</h3> -<div> - <p>Quick Answer: They compute the same address location.</p> - <p>These two GEP instructions will compute the same address because indexing - through the 0th element does not change the address. However, it does change - the type. Consider this example:</p> - -<div class="doc_code"> -<pre> -%MyVar = global { [10 x i32 ] } -%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 1, i32 0, i64 0 -%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1 -</pre> -</div> - - <p>In this example, the value of <tt>%idx1</tt> is <tt>%MyVar+40</tt> and - its type is <tt>i32*</tt>. The value of <tt>%idx2</tt> is also - <tt>MyVar+40</tt> but its type is <tt>{ [10 x i32] }*</tt>.</p> -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="vectors">Can GEP index into vector elements?</a> -</h3> -<div> - <p>This hasn't always been forcefully disallowed, though it's not recommended. - It leads to awkward special cases in the optimizers, and fundamental - inconsistency in the IR. In the future, it will probably be outright - disallowed.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="addrspace">What effect do address spaces have on GEPs?</a> -</h3> -<div> - <p>None, except that the address space qualifier on the first operand pointer - type always matches the address space qualifier on the result type.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="int"> - How is GEP different from ptrtoint, arithmetic, and inttoptr? - </a> -</h3> -<div> - <p>It's very similar; there are only subtle differences.</p> - - <p>With ptrtoint, you have to pick an integer type. One approach is to pick i64; - this is safe on everything LLVM supports (LLVM internally assumes pointers - are never wider than 64 bits in many places), and the optimizer will actually - narrow the i64 arithmetic down to the actual pointer size on targets which - don't support 64-bit arithmetic in most cases. However, there are some cases - where it doesn't do this. With GEP you can avoid this problem. - - <p>Also, GEP carries additional pointer aliasing rules. It's invalid to take a - GEP from one object, address into a different separately allocated - object, and dereference it. IR producers (front-ends) must follow this rule, - and consumers (optimizers, specifically alias analysis) benefit from being - able to rely on it. See the <a href="#rules">Rules</a> section for more - information.</p> - - <p>And, GEP is more concise in common cases.</p> - - <p>However, for the underlying integer computation implied, there - is no difference.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="be"> - I'm writing a backend for a target which needs custom lowering for GEP. - How do I do this? - </a> -</h3> -<div> - <p>You don't. The integer computation implied by a GEP is target-independent. - Typically what you'll need to do is make your backend pattern-match - expressions trees involving ADD, MUL, etc., which are what GEP is lowered - into. This has the advantage of letting your code work correctly in more - cases.</p> - - <p>GEP does use target-dependent parameters for the size and layout of data - types, which targets can customize.</p> - - <p>If you require support for addressing units which are not 8 bits, you'll - need to fix a lot of code in the backend, with GEP lowering being only a - small piece of the overall picture.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="vla">How does VLA addressing work with GEPs?</a> -</h3> -<div> - <p>GEPs don't natively support VLAs. LLVM's type system is entirely static, - and GEP address computations are guided by an LLVM type.</p> - - <p>VLA indices can be implemented as linearized indices. For example, an - expression like X[a][b][c], must be effectively lowered into a form - like X[a*m+b*n+c], so that it appears to the GEP as a single-dimensional - array reference.</p> - - <p>This means if you want to write an analysis which understands array - indices and you want to support VLAs, your code will have to be - prepared to reverse-engineer the linearization. One way to solve this - problem is to use the ScalarEvolution library, which always presents - VLA and non-VLA indexing in the same manner.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="rules">Rules</a></h2> -<!-- *********************************************************************** --> -<div> -<!-- *********************************************************************** --> - -<h3> - <a name="bounds">What happens if an array index is out of bounds?</a> -</h3> -<div> - <p>There are two senses in which an array index can be out of bounds.</p> - - <p>First, there's the array type which comes from the (static) type of - the first operand to the GEP. Indices greater than the number of elements - in the corresponding static array type are valid. There is no problem with - out of bounds indices in this sense. Indexing into an array only depends - on the size of the array element, not the number of elements.</p> - - <p>A common example of how this is used is arrays where the size is not known. - It's common to use array types with zero length to represent these. The - fact that the static type says there are zero elements is irrelevant; it's - perfectly valid to compute arbitrary element indices, as the computation - only depends on the size of the array element, not the number of - elements. Note that zero-sized arrays are not a special case here.</p> - - <p>This sense is unconnected with <tt>inbounds</tt> keyword. The - <tt>inbounds</tt> keyword is designed to describe low-level pointer - arithmetic overflow conditions, rather than high-level array - indexing rules. - - <p>Analysis passes which wish to understand array indexing should not - assume that the static array type bounds are respected.</p> - - <p>The second sense of being out of bounds is computing an address that's - beyond the actual underlying allocated object.</p> - - <p>With the <tt>inbounds</tt> keyword, the result value of the GEP is - undefined if the address is outside the actual underlying allocated - object and not the address one-past-the-end.</p> - - <p>Without the <tt>inbounds</tt> keyword, there are no restrictions - on computing out-of-bounds addresses. Obviously, performing a load or - a store requires an address of allocated and sufficiently aligned - memory. But the GEP itself is only concerned with computing addresses.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="negative">Can array indices be negative?</a> -</h3> -<div> - <p>Yes. This is basically a special case of array indices being out - of bounds.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="compare">Can I compare two values computed with GEPs?</a> -</h3> -<div> - <p>Yes. If both addresses are within the same allocated object, or - one-past-the-end, you'll get the comparison result you expect. If either - is outside of it, integer arithmetic wrapping may occur, so the - comparison may not be meaningful.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="types"> - Can I do GEP with a different pointer type than the type of - the underlying object? - </a> -</h3> -<div> - <p>Yes. There are no restrictions on bitcasting a pointer value to an arbitrary - pointer type. The types in a GEP serve only to define the parameters for the - underlying integer computation. They need not correspond with the actual - type of the underlying object.</p> - - <p>Furthermore, loads and stores don't have to use the same types as the type - of the underlying object. Types in this context serve only to specify - memory size and alignment. Beyond that there are merely a hint to the - optimizer indicating how the value will likely be used.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="null"> - Can I cast an object's address to integer and add it to null? - </a> -</h3> -<div> - <p>You can compute an address that way, but if you use GEP to do the add, - you can't use that pointer to actually access the object, unless the - object is managed outside of LLVM.</p> - - <p>The underlying integer computation is sufficiently defined; null has a - defined value -- zero -- and you can add whatever value you want to it.</p> - - <p>However, it's invalid to access (load from or store to) an LLVM-aware - object with such a pointer. This includes GlobalVariables, Allocas, and - objects pointed to by noalias pointers.</p> - - <p>If you really need this functionality, you can do the arithmetic with - explicit integer instructions, and use inttoptr to convert the result to - an address. Most of GEP's special aliasing rules do not apply to pointers - computed from ptrtoint, arithmetic, and inttoptr sequences.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="ptrdiff"> - Can I compute the distance between two objects, and add - that value to one address to compute the other address? - </a> -</h3> -<div> - <p>As with arithmetic on null, You can use GEP to compute an address that - way, but you can't use that pointer to actually access the object if you - do, unless the object is managed outside of LLVM.</p> - - <p>Also as above, ptrtoint and inttoptr provide an alternative way to do this - which do not have this restriction.</p> - -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="tbaa">Can I do type-based alias analysis on LLVM IR?</a> -</h3> -<div> - <p>You can't do type-based alias analysis using LLVM's built-in type system, - because LLVM has no restrictions on mixing types in addressing, loads or - stores.</p> - - <p>LLVM's type-based alias analysis pass uses metadata to describe a different - type system (such as the C type system), and performs type-based aliasing - on top of that. Further details are in the - <a href="LangRef.html#tbaa">language reference</a>.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="overflow">What happens if a GEP computation overflows?</a> -</h3> -<div> - <p>If the GEP lacks the <tt>inbounds</tt> keyword, the value is the result - from evaluating the implied two's complement integer computation. However, - since there's no guarantee of where an object will be allocated in the - address space, such values have limited meaning.</p> - - <p>If the GEP has the <tt>inbounds</tt> keyword, the result value is - undefined (a "<a href="LangRef.html#trapvalues">trap value</a>") if the GEP - overflows (i.e. wraps around the end of the address space).</p> - - <p>As such, there are some ramifications of this for inbounds GEPs: scales - implied by array/vector/pointer indices are always known to be "nsw" since - they are signed values that are scaled by the element size. These values - are also allowed to be negative (e.g. "gep i32 *%P, i32 -1") but the - pointer itself is logically treated as an unsigned value. This means that - GEPs have an asymmetric relation between the pointer base (which is treated - as unsigned) and the offset applied to it (which is treated as signed). The - result of the additions within the offset calculation cannot have signed - overflow, but when applied to the base pointer, there can be signed - overflow. - </p> - - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="check"> - How can I tell if my front-end is following the rules? - </a> -</h3> -<div> - <p>There is currently no checker for the getelementptr rules. Currently, - the only way to do this is to manually check each place in your front-end - where GetElementPtr operators are created.</p> - - <p>It's not possible to write a checker which could find all rule - violations statically. It would be possible to write a checker which - works by instrumenting the code with dynamic checks though. Alternatively, - it would be possible to write a static checker which catches a subset of - possible problems. However, no such checker exists today.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="rationale">Rationale</a></h2> -<!-- *********************************************************************** --> -<div> -<!-- *********************************************************************** --> - -<h3> - <a name="goals">Why is GEP designed this way?</a> -</h3> -<div> - <p>The design of GEP has the following goals, in rough unofficial - order of priority:</p> - <ul> - <li>Support C, C-like languages, and languages which can be - conceptually lowered into C (this covers a lot).</li> - <li>Support optimizations such as those that are common in - C compilers. In particular, GEP is a cornerstone of LLVM's - <a href="LangRef.html#pointeraliasing">pointer aliasing model</a>.</li> - <li>Provide a consistent method for computing addresses so that - address computations don't need to be a part of load and - store instructions in the IR.</li> - <li>Support non-C-like languages, to the extent that it doesn't - interfere with other goals.</li> - <li>Minimize target-specific information in the IR.</li> - </ul> -</div> - -<!-- *********************************************************************** --> -<h3> - <a name="i32">Why do struct member indices always use i32?</a> -</h3> -<div> - <p>The specific type i32 is probably just a historical artifact, however it's - wide enough for all practical purposes, so there's been no need to change it. - It doesn't necessarily imply i32 address arithmetic; it's just an identifier - which identifies a field in a struct. Requiring that all struct indices be - the same reduces the range of possibilities for cases where two GEPs are - effectively the same but have distinct operand types.</p> - -</div> - -<!-- *********************************************************************** --> - -<h3> - <a name="uglygep">What's an uglygep?</a> -</h3> -<div> - <p>Some LLVM optimizers operate on GEPs by internally lowering them into - more primitive integer expressions, which allows them to be combined - with other integer expressions and/or split into multiple separate - integer expressions. If they've made non-trivial changes, translating - back into LLVM IR can involve reverse-engineering the structure of - the addressing in order to fit it into the static type of the original - first operand. It isn't always possibly to fully reconstruct this - structure; sometimes the underlying addressing doesn't correspond with - the static type at all. In such cases the optimizer instead will emit - a GEP with the base pointer casted to a simple address-unit pointer, - using the name "uglygep". This isn't pretty, but it's just as - valid, and it's sufficient to preserve the pointer aliasing guarantees - that GEP provides.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="summary">Summary</a></h2> -<!-- *********************************************************************** --> - -<div> - <p>In summary, here's some things to always remember about the GetElementPtr - instruction:</p> - <ol> - <li>The GEP instruction never accesses memory, it only provides pointer - computations.</li> - <li>The first operand to the GEP instruction is always a pointer and it must - be indexed.</li> - <li>There are no superfluous indices for the GEP instruction.</li> - <li>Trailing zero indices are superfluous for pointer aliasing, but not for - the types of the pointers.</li> - <li>Leading zero indices are not superfluous for pointer aliasing nor the - types of the pointers.</li> - </ol> -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-31 14:04:26 +0100 (Mon, 31 Oct 2011) $ -</address> -</body> -</html> diff --git a/docs/GetElementPtr.rst b/docs/GetElementPtr.rst new file mode 100644 index 0000000..f6f904b --- /dev/null +++ b/docs/GetElementPtr.rst @@ -0,0 +1,538 @@ +.. _gep: + +======================================= +The Often Misunderstood GEP Instruction +======================================= + +.. contents:: + :local: + +Introduction +============ + +This document seeks to dispel the mystery and confusion surrounding LLVM's +`GetElementPtr <LangRef.html#i_getelementptr>`_ (GEP) instruction. Questions +about the wily GEP instruction are probably the most frequently occurring +questions once a developer gets down to coding with LLVM. Here we lay out the +sources of confusion and show that the GEP instruction is really quite simple. + +Address Computation +=================== + +When people are first confronted with the GEP instruction, they tend to relate +it to known concepts from other programming paradigms, most notably C array +indexing and field selection. GEP closely resembles C array indexing and field +selection, however it's is a little different and this leads to the following +questions. + +What is the first index of the GEP instruction? +----------------------------------------------- + +Quick answer: The index stepping through the first operand. + +The confusion with the first index usually arises from thinking about the +GetElementPtr instruction as if it was a C index operator. They aren't the +same. For example, when we write, in "C": + +.. code-block:: c++ + + AType *Foo; + ... + X = &Foo->F; + +it is natural to think that there is only one index, the selection of the field +``F``. However, in this example, ``Foo`` is a pointer. That pointer +must be indexed explicitly in LLVM. C, on the other hand, indices through it +transparently. To arrive at the same address location as the C code, you would +provide the GEP instruction with two index operands. The first operand indexes +through the pointer; the second operand indexes the field ``F`` of the +structure, just as if you wrote: + +.. code-block:: c++ + + X = &Foo[0].F; + +Sometimes this question gets rephrased as: + +.. _GEP index through first pointer: + + *Why is it okay to index through the first pointer, but subsequent pointers + won't be dereferenced?* + +The answer is simply because memory does not have to be accessed to perform the +computation. The first operand to the GEP instruction must be a value of a +pointer type. The value of the pointer is provided directly to the GEP +instruction as an operand without any need for accessing memory. It must, +therefore be indexed and requires an index operand. Consider this example: + +.. code-block:: c++ + + struct munger_struct { + int f1; + int f2; + }; + void munge(struct munger_struct *P) { + P[0].f1 = P[1].f1 + P[2].f2; + } + ... + munger_struct Array[3]; + ... + munge(Array); + +In this "C" example, the front end compiler (llvm-gcc) will generate three GEP +instructions for the three indices through "P" in the assignment statement. The +function argument ``P`` will be the first operand of each of these GEP +instructions. The second operand indexes through that pointer. The third +operand will be the field offset into the ``struct munger_struct`` type, for +either the ``f1`` or ``f2`` field. So, in LLVM assembly the ``munge`` function +looks like: + +.. code-block:: llvm + + void %munge(%struct.munger_struct* %P) { + entry: + %tmp = getelementptr %struct.munger_struct* %P, i32 1, i32 0 + %tmp = load i32* %tmp + %tmp6 = getelementptr %struct.munger_struct* %P, i32 2, i32 1 + %tmp7 = load i32* %tmp6 + %tmp8 = add i32 %tmp7, %tmp + %tmp9 = getelementptr %struct.munger_struct* %P, i32 0, i32 0 + store i32 %tmp8, i32* %tmp9 + ret void + } + +In each case the first operand is the pointer through which the GEP instruction +starts. The same is true whether the first operand is an argument, allocated +memory, or a global variable. + +To make this clear, let's consider a more obtuse example: + +.. code-block:: llvm + + %MyVar = uninitialized global i32 + ... + %idx1 = getelementptr i32* %MyVar, i64 0 + %idx2 = getelementptr i32* %MyVar, i64 1 + %idx3 = getelementptr i32* %MyVar, i64 2 + +These GEP instructions are simply making address computations from the base +address of ``MyVar``. They compute, as follows (using C syntax): + +.. code-block:: c++ + + idx1 = (char*) &MyVar + 0 + idx2 = (char*) &MyVar + 4 + idx3 = (char*) &MyVar + 8 + +Since the type ``i32`` is known to be four bytes long, the indices 0, 1 and 2 +translate into memory offsets of 0, 4, and 8, respectively. No memory is +accessed to make these computations because the address of ``%MyVar`` is passed +directly to the GEP instructions. + +The obtuse part of this example is in the cases of ``%idx2`` and ``%idx3``. They +result in the computation of addresses that point to memory past the end of the +``%MyVar`` global, which is only one ``i32`` long, not three ``i32``\s long. +While this is legal in LLVM, it is inadvisable because any load or store with +the pointer that results from these GEP instructions would produce undefined +results. + +Why is the extra 0 index required? +---------------------------------- + +Quick answer: there are no superfluous indices. + +This question arises most often when the GEP instruction is applied to a global +variable which is always a pointer type. For example, consider this: + +.. code-block:: llvm + + %MyStruct = uninitialized global { float*, i32 } + ... + %idx = getelementptr { float*, i32 }* %MyStruct, i64 0, i32 1 + +The GEP above yields an ``i32*`` by indexing the ``i32`` typed field of the +structure ``%MyStruct``. When people first look at it, they wonder why the ``i64 +0`` index is needed. However, a closer inspection of how globals and GEPs work +reveals the need. Becoming aware of the following facts will dispel the +confusion: + +#. The type of ``%MyStruct`` is *not* ``{ float*, i32 }`` but rather ``{ float*, + i32 }*``. That is, ``%MyStruct`` is a pointer to a structure containing a + pointer to a ``float`` and an ``i32``. + +#. Point #1 is evidenced by noticing the type of the first operand of the GEP + instruction (``%MyStruct``) which is ``{ float*, i32 }*``. + +#. The first index, ``i64 0`` is required to step over the global variable + ``%MyStruct``. Since the first argument to the GEP instruction must always + be a value of pointer type, the first index steps through that pointer. A + value of 0 means 0 elements offset from that pointer. + +#. The second index, ``i32 1`` selects the second field of the structure (the + ``i32``). + +What is dereferenced by GEP? +---------------------------- + +Quick answer: nothing. + +The GetElementPtr instruction dereferences nothing. That is, it doesn't access +memory in any way. That's what the Load and Store instructions are for. GEP is +only involved in the computation of addresses. For example, consider this: + +.. code-block:: llvm + + %MyVar = uninitialized global { [40 x i32 ]* } + ... + %idx = getelementptr { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17 + +In this example, we have a global variable, ``%MyVar`` that is a pointer to a +structure containing a pointer to an array of 40 ints. The GEP instruction seems +to be accessing the 18th integer of the structure's array of ints. However, this +is actually an illegal GEP instruction. It won't compile. The reason is that the +pointer in the structure <i>must</i> be dereferenced in order to index into the +array of 40 ints. Since the GEP instruction never accesses memory, it is +illegal. + +In order to access the 18th integer in the array, you would need to do the +following: + +.. code-block:: llvm + + %idx = getelementptr { [40 x i32]* }* %, i64 0, i32 0 + %arr = load [40 x i32]** %idx + %idx = getelementptr [40 x i32]* %arr, i64 0, i64 17 + +In this case, we have to load the pointer in the structure with a load +instruction before we can index into the array. If the example was changed to: + +.. code-block:: llvm + + %MyVar = uninitialized global { [40 x i32 ] } + ... + %idx = getelementptr { [40 x i32] }*, i64 0, i32 0, i64 17 + +then everything works fine. In this case, the structure does not contain a +pointer and the GEP instruction can index through the global variable, into the +first field of the structure and access the 18th ``i32`` in the array there. + +Why don't GEP x,0,0,1 and GEP x,1 alias? +---------------------------------------- + +Quick Answer: They compute different address locations. + +If you look at the first indices in these GEP instructions you find that they +are different (0 and 1), therefore the address computation diverges with that +index. Consider this example: + +.. code-block:: llvm + + %MyVar = global { [10 x i32 ] } + %idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 0, i32 0, i64 1 + %idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1 + +In this example, ``idx1`` computes the address of the second integer in the +array that is in the structure in ``%MyVar``, that is ``MyVar+4``. The type of +``idx1`` is ``i32*``. However, ``idx2`` computes the address of *the next* +structure after ``%MyVar``. The type of ``idx2`` is ``{ [10 x i32] }*`` and its +value is equivalent to ``MyVar + 40`` because it indexes past the ten 4-byte +integers in ``MyVar``. Obviously, in such a situation, the pointers don't +alias. + +Why do GEP x,1,0,0 and GEP x,1 alias? +------------------------------------- + +Quick Answer: They compute the same address location. + +These two GEP instructions will compute the same address because indexing +through the 0th element does not change the address. However, it does change the +type. Consider this example: + +.. code-block:: llvm + + %MyVar = global { [10 x i32 ] } + %idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 1, i32 0, i64 0 + %idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1 + +In this example, the value of ``%idx1`` is ``%MyVar+40`` and its type is +``i32*``. The value of ``%idx2`` is also ``MyVar+40`` but its type is ``{ [10 x +i32] }*``. + +Can GEP index into vector elements? +----------------------------------- + +This hasn't always been forcefully disallowed, though it's not recommended. It +leads to awkward special cases in the optimizers, and fundamental inconsistency +in the IR. In the future, it will probably be outright disallowed. + +What effect do address spaces have on GEPs? +------------------------------------------- + +None, except that the address space qualifier on the first operand pointer type +always matches the address space qualifier on the result type. + +How is GEP different from ``ptrtoint``, arithmetic, and ``inttoptr``? +--------------------------------------------------------------------- + +It's very similar; there are only subtle differences. + +With ptrtoint, you have to pick an integer type. One approach is to pick i64; +this is safe on everything LLVM supports (LLVM internally assumes pointers are +never wider than 64 bits in many places), and the optimizer will actually narrow +the i64 arithmetic down to the actual pointer size on targets which don't +support 64-bit arithmetic in most cases. However, there are some cases where it +doesn't do this. With GEP you can avoid this problem. + +Also, GEP carries additional pointer aliasing rules. It's invalid to take a GEP +from one object, address into a different separately allocated object, and +dereference it. IR producers (front-ends) must follow this rule, and consumers +(optimizers, specifically alias analysis) benefit from being able to rely on +it. See the `Rules`_ section for more information. + +And, GEP is more concise in common cases. + +However, for the underlying integer computation implied, there is no +difference. + + +I'm writing a backend for a target which needs custom lowering for GEP. How do I do this? +----------------------------------------------------------------------------------------- + +You don't. The integer computation implied by a GEP is target-independent. +Typically what you'll need to do is make your backend pattern-match expressions +trees involving ADD, MUL, etc., which are what GEP is lowered into. This has the +advantage of letting your code work correctly in more cases. + +GEP does use target-dependent parameters for the size and layout of data types, +which targets can customize. + +If you require support for addressing units which are not 8 bits, you'll need to +fix a lot of code in the backend, with GEP lowering being only a small piece of +the overall picture. + +How does VLA addressing work with GEPs? +--------------------------------------- + +GEPs don't natively support VLAs. LLVM's type system is entirely static, and GEP +address computations are guided by an LLVM type. + +VLA indices can be implemented as linearized indices. For example, an expression +like ``X[a][b][c]``, must be effectively lowered into a form like +``X[a*m+b*n+c]``, so that it appears to the GEP as a single-dimensional array +reference. + +This means if you want to write an analysis which understands array indices and +you want to support VLAs, your code will have to be prepared to reverse-engineer +the linearization. One way to solve this problem is to use the ScalarEvolution +library, which always presents VLA and non-VLA indexing in the same manner. + +.. _Rules: + +Rules +===== + +What happens if an array index is out of bounds? +------------------------------------------------ + +There are two senses in which an array index can be out of bounds. + +First, there's the array type which comes from the (static) type of the first +operand to the GEP. Indices greater than the number of elements in the +corresponding static array type are valid. There is no problem with out of +bounds indices in this sense. Indexing into an array only depends on the size of +the array element, not the number of elements. + +A common example of how this is used is arrays where the size is not known. +It's common to use array types with zero length to represent these. The fact +that the static type says there are zero elements is irrelevant; it's perfectly +valid to compute arbitrary element indices, as the computation only depends on +the size of the array element, not the number of elements. Note that zero-sized +arrays are not a special case here. + +This sense is unconnected with ``inbounds`` keyword. The ``inbounds`` keyword is +designed to describe low-level pointer arithmetic overflow conditions, rather +than high-level array indexing rules. + +Analysis passes which wish to understand array indexing should not assume that +the static array type bounds are respected. + +The second sense of being out of bounds is computing an address that's beyond +the actual underlying allocated object. + +With the ``inbounds`` keyword, the result value of the GEP is undefined if the +address is outside the actual underlying allocated object and not the address +one-past-the-end. + +Without the ``inbounds`` keyword, there are no restrictions on computing +out-of-bounds addresses. Obviously, performing a load or a store requires an +address of allocated and sufficiently aligned memory. But the GEP itself is only +concerned with computing addresses. + +Can array indices be negative? +------------------------------ + +Yes. This is basically a special case of array indices being out of bounds. + +Can I compare two values computed with GEPs? +-------------------------------------------- + +Yes. If both addresses are within the same allocated object, or +one-past-the-end, you'll get the comparison result you expect. If either is +outside of it, integer arithmetic wrapping may occur, so the comparison may not +be meaningful. + +Can I do GEP with a different pointer type than the type of the underlying object? +---------------------------------------------------------------------------------- + +Yes. There are no restrictions on bitcasting a pointer value to an arbitrary +pointer type. The types in a GEP serve only to define the parameters for the +underlying integer computation. They need not correspond with the actual type of +the underlying object. + +Furthermore, loads and stores don't have to use the same types as the type of +the underlying object. Types in this context serve only to specify memory size +and alignment. Beyond that there are merely a hint to the optimizer indicating +how the value will likely be used. + +Can I cast an object's address to integer and add it to null? +------------------------------------------------------------- + +You can compute an address that way, but if you use GEP to do the add, you can't +use that pointer to actually access the object, unless the object is managed +outside of LLVM. + +The underlying integer computation is sufficiently defined; null has a defined +value --- zero --- and you can add whatever value you want to it. + +However, it's invalid to access (load from or store to) an LLVM-aware object +with such a pointer. This includes ``GlobalVariables``, ``Allocas``, and objects +pointed to by noalias pointers. + +If you really need this functionality, you can do the arithmetic with explicit +integer instructions, and use inttoptr to convert the result to an address. Most +of GEP's special aliasing rules do not apply to pointers computed from ptrtoint, +arithmetic, and inttoptr sequences. + +Can I compute the distance between two objects, and add that value to one address to compute the other address? +--------------------------------------------------------------------------------------------------------------- + +As with arithmetic on null, You can use GEP to compute an address that way, but +you can't use that pointer to actually access the object if you do, unless the +object is managed outside of LLVM. + +Also as above, ptrtoint and inttoptr provide an alternative way to do this which +do not have this restriction. + +Can I do type-based alias analysis on LLVM IR? +---------------------------------------------- + +You can't do type-based alias analysis using LLVM's built-in type system, +because LLVM has no restrictions on mixing types in addressing, loads or stores. + +LLVM's type-based alias analysis pass uses metadata to describe a different type +system (such as the C type system), and performs type-based aliasing on top of +that. Further details are in the `language reference <LangRef.html#tbaa>`_. + +What happens if a GEP computation overflows? +-------------------------------------------- + +If the GEP lacks the ``inbounds`` keyword, the value is the result from +evaluating the implied two's complement integer computation. However, since +there's no guarantee of where an object will be allocated in the address space, +such values have limited meaning. + +If the GEP has the ``inbounds`` keyword, the result value is undefined (a "trap +value") if the GEP overflows (i.e. wraps around the end of the address space). + +As such, there are some ramifications of this for inbounds GEPs: scales implied +by array/vector/pointer indices are always known to be "nsw" since they are +signed values that are scaled by the element size. These values are also +allowed to be negative (e.g. "``gep i32 *%P, i32 -1``") but the pointer itself +is logically treated as an unsigned value. This means that GEPs have an +asymmetric relation between the pointer base (which is treated as unsigned) and +the offset applied to it (which is treated as signed). The result of the +additions within the offset calculation cannot have signed overflow, but when +applied to the base pointer, there can be signed overflow. + +How can I tell if my front-end is following the rules? +------------------------------------------------------ + +There is currently no checker for the getelementptr rules. Currently, the only +way to do this is to manually check each place in your front-end where +GetElementPtr operators are created. + +It's not possible to write a checker which could find all rule violations +statically. It would be possible to write a checker which works by instrumenting +the code with dynamic checks though. Alternatively, it would be possible to +write a static checker which catches a subset of possible problems. However, no +such checker exists today. + +Rationale +========= + +Why is GEP designed this way? +----------------------------- + +The design of GEP has the following goals, in rough unofficial order of +priority: + +* Support C, C-like languages, and languages which can be conceptually lowered + into C (this covers a lot). + +* Support optimizations such as those that are common in C compilers. In + particular, GEP is a cornerstone of LLVM's `pointer aliasing + model <LangRef.html#pointeraliasing>`_. + +* Provide a consistent method for computing addresses so that address + computations don't need to be a part of load and store instructions in the IR. + +* Support non-C-like languages, to the extent that it doesn't interfere with + other goals. + +* Minimize target-specific information in the IR. + +Why do struct member indices always use ``i32``? +------------------------------------------------ + +The specific type i32 is probably just a historical artifact, however it's wide +enough for all practical purposes, so there's been no need to change it. It +doesn't necessarily imply i32 address arithmetic; it's just an identifier which +identifies a field in a struct. Requiring that all struct indices be the same +reduces the range of possibilities for cases where two GEPs are effectively the +same but have distinct operand types. + +What's an uglygep? +------------------ + +Some LLVM optimizers operate on GEPs by internally lowering them into more +primitive integer expressions, which allows them to be combined with other +integer expressions and/or split into multiple separate integer expressions. If +they've made non-trivial changes, translating back into LLVM IR can involve +reverse-engineering the structure of the addressing in order to fit it into the +static type of the original first operand. It isn't always possibly to fully +reconstruct this structure; sometimes the underlying addressing doesn't +correspond with the static type at all. In such cases the optimizer instead will +emit a GEP with the base pointer casted to a simple address-unit pointer, using +the name "uglygep". This isn't pretty, but it's just as valid, and it's +sufficient to preserve the pointer aliasing guarantees that GEP provides. + +Summary +======= + +In summary, here's some things to always remember about the GetElementPtr +instruction: + + +#. The GEP instruction never accesses memory, it only provides pointer + computations. + +#. The first operand to the GEP instruction is always a pointer and it must be + indexed. + +#. There are no superfluous indices for the GEP instruction. + +#. Trailing zero indices are superfluous for pointer aliasing, but not for the + types of the pointers. + +#. Leading zero indices are not superfluous for pointer aliasing nor the types + of the pointers. diff --git a/docs/GettingStarted.html b/docs/GettingStarted.html index 52baf90..61335af 100644 --- a/docs/GettingStarted.html +++ b/docs/GettingStarted.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Getting Started with LLVM System</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -344,7 +344,7 @@ up</a></li> <li><a name="pf_7">Native code generation exists but is not complete.</a></li> <li><a name="pf_8">Binutils 2.20 or later is required to build the assembler generated by LLVM properly.</a></li> -<li><a name="pf_9">XCode 2.5 and gcc 4.0.1</a> (Apple Build 5370) will trip +<li><a name="pf_9">Xcode 2.5 and gcc 4.0.1</a> (Apple Build 5370) will trip internal LLVM assert messages when compiled for Release at optimization levels greater than 0 (i.e., <i>"-O1"</i> and higher). Add <i>OPTIMIZE_OPTION="-O0"</i> to the build command line @@ -590,6 +590,7 @@ as the previous one. It appears to work with ENABLE_OPTIMIZED=0 (the default).</ <p><b>GCC 4.3.3 (Debian 4.3.3-10) on ARM</b>: Miscompiles parts of LLVM 2.6 when optimizations are turned on. The symptom is an infinite loop in FoldingSetImpl::RemoveNode while running the code generator.</p> +<p><b>SUSE 11 GCC 4.3.4</b>: Miscompiles LLVM, causing crashes in ValueHandle logic.</p> <p><b>GCC 4.3.5 and GCC 4.4.5 on ARM</b>: These can miscompile <tt>value >> 1</tt> even at -O0. A test failure in <tt>test/Assembler/alignstack.ll</tt> is one symptom of the problem. @@ -626,7 +627,7 @@ upgrading to a newer version of Gold.</p> LLVM and to give you some basic information about the LLVM environment.</p> <p>The later sections of this guide describe the <a -href="#layout">general layout</a> of the the LLVM source tree, a <a +href="#layout">general layout</a> of the LLVM source tree, a <a href="#tutorial">simple example</a> using the LLVM tool chain, and <a href="#links">links</a> to find more information about LLVM or to get help via e-mail.</p> @@ -748,6 +749,8 @@ revision), you can checkout it from the '<tt>tags</tt>' directory (instead of subdirectories of the '<tt>tags</tt>' directory:</p> <ul> +<li>Release 3.1: <b>RELEASE_31/final</b></li> +<li>Release 3.0: <b>RELEASE_30/final</b></li> <li>Release 2.9: <b>RELEASE_29/final</b></li> <li>Release 2.8: <b>RELEASE_28</b></li> <li>Release 2.7: <b>RELEASE_27</b></li> @@ -1015,7 +1018,7 @@ script to configure the build system:</p> selected as the target of the build host. You can also specify a comma separated list of target names that you want available in llc. The target names use all lower case. The current set of targets is: <br> - <tt>arm, cbe, cpp, hexagon, mblaze, mips, mipsel, msp430, powerpc, ptx, sparc, spu, x86, x86_64, xcore</tt>. + <tt>arm, cpp, hexagon, mblaze, mips, mipsel, msp430, powerpc, ptx, sparc, spu, x86, x86_64, xcore</tt>. <br><br></dd> <dt><i>--enable-doxygen</i></dt> <dd>Look for the doxygen program and enable construction of doxygen based @@ -1510,12 +1513,6 @@ information is in the <a href="CommandGuide/index.html">Command Guide</a>.</p> <dd>The disassembler transforms the LLVM bitcode to human readable LLVM assembly.</dd> - <dt><tt><b>llvm-ld</b></tt></dt> - <dd><tt>llvm-ld</tt> is a general purpose and extensible linker for LLVM. - It performs standard link time optimizations and allows optimization - modules to be loaded and run so that language specific optimizations can - be applied at link time.</dd> - <dt><tt><b>llvm-link</b></tt></dt> <dd><tt>llvm-link</tt>, not surprisingly, links multiple LLVM modules into a single program.</dd> @@ -1538,7 +1535,7 @@ information is in the <a href="CommandGuide/index.html">Command Guide</a>.</p> bitcode or assembly (with the <tt>-emit-llvm</tt> option) instead of the usual machine code output. It works just like any other GCC compiler, taking the typical <tt>-c, -S, -E, -o</tt> options that are typically used. - Additionally, the the source code for <tt>llvm-gcc</tt> is available as a + Additionally, the source code for <tt>llvm-gcc</tt> is available as a separate Subversion module.</dd> <dt><tt><b>opt</b></tt></dt> @@ -1757,7 +1754,7 @@ out:</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.x10sys.com/rspencer/">Reid Spencer</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ + Last modified: $Date: 2012-07-23 10:51:15 +0200 (Mon, 23 Jul 2012) $ </address> </body> </html> diff --git a/docs/GettingStartedVS.html b/docs/GettingStartedVS.html deleted file mode 100644 index beadd0b..0000000 --- a/docs/GettingStartedVS.html +++ /dev/null @@ -1,368 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Getting Started with LLVM System for Microsoft Visual Studio</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1> - Getting Started with the LLVM System using Microsoft Visual Studio -</h1> - -<ul> - <li><a href="#overview">Overview</a> - <li><a href="#requirements">Requirements</a> - <ol> - <li><a href="#hardware">Hardware</a> - <li><a href="#software">Software</a> - </ol></li> - <li><a href="#quickstart">Getting Started</a> - <li><a href="#tutorial">An Example Using the LLVM Tool Chain</a> - <li><a href="#problems">Common Problems</a> - <li><a href="#links">Links</a> -</ul> - -<div class="doc_author"> - <p>Written by: <a href="http://llvm.org/">The LLVM Team</a></p> -</div> - - -<!-- *********************************************************************** --> -<h2> - <a name="overview"><b>Overview</b></a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p>Welcome to LLVM on Windows! This document only covers LLVM on Windows using - Visual Studio, not mingw or cygwin. In order to get started, you first need to - know some basic information.</p> - - <p>There are many different projects that compose LLVM. The first is the LLVM - suite. This contains all of the tools, libraries, and header files needed to - use LLVM. It contains an assembler, disassembler, - bitcode analyzer and bitcode optimizer. It also contains a test suite that can - be used to test the LLVM tools.</p> - - <p>Another useful project on Windows is - <a href="http://clang.llvm.org/">clang</a>. Clang is a C family - ([Objective]C/C++) compiler. Clang mostly works on Windows, but does not - currently understand all of the Microsoft extensions to C and C++. Because of - this, clang cannot parse the C++ standard library included with Visual Studio, - nor parts of the Windows Platform SDK. However, most standard C programs do - compile. Clang can be used to emit bitcode, directly emit object files or - even linked executables using Visual Studio's <tt>link.exe</tt></p> - - <p>The large LLVM test suite cannot be run on the Visual Studio port at this - time.</p> - - <p>Most of the tools build and work. <tt>bugpoint</tt> does build, but does - not work.</p> - - <p>Additional information about the LLVM directory structure and tool chain - can be found on the main <a href="GettingStarted.html">Getting Started</a> - page.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="requirements"><b>Requirements</b></a> -</h2> -<!-- *********************************************************************** --> - -<div> - - <p>Before you begin to use the LLVM system, review the requirements given - below. This may save you some trouble by knowing ahead of time what hardware - and software you will need.</p> - -<!-- ======================================================================= --> -<h3> - <a name="hardware"><b>Hardware</b></a> -</h3> - -<div> - - <p>Any system that can adequately run Visual Studio 2008 is fine. The LLVM - source tree and object files, libraries and executables will consume - approximately 3GB.</p> - -</div> - -<!-- ======================================================================= --> -<h3><a name="software"><b>Software</b></a></h3> -<div> - - <p>You will need Visual Studio 2008 or higher. Earlier versions of Visual - Studio have bugs, are not completely compatible, or do not support the C++ - standard well enough.</p> - - <p>You will also need the <a href="http://www.cmake.org/">CMake</a> build - system since it generates the project files you will use to build with.</p> - - <p>If you would like to run the LLVM tests you will need - <a href="http://www.python.org/">Python</a>. Versions 2.4-2.7 are known to - work. You will need <a href="http://gnuwin32.sourceforge.net/">"GnuWin32"</a> - tools, too.</p> - - <p>Do not install the LLVM directory tree into a path containing spaces (e.g. - C:\Documents and Settings\...) as the configure step will fail.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="quickstart"><b>Getting Started</b></a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Here's the short story for getting up and running quickly with LLVM:</p> - -<ol> - <li>Read the documentation.</li> - <li>Seriously, read the documentation.</li> - <li>Remember that you were warned twice about reading the documentation.</li> - - <li>Get the Source Code - <ul> - <li>With the distributed files: - <ol> - <li><tt>cd <i>where-you-want-llvm-to-live</i></tt> - <li><tt>gunzip --stdout llvm-<i>version</i>.tar.gz | tar -xvf -</tt> - <i> or use WinZip</i> - <li><tt>cd llvm</tt></li> - </ol></li> - - <li>With anonymous Subversion access: - <ol> - <li><tt>cd <i>where-you-want-llvm-to-live</i></tt></li> - <li><tt>svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm</tt></li> - <li><tt>cd llvm</tt></li> - </ol></li> - </ul></li> - - <li> Use <a href="http://www.cmake.org/">CMake</a> to generate up-to-date - project files: - <ul> - <li>Once CMake is installed then the simplest way is to just start the - CMake GUI, select the directory where you have LLVM extracted to, and the - default options should all be fine. One option you may really want to - change, regardless of anything else, might be the CMAKE_INSTALL_PREFIX - setting to select a directory to INSTALL to once compiling is complete, - although installation is not mandatory for using LLVM. Another important - option is LLVM_TARGETS_TO_BUILD, which controls the LLVM target - architectures that are included on the build. - <li>See the <a href="CMake.html">LLVM CMake guide</a> for - detailed information about how to configure the LLVM - build.</li> - </ul> - </li> - - <li>Start Visual Studio - <ul> - <li>In the directory you created the project files will have - an <tt>llvm.sln</tt> file, just double-click on that to open - Visual Studio.</li> - </ul></li> - - <li>Build the LLVM Suite: - <ul> - <li>The projects may still be built individually, but - to build them all do not just select all of them in batch build (as some - are meant as configuration projects), but rather select and build just - the ALL_BUILD project to build everything, or the INSTALL project, which - first builds the ALL_BUILD project, then installs the LLVM headers, libs, - and other useful things to the directory set by the CMAKE_INSTALL_PREFIX - setting when you first configured CMake.</li> - <li>The Fibonacci project is a sample program that uses the JIT. - Modify the project's debugging properties to provide a numeric - command line argument or run it from the command line. The - program will print the corresponding fibonacci value.</li> - </ul></li> - - <li>Test LLVM on Visual Studio: - <ul> - <li>If %PATH% does not contain GnuWin32, you may specify LLVM_LIT_TOOLS_DIR - on CMake for the path to GnuWin32.</li> - <li>You can run LLVM tests by merely building the project - "check". The test results will be shown in the VS output - window.</li> - </ul> - </li> - - <!-- FIXME: Is it up-to-date? --> - <li>Test LLVM: - <ul> - <li>The LLVM tests can be run by <tt>cd</tt>ing to the llvm source directory - and running: - -<div class="doc_code"> -<pre> -% llvm-lit test -</pre> -</div> - - <p>Note that quite a few of these test will fail.</p> - </li> - - <li>A specific test or test directory can be run with: - -<div class="doc_code"> -<pre> -% llvm-lit test/path/to/test -</pre> -</div> - </li> - </ul> -</ol> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="tutorial">An Example Using the LLVM Tool Chain</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<ol> - <li><p>First, create a simple C file, name it 'hello.c':</p> - -<div class="doc_code"> -<pre> -#include <stdio.h> -int main() { - printf("hello world\n"); - return 0; -} -</pre></div></li> - - <li><p>Next, compile the C file into a LLVM bitcode file:</p> - -<div class="doc_code"> -<pre> -% clang -c hello.c -emit-llvm -o hello.bc -</pre> -</div> - - <p>This will create the result file <tt>hello.bc</tt> which is the LLVM - bitcode that corresponds the the compiled program and the library - facilities that it required. You can execute this file directly using - <tt>lli</tt> tool, compile it to native assembly with the <tt>llc</tt>, - optimize or analyze it further with the <tt>opt</tt> tool, etc.</p> - - <p>Alternatively you can directly output an executable with clang with: - </p> - -<div class="doc_code"> -<pre> -% clang hello.c -o hello.exe -</pre> -</div> - - <p>The <tt>-o hello.exe</tt> is required because clang currently outputs - <tt>a.out</tt> when neither <tt>-o</tt> nor <tt>-c</tt> are given.</p> - - <li><p>Run the program using the just-in-time compiler:</p> - -<div class="doc_code"> -<pre> -% lli hello.bc -</pre> -</div> - - <li><p>Use the <tt>llvm-dis</tt> utility to take a look at the LLVM assembly - code:</p> - -<div class="doc_code"> -<pre> -% llvm-dis < hello.bc | more -</pre> -</div></li> - - <li><p>Compile the program to object code using the LLC code generator:</p> - -<div class="doc_code"> -<pre> -% llc -filetype=obj hello.bc -</pre> -</div></li> - - <li><p>Link to binary using Microsoft link:</p> - -<div class="doc_code"> -<pre> -% link hello.obj -defaultlib:libcmt -</pre> -</div> - - <li><p>Execute the native code program:</p> - -<div class="doc_code"> -<pre> -% hello.exe -</pre> -</div></li> -</ol> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="problems">Common Problems</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you are having problems building or using LLVM, or if you have any other -general questions about LLVM, please consult the <a href="FAQ.html">Frequently -Asked Questions</a> page.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="links">Links</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>This document is just an <b>introduction</b> to how to use LLVM to do -some simple things... there are many more interesting and complicated things -that you can do that aren't documented here (but we'll gladly accept a patch -if you want to write something up!). For more information about LLVM, check -out:</p> - -<ul> - <li><a href="http://llvm.org/">LLVM homepage</a></li> - <li><a href="http://llvm.org/doxygen/">LLVM doxygen tree</a></li> -</ul> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-01-25 23:00:23 +0100 (Wed, 25 Jan 2012) $ -</address> -</body> -</html> diff --git a/docs/GettingStartedVS.rst b/docs/GettingStartedVS.rst new file mode 100644 index 0000000..35f97f0 --- /dev/null +++ b/docs/GettingStartedVS.rst @@ -0,0 +1,234 @@ +.. _winvs: + +================================================================== +Getting Started with the LLVM System using Microsoft Visual Studio +================================================================== + +.. contents:: + :local: + + +Overview +======== +Welcome to LLVM on Windows! This document only covers LLVM on Windows using +Visual Studio, not mingw or cygwin. In order to get started, you first need to +know some basic information. + +There are many different projects that compose LLVM. The first is the LLVM +suite. This contains all of the tools, libraries, and header files needed to +use LLVM. It contains an assembler, disassembler, +bitcode analyzer and bitcode optimizer. It also contains a test suite that can +be used to test the LLVM tools. + +Another useful project on Windows is `Clang <http://clang.llvm.org/>`_. +Clang is a C family ([Objective]C/C++) compiler. Clang mostly works on +Windows, but does not currently understand all of the Microsoft extensions +to C and C++. Because of this, clang cannot parse the C++ standard library +included with Visual Studio, nor parts of the Windows Platform SDK. However, +most standard C programs do compile. Clang can be used to emit bitcode, +directly emit object files or even linked executables using Visual Studio's +``link.exe``. + +The large LLVM test suite cannot be run on the Visual Studio port at this +time. + +Most of the tools build and work. ``bugpoint`` does build, but does +not work. + +Additional information about the LLVM directory structure and tool chain +can be found on the main `Getting Started <GettingStarted.html>`_ page. + + +Requirements +============ +Before you begin to use the LLVM system, review the requirements given +below. This may save you some trouble by knowing ahead of time what hardware +and software you will need. + +Hardware +-------- +Any system that can adequately run Visual Studio 2008 is fine. The LLVM +source tree and object files, libraries and executables will consume +approximately 3GB. + +Software +-------- +You will need Visual Studio 2008 or higher. Earlier versions of Visual +Studio have bugs, are not completely compatible, or do not support the C++ +standard well enough. + +You will also need the `CMake <http://www.cmake.org/>`_ build system since it +generates the project files you will use to build with. + +If you would like to run the LLVM tests you will need `Python +<http://www.python.org/>`_. Versions 2.4-2.7 are known to work. You will need +`GnuWin32 <http://gnuwin32.sourceforge.net/>`_ tools, too. + +Do not install the LLVM directory tree into a path containing spaces (e.g. +``C:\Documents and Settings\...``) as the configure step will fail. + + +Getting Started +=============== +Here's the short story for getting up and running quickly with LLVM: + +1. Read the documentation. +2. Seriously, read the documentation. +3. Remember that you were warned twice about reading the documentation. +4. Get the Source Code + + * With the distributed files: + + 1. ``cd <where-you-want-llvm-to-live>`` + 2. ``gunzip --stdout llvm-VERSION.tar.gz | tar -xvf -`` + (*or use WinZip*) + 3. ``cd llvm`` + + * With anonymous Subversion access: + + 1. ``cd <where-you-want-llvm-to-live>`` + 2. ``svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm`` + 3. ``cd llvm`` + +5. Use `CMake <http://www.cmake.org/>`_ to generate up-to-date project files: + + * Once CMake is installed then the simplest way is to just start the + CMake GUI, select the directory where you have LLVM extracted to, and + the default options should all be fine. One option you may really + want to change, regardless of anything else, might be the + ``CMAKE_INSTALL_PREFIX`` setting to select a directory to INSTALL to + once compiling is complete, although installation is not mandatory for + using LLVM. Another important option is ``LLVM_TARGETS_TO_BUILD``, + which controls the LLVM target architectures that are included on the + build. + * See the `LLVM CMake guide <CMake.html>`_ for detailed information about + how to configure the LLVM build. + +6. Start Visual Studio + + * In the directory you created the project files will have an ``llvm.sln`` + file, just double-click on that to open Visual Studio. + +7. Build the LLVM Suite: + + * The projects may still be built individually, but to build them all do + not just select all of them in batch build (as some are meant as + configuration projects), but rather select and build just the + ``ALL_BUILD`` project to build everything, or the ``INSTALL`` project, + which first builds the ``ALL_BUILD`` project, then installs the LLVM + headers, libs, and other useful things to the directory set by the + ``CMAKE_INSTALL_PREFIX`` setting when you first configured CMake. + * The Fibonacci project is a sample program that uses the JIT. Modify the + project's debugging properties to provide a numeric command line argument + or run it from the command line. The program will print the + corresponding fibonacci value. + +8. Test LLVM on Visual Studio: + + * If ``%PATH%`` does not contain GnuWin32, you may specify + ``LLVM_LIT_TOOLS_DIR`` on CMake for the path to GnuWin32. + * You can run LLVM tests by merely building the project "check". The test + results will be shown in the VS output window. + +.. FIXME: Is it up-to-date? + +9. Test LLVM: + + * The LLVM tests can be run by changing directory to the llvm source + directory and running: + + .. code-block:: bat + + C:\..\llvm> llvm-lit test + + Note that quite a few of these test will fail. + + A specific test or test directory can be run with: + + .. code-block:: bat + + C:\..\llvm> llvm-lit test/path/to/test + + +An Example Using the LLVM Tool Chain +==================================== + +1. First, create a simple C file, name it '``hello.c``': + + .. code-block:: c + + #include <stdio.h> + int main() { + printf("hello world\n"); + return 0; + } + +2. Next, compile the C file into a LLVM bitcode file: + + .. code-block:: bat + + C:\..> clang -c hello.c -emit-llvm -o hello.bc + + This will create the result file ``hello.bc`` which is the LLVM bitcode + that corresponds the compiled program and the library facilities that + it required. You can execute this file directly using ``lli`` tool, + compile it to native assembly with the ``llc``, optimize or analyze it + further with the ``opt`` tool, etc. + + Alternatively you can directly output an executable with clang with: + + .. code-block:: bat + + C:\..> clang hello.c -o hello.exe + + The ``-o hello.exe`` is required because clang currently outputs ``a.out`` + when neither ``-o`` nor ``-c`` are given. + +3. Run the program using the just-in-time compiler: + + .. code-block:: bat + + C:\..> lli hello.bc + +4. Use the ``llvm-dis`` utility to take a look at the LLVM assembly code: + + .. code-block:: bat + + C:\..> llvm-dis < hello.bc | more + +5. Compile the program to object code using the LLC code generator: + + .. code-block:: bat + + C:\..> llc -filetype=obj hello.bc + +6. Link to binary using Microsoft link: + + .. code-block:: bat + + C:\..> link hello.obj -defaultlib:libcmt + +7. Execute the native code program: + + .. code-block:: bat + + C:\..> hello.exe + + +Common Problems +=============== +If you are having problems building or using LLVM, or if you have any other +general questions about LLVM, please consult the `Frequently Asked Questions +<FAQ.html>`_ page. + + +Links +===== +This document is just an **introduction** to how to use LLVM to do some simple +things... there are many more interesting and complicated things that you can +do that aren't documented here (but we'll gladly accept a patch if you want to +write something up!). For more information about LLVM, check out: + +* `LLVM homepage <http://llvm.org/>`_ +* `LLVM doxygen tree <http://llvm.org/doxygen/>`_ + diff --git a/docs/GoldPlugin.html b/docs/GoldPlugin.html index 2c08bd0..1e99a5a 100644 --- a/docs/GoldPlugin.html +++ b/docs/GoldPlugin.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>LLVM gold plugin</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> diff --git a/docs/HowToAddABuilder.html b/docs/HowToAddABuilder.html index 0de2dac..985b30e 100644 --- a/docs/HowToAddABuilder.html +++ b/docs/HowToAddABuilder.html @@ -6,7 +6,7 @@ <title> How To Add Your Build Configuration To LLVM Buildbot Infrastructure </title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> diff --git a/docs/HowToReleaseLLVM.html b/docs/HowToReleaseLLVM.html index 30c4f0c..6fdec2c 100644 --- a/docs/HowToReleaseLLVM.html +++ b/docs/HowToReleaseLLVM.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>How To Release LLVM To The Public</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -476,7 +476,7 @@ $ tar -cvf - llvm-test-<i>X.Y</i>rc1 | gzip > llvm-test-<i>X.Y</i>rc1.src.t <p>Review the documentation and ensure that it is up to date. The "Release Notes" must be updated to reflect new features, bug fixes, new known issues, and changes in the list of supported platforms. The "Getting Started Guide" - should be updated to reflect the new release version number tag avaiable from + should be updated to reflect the new release version number tag available from Subversion and changes in basic system requirements. Merge both changes from mainline into the release branch.</p> @@ -575,7 +575,7 @@ $ svn copy https://llvm.org/svn/llvm-project/test-suite/branches/release_XY \ src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a> <br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ + Last modified: $Date: 2012-07-31 09:05:57 +0200 (Tue, 31 Jul 2012) $ </address> </body> </html> diff --git a/docs/HowToSubmitABug.html b/docs/HowToSubmitABug.html index 0071ec6..ef7cf9e 100644 --- a/docs/HowToSubmitABug.html +++ b/docs/HowToSubmitABug.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>How to submit an LLVM bug report</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -31,9 +31,6 @@ <a href="http://misha.brukman.net">Misha Brukman</a></p> </div> </td> -<td class="right"> - <img src="img/Debugging.gif" alt="Debugging" width="444" height="314"> -</td> </tr> </table> @@ -226,12 +223,12 @@ we have chased down ended up being bugs in the program being compiled, not LLVM.</p> <p>Once you determine that the program itself is not buggy, you should choose -which code generator you wish to compile the program with (e.g. C backend, the -JIT, or LLC) and optionally a series of LLVM passes to run. For example:</p> +which code generator you wish to compile the program with (e.g. LLC or the JIT) +and optionally a series of LLVM passes to run. For example:</p> <div class="doc_code"> <p><tt> -<b>bugpoint</b> -run-cbe [... optzn passes ...] file-to-test.bc --args -- [program arguments]</tt></p> +<b>bugpoint</b> -run-llc [... optzn passes ...] file-to-test.bc --args -- [program arguments]</tt></p> </div> <p><tt>bugpoint</tt> will try to narrow down your list of passes to the one pass @@ -341,7 +338,7 @@ the following:</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a> <br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ + Last modified: $Date: 2012-06-14 18:52:55 +0200 (Thu, 14 Jun 2012) $ </address> </body> diff --git a/docs/LLVMBuild.html b/docs/LLVMBuild.html index a8420dd..9e7f8c7 100644 --- a/docs/LLVMBuild.html +++ b/docs/LLVMBuild.html @@ -3,7 +3,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>LLVMBuild Documentation</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> diff --git a/docs/LangRef.html b/docs/LangRef.html index 5261230..946380e 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -7,7 +7,7 @@ <meta name="author" content="Chris Lattner"> <meta name="description" content="LLVM Assembly Language Reference Manual."> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -257,6 +257,8 @@ <li><a href="#int_exp">'<tt>llvm.exp.*</tt>' Intrinsic</a></li> <li><a href="#int_log">'<tt>llvm.log.*</tt>' Intrinsic</a></li> <li><a href="#int_fma">'<tt>llvm.fma.*</tt>' Intrinsic</a></li> + <li><a href="#int_fabs">'<tt>llvm.fabs.*</tt>' Intrinsic</a></li> + <li><a href="#int_floor">'<tt>llvm.floor.*</tt>' Intrinsic</a></li> </ol> </li> <li><a href="#int_manip">Bit Manipulation Intrinsics</a> @@ -277,6 +279,11 @@ <li><a href="#int_umul_overflow">'<tt>llvm.umul.with.overflow.*</tt> Intrinsics</a></li> </ol> </li> + <li><a href="#spec_arithmetic">Specialised Arithmetic Intrinsics</a> + <ol> + <li><a href="#fmuladd">'<tt>llvm.fmuladd</tt> Intrinsic</a></li> + </ol> + </li> <li><a href="#int_fp16">Half Precision Floating Point Intrinsics</a> <ol> <li><a href="#int_convert_to_fp16">'<tt>llvm.convert.to.fp16</tt>' Intrinsic</a></li> @@ -307,12 +314,16 @@ '<tt>llvm.annotation.*</tt>' Intrinsic</a></li> <li><a href="#int_trap"> '<tt>llvm.trap</tt>' Intrinsic</a></li> + <li><a href="#int_debugtrap"> + '<tt>llvm.debugtrap</tt>' Intrinsic</a></li> <li><a href="#int_stackprotector"> '<tt>llvm.stackprotector</tt>' Intrinsic</a></li> - <li><a href="#int_objectsize"> + <li><a href="#int_objectsize"> '<tt>llvm.objectsize</tt>' Intrinsic</a></li> - <li><a href="#int_expect"> + <li><a href="#int_expect"> '<tt>llvm.expect</tt>' Intrinsic</a></li> + <li><a href="#int_donothing"> + '<tt>llvm.donothing</tt>' Intrinsic</a></li> </ol> </li> </ol> @@ -831,9 +842,32 @@ define i32 @main() { <i>; i32()* </i> <p>Global variables define regions of memory allocated at compilation time instead of run-time. Global variables may optionally be initialized, may have an explicit section to be placed in, and may have an optional explicit - alignment specified. A variable may be defined as "thread_local", which + alignment specified.</p> + +<p>A variable may be defined as <tt>thread_local</tt>, which means that it will not be shared by threads (each thread will have a - separated copy of the variable). A variable may be defined as a global + separated copy of the variable). Not all targets support thread-local + variables. Optionally, a TLS model may be specified:</p> + +<dl> + <dt><b><tt>localdynamic</tt></b>:</dt> + <dd>For variables that are only used within the current shared library.</dd> + + <dt><b><tt>initialexec</tt></b>:</dt> + <dd>For variables in modules that will not be loaded dynamically.</dd> + + <dt><b><tt>localexec</tt></b>:</dt> + <dd>For variables defined in the executable and only used within it.</dd> +</dl> + +<p>The models correspond to the ELF TLS models; see + <a href="http://people.redhat.com/drepper/tls.pdf">ELF + Handling For Thread-Local Storage</a> for more information on under which + circumstances the different models may be used. The target may choose a + different TLS model if the specified model is not supported, or if a better + choice of model can be made.</p> + +<p>A variable may be defined as a global "constant," which indicates that the contents of the variable will <b>never</b> be modified (enabling better optimization, allowing the global data to be placed in the read-only section of an executable, etc). @@ -886,6 +920,13 @@ define i32 @main() { <i>; i32()* </i> @G = addrspace(5) constant float 1.0, section "foo", align 4 </pre> +<p>The following example defines a thread-local global with + the <tt>initialexec</tt> TLS model:</p> + +<pre class="doc_code"> +@G = thread_local(initialexec) global i32 0, align 4 +</pre> + </div> @@ -1048,7 +1089,7 @@ declare signext i8 @returns_signed_char() value to the function. The attribute implies that a hidden copy of the pointee is made between the caller and the callee, so the callee is unable to - modify the value in the callee. This attribute is only valid on LLVM + modify the value in the caller. This attribute is only valid on LLVM pointer arguments. It is generally used to pass structs and arrays by value, but is also valid on pointers to scalars. The copy is considered to belong to the caller not the callee (for example, @@ -1167,6 +1208,13 @@ define void @f() optsize { ... } may make calls to the function faster, at the cost of extra program startup time if the function is not called during program startup.</dd> + <dt><tt><b>ia_nsdialect</b></tt></dt> + <dd>This attribute indicates the associated inline assembly call is using a + non-standard assembly dialect. The standard dialect is ATT, which is + assumed when this attribute is not present. When present, the dialect + is assumed to be Intel. Currently, ATT and Intel are the only supported + dialects.</dd> + <dt><tt><b>inlinehint</b></tt></dt> <dd>This attribute indicates that the source code contained a hint that inlining this function is desirable (such as the "inline" keyword in C/C++). It @@ -1392,7 +1440,7 @@ target datalayout = "<i>layout specification</i>" <li>If no match is found, and the type sought is an integer type, then the smallest integer type that is larger than the bitwidth of the sought type is used. If none of the specifications are larger than the bitwidth then - the the largest integer type is used. For example, given the default + the largest integer type is used. For example, given the default specifications above, the i7 type will use the alignment of i8 (next largest) while both i65 and i256 will use the alignment of i64 (largest specified).</li> @@ -2287,8 +2335,9 @@ in signal handlers).</p> by <tt>0xM</tt> followed by 32 hexadecimal digits. The IEEE 128-bit format is represented by <tt>0xL</tt> followed by 32 hexadecimal digits; no currently supported target uses this format. Long doubles will only work if - they match the long double format on your target. All hexadecimal formats - are big-endian (sign bit at the left).</p> + they match the long double format on your target. The IEEE 16-bit format + (half precision) is represented by <tt>0xH</tt> followed by 4 hexadecimal + digits. All hexadecimal formats are big-endian (sign bit at the left).</p> <p>There are no constants of type x86mmx.</p> </div> @@ -2739,7 +2788,7 @@ second_end: make it fit in <tt>TYPE</tt>.</dd> <dt><b><tt>inttoptr (CST to TYPE)</tt></b></dt> - <dd>Convert a integer constant to a pointer constant. TYPE must be a pointer + <dd>Convert an integer constant to a pointer constant. TYPE must be a pointer type. CST must be of integer type. The CST value is zero extended, truncated, or unchanged to make it fit in a pointer size. This one is <i>really</i> dangerous!</dd> @@ -2826,8 +2875,9 @@ i32 (i32) asm "bswap $0", "=r,r" </pre> <p>Inline assembler expressions may <b>only</b> be used as the callee operand of - a <a href="#i_call"><tt>call</tt> instruction</a>. Thus, typically we - have:</p> + a <a href="#i_call"><tt>call</tt></a> or an + <a href="#i_invoke"><tt>invoke</tt></a> instruction. + Thus, typically we have:</p> <pre class="doc_code"> %X = call i32 asm "<a href="#int_bswap">bswap</a> $0", "=r,r"(i32 %Y) @@ -3051,6 +3101,8 @@ call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) <li>The range should not represent the full or empty set. That is, <tt>a!=b</tt>. </li> </ul> +<p> In addition, the pairs must be in signed order of the lower bound and + they must be non-contiguous.</p> <p>Examples:</p> <div class="doc_code"> @@ -3058,10 +3110,12 @@ call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) %a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1 %b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 %c = load i8* %z, align 1, !range !2 ; Can only be 0, 1, 3, 4 or 5 + %d = load i8* %z, align 1, !range !3 ; Can only be -2, -1, 3, 4 or 5 ... !0 = metadata !{ i8 0, i8 2 } !1 = metadata !{ i8 255, i8 2 } !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 } +!3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 } </pre> </div> </div> @@ -4727,7 +4781,7 @@ IfUnequal: <h5>Arguments:</h5> <p>The first two operands of a '<tt>shufflevector</tt>' instruction are vectors - with types that match each other. The third argument is a shuffle mask whose + with the same type. The third argument is a shuffle mask whose element type is always 'i32'. The result of the instruction is a vector whose length is the same as the shuffle mask and whose element type is the same as the element type of the first two operands.</p> @@ -7464,6 +7518,74 @@ LLVM</a>.</p> </div> +<!-- _______________________________________________________________________ --> +<h4> + <a name="int_fabs">'<tt>llvm.fabs.*</tt>' Intrinsic</a> +</h4> + +<div> + +<h5>Syntax:</h5> +<p>This is an overloaded intrinsic. You can use <tt>llvm.fabs</tt> on any + floating point or vector of floating point type. Not all targets support all + types however.</p> + +<pre> + declare float @llvm.fabs.f32(float %Val) + declare double @llvm.fabs.f64(double %Val) + declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) + declare fp128 @llvm.fabs.f128(fp128 %Val) + declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) +</pre> + +<h5>Overview:</h5> +<p>The '<tt>llvm.fabs.*</tt>' intrinsics return the absolute value of + the operand.</p> + +<h5>Arguments:</h5> +<p>The argument and return value are floating point numbers of the same + type.</p> + +<h5>Semantics:</h5> +<p>This function returns the same values as the libm <tt>fabs</tt> functions + would, and handles error conditions in the same way.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<h4> + <a name="int_floor">'<tt>llvm.floor.*</tt>' Intrinsic</a> +</h4> + +<div> + +<h5>Syntax:</h5> +<p>This is an overloaded intrinsic. You can use <tt>llvm.floor</tt> on any + floating point or vector of floating point type. Not all targets support all + types however.</p> + +<pre> + declare float @llvm.floor.f32(float %Val) + declare double @llvm.floor.f64(double %Val) + declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) + declare fp128 @llvm.floor.f128(fp128 %Val) + declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) +</pre> + +<h5>Overview:</h5> +<p>The '<tt>llvm.floor.*</tt>' intrinsics return the floor of + the operand.</p> + +<h5>Arguments:</h5> +<p>The argument and return value are floating point numbers of the same + type.</p> + +<h5>Semantics:</h5> +<p>This function returns the same values as the libm <tt>floor</tt> functions + would, and handles error conditions in the same way.</p> + +</div> + </div> <!-- ======================================================================= --> @@ -7940,12 +8062,59 @@ LLVM</a>.</p> <!-- ======================================================================= --> <h3> + <a name="spec_arithmetic">Specialised Arithmetic Intrinsics</a> +</h3> + +<!-- _______________________________________________________________________ --> + +<h4> + <a name="fmuladd">'<tt>llvm.fmuladd.*</tt>' Intrinsic</a> +</h4> + +<div> + +<h5>Syntax:</h5> +<pre> + declare float @llvm.fmuladd.f32(float %a, float %b, float %c) + declare double @llvm.fmuladd.f64(double %a, double %b, double %c) +</pre> + +<h5>Overview:</h5> +<p>The '<tt>llvm.fmuladd.*</tt>' intrinsic functions represent multiply-add +expressions that can be fused if the code generator determines that the fused +expression would be legal and efficient.</p> + +<h5>Arguments:</h5> +<p>The '<tt>llvm.fmuladd.*</tt>' intrinsics each take three arguments: two +multiplicands, a and b, and an addend c.</p> + +<h5>Semantics:</h5> +<p>The expression:</p> +<pre> + %0 = call float @llvm.fmuladd.f32(%a, %b, %c) +</pre> +<p>is equivalent to the expression a * b + c, except that rounding will not be +performed between the multiplication and addition steps if the code generator +fuses the operations. Fusion is not guaranteed, even if the target platform +supports it. If a fused multiply-add is required the corresponding llvm.fma.* +intrinsic function should be used instead.</p> + +<h5>Examples:</h5> +<pre> + %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields {float}:r2 = (a * b) + c +</pre> + +</div> + +<!-- ======================================================================= --> +<h3> <a name="int_fp16">Half Precision Floating Point Intrinsics</a> </h3> <div> -<p>Half precision floating point is a storage-only format. This means that it is +<p>For most target platforms, half precision floating point is a storage-only + format. This means that it is a dense encoding (in memory) but does not support computation in the format.</p> @@ -8382,7 +8551,7 @@ LLVM</a>.</p> <h5>Syntax:</h5> <pre> - declare void @llvm.trap() + declare void @llvm.trap() noreturn nounwind </pre> <h5>Overview:</h5> @@ -8392,9 +8561,33 @@ LLVM</a>.</p> <p>None.</p> <h5>Semantics:</h5> -<p>This intrinsics is lowered to the target dependent trap instruction. If the +<p>This intrinsic is lowered to the target dependent trap instruction. If the target does not have a trap instruction, this intrinsic will be lowered to - the call of the <tt>abort()</tt> function.</p> + a call of the <tt>abort()</tt> function.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<h4> + <a name="int_debugtrap">'<tt>llvm.debugtrap</tt>' Intrinsic</a> +</h4> + +<div> + +<h5>Syntax:</h5> +<pre> + declare void @llvm.debugtrap() nounwind +</pre> + +<h5>Overview:</h5> +<p>The '<tt>llvm.debugtrap</tt>' intrinsic.</p> + +<h5>Arguments:</h5> +<p>None.</p> + +<h5>Semantics:</h5> +<p>This intrinsic is lowered to code which is intended to cause an execution + trap with the intention of requesting the attention of a debugger.</p> </div> @@ -8441,8 +8634,8 @@ LLVM</a>.</p> <h5>Syntax:</h5> <pre> - declare i32 @llvm.objectsize.i32(i8* <object>, i1 <type>) - declare i64 @llvm.objectsize.i64(i8* <object>, i1 <type>) + declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>) + declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>) </pre> <h5>Overview:</h5> @@ -8455,15 +8648,15 @@ LLVM</a>.</p> <h5>Arguments:</h5> <p>The <tt>llvm.objectsize</tt> intrinsic takes two arguments. The first argument is a pointer to or into the <tt>object</tt>. The second argument - is a boolean 0 or 1. This argument determines whether you want the - maximum (0) or minimum (1) bytes remaining. This needs to be a literal 0 or - 1, variables are not allowed.</p> + is a boolean and determines whether <tt>llvm.objectsize</tt> returns 0 (if + true) or -1 (if false) when the object size is unknown. + The second argument only accepts constants.</p> <h5>Semantics:</h5> -<p>The <tt>llvm.objectsize</tt> intrinsic is lowered to either a constant - representing the size of the object concerned, or <tt>i32/i64 -1 or 0</tt>, - depending on the <tt>type</tt> argument, if the size cannot be determined at - compile time.</p> +<p>The <tt>llvm.objectsize</tt> intrinsic is lowered to a constant representing + the size of the object concerned. If the size cannot be determined at compile + time, <tt>llvm.objectsize</tt> returns <tt>i32/i64 -1 or 0</tt> + (depending on the <tt>min</tt> argument).</p> </div> <!-- _______________________________________________________________________ --> @@ -8492,6 +8685,30 @@ LLVM</a>.</p> <p>This intrinsic is lowered to the <tt>val</tt>.</p> </div> +<!-- _______________________________________________________________________ --> +<h4> + <a name="int_donothing">'<tt>llvm.donothing</tt>' Intrinsic</a> +</h4> + +<div> + +<h5>Syntax:</h5> +<pre> + declare void @llvm.donothing() nounwind readnone +</pre> + +<h5>Overview:</h5> +<p>The <tt>llvm.donothing</tt> intrinsic doesn't perform any operation. It's the +only intrinsic that can be called with an invoke instruction.</p> + +<h5>Arguments:</h5> +<p>None.</p> + +<h5>Semantics:</h5> +<p>This intrinsic does nothing, and it's removed by optimizers and ignored by +codegen.</p> +</div> + </div> </div> @@ -8505,7 +8722,7 @@ LLVM</a>.</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-16 21:39:33 +0200 (Mon, 16 Apr 2012) $ + Last modified: $Date: 2012-08-10 02:00:22 +0200 (Fri, 10 Aug 2012) $ </address> </body> diff --git a/docs/Lexicon.html b/docs/Lexicon.html deleted file mode 100644 index dbb7f9b..0000000 --- a/docs/Lexicon.html +++ /dev/null @@ -1,292 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>The LLVM Lexicon</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> - <meta name="author" content="Various"> - <meta name="description" - content="A glossary of terms used with the LLVM project."> -</head> -<body> -<h1>The LLVM Lexicon</h1> -<p class="doc_warning">NOTE: This document is a work in progress!</p> -<!-- *********************************************************************** --> -<h2>Table Of Contents</h2> -<!-- *********************************************************************** --> -<div> - <table> - <tr><th colspan="8"><b>- <a href="#A">A</a> -</b></th></tr> - <tr> - <td><a href="#ADCE">ADCE</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#B">B</a> -</b></th></tr> - <tr> - <td><a href="#BURS">BURS</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#C">C</a> -</b></th></tr> - <tr> - <td><a href="#CSE">CSE</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#D">D</a> -</b></th></tr> - <tr> - <td><a href="#DAG">DAG</a></td> - <td><a href="#Derived_Pointer">Derived Pointer</a></td> - <td><a href="#DSA">DSA</a></td> - <td><a href="#DSE">DSE</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#F">F</a> -</b></th></tr> - <tr> - <td><a href="#FCA">FCA</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#G">G</a> -</b></th></tr> - <tr> - <td><a href="#GC">GC</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#I">I</a> -</b></th></tr> - <tr> - <td><a href="#IPA">IPA</a></td> - <td><a href="#IPO">IPO</a></td> - <td><a href="#ISel">ISel</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#L">L</a> -</b></th></tr> - <tr> - <td><a href="#LCSSA">LCSSA</a></td> - <td><a href="#LICM">LICM</a></td> - <td><a href="#Load-VN">Load-VN</a></td> - <td><a href="#LTO">LTO</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#M">M</a> -</b></th></tr> - <tr> - <td><a href="#MC">MC</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#O">O</a> -</b></th></tr> - <tr> - <td><a href="#Object_Pointer">Object Pointer</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#P">P</a> -</b></th></tr> - <tr> - <td><a href="#PRE">PRE</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#R">R</a> -</b></th></tr> - <tr> - <td><a href="#RAUW">RAUW</a></td> - <td><a href="#Reassociation">Reassociation</a></td> - <td><a href="#Root">Root</a></td> - </tr> - <tr><th colspan="8"><b>- <a href="#S">S</a> -</b></th></tr> - <tr> - <td><a href="#Safe_Point">Safe Point</a></td> - <td><a href="#SCC">SCC</a></td> - <td><a href="#SCCP">SCCP</a></td> - <td><a href="#SDISel">SDISel</a></td> - <td><a href="#SRoA">SRoA</a></td> - <td><a href="#Stack_Map">Stack Map</a></td> - </tr> - </table> -</div> - -<!-- *********************************************************************** --> -<h2>Definitions</h2> -<!-- *********************************************************************** --> -<div> -<!-- _______________________________________________________________________ --> -<h3><a name="A">- A -</a></h3> -<div> - <dl> - <dt><a name="ADCE"><b>ADCE</b></a></dt> - <dd>Aggressive Dead Code Elimination</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="B">- B -</a></h3> -<div> - <dl> - <dt><a name="BURS"><b>BURS</b></a></dt> - <dd>Bottom Up Rewriting System—A method of instruction selection for - code generation. An example is the <a -href="http://www.program-transformation.org/Transform/BURG">BURG</a> tool.</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="C">- C -</a></h3> -<div> - <dl> - <dt><a name="CSE"><b>CSE</b></a></dt> - <dd>Common Subexpression Elimination. An optimization that removes common - subexpression compuation. For example <tt>(a+b)*(a+b)</tt> has two - subexpressions that are the same: <tt>(a+b)</tt>. This optimization would - perform the addition only once and then perform the multiply (but only if - it's compulationally correct/safe). - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="D">- D -</a></h3> -<div> - <dl> - <dt><a name="DAG"><b>DAG</b></a></dt> - <dd>Directed Acyclic Graph</dd> - <dt><a name="Derived_Pointer"><b>Derived Pointer</b></a></dt> - <dd>A pointer to the interior of an object, such that a garbage collector - is unable to use the pointer for reachability analysis. While a derived - pointer is live, the corresponding object pointer must be kept in a root, - otherwise the collector might free the referenced object. With copying - collectors, derived pointers pose an additional hazard that they may be - invalidated at any <a href="Safe_Point">safe point</a>. This term is used in - opposition to <a href="#Object_Pointer">object pointer</a>.</dd> - <dt><a name="DSA"><b>DSA</b></a></dt> - <dd>Data Structure Analysis</dd> - <dt><a name="DSE"><b>DSE</b></a></dt> - <dd>Dead Store Elimination</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="F">- F -</a></h3> -<div> - <dl> - <dt><a name="FCA"><b>FCA</b></a></dt> - <dd>First Class Aggregate</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="G">- G -</a></h3> -<div> - <dl> - <dt><a name="GC"><b>GC</b></a></dt> - <dd>Garbage Collection. The practice of using reachability analysis instead - of explicit memory management to reclaim unused memory.</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="H">- H -</a></h3> -<div> - <dl> - <dt><a name="Heap"><b>Heap</b></a></dt> - <dd>In garbage collection, the region of memory which is managed using - reachability analysis.</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="I">- I -</a></h3> -<div> - <dl> - <dt><a name="IPA"><b>IPA</b></a></dt> - <dd>Inter-Procedural Analysis. Refers to any variety of code analysis that - occurs between procedures, functions or compilation units (modules).</dd> - <dt><a name="IPO"><b>IPO</b></a></dt> - <dd>Inter-Procedural Optimization. Refers to any variety of code - optimization that occurs between procedures, functions or compilation units - (modules).</dd> - <dt><a name="ISel"><b>ISel</b></a></dt> - <dd>Instruction Selection.</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="L">- L -</a></h3> -<div> - <dl> - <dt><a name="LCSSA"><b>LCSSA</b></a></dt> - <dd>Loop-Closed Static Single Assignment Form</dd> - <dt><a name="LICM"><b>LICM</b></a></dt> - <dd>Loop Invariant Code Motion</dd> - <dt><a name="Load-VN"><b>Load-VN</b></a></dt> - <dd>Load Value Numbering</dd> - <dt><a name="LTO"><b>LTO</b></a></dt> - <dd>Link-Time Optimization</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="M">- M -</a></h3> -<div> - <dl> - <dt><a name="MC"><b>MC</b></a></dt> - <dd>Machine Code</dd> - </dl> -</div> -<!-- _______________________________________________________________________ --> -<h3><a name="O">- O -</a></h3> -<div> - <dl> - <dt><a name="Object_Pointer"><b>Object Pointer</b></a></dt> - <dd>A pointer to an object such that the garbage collector is able to trace - references contained within the object. This term is used in opposition to - <a href="#Derived_Pointer">derived pointer</a>.</dd> - </dl> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="P">- P -</a></h3> -<div> - <dl> - <dt><a name="PRE"><b>PRE</b></a></dt> - <dd>Partial Redundancy Elimination</dd> - </dl> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="R">- R -</a></h3> -<div> - <dl> - <dt><a name="RAUW"><b>RAUW</b></a></dt> <dd>An abbreviation for Replace - All Uses With. The functions User::replaceUsesOfWith(), - Value::replaceAllUsesWith(), and Constant::replaceUsesOfWithOnConstant() - implement the replacement of one Value with another by iterating over its - def/use chain and fixing up all of the pointers to point to the new value. - See also <a href="ProgrammersManual.html#iterate_chains">def/use chains</a>. - </dd> - <dt><a name="Reassociation"><b>Reassociation</b></a></dt> <dd>Rearranging - associative expressions to promote better redundancy elimination and other - optimization. For example, changing (A+B-A) into (B+A-A), permitting it to - be optimized into (B+0) then (B).</dd> - <dt><a name="Root"><b>Root</b></a></dt> <dd>In garbage collection, a - pointer variable lying outside of the <a href="#Heap">heap</a> from which - the collector begins its reachability analysis. In the context of code - generation, "root" almost always refers to a "stack root" -- a local or - temporary variable within an executing function.</dd> - </dl> -</div> - -<!-- _______________________________________________________________________ --> -<h3><a name="S">- S -</a></h3> -<div> - <dl> - <dt><a name="Safe_Point"><b>Safe Point</b></a></dt> - <dd>In garbage collection, it is necessary to identify <a href="#Root">stack - roots</a> so that reachability analysis may proceed. It may be infeasible to - provide this information for every instruction, so instead the information - may is calculated only at designated safe points. With a copying collector, - <a href="#Derived_Pointers">derived pointers</a> must not be retained across - safe points and <a href="#Object_Pointers">object pointers</a> must be - reloaded from stack roots.</dd> - <dt><a name="SDISel"><b>SDISel</b></a></dt> - <dd>Selection DAG Instruction Selection.</dd> - <dt><a name="SCC"><b>SCC</b></a></dt> - <dd>Strongly Connected Component</dd> - <dt><a name="SCCP"><b>SCCP</b></a></dt> - <dd>Sparse Conditional Constant Propagation</dd> - <dt><a name="SRoA"><b>SRoA</b></a></dt> - <dd>Scalar Replacement of Aggregates</dd> - <dt><a name="SSA"><b>SSA</b></a></dt> - <dd>Static Single Assignment</dd> - <dt><a name="Stack_Map"><b>Stack Map</b></a></dt> - <dd>In garbage collection, metadata emitted by the code generator which - identifies <a href="#Root">roots</a> within the stack frame of an executing - function.</dd> - </dl> -</div> - -</div> -<!-- *********************************************************************** --> -<hr> -<address> <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a><a - href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a><a - href="http://llvm.org/">The LLVM Team</a><br> -<a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> -Last modified: $Date: 2012-01-05 09:18:41 +0100 (Thu, 05 Jan 2012) $ -</address> -<!-- vim: sw=2 ---> -</body> -</html> diff --git a/docs/Lexicon.rst b/docs/Lexicon.rst new file mode 100644 index 0000000..6ebe614 --- /dev/null +++ b/docs/Lexicon.rst @@ -0,0 +1,194 @@ +.. _lexicon: + +================ +The LLVM Lexicon +================ + +.. note:: + + This document is a work in progress! + +Definitions +=========== + +A +- + +**ADCE** + Aggressive Dead Code Elimination + +B +- + +**BURS** + + Bottom Up Rewriting System --- A method of instruction selection for code + generation. An example is the `BURG + <http://www.program-transformation.org/Transform/BURG>`_ tool. + +C +- + +**CSE** + Common Subexpression Elimination. An optimization that removes common + subexpression compuation. For example ``(a+b)*(a+b)`` has two subexpressions + that are the same: ``(a+b)``. This optimization would perform the addition + only once and then perform the multiply (but only if it's compulationally + correct/safe). + +D +- + +**DAG** + Directed Acyclic Graph + +.. _derived pointer: +.. _derived pointers: + +**Derived Pointer** + A pointer to the interior of an object, such that a garbage collector is + unable to use the pointer for reachability analysis. While a derived pointer + is live, the corresponding object pointer must be kept in a root, otherwise + the collector might free the referenced object. With copying collectors, + derived pointers pose an additional hazard that they may be invalidated at + any `safe point`_. This term is used in opposition to `object pointer`_. + +**DSA** + Data Structure Analysis + +**DSE** + Dead Store Elimination + +F +- + +**FCA** + First Class Aggregate + +G +- + +**GC** + Garbage Collection. The practice of using reachability analysis instead of + explicit memory management to reclaim unused memory. + +H +- + +.. _heap: + +**Heap** + In garbage collection, the region of memory which is managed using + reachability analysis. + +I +- + +**IPA** + Inter-Procedural Analysis. Refers to any variety of code analysis that + occurs between procedures, functions or compilation units (modules). + +**IPO** + Inter-Procedural Optimization. Refers to any variety of code optimization + that occurs between procedures, functions or compilation units (modules). + +**ISel** + Instruction Selection + +L +- + +**LCSSA** + Loop-Closed Static Single Assignment Form + +**LICM** + Loop Invariant Code Motion + +**Load-VN** + Load Value Numbering + +**LTO** + Link-Time Optimization + +M +- + +**MC** + Machine Code + +O +- +.. _object pointer: +.. _object pointers: + +**Object Pointer** + A pointer to an object such that the garbage collector is able to trace + references contained within the object. This term is used in opposition to + `derived pointer`_. + +P +- + +**PRE** + Partial Redundancy Elimination + +R +- + +**RAUW** + + Replace All Uses With. The functions ``User::replaceUsesOfWith()``, + ``Value::replaceAllUsesWith()``, and + ``Constant::replaceUsesOfWithOnConstant()`` implement the replacement of one + Value with another by iterating over its def/use chain and fixing up all of + the pointers to point to the new value. See + also `def/use chains <ProgrammersManual.html#iterate_chains>`_. + +**Reassociation** + Rearranging associative expressions to promote better redundancy elimination + and other optimization. For example, changing ``(A+B-A)`` into ``(B+A-A)``, + permitting it to be optimized into ``(B+0)`` then ``(B)``. + +.. _roots: +.. _stack roots: + +**Root** + In garbage collection, a pointer variable lying outside of the `heap`_ from + which the collector begins its reachability analysis. In the context of code + generation, "root" almost always refers to a "stack root" --- a local or + temporary variable within an executing function.</dd> + +**RPO** + Reverse postorder + +S +- + +.. _safe point: + +**Safe Point** + In garbage collection, it is necessary to identify `stack roots`_ so that + reachability analysis may proceed. It may be infeasible to provide this + information for every instruction, so instead the information may is + calculated only at designated safe points. With a copying collector, + `derived pointers`_ must not be retained across safe points and `object + pointers`_ must be reloaded from stack roots. + +**SDISel** + Selection DAG Instruction Selection. + +**SCC** + Strongly Connected Component + +**SCCP** + Sparse Conditional Constant Propagation + +**SRoA** + Scalar Replacement of Aggregates + +**SSA** + Static Single Assignment + +**Stack Map** + In garbage collection, metadata emitted by the code generator which + identifies `roots`_ within the stack frame of an executing function. diff --git a/docs/LinkTimeOptimization.html b/docs/LinkTimeOptimization.html deleted file mode 100644 index 5652555..0000000 --- a/docs/LinkTimeOptimization.html +++ /dev/null @@ -1,401 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Link Time Optimization: Design and Implementation</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> - -<h1> - LLVM Link Time Optimization: Design and Implementation -</h1> - -<ul> - <li><a href="#desc">Description</a></li> - <li><a href="#design">Design Philosophy</a> - <ul> - <li><a href="#example1">Example of link time optimization</a></li> - <li><a href="#alternative_approaches">Alternative Approaches</a></li> - </ul></li> - <li><a href="#multiphase">Multi-phase communication between LLVM and linker</a> - <ul> - <li><a href="#phase1">Phase 1 : Read LLVM Bitcode Files</a></li> - <li><a href="#phase2">Phase 2 : Symbol Resolution</a></li> - <li><a href="#phase3">Phase 3 : Optimize Bitcode Files</a></li> - <li><a href="#phase4">Phase 4 : Symbol Resolution after optimization</a></li> - </ul></li> - <li><a href="#lto">libLTO</a> - <ul> - <li><a href="#lto_module_t">lto_module_t</a></li> - <li><a href="#lto_code_gen_t">lto_code_gen_t</a></li> - </ul> -</ul> - -<div class="doc_author"> -<p>Written by Devang Patel and Nick Kledzik</p> -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="desc">Description</a> -</h2> -<!-- *********************************************************************** --> - -<div> -<p> -LLVM features powerful intermodular optimizations which can be used at link -time. Link Time Optimization (LTO) is another name for intermodular optimization -when performed during the link stage. This document describes the interface -and design between the LTO optimizer and the linker.</p> -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="design">Design Philosophy</a> -</h2> -<!-- *********************************************************************** --> - -<div> -<p> -The LLVM Link Time Optimizer provides complete transparency, while doing -intermodular optimization, in the compiler tool chain. Its main goal is to let -the developer take advantage of intermodular optimizations without making any -significant changes to the developer's makefiles or build system. This is -achieved through tight integration with the linker. In this model, the linker -treates LLVM bitcode files like native object files and allows mixing and -matching among them. The linker uses <a href="#lto">libLTO</a>, a shared -object, to handle LLVM bitcode files. This tight integration between -the linker and LLVM optimizer helps to do optimizations that are not possible -in other models. The linker input allows the optimizer to avoid relying on -conservative escape analysis. -</p> - -<!-- ======================================================================= --> -<h3> - <a name="example1">Example of link time optimization</a> -</h3> - -<div> - <p>The following example illustrates the advantages of LTO's integrated - approach and clean interface. This example requires a system linker which - supports LTO through the interface described in this document. Here, - clang transparently invokes system linker. </p> - <ul> - <li> Input source file <tt>a.c</tt> is compiled into LLVM bitcode form. - <li> Input source file <tt>main.c</tt> is compiled into native object code. - </ul> -<pre class="doc_code"> ---- a.h --- -extern int foo1(void); -extern void foo2(void); -extern void foo4(void); - ---- a.c --- -#include "a.h" - -static signed int i = 0; - -void foo2(void) { - i = -1; -} - -static int foo3() { - foo4(); - return 10; -} - -int foo1(void) { - int data = 0; - - if (i < 0) - data = foo3(); - - data = data + 42; - return data; -} - ---- main.c --- -#include <stdio.h> -#include "a.h" - -void foo4(void) { - printf("Hi\n"); -} - -int main() { - return foo1(); -} - ---- command lines --- -$ clang -emit-llvm -c a.c -o a.o # <-- a.o is LLVM bitcode file -$ clang -c main.c -o main.o # <-- main.o is native object file -$ clang a.o main.o -o main # <-- standard link command without any modifications -</pre> - -<ul> - <li>In this example, the linker recognizes that <tt>foo2()</tt> is an - externally visible symbol defined in LLVM bitcode file. The linker - completes its usual symbol resolution pass and finds that <tt>foo2()</tt> - is not used anywhere. This information is used by the LLVM optimizer and - it removes <tt>foo2()</tt>.</li> - <li>As soon as <tt>foo2()</tt> is removed, the optimizer recognizes that condition - <tt>i < 0</tt> is always false, which means <tt>foo3()</tt> is never - used. Hence, the optimizer also removes <tt>foo3()</tt>.</li> - <li>And this in turn, enables linker to remove <tt>foo4()</tt>.</li> -</ul> - -<p>This example illustrates the advantage of tight integration with the - linker. Here, the optimizer can not remove <tt>foo3()</tt> without the - linker's input.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="alternative_approaches">Alternative Approaches</a> -</h3> - -<div> - <dl> - <dt><b>Compiler driver invokes link time optimizer separately.</b></dt> - <dd>In this model the link time optimizer is not able to take advantage of - information collected during the linker's normal symbol resolution phase. - In the above example, the optimizer can not remove <tt>foo2()</tt> without - the linker's input because it is externally visible. This in turn prohibits - the optimizer from removing <tt>foo3()</tt>.</dd> - <dt><b>Use separate tool to collect symbol information from all object - files.</b></dt> - <dd>In this model, a new, separate, tool or library replicates the linker's - capability to collect information for link time optimization. Not only is - this code duplication difficult to justify, but it also has several other - disadvantages. For example, the linking semantics and the features - provided by the linker on various platform are not unique. This means, - this new tool needs to support all such features and platforms in one - super tool or a separate tool per platform is required. This increases - maintenance cost for link time optimizer significantly, which is not - necessary. This approach also requires staying synchronized with linker - developements on various platforms, which is not the main focus of the link - time optimizer. Finally, this approach increases end user's build time due - to the duplication of work done by this separate tool and the linker itself. - </dd> - </dl> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="multiphase">Multi-phase communication between libLTO and linker</a> -</h2> - -<div> - <p>The linker collects information about symbol defininitions and uses in - various link objects which is more accurate than any information collected - by other tools during typical build cycles. The linker collects this - information by looking at the definitions and uses of symbols in native .o - files and using symbol visibility information. The linker also uses - user-supplied information, such as a list of exported symbols. LLVM - optimizer collects control flow information, data flow information and knows - much more about program structure from the optimizer's point of view. - Our goal is to take advantage of tight integration between the linker and - the optimizer by sharing this information during various linking phases. -</p> - -<!-- ======================================================================= --> -<h3> - <a name="phase1">Phase 1 : Read LLVM Bitcode Files</a> -</h3> - -<div> - <p>The linker first reads all object files in natural order and collects - symbol information. This includes native object files as well as LLVM bitcode - files. To minimize the cost to the linker in the case that all .o files - are native object files, the linker only calls <tt>lto_module_create()</tt> - when a supplied object file is found to not be a native object file. If - <tt>lto_module_create()</tt> returns that the file is an LLVM bitcode file, - the linker - then iterates over the module using <tt>lto_module_get_symbol_name()</tt> and - <tt>lto_module_get_symbol_attribute()</tt> to get all symbols defined and - referenced. - This information is added to the linker's global symbol table. -</p> - <p>The lto* functions are all implemented in a shared object libLTO. This - allows the LLVM LTO code to be updated independently of the linker tool. - On platforms that support it, the shared object is lazily loaded. -</p> -</div> - -<!-- ======================================================================= --> -<h3> - <a name="phase2">Phase 2 : Symbol Resolution</a> -</h3> - -<div> - <p>In this stage, the linker resolves symbols using global symbol table. - It may report undefined symbol errors, read archive members, replace - weak symbols, etc. The linker is able to do this seamlessly even though it - does not know the exact content of input LLVM bitcode files. If dead code - stripping is enabled then the linker collects the list of live symbols. - </p> -</div> - -<!-- ======================================================================= --> -<h3> - <a name="phase3">Phase 3 : Optimize Bitcode Files</a> -</h3> -<div> - <p>After symbol resolution, the linker tells the LTO shared object which - symbols are needed by native object files. In the example above, the linker - reports that only <tt>foo1()</tt> is used by native object files using - <tt>lto_codegen_add_must_preserve_symbol()</tt>. Next the linker invokes - the LLVM optimizer and code generators using <tt>lto_codegen_compile()</tt> - which returns a native object file creating by merging the LLVM bitcode files - and applying various optimization passes. -</p> -</div> - -<!-- ======================================================================= --> -<h3> - <a name="phase4">Phase 4 : Symbol Resolution after optimization</a> -</h3> - -<div> - <p>In this phase, the linker reads optimized a native object file and - updates the internal global symbol table to reflect any changes. The linker - also collects information about any changes in use of external symbols by - LLVM bitcode files. In the example above, the linker notes that - <tt>foo4()</tt> is not used any more. If dead code stripping is enabled then - the linker refreshes the live symbol information appropriately and performs - dead code stripping.</p> - <p>After this phase, the linker continues linking as if it never saw LLVM - bitcode files.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="lto">libLTO</a> -</h2> - -<div> - <p><tt>libLTO</tt> is a shared object that is part of the LLVM tools, and - is intended for use by a linker. <tt>libLTO</tt> provides an abstract C - interface to use the LLVM interprocedural optimizer without exposing details - of LLVM's internals. The intention is to keep the interface as stable as - possible even when the LLVM optimizer continues to evolve. It should even - be possible for a completely different compilation technology to provide - a different libLTO that works with their object files and the standard - linker tool.</p> - -<!-- ======================================================================= --> -<h3> - <a name="lto_module_t">lto_module_t</a> -</h3> - -<div> - -<p>A non-native object file is handled via an <tt>lto_module_t</tt>. -The following functions allow the linker to check if a file (on disk -or in a memory buffer) is a file which libLTO can process:</p> - -<pre class="doc_code"> -lto_module_is_object_file(const char*) -lto_module_is_object_file_for_target(const char*, const char*) -lto_module_is_object_file_in_memory(const void*, size_t) -lto_module_is_object_file_in_memory_for_target(const void*, size_t, const char*) -</pre> - -<p>If the object file can be processed by libLTO, the linker creates a -<tt>lto_module_t</tt> by using one of</p> - -<pre class="doc_code"> -lto_module_create(const char*) -lto_module_create_from_memory(const void*, size_t) -</pre> - -<p>and when done, the handle is released via</p> - -<pre class="doc_code"> -lto_module_dispose(lto_module_t) -</pre> - -<p>The linker can introspect the non-native object file by getting the number of -symbols and getting the name and attributes of each symbol via:</p> - -<pre class="doc_code"> -lto_module_get_num_symbols(lto_module_t) -lto_module_get_symbol_name(lto_module_t, unsigned int) -lto_module_get_symbol_attribute(lto_module_t, unsigned int) -</pre> - -<p>The attributes of a symbol include the alignment, visibility, and kind.</p> -</div> - -<!-- ======================================================================= --> -<h3> - <a name="lto_code_gen_t">lto_code_gen_t</a> -</h3> - -<div> - -<p>Once the linker has loaded each non-native object files into an -<tt>lto_module_t</tt>, it can request libLTO to process them all and -generate a native object file. This is done in a couple of steps. -First, a code generator is created with:</p> - -<pre class="doc_code">lto_codegen_create()</pre> - -<p>Then, each non-native object file is added to the code generator with:</p> - -<pre class="doc_code"> -lto_codegen_add_module(lto_code_gen_t, lto_module_t) -</pre> - -<p>The linker then has the option of setting some codegen options. Whether or -not to generate DWARF debug info is set with:</p> - -<pre class="doc_code">lto_codegen_set_debug_model(lto_code_gen_t)</pre> - -<p>Which kind of position independence is set with:</p> - -<pre class="doc_code">lto_codegen_set_pic_model(lto_code_gen_t) </pre> - -<p>And each symbol that is referenced by a native object file or otherwise must -not be optimized away is set with:</p> - -<pre class="doc_code"> -lto_codegen_add_must_preserve_symbol(lto_code_gen_t, const char*) -</pre> - -<p>After all these settings are done, the linker requests that a native object -file be created from the modules with the settings using:</p> - -<pre class="doc_code">lto_codegen_compile(lto_code_gen_t, size*)</pre> - -<p>which returns a pointer to a buffer containing the generated native -object file. The linker then parses that and links it with the rest -of the native object files.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - Devang Patel and Nick Kledzik<br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ -</address> - -</body> -</html> - diff --git a/docs/LinkTimeOptimization.rst b/docs/LinkTimeOptimization.rst new file mode 100644 index 0000000..53d673e --- /dev/null +++ b/docs/LinkTimeOptimization.rst @@ -0,0 +1,298 @@ +.. _lto: + +====================================================== +LLVM Link Time Optimization: Design and Implementation +====================================================== + +.. contents:: + :local: + +Description +=========== + +LLVM features powerful intermodular optimizations which can be used at link +time. Link Time Optimization (LTO) is another name for intermodular +optimization when performed during the link stage. This document describes the +interface and design between the LTO optimizer and the linker. + +Design Philosophy +================= + +The LLVM Link Time Optimizer provides complete transparency, while doing +intermodular optimization, in the compiler tool chain. Its main goal is to let +the developer take advantage of intermodular optimizations without making any +significant changes to the developer's makefiles or build system. This is +achieved through tight integration with the linker. In this model, the linker +treates LLVM bitcode files like native object files and allows mixing and +matching among them. The linker uses `libLTO`_, a shared object, to handle LLVM +bitcode files. This tight integration between the linker and LLVM optimizer +helps to do optimizations that are not possible in other models. The linker +input allows the optimizer to avoid relying on conservative escape analysis. + +Example of link time optimization +--------------------------------- + +The following example illustrates the advantages of LTO's integrated approach +and clean interface. This example requires a system linker which supports LTO +through the interface described in this document. Here, clang transparently +invokes system linker. + +* Input source file ``a.c`` is compiled into LLVM bitcode form. +* Input source file ``main.c`` is compiled into native object code. + +.. code-block:: c++ + + --- a.h --- + extern int foo1(void); + extern void foo2(void); + extern void foo4(void); + + --- a.c --- + #include "a.h" + + static signed int i = 0; + + void foo2(void) { + i = -1; + } + + static int foo3() { + foo4(); + return 10; + } + + int foo1(void) { + int data = 0; + + if (i < 0) + data = foo3(); + + data = data + 42; + return data; + } + + --- main.c --- + #include <stdio.h> + #include "a.h" + + void foo4(void) { + printf("Hi\n"); + } + + int main() { + return foo1(); + } + +.. code-block:: bash + + --- command lines --- + % clang -emit-llvm -c a.c -o a.o # <-- a.o is LLVM bitcode file + % clang -c main.c -o main.o # <-- main.o is native object file + % clang a.o main.o -o main # <-- standard link command without modifications + +* In this example, the linker recognizes that ``foo2()`` is an externally + visible symbol defined in LLVM bitcode file. The linker completes its usual + symbol resolution pass and finds that ``foo2()`` is not used + anywhere. This information is used by the LLVM optimizer and it + removes ``foo2()``.</li> + +* As soon as ``foo2()`` is removed, the optimizer recognizes that condition ``i + < 0`` is always false, which means ``foo3()`` is never used. Hence, the + optimizer also removes ``foo3()``. + +* And this in turn, enables linker to remove ``foo4()``. + +This example illustrates the advantage of tight integration with the +linker. Here, the optimizer can not remove ``foo3()`` without the linker's +input. + +Alternative Approaches +---------------------- + +**Compiler driver invokes link time optimizer separately.** + In this model the link time optimizer is not able to take advantage of + information collected during the linker's normal symbol resolution phase. + In the above example, the optimizer can not remove ``foo2()`` without the + linker's input because it is externally visible. This in turn prohibits the + optimizer from removing ``foo3()``. + +**Use separate tool to collect symbol information from all object files.** + In this model, a new, separate, tool or library replicates the linker's + capability to collect information for link time optimization. Not only is + this code duplication difficult to justify, but it also has several other + disadvantages. For example, the linking semantics and the features provided + by the linker on various platform are not unique. This means, this new tool + needs to support all such features and platforms in one super tool or a + separate tool per platform is required. This increases maintenance cost for + link time optimizer significantly, which is not necessary. This approach + also requires staying synchronized with linker developements on various + platforms, which is not the main focus of the link time optimizer. Finally, + this approach increases end user's build time due to the duplication of work + done by this separate tool and the linker itself. + +Multi-phase communication between ``libLTO`` and linker +======================================================= + +The linker collects information about symbol defininitions and uses in various +link objects which is more accurate than any information collected by other +tools during typical build cycles. The linker collects this information by +looking at the definitions and uses of symbols in native .o files and using +symbol visibility information. The linker also uses user-supplied information, +such as a list of exported symbols. LLVM optimizer collects control flow +information, data flow information and knows much more about program structure +from the optimizer's point of view. Our goal is to take advantage of tight +integration between the linker and the optimizer by sharing this information +during various linking phases. + +Phase 1 : Read LLVM Bitcode Files +--------------------------------- + +The linker first reads all object files in natural order and collects symbol +information. This includes native object files as well as LLVM bitcode files. +To minimize the cost to the linker in the case that all .o files are native +object files, the linker only calls ``lto_module_create()`` when a supplied +object file is found to not be a native object file. If ``lto_module_create()`` +returns that the file is an LLVM bitcode file, the linker then iterates over the +module using ``lto_module_get_symbol_name()`` and +``lto_module_get_symbol_attribute()`` to get all symbols defined and referenced. +This information is added to the linker's global symbol table. + + +The lto* functions are all implemented in a shared object libLTO. This allows +the LLVM LTO code to be updated independently of the linker tool. On platforms +that support it, the shared object is lazily loaded. + +Phase 2 : Symbol Resolution +--------------------------- + +In this stage, the linker resolves symbols using global symbol table. It may +report undefined symbol errors, read archive members, replace weak symbols, etc. +The linker is able to do this seamlessly even though it does not know the exact +content of input LLVM bitcode files. If dead code stripping is enabled then the +linker collects the list of live symbols. + +Phase 3 : Optimize Bitcode Files +-------------------------------- + +After symbol resolution, the linker tells the LTO shared object which symbols +are needed by native object files. In the example above, the linker reports +that only ``foo1()`` is used by native object files using +``lto_codegen_add_must_preserve_symbol()``. Next the linker invokes the LLVM +optimizer and code generators using ``lto_codegen_compile()`` which returns a +native object file creating by merging the LLVM bitcode files and applying +various optimization passes. + +Phase 4 : Symbol Resolution after optimization +---------------------------------------------- + +In this phase, the linker reads optimized a native object file and updates the +internal global symbol table to reflect any changes. The linker also collects +information about any changes in use of external symbols by LLVM bitcode +files. In the example above, the linker notes that ``foo4()`` is not used any +more. If dead code stripping is enabled then the linker refreshes the live +symbol information appropriately and performs dead code stripping. + +After this phase, the linker continues linking as if it never saw LLVM bitcode +files. + +.. _libLTO: + +``libLTO`` +========== + +``libLTO`` is a shared object that is part of the LLVM tools, and is intended +for use by a linker. ``libLTO`` provides an abstract C interface to use the LLVM +interprocedural optimizer without exposing details of LLVM's internals. The +intention is to keep the interface as stable as possible even when the LLVM +optimizer continues to evolve. It should even be possible for a completely +different compilation technology to provide a different libLTO that works with +their object files and the standard linker tool. + +``lto_module_t`` +---------------- + +A non-native object file is handled via an ``lto_module_t``. The following +functions allow the linker to check if a file (on disk or in a memory buffer) is +a file which libLTO can process: + +.. code-block:: c + + lto_module_is_object_file(const char*) + lto_module_is_object_file_for_target(const char*, const char*) + lto_module_is_object_file_in_memory(const void*, size_t) + lto_module_is_object_file_in_memory_for_target(const void*, size_t, const char*) + +If the object file can be processed by ``libLTO``, the linker creates a +``lto_module_t`` by using one of: + +.. code-block:: c + + lto_module_create(const char*) + lto_module_create_from_memory(const void*, size_t) + +and when done, the handle is released via + +.. code-block:: c + + lto_module_dispose(lto_module_t) + + +The linker can introspect the non-native object file by getting the number of +symbols and getting the name and attributes of each symbol via: + +.. code-block:: c + + lto_module_get_num_symbols(lto_module_t) + lto_module_get_symbol_name(lto_module_t, unsigned int) + lto_module_get_symbol_attribute(lto_module_t, unsigned int) + +The attributes of a symbol include the alignment, visibility, and kind. + +``lto_code_gen_t`` +------------------ + +Once the linker has loaded each non-native object files into an +``lto_module_t``, it can request ``libLTO`` to process them all and generate a +native object file. This is done in a couple of steps. First, a code generator +is created with: + +.. code-block:: c + + lto_codegen_create() + +Then, each non-native object file is added to the code generator with: + +.. code-block:: c + + lto_codegen_add_module(lto_code_gen_t, lto_module_t) + +The linker then has the option of setting some codegen options. Whether or not +to generate DWARF debug info is set with: + +.. code-block:: c + + lto_codegen_set_debug_model(lto_code_gen_t) + +Which kind of position independence is set with: + +.. code-block:: c + + lto_codegen_set_pic_model(lto_code_gen_t) + +And each symbol that is referenced by a native object file or otherwise must not +be optimized away is set with: + +.. code-block:: c + + lto_codegen_add_must_preserve_symbol(lto_code_gen_t, const char*) + +After all these settings are done, the linker requests that a native object file +be created from the modules with the settings using: + +.. code-block:: c + + lto_codegen_compile(lto_code_gen_t, size*) + +which returns a pointer to a buffer containing the generated native object file. +The linker then parses that and links it with the rest of the native object +files. diff --git a/docs/Makefile b/docs/Makefile index 389fd90..122c4b8 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -8,7 +8,7 @@ ##===----------------------------------------------------------------------===## LEVEL := .. -DIRS := CommandGuide tutorial +DIRS := ifdef BUILD_FOR_WEBSITE PROJ_OBJ_DIR = . @@ -26,10 +26,9 @@ include $(LEVEL)/Makefile.common HTML := $(wildcard $(PROJ_SRC_DIR)/*.html) \ $(wildcard $(PROJ_SRC_DIR)/*.css) -IMAGES := $(wildcard $(PROJ_SRC_DIR)/img/*.*) DOXYFILES := doxygen.cfg.in doxygen.css doxygen.footer doxygen.header \ doxygen.intro -EXTRA_DIST := $(HTML) $(DOXYFILES) llvm.css CommandGuide img +EXTRA_DIST := $(HTML) $(DOXYFILES) llvm.css CommandGuide .PHONY: install-html install-doxygen doxygen install-ocamldoc ocamldoc generated @@ -56,9 +55,7 @@ generated:: $(generated_targets) install-html: $(PROJ_OBJ_DIR)/html.tar.gz $(Echo) Installing HTML documentation $(Verb) $(MKDIR) $(DESTDIR)$(PROJ_docsdir)/html - $(Verb) $(MKDIR) $(DESTDIR)$(PROJ_docsdir)/html/img $(Verb) $(DataInstall) $(HTML) $(DESTDIR)$(PROJ_docsdir)/html - $(Verb) $(DataInstall) $(IMAGES) $(DESTDIR)$(PROJ_docsdir)/html/img $(Verb) $(DataInstall) $(PROJ_OBJ_DIR)/html.tar.gz $(DESTDIR)$(PROJ_docsdir) $(PROJ_OBJ_DIR)/html.tar.gz: $(HTML) diff --git a/docs/Makefile.sphinx b/docs/Makefile.sphinx new file mode 100644 index 0000000..21f6648 --- /dev/null +++ b/docs/Makefile.sphinx @@ -0,0 +1,155 @@ +# Makefile for Sphinx documentation +# + +# You can set these variables from the command line. +SPHINXOPTS = +SPHINXBUILD = sphinx-build +PAPER = +BUILDDIR = _build + +# Internal variables. +PAPEROPT_a4 = -D latex_paper_size=a4 +PAPEROPT_letter = -D latex_paper_size=letter +ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . +# the i18n builder cannot share the environment and doctrees with the others +I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . + +.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext + +all: html + +help: + @echo "Please use \`make <target>' where <target> is one of" + @echo " html to make standalone HTML files" + @echo " dirhtml to make HTML files named index.html in directories" + @echo " singlehtml to make a single large HTML file" + @echo " pickle to make pickle files" + @echo " json to make JSON files" + @echo " htmlhelp to make HTML files and a HTML help project" + @echo " qthelp to make HTML files and a qthelp project" + @echo " devhelp to make HTML files and a Devhelp project" + @echo " epub to make an epub" + @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" + @echo " latexpdf to make LaTeX files and run them through pdflatex" + @echo " text to make text files" + @echo " man to make manual pages" + @echo " texinfo to make Texinfo files" + @echo " info to make Texinfo files and run them through makeinfo" + @echo " gettext to make PO message catalogs" + @echo " changes to make an overview of all changed/added/deprecated items" + @echo " linkcheck to check all external links for integrity" + @echo " doctest to run all doctests embedded in the documentation (if enabled)" + +clean: + -rm -rf $(BUILDDIR)/* + +html: + $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." + +dirhtml: + $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." + +singlehtml: + $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml + @echo + @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." + +pickle: + $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle + @echo + @echo "Build finished; now you can process the pickle files." + +json: + $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json + @echo + @echo "Build finished; now you can process the JSON files." + +htmlhelp: + $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp + @echo + @echo "Build finished; now you can run HTML Help Workshop with the" \ + ".hhp project file in $(BUILDDIR)/htmlhelp." + +qthelp: + $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp + @echo + @echo "Build finished; now you can run "qcollectiongenerator" with the" \ + ".qhcp project file in $(BUILDDIR)/qthelp, like this:" + @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/llvm.qhcp" + @echo "To view the help file:" + @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/llvm.qhc" + +devhelp: + $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp + @echo + @echo "Build finished." + @echo "To view the help file:" + @echo "# mkdir -p $$HOME/.local/share/devhelp/llvm" + @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/llvm" + @echo "# devhelp" + +epub: + $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub + @echo + @echo "Build finished. The epub file is in $(BUILDDIR)/epub." + +latex: + $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex + @echo + @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." + @echo "Run \`make' in that directory to run these through (pdf)latex" \ + "(use \`make latexpdf' here to do that automatically)." + +latexpdf: + $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex + @echo "Running LaTeX files through pdflatex..." + $(MAKE) -C $(BUILDDIR)/latex all-pdf + @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." + +text: + $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text + @echo + @echo "Build finished. The text files are in $(BUILDDIR)/text." + +man: + $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man + @echo + @echo "Build finished. The manual pages are in $(BUILDDIR)/man." + +texinfo: + $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo + @echo + @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." + @echo "Run \`make' in that directory to run these through makeinfo" \ + "(use \`make info' here to do that automatically)." + +info: + $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo + @echo "Running Texinfo files through makeinfo..." + make -C $(BUILDDIR)/texinfo info + @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." + +gettext: + $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale + @echo + @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." + +changes: + $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes + @echo + @echo "The overview file is in $(BUILDDIR)/changes." + +linkcheck: + $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck + @echo + @echo "Link check complete; look for any errors in the above output " \ + "or in $(BUILDDIR)/linkcheck/output.txt." + +doctest: + $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest + @echo "Testing of doctests in the sources finished, look at the " \ + "results in $(BUILDDIR)/doctest/output.txt." diff --git a/docs/MakefileGuide.html b/docs/MakefileGuide.html deleted file mode 100644 index 1e7c3469..0000000 --- a/docs/MakefileGuide.html +++ /dev/null @@ -1,1039 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Makefile Guide</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>LLVM Makefile Guide</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - <li><a href="#general">General Concepts</a> - <ol> - <li><a href="#projects">Projects</a></li> - <li><a href="#varvals">Variable Values</a></li> - <li><a href="#including">Including Makefiles</a> - <ol> - <li><a href="#Makefile">Makefile</a></li> - <li><a href="#Makefile.common">Makefile.common</a></li> - <li><a href="#Makefile.config">Makefile.config</a></li> - <li><a href="#Makefile.rules">Makefile.rules</a></li> - </ol> - </li> - <li><a href="#Comments">Comments</a></li> - </ol> - </li> - <li><a href="#tutorial">Tutorial</a> - <ol> - <li><a href="#libraries">Libraries</a> - <ol> - <li><a href="#BCModules">Bitcode Modules</a></li> - <li><a href="#LoadableModules">Loadable Modules</a></li> - </ol> - </li> - <li><a href="#tools">Tools</a> - <ol> - <li><a href="#JIT">JIT Tools</a></li> - </ol> - </li> - <li><a href="#projects">Projects</a></li> - </ol> - </li> - <li><a href="#targets">Targets Supported</a> - <ol> - <li><a href="#all">all</a></li> - <li><a href="#all-local">all-local</a></li> - <li><a href="#check">check</a></li> - <li><a href="#check-local">check-local</a></li> - <li><a href="#clean">clean</a></li> - <li><a href="#clean-local">clean-local</a></li> - <li><a href="#dist">dist</a></li> - <li><a href="#dist-check">dist-check</a></li> - <li><a href="#dist-clean">dist-clean</a></li> - <li><a href="#install">install</a></li> - <li><a href="#preconditions">preconditions</a></li> - <li><a href="#printvars">printvars</a></li> - <li><a href="#reconfigure">reconfigure</a></li> - <li><a href="#spotless">spotless</a></li> - <li><a href="#tags">tags</a></li> - <li><a href="#uninstall">uninstall</a></li> - </ol> - </li> - <li><a href="#variables">Using Variables</a> - <ol> - <li><a href="#setvars">Control Variables</a></li> - <li><a href="#overvars">Override Variables</a></li> - <li><a href="#getvars">Readable Variables</a></li> - <li><a href="#intvars">Internal Variables</a></li> - </ol> - </li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:reid@x10sys.com">Reid Spencer</a></p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="introduction">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - <p>This document provides <em>usage</em> information about the LLVM makefile - system. While loosely patterned after the BSD makefile system, LLVM has taken - a departure from BSD in order to implement additional features needed by LLVM. - Although makefile systems such as automake were attempted at one point, it - has become clear that the features needed by LLVM and the Makefile norm are - too great to use a more limited tool. Consequently, LLVM requires simply GNU - Make 3.79, a widely portable makefile processor. LLVM unabashedly makes heavy - use of the features of GNU Make so the dependency on GNU Make is firm. If - you're not familiar with <tt>make</tt>, it is recommended that you read the - <a href="http://www.gnu.org/software/make/manual/make.html">GNU Makefile - Manual</a>.</p> - <p>While this document is rightly part of the - <a href="ProgrammersManual.html">LLVM Programmer's Manual</a>, it is treated - separately here because of the volume of content and because it is often an - early source of bewilderment for new developers.</p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="general">General Concepts</a></h2> -<!-- *********************************************************************** --> - -<div> - <p>The LLVM Makefile System is the component of LLVM that is responsible for - building the software, testing it, generating distributions, checking those - distributions, installing and uninstalling, etc. It consists of a several - files throughout the source tree. These files and other general concepts are - described in this section.</p> - -<!-- ======================================================================= --> -<h3><a name="projects">Projects</a></h3> -<div> - <p>The LLVM Makefile System is quite generous. It not only builds its own - software, but it can build yours too. Built into the system is knowledge of - the <tt>llvm/projects</tt> directory. Any directory under <tt>projects</tt> - that has both a <tt>configure</tt> script and a <tt>Makefile</tt> is assumed - to be a project that uses the LLVM Makefile system. Building software that - uses LLVM does not require the LLVM Makefile System nor even placement in the - <tt>llvm/projects</tt> directory. However, doing so will allow your project - to get up and running quickly by utilizing the built-in features that are used - to compile LLVM. LLVM compiles itself using the same features of the makefile - system as used for projects.</p> - <p>For complete details on setting up your projects configuration, simply - mimic the <tt>llvm/projects/sample</tt> project or for further details, - consult the <a href="Projects.html">Projects.html</a> page.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="varvalues">Variable Values</a></h3> -<div> - <p>To use the makefile system, you simply create a file named - <tt>Makefile</tt> in your directory and declare values for certain variables. - The variables and values that you select determine what the makefile system - will do. These variables enable rules and processing in the makefile system - that automatically Do The Right Thing™. -</div> - -<!-- ======================================================================= --> -<h3><a name="including">Including Makefiles</a></h3> -<div> - <p>Setting variables alone is not enough. You must include into your Makefile - additional files that provide the rules of the LLVM Makefile system. The - various files involved are described in the sections that follow.</p> - -<!-- ======================================================================= --> -<h4><a name="Makefile">Makefile</a></h4> -<div> - <p>Each directory to participate in the build needs to have a file named - <tt>Makefile</tt>. This is the file first read by <tt>make</tt>. It has three - sections:</p> - <ol> - <li><a href="#setvars">Settable Variables</a> - Required that must be set - first.</li> - <li><a href="#Makefile.common">include <tt>$(LEVEL)/Makefile.common</tt></a> - - include the LLVM Makefile system. - <li><a href="#overvars">Override Variables</a> - Override variables set by - the LLVM Makefile system. - </ol> -</div> - -<!-- ======================================================================= --> -<h4><a name="Makefile.common">Makefile.common</a></h4> -<div> - <p>Every project must have a <tt>Makefile.common</tt> file at its top source - directory. This file serves three purposes:</p> - <ol> - <li>It includes the project's configuration makefile to obtain values - determined by the <tt>configure</tt> script. This is done by including the - <a href="#Makefile.config"><tt>$(LEVEL)/Makefile.config</tt></a> file.</li> - <li>It specifies any other (static) values that are needed throughout the - project. Only values that are used in all or a large proportion of the - project's directories should be placed here.</li> - <li>It includes the standard rules for the LLVM Makefile system, - <a href="#Makefile.rules"><tt>$(LLVM_SRC_ROOT)/Makefile.rules</tt></a>. - This file is the "guts" of the LLVM Makefile system.</li> - </ol> -</div> - -<!-- ======================================================================= --> -<h4><a name="Makefile.config">Makefile.config</a></h4> -<div> - <p>Every project must have a <tt>Makefile.config</tt> at the top of its - <em>build</em> directory. This file is <b>generated</b> by the - <tt>configure</tt> script from the pattern provided by the - <tt>Makefile.config.in</tt> file located at the top of the project's - <em>source</em> directory. The contents of this file depend largely on what - configuration items the project uses, however most projects can get what they - need by just relying on LLVM's configuration found in - <tt>$(LLVM_OBJ_ROOT)/Makefile.config</tt>. -</div> - -<!-- ======================================================================= --> -<h4><a name="Makefile.rules">Makefile.rules</a></h4> -<div> - <p>This file, located at <tt>$(LLVM_SRC_ROOT)/Makefile.rules</tt> is the heart - of the LLVM Makefile System. It provides all the logic, dependencies, and - rules for building the targets supported by the system. What it does largely - depends on the values of <tt>make</tt> <a href="#variables">variables</a> that - have been set <em>before</em> <tt>Makefile.rules</tt> is included. -</div> - -</div> - -<!-- ======================================================================= --> -<h3><a name="Comments">Comments</a></h3> -<div> - <p>User Makefiles need not have comments in them unless the construction is - unusual or it does not strictly follow the rules and patterns of the LLVM - makefile system. Makefile comments are invoked with the pound (#) character. - The # character and any text following it, to the end of the line, are ignored - by <tt>make</tt>.</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="tutorial">Tutorial</a></h2> -<!-- *********************************************************************** --> -<div> - <p>This section provides some examples of the different kinds of modules you - can build with the LLVM makefile system. In general, each directory you - provide will build a single object although that object may be composed of - additionally compiled components.</p> - -<!-- ======================================================================= --> -<h3><a name="libraries">Libraries</a></h3> -<div> - <p>Only a few variable definitions are needed to build a regular library. - Normally, the makefile system will build all the software into a single - <tt>libname.o</tt> (pre-linked) object. This means the library is not - searchable and that the distinction between compilation units has been - dissolved. Optionally, you can ask for a shared library (.so) or archive - library (.a) built. Archive libraries are the default. For example:</p> - <pre><tt> - LIBRARYNAME = mylib - SHARED_LIBRARY = 1 - ARCHIVE_LIBRARY = 1 - </tt></pre> - <p>says to build a library named "mylib" with both a shared library - (<tt>mylib.so</tt>) and an archive library (<tt>mylib.a</tt>) version. The - contents of all the - libraries produced will be the same, they are just constructed differently. - Note that you normally do not need to specify the sources involved. The LLVM - Makefile system will infer the source files from the contents of the source - directory.</p> - <p>The <tt>LOADABLE_MODULE=1</tt> directive can be used in conjunction with - <tt>SHARED_LIBRARY=1</tt> to indicate that the resulting shared library should - be openable with the <tt>dlopen</tt> function and searchable with the - <tt>dlsym</tt> function (or your operating system's equivalents). While this - isn't strictly necessary on Linux and a few other platforms, it is required - on systems like HP-UX and Darwin. You should use <tt>LOADABLE_MODULE</tt> for - any shared library that you intend to be loaded into an tool via the - <tt>-load</tt> option. See the - <a href="WritingAnLLVMPass.html#makefile">WritingAnLLVMPass.html</a> document - for an example of why you might want to do this. - -<!-- ======================================================================= --> -<h4><a name="BCModules">Bitcode Modules</a></h4> -<div> - <p>In some situations, it is desirable to build a single bitcode module from - a variety of sources, instead of an archive, shared library, or bitcode - library. Bitcode modules can be specified in addition to any of the other - types of libraries by defining the <a href="#MODULE_NAME">MODULE_NAME</a> - variable. For example:</p> - <pre><tt> - LIBRARYNAME = mylib - BYTECODE_LIBRARY = 1 - MODULE_NAME = mymod - </tt></pre> - <p>will build a module named <tt>mymod.bc</tt> from the sources in the - directory. This module will be an aggregation of all the bitcode modules - derived from the sources. The example will also build a bitcode archive - containing a bitcode module for each compiled source file. The difference is - subtle, but important depending on how the module or library is to be linked. - </p> -</div> - -<!-- ======================================================================= --> -<h4> - <a name="LoadableModules">Loadable Modules</a> -</h4> -<div> - <p>In some situations, you need to create a loadable module. Loadable modules - can be loaded into programs like <tt>opt</tt> or <tt>llc</tt> to specify - additional passes to run or targets to support. Loadable modules are also - useful for debugging a pass or providing a pass with another package if that - pass can't be included in LLVM.</p> - <p>LLVM provides complete support for building such a module. All you need to - do is use the LOADABLE_MODULE variable in your Makefile. For example, to - build a loadable module named <tt>MyMod</tt> that uses the LLVM libraries - <tt>LLVMSupport.a</tt> and <tt>LLVMSystem.a</tt>, you would specify:</p> - <pre><tt> - LIBRARYNAME := MyMod - LOADABLE_MODULE := 1 - LINK_COMPONENTS := support system - </tt></pre> - <p>Use of the <tt>LOADABLE_MODULE</tt> facility implies several things:</p> - <ol> - <li>There will be no "lib" prefix on the module. This differentiates it from - a standard shared library of the same name.</li> - <li>The <a href="#SHARED_LIBRARY">SHARED_LIBRARY</a> variable is turned - on.</li> - <li>The <a href="#LINK_LIBS_IN_SHARED">LINK_LIBS_IN_SHARED</a> variable - is turned on.</li> - </ol> - <p>A loadable module is loaded by LLVM via the facilities of libtool's libltdl - library which is part of <tt>lib/System</tt> implementation.</p> -</div> - -</div> - -<!-- ======================================================================= --> -<h3><a name="tools">Tools</a></h3> -<div> - <p>For building executable programs (tools), you must provide the name of the - tool and the names of the libraries you wish to link with the tool. For - example:</p> - <pre><tt> - TOOLNAME = mytool - USEDLIBS = mylib - LINK_COMPONENTS = support system - </tt></pre> - <p>says that we are to build a tool name <tt>mytool</tt> and that it requires - three libraries: <tt>mylib</tt>, <tt>LLVMSupport.a</tt> and - <tt>LLVMSystem.a</tt>.</p> - <p>Note that two different variables are use to indicate which libraries are - linked: <tt>USEDLIBS</tt> and <tt>LLVMLIBS</tt>. This distinction is necessary - to support projects. <tt>LLVMLIBS</tt> refers to the LLVM libraries found in - the LLVM object directory. <tt>USEDLIBS</tt> refers to the libraries built by - your project. In the case of building LLVM tools, <tt>USEDLIBS</tt> and - <tt>LLVMLIBS</tt> can be used interchangeably since the "project" is LLVM - itself and <tt>USEDLIBS</tt> refers to the same place as <tt>LLVMLIBS</tt>. - </p> - <p>Also note that there are two different ways of specifying a library: with a - <tt>.a</tt> suffix and without. Without the suffix, the entry refers to the - re-linked (.o) file which will include <em>all</em> symbols of the library. - This is useful, for example, to include all passes from a library of passes. - If the <tt>.a</tt> suffix is used then the library is linked as a searchable - library (with the <tt>-l</tt> option). In this case, only the symbols that are - unresolved <em>at that point</em> will be resolved from the library, if they - exist. Other (unreferenced) symbols will not be included when the <tt>.a</tt> - syntax is used. Note that in order to use the <tt>.a</tt> suffix, the library - in question must have been built with the <tt>ARCHIVE_LIBRARY</tt> option set. - </p> - -<!-- ======================================================================= --> -<h4><a name="JIT">JIT Tools</a></h4> -<div> - <p>Many tools will want to use the JIT features of LLVM. To do this, you - simply specify that you want an execution 'engine', and the makefiles will - automatically link in the appropriate JIT for the host or an interpreter - if none is available:</p> - <pre><tt> - TOOLNAME = my_jit_tool - USEDLIBS = mylib - LINK_COMPONENTS = engine - </tt></pre> - <p>Of course, any additional libraries may be listed as other components. To - get a full understanding of how this changes the linker command, it is - recommended that you:</p> - <pre><tt> - cd examples/Fibonacci - make VERBOSE=1 - </tt></pre> -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="targets">Targets Supported</a></h2> -<!-- *********************************************************************** --> - -<div> - <p>This section describes each of the targets that can be built using the LLVM - Makefile system. Any target can be invoked from any directory but not all are - applicable to a given directory (e.g. "check", "dist" and "install" will - always operate as if invoked from the top level directory).</p> - - <table style="text-align:left"> - <tr> - <th>Target Name</th><th>Implied Targets</th><th>Target Description</th> - </tr> - <tr><td><a href="#all"><tt>all</tt></a></td><td></td> - <td>Compile the software recursively. Default target. - </td></tr> - <tr><td><a href="#all-local"><tt>all-local</tt></a></td><td></td> - <td>Compile the software in the local directory only. - </td></tr> - <tr><td><a href="#check"><tt>check</tt></a></td><td></td> - <td>Change to the <tt>test</tt> directory in a project and run the - test suite there. - </td></tr> - <tr><td><a href="#check-local"><tt>check-local</tt></a></td><td></td> - <td>Run a local test suite. Generally this is only defined in the - <tt>Makefile</tt> of the project's <tt>test</tt> directory. - </td></tr> - <tr><td><a href="#clean"><tt>clean</tt></a></td><td></td> - <td>Remove built objects recursively. - </td></tr> - <tr><td><a href="#clean-local"><tt>clean-local</tt></a></td><td></td> - <td>Remove built objects from the local directory only. - </td></tr> - <tr><td><a href="#dist"><tt>dist</tt></a></td><td>all</td> - <td>Prepare a source distribution tarball. - </td></tr> - <tr><td><a href="#dist-check"><tt>dist-check</tt></a></td><td>all</td> - <td>Prepare a source distribution tarball and check that it builds. - </td></tr> - <tr><td><a href="#dist-clean"><tt>dist-clean</tt></a></td><td>clean</td> - <td>Clean source distribution tarball temporary files. - </td></tr> - <tr><td><a href="#install"><tt>install</tt></a></td><td>all</td> - <td>Copy built objects to installation directory. - </td></tr> - <tr><td><a href="#preconditions"><tt>preconditions</tt></a></td><td>all</td> - <td>Check to make sure configuration and makefiles are up to date. - </td></tr> - <tr><td><a href="#printvars"><tt>printvars</tt></a></td><td>all</td> - <td>Prints variables defined by the makefile system (for debugging). - </td></tr> - <tr><td><a href="#tags"><tt>tags</tt></a></td><td></td> - <td>Make C and C++ tags files for emacs and vi. - </td></tr> - <tr><td><a href="#uninstall"><tt>uninstall</tt></a></td><td></td> - <td>Remove built objects from installation directory. - </td></tr> - </table> - -<!-- ======================================================================= --> -<h3><a name="all">all (default)</a></h3> -<div> - <p>When you invoke <tt>make</tt> with no arguments, you are implicitly - instructing it to seek the "all" target (goal). This target is used for - building the software recursively and will do different things in different - directories. For example, in a <tt>lib</tt> directory, the "all" target will - compile source files and generate libraries. But, in a <tt>tools</tt> - directory, it will link libraries and generate executables.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="all-local">all-local</a></h3> -<div> - <p>This target is the same as <a href="#all">all</a> but it operates only on - the current directory instead of recursively.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="check">check</a></h3> -<div> - <p>This target can be invoked from anywhere within a project's directories - but always invokes the <a href="#check-local"><tt>check-local</tt></a> target - in the project's <tt>test</tt> directory, if it exists and has a - <tt>Makefile</tt>. A warning is produced otherwise. If - <a href="#TESTSUITE"><tt>TESTSUITE</tt></a> is defined on the <tt>make</tt> - command line, it will be passed down to the invocation of - <tt>make check-local</tt> in the <tt>test</tt> directory. The intended usage - for this is to assist in running specific suites of tests. If - <tt>TESTSUITE</tt> is not set, the implementation of <tt>check-local</tt> - should run all normal tests. It is up to the project to define what - different values for <tt>TESTSUTE</tt> will do. See the - <a href="TestingGuide.html">TestingGuide</a> for further details.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="check-local">check-local</a></h3> -<div> - <p>This target should be implemented by the <tt>Makefile</tt> in the project's - <tt>test</tt> directory. It is invoked by the <tt>check</tt> target elsewhere. - Each project is free to define the actions of <tt>check-local</tt> as - appropriate for that project. The LLVM project itself uses dejagnu to run a - suite of feature and regresson tests. Other projects may choose to use - dejagnu or any other testing mechanism.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="clean">clean</a></h3> -<div> - <p>This target cleans the build directory, recursively removing all things - that the Makefile builds. The cleaning rules have been made guarded so they - shouldn't go awry (via <tt>rm -f $(UNSET_VARIABLE)/*</tt> which will attempt - to erase the entire directory structure.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="clean-local">clean-local</a></h3> -<div> - <p>This target does the same thing as <tt>clean</tt> but only for the current - (local) directory.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="dist">dist</a></h3> -<div> - <p>This target builds a distribution tarball. It first builds the entire - project using the <tt>all</tt> target and then tars up the necessary files and - compresses it. The generated tarball is sufficient for a casual source - distribution, but probably not for a release (see <tt>dist-check</tt>).</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="dist-check">dist-check</a></h3> -<div> - <p>This target does the same thing as the <tt>dist</tt> target but also checks - the distribution tarball. The check is made by unpacking the tarball to a new - directory, configuring it, building it, installing it, and then verifying that - the installation results are correct (by comparing to the original build). - This target can take a long time to run but should be done before a release - goes out to make sure that the distributed tarball can actually be built into - a working release.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="dist-clean">dist-clean</a></h3> -<div> - <p>This is a special form of the <tt>clean</tt> clean target. It performs a - normal <tt>clean</tt> but also removes things pertaining to building the - distribution.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="install">install</a></h3> -<div> - <p>This target finalizes shared objects and executables and copies all - libraries, headers, executables and documentation to the directory given - with the <tt>--prefix</tt> option to <tt>configure</tt>. When completed, - the prefix directory will have everything needed to <b>use</b> LLVM. </p> - <p>The LLVM makefiles can generate complete <b>internal</b> documentation - for all the classes by using <tt>doxygen</tt>. By default, this feature is - <b>not</b> enabled because it takes a long time and generates a massive - amount of data (>100MB). If you want this feature, you must configure LLVM - with the --enable-doxygen switch and ensure that a modern version of doxygen - (1.3.7 or later) is available in your <tt>PATH</tt>. You can download - doxygen from - <a href="http://www.stack.nl/~dimitri/doxygen/download.html#latestsrc"> - here</a>. -</div> - -<!-- ======================================================================= --> -<h3><a name="preconditions">preconditions</a></h3> -<div> - <p>This utility target checks to see if the <tt>Makefile</tt> in the object - directory is older than the <tt>Makefile</tt> in the source directory and - copies it if so. It also reruns the <tt>configure</tt> script if that needs to - be done and rebuilds the <tt>Makefile.config</tt> file similarly. Users may - overload this target to ensure that sanity checks are run <em>before</em> any - building of targets as all the targets depend on <tt>preconditions</tt>.</p> -</div> - -<!-- ======================================================================= --> -<h3><a name="printvars">printvars</a></h3> -<div> - <p>This utility target just causes the LLVM makefiles to print out some of - the makefile variables so that you can double check how things are set. </p> -</div> - -<!-- ======================================================================= --> -<h3><a name="reconfigure">reconfigure</a></h3> -<div> - <p>This utility target will force a reconfigure of LLVM or your project. It - simply runs <tt>$(PROJ_OBJ_ROOT)/config.status --recheck</tt> to rerun the - configuration tests and rebuild the configured files. This isn't generally - useful as the makefiles will reconfigure themselves whenever its necessary. - </p> -</div> - -<!-- ======================================================================= --> -<h3><a name="spotless">spotless</a></h3> -<div> - <p>This utility target, only available when <tt>$(PROJ_OBJ_ROOT)</tt> is not - the same as <tt>$(PROJ_SRC_ROOT)</tt>, will completely clean the - <tt>$(PROJ_OBJ_ROOT)</tt> directory by removing its content entirely and - reconfiguring the directory. This returns the <tt>$(PROJ_OBJ_ROOT)</tt> - directory to a completely fresh state. All content in the directory except - configured files and top-level makefiles will be lost.</p> - <div class="doc_warning"><p>Use with caution.</p></div> -</div> - -<!-- ======================================================================= --> -<h3><a name="tags">tags</a></h3> -<div> - <p>This target will generate a <tt>TAGS</tt> file in the top-level source - directory. It is meant for use with emacs, XEmacs, or ViM. The TAGS file - provides an index of symbol definitions so that the editor can jump you to the - definition quickly. </p> -</div> - -<!-- ======================================================================= --> -<h3><a name="uninstall">uninstall</a></h3> -<div> - <p>This target is the opposite of the <tt>install</tt> target. It removes the - header, library and executable files from the installation directories. Note - that the directories themselves are not removed because it is not guaranteed - that LLVM is the only thing installing there (e.g. --prefix=/usr).</p> -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="variables">Variables</a></h2> -<!-- *********************************************************************** --> -<div> - <p>Variables are used to tell the LLVM Makefile System what to do and to - obtain information from it. Variables are also used internally by the LLVM - Makefile System. Variable names that contain only the upper case alphabetic - letters and underscore are intended for use by the end user. All other - variables are internal to the LLVM Makefile System and should not be relied - upon nor modified. The sections below describe how to use the LLVM Makefile - variables.</p> - -<!-- ======================================================================= --> -<h3><a name="setvars">Control Variables</a></h3> -<div> - <p>Variables listed in the table below should be set <em>before</em> the - inclusion of <a href="#Makefile.common"><tt>$(LEVEL)/Makefile.common</tt></a>. - These variables provide input to the LLVM make system that tell it what to do - for the current directory.</p> - <dl> - <dt><a name="BUILD_ARCHIVE"><tt>BUILD_ARCHIVE</tt></a></dt> - <dd>If set to any value, causes an archive (.a) library to be built.</dd> - <dt><a name="BUILT_SOURCES"><tt>BUILT_SOURCES</tt></a></dt> - <dd>Specifies a set of source files that are generated from other source - files. These sources will be built before any other target processing to - ensure they are present.</dd> - <dt><a name="BYTECODE_LIBRARY"><tt>BYTECODE_LIBRARY</tt></a></dt> - <dd>If set to any value, causes a bitcode library (.bc) to be built.</dd> - <dt><a name="CONFIG_FILES"><tt>CONFIG_FILES</tt></a></dt> - <dd>Specifies a set of configuration files to be installed.</dd> - <dt><a name="DEBUG_SYMBOLS"><tt>DEBUG_SYMBOLS</tt></a></dt> - <dd>If set to any value, causes the build to include debugging - symbols even in optimized objects, libraries and executables. This - alters the flags specified to the compilers and linkers. Debugging - isn't fun in an optimized build, but it is possible.</dd> - <dt><a name="DIRS"><tt>DIRS</tt></a></dt> - <dd>Specifies a set of directories, usually children of the current - directory, that should also be made using the same goal. These directories - will be built serially.</dd> - <dt><a name="DISABLE_AUTO_DEPENDENCIES"><tt>DISABLE_AUTO_DEPENDENCIES</tt></a></dt> - <dd>If set to any value, causes the makefiles to <b>not</b> automatically - generate dependencies when running the compiler. Use of this feature is - discouraged and it may be removed at a later date.</dd> - <dt><a name="ENABLE_OPTIMIZED"><tt>ENABLE_OPTIMIZED</tt></a></dt> - <dd>If set to 1, causes the build to generate optimized objects, - libraries and executables. This alters the flags specified to the compilers - and linkers. Generally debugging won't be a fun experience with an optimized - build.</dd> - <dt><a name="ENABLE_PROFILING"><tt>ENABLE_PROFILING</tt></a></dt> - <dd>If set to 1, causes the build to generate both optimized and - profiled objects, libraries and executables. This alters the flags specified - to the compilers and linkers to ensure that profile data can be collected - from the tools built. Use the <tt>gprof</tt> tool to analyze the output from - the profiled tools (<tt>gmon.out</tt>).</dd> - <dt><a name="DISABLE_ASSERTIONS"><tt>DISABLE_ASSERTIONS</tt></a></dt> - <dd>If set to 1, causes the build to disable assertions, even if - building a debug or profile build. This will exclude all assertion check - code from the build. LLVM will execute faster, but with little help when - things go wrong.</dd> - <dt><a name="EXPERIMENTAL_DIRS"><tt>EXPERIMENTAL_DIRS</tt></a></dt> - <dd>Specify a set of directories that should be built, but if they fail, it - should not cause the build to fail. Note that this should only be used - temporarily while code is being written.</dd> - <dt><a name="EXPORTED_SYMBOL_FILE"><tt>EXPORTED_SYMBOL_FILE</tt></a></dt> - <dd>Specifies the name of a single file that contains a list of the - symbols to be exported by the linker. One symbol per line.</dd> - <dt><a name="EXPORTED_SYMBOL_LIST"><tt>EXPORTED_SYMBOL_LIST</tt></a></dt> - <dd>Specifies a set of symbols to be exported by the linker.</dd> - <dt><a name="EXTRA_DIST"><tt>EXTRA_DIST</tt></a></dt> - <dd>Specifies additional files that should be distributed with LLVM. All - source files, all built sources, all Makefiles, and most documentation files - will be automatically distributed. Use this variable to distribute any - files that are not automatically distributed.</dd> - <dt><a name="KEEP_SYMBOLS"><tt>KEEP_SYMBOLS</tt></a></dt> - <dd>If set to any value, specifies that when linking executables the - makefiles should retain debug symbols in the executable. Normally, symbols - are stripped from the executable.</dd> - <dt><a name="LEVEL"><tt>LEVEL</tt></a><small>(required)</small></dt> - <dd>Specify the level of nesting from the top level. This variable must be - set in each makefile as it is used to find the top level and thus the other - makefiles.</dd> - <dt><a name="LIBRARYNAME"><tt>LIBRARYNAME</tt></a></dt> - <dd>Specify the name of the library to be built. (Required For - Libraries)</dd> - <dt><a name="LINK_COMPONENTS"><tt>LINK_COMPONENTS</tt></a></dt> - <dd>When specified for building a tool, the value of this variable will be - passed to the <tt>llvm-config</tt> tool to generate a link line for the - tool. Unlike <tt>USEDLIBS</tt> and <tt>LLVMLIBS</tt>, not all libraries need - to be specified. The <tt>llvm-config</tt> tool will figure out the library - dependencies and add any libraries that are needed. The <tt>USEDLIBS</tt> - variable can still be used in conjunction with <tt>LINK_COMPONENTS</tt> so - that additional project-specific libraries can be linked with the LLVM - libraries specified by <tt>LINK_COMPONENTS</tt></dd> - <dt><a name="LINK_LIBS_IN_SHARED"><tt>LINK_LIBS_IN_SHARED</tt></a></dt> - <dd>By default, shared library linking will ignore any libraries specified - with the <a href="LLVMLIBS">LLVMLIBS</a> or <a href="USEDLIBS">USEDLIBS</a>. - This prevents shared libs from including things that will be in the LLVM - tool the shared library will be loaded into. However, sometimes it is useful - to link certain libraries into your shared library and this option enables - that feature.</dd> - <dt><a name="LLVMLIBS"><tt>LLVMLIBS</tt></a></dt> - <dd>Specifies the set of libraries from the LLVM $(ObjDir) that will be - linked into the tool or library.</dd> - <dt><a name="LOADABLE_MODULE"><tt>LOADABLE_MODULE</tt></a></dt> - <dd>If set to any value, causes the shared library being built to also be - a loadable module. Loadable modules can be opened with the dlopen() function - and searched with dlsym (or the operating system's equivalent). Note that - setting this variable without also setting <tt>SHARED_LIBRARY</tt> will have - no effect.</dd> - <dt><a name="MODULE_NAME"><tt>MODULE_NAME</tt></a></dt> - <dd>Specifies the name of a bitcode module to be created. A bitcode - module can be specified in conjunction with other kinds of library builds - or by itself. It constructs from the sources a single linked bitcode - file.</dd> - <dt><a name="NO_INSTALL"><tt>NO_INSTALL</tt></a></dt> - <dd>Specifies that the build products of the directory should not be - installed but should be built even if the <tt>install</tt> target is given. - This is handy for directories that build libraries or tools that are only - used as part of the build process, such as code generators (e.g. - <tt>tblgen</tt>).</dd> - <dt><a name="OPTIONAL_DIRS"><tt>OPTIONAL_DIRS</tt></a></dt> - <dd>Specify a set of directories that may be built, if they exist, but its - not an error for them not to exist.</dd> - <dt><a name="PARALLEL_DIRS"><tt>PARALLEL_DIRS</tt></a></dt> - <dd>Specify a set of directories to build recursively and in parallel if - the -j option was used with <tt>make</tt>.</dd> - <dt><a name="SHARED_LIBRARY"><tt>SHARED_LIBRARY</tt></a></dt> - <dd>If set to any value, causes a shared library (.so) to be built in - addition to any other kinds of libraries. Note that this option will cause - all source files to be built twice: once with options for position - independent code and once without. Use it only where you really need a - shared library.</dd> - <dt><a name="SOURCES"><tt>SOURCES</tt><small>(optional)</small></a></dt> - <dd>Specifies the list of source files in the current directory to be - built. Source files of any type may be specified (programs, documentation, - config files, etc.). If not specified, the makefile system will infer the - set of source files from the files present in the current directory.</dd> - <dt><a name="SUFFIXES"><tt>SUFFIXES</tt></a></dt> - <dd>Specifies a set of filename suffixes that occur in suffix match rules. - Only set this if your local <tt>Makefile</tt> specifies additional suffix - match rules.</dd> - <dt><a name="TARGET"><tt>TARGET</tt></a></dt> - <dd>Specifies the name of the LLVM code generation target that the - current directory builds. Setting this variable enables additional rules to - build <tt>.inc</tt> files from <tt>.td</tt> files. </dd> - <dt><a name="TESTSUITE"><tt>TESTSUITE</tt></a></dt> - <dd>Specifies the directory of tests to run in <tt>llvm/test</tt>.</dd> - <dt><a name="TOOLNAME"><tt>TOOLNAME</tt></a></dt> - <dd>Specifies the name of the tool that the current directory should - build.</dd> - <dt><a name="TOOL_VERBOSE"><tt>TOOL_VERBOSE</tt></a></dt> - <dd>Implies VERBOSE and also tells each tool invoked to be verbose. This is - handy when you're trying to see the sub-tools invoked by each tool invoked - by the makefile. For example, this will pass <tt>-v</tt> to the GCC - compilers which causes it to print out the command lines it uses to invoke - sub-tools (compiler, assembler, linker).</dd> - <dt><a name="USEDLIBS"><tt>USEDLIBS</tt></a></dt> - <dd>Specifies the list of project libraries that will be linked into the - tool or library.</dd> - <dt><a name="VERBOSE"><tt>VERBOSE</tt></a></dt> - <dd>Tells the Makefile system to produce detailed output of what it is doing - instead of just summary comments. This will generate a LOT of output.</dd> - </dl> -</div> - -<!-- ======================================================================= --> -<h3><a name="overvars">Override Variables</a></h3> -<div> - <p>Override variables can be used to override the default - values provided by the LLVM makefile system. These variables can be set in - several ways:</p> - <ul> - <li>In the environment (e.g. setenv, export) -- not recommended.</li> - <li>On the <tt>make</tt> command line -- recommended.</li> - <li>On the <tt>configure</tt> command line</li> - <li>In the Makefile (only <em>after</em> the inclusion of <a - href="#Makefile.common"><tt>$(LEVEL)/Makefile.common</tt></a>).</li> - </ul> - <p>The override variables are given below:</p> - <dl> - <dt><a name="AR"><tt>AR</tt></a> <small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>ar</tt> tool.</dd> - <dt><a name="PROJ_OBJ_DIR"><tt>PROJ_OBJ_DIR</tt></a></dt> - <dd>The directory into which the products of build rules will be placed. - This might be the same as - <a href="#PROJ_SRC_DIR"><tt>PROJ_SRC_DIR</tt></a> but typically is - not.</dd> - <dt><a name="PROJ_SRC_DIR"><tt>PROJ_SRC_DIR</tt></a></dt> - <dd>The directory which contains the source files to be built.</dd> - <dt><a name="BUILD_EXAMPLES"><tt>BUILD_EXAMPLES</tt></a></dt> - <dd>If set to 1, build examples in <tt>examples</tt> and (if building - Clang) <tt>tools/clang/examples</tt> directories.</dd> - <dt><a name="BZIP2"><tt>BZIP2</tt></a><small>(configured)</small></dt> - <dd>The path to the <tt>bzip2</tt> tool.</dd> - <dt><a name="CC"><tt>CC</tt></a><small>(configured)</small></dt> - <dd>The path to the 'C' compiler.</dd> - <dt><a name="CFLAGS"><tt>CFLAGS</tt></a></dt> - <dd>Additional flags to be passed to the 'C' compiler.</dd> - <dt><a name="CXX"><tt>CXX</tt></a></dt> - <dd>Specifies the path to the C++ compiler.</dd> - <dt><a name="CXXFLAGS"><tt>CXXFLAGS</tt></a></dt> - <dd>Additional flags to be passed to the C++ compiler.</dd> - <dt><a name="DATE"><tt>DATE<small>(configured)</small></tt></a></dt> - <dd>Specifies the path to the <tt>date</tt> program or any program that can - generate the current date and time on its standard output</dd> - <dt><a name="DOT"><tt>DOT</tt></a><small>(configured)</small></dt> - <dd>Specifies the path to the <tt>dot</tt> tool or <tt>false</tt> if there - isn't one.</dd> - <dt><a name="ECHO"><tt>ECHO</tt></a><small>(configured)</small></dt> - <dd>Specifies the path to the <tt>echo</tt> tool for printing output.</dd> - <dt><a name="EXEEXT"><tt>EXEEXT</tt></a><small>(configured)</small></dt> - <dd>Provides the extension to be used on executables built by the makefiles. - The value may be empty on platforms that do not use file extensions for - executables (e.g. Unix).</dd> - <dt><a name="INSTALL"><tt>INSTALL</tt></a><small>(configured)</small></dt> - <dd>Specifies the path to the <tt>install</tt> tool.</dd> - <dt><a name="LDFLAGS"><tt>LDFLAGS</tt></a><small>(configured)</small></dt> - <dd>Allows users to specify additional flags to pass to the linker.</dd> - <dt><a name="LIBS"><tt>LIBS</tt></a><small>(configured)</small></dt> - <dd>The list of libraries that should be linked with each tool.</dd> - <dt><a name="LIBTOOL"><tt>LIBTOOL</tt></a><small>(configured)</small></dt> - <dd>Specifies the path to the <tt>libtool</tt> tool. This tool is renamed - <tt>mklib</tt> by the <tt>configure</tt> script and always located in the - <dt><a name="LLVMAS"><tt>LLVMAS</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>llvm-as</tt> tool.</dd> - <dt><a name="LLVMCC"><tt>LLVMCC</tt></a></dt> - <dd>Specifies the path to the LLVM capable compiler.</dd> - <dt><a name="LLVMCXX"><tt>LLVMCXX</tt></a></dt> - <dd>Specifies the path to the LLVM C++ capable compiler.</dd> - <dt><a name="LLVMGCC"><tt>LLVMGCC</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the LLVM version of the GCC 'C' Compiler</dd> - <dt><a name="LLVMGXX"><tt>LLVMGXX</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the LLVM version of the GCC C++ Compiler</dd> - <dt><a name="LLVMLD"><tt>LLVMLD</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the LLVM bitcode linker tool</dd> - <dt><a name="LLVM_OBJ_ROOT"><tt>LLVM_OBJ_ROOT</tt></a><small>(configured) - </small></dt> - <dd>Specifies the top directory into which the output of the build is - placed.</dd> - <dt><a name="LLVM_SRC_ROOT"><tt>LLVM_SRC_ROOT</tt></a><small>(configured) - </small></dt> - <dd>Specifies the top directory in which the sources are found.</dd> - <dt><a name="LLVM_TARBALL_NAME"><tt>LLVM_TARBALL_NAME</tt></a> - <small>(configured)</small></dt> - <dd>Specifies the name of the distribution tarball to create. This is - configured from the name of the project and its version number.</dd> - <dt><a name="MKDIR"><tt>MKDIR</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>mkdir</tt> tool that creates - directories.</dd> - <dt><a name="ONLY_TOOLS"><tt>ONLY_TOOLS</tt></a></dt> - <dd>If set, specifies the list of tools to build.</dd> - <dt><a name="PLATFORMSTRIPOPTS"><tt>PLATFORMSTRIPOPTS</tt></a></dt> - <dd>The options to provide to the linker to specify that a stripped (no - symbols) executable should be built.</dd> - <dt><a name="RANLIB"><tt>RANLIB</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>ranlib</tt> tool.</dd> - <dt><a name="RM"><tt>RM</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>rm</tt> tool.</dd> - <dt><a name="SED"><tt>SED</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>sed</tt> tool.</dd> - <dt><a name="SHLIBEXT"><tt>SHLIBEXT</tt></a><small>(configured)</small></dt> - <dd>Provides the filename extension to use for shared libraries.</dd> - <dt><a name="TBLGEN"><tt>TBLGEN</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>tblgen</tt> tool.</dd> - <dt><a name="TAR"><tt>TAR</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>tar</tt> tool.</dd> - <dt><a name="ZIP"><tt>ZIP</tt></a><small>(defaulted)</small></dt> - <dd>Specifies the path to the <tt>zip</tt> tool.</dd> - </dl> -</div> - -<!-- ======================================================================= --> -<h3><a name="getvars">Readable Variables</a></h3> -<div> - <p>Variables listed in the table below can be used by the user's Makefile but - should not be changed. Changing the value will generally cause the build to go - wrong, so don't do it.</p> - <dl> - <dt><a name="bindir"><tt>bindir</tt></a></dt> - <dd>The directory into which executables will ultimately be installed. This - value is derived from the <tt>--prefix</tt> option given to - <tt>configure</tt>.</dd> - <dt><a name="BuildMode"><tt>BuildMode</tt></a></dt> - <dd>The name of the type of build being performed: Debug, Release, or - Profile</dd> - <dt><a name="bitcode_libdir"><tt>bytecode_libdir</tt></a></dt> - <dd>The directory into which bitcode libraries will ultimately be - installed. This value is derived from the <tt>--prefix</tt> option given to - <tt>configure</tt>.</dd> - <dt><a name="ConfigureScriptFLAGS"><tt>ConfigureScriptFLAGS</tt></a></dt> - <dd>Additional flags given to the <tt>configure</tt> script when - reconfiguring.</dd> - <dt><a name="DistDir"><tt>DistDir</tt></a></dt> - <dd>The <em>current</em> directory for which a distribution copy is being - made.</dd> - <dt><a name="Echo"><tt>Echo</tt></a></dt> - <dd>The LLVM Makefile System output command. This provides the - <tt>llvm[n]</tt> prefix and starts with @ so the command itself is not - printed by <tt>make</tt>.</dd> - <dt><a name="EchoCmd"><tt>EchoCmd</tt></a></dt> - <dd> Same as <a href="#Echo"><tt>Echo</tt></a> but without the leading @. - </dd> - <dt><a name="includedir"><tt>includedir</tt></a></dt> - <dd>The directory into which include files will ultimately be installed. - This value is derived from the <tt>--prefix</tt> option given to - <tt>configure</tt>.</dd> - <dt><a name="libdir"><tt>libdir</tt></a></dt><dd></dd> - <dd>The directory into which native libraries will ultimately be installed. - This value is derived from the <tt>--prefix</tt> option given to - <tt>configure</tt>.</dd> - <dt><a name="LibDir"><tt>LibDir</tt></a></dt> - <dd>The configuration specific directory into which libraries are placed - before installation.</dd> - <dt><a name="MakefileConfig"><tt>MakefileConfig</tt></a></dt> - <dd>Full path of the <tt>Makefile.config</tt> file.</dd> - <dt><a name="MakefileConfigIn"><tt>MakefileConfigIn</tt></a></dt> - <dd>Full path of the <tt>Makefile.config.in</tt> file.</dd> - <dt><a name="ObjDir"><tt>ObjDir</tt></a></dt> - <dd>The configuration and directory specific directory where build objects - (compilation results) are placed.</dd> - <dt><a name="SubDirs"><tt>SubDirs</tt></a></dt> - <dd>The complete list of sub-directories of the current directory as - specified by other variables.</dd> - <dt><a name="Sources"><tt>Sources</tt></a></dt> - <dd>The complete list of source files.</dd> - <dt><a name="sysconfdir"><tt>sysconfdir</tt></a></dt> - <dd>The directory into which configuration files will ultimately be - installed. This value is derived from the <tt>--prefix</tt> option given to - <tt>configure</tt>.</dd> - <dt><a name="ToolDir"><tt>ToolDir</tt></a></dt> - <dd>The configuration specific directory into which executables are placed - before they are installed.</dd> - <dt><a name="TopDistDir"><tt>TopDistDir</tt></a></dt> - <dd>The top most directory into which the distribution files are copied. - </dd> - <dt><a name="Verb"><tt>Verb</tt></a></dt> - <dd>Use this as the first thing on your build script lines to enable or - disable verbose mode. It expands to either an @ (quiet mode) or nothing - (verbose mode). </dd> - </dl> -</div> - -<!-- ======================================================================= --> -<h3><a name="intvars">Internal Variables</a></h3> -<div> - <p>Variables listed below are used by the LLVM Makefile System - and considered internal. You should not use these variables under any - circumstances.</p> - <p><tt> - Archive - AR.Flags - BaseNameSources - BCCompile.C - BCCompile.CXX - BCLinkLib - C.Flags - Compile.C - CompileCommonOpts - Compile.CXX - ConfigStatusScript - ConfigureScript - CPP.Flags - CPP.Flags - CXX.Flags - DependFiles - DestArchiveLib - DestBitcodeLib - DestModule - DestSharedLib - DestTool - DistAlways - DistCheckDir - DistCheckTop - DistFiles - DistName - DistOther - DistSources - DistSubDirs - DistTarBZ2 - DistTarGZip - DistZip - ExtraLibs - FakeSources - INCFiles - InternalTargets - LD.Flags - LibName.A - LibName.BC - LibName.LA - LibName.O - LibTool.Flags - Link - LinkModule - LLVMLibDir - LLVMLibsOptions - LLVMLibsPaths - LLVMToolDir - LLVMUsedLibs - LocalTargets - Module - ObjectsBC - ObjectsLO - ObjectsO - ObjMakefiles - ParallelTargets - PreConditions - ProjLibsOptions - ProjLibsPaths - ProjUsedLibs - Ranlib - RecursiveTargets - SrcMakefiles - Strip - StripWarnMsg - TableGen - TDFiles - ToolBuildPath - TopLevelTargets - UserTargets - </tt></p> -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:rspencer@x10sys.com">Reid Spencer</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ -</address> -</body> -</html> diff --git a/docs/MakefileGuide.rst b/docs/MakefileGuide.rst new file mode 100644 index 0000000..d2bdd24 --- /dev/null +++ b/docs/MakefileGuide.rst @@ -0,0 +1,956 @@ +.. _makefile_guide: + +=================== +LLVM Makefile Guide +=================== + +.. contents:: + :local: + +Introduction +============ + +This document provides *usage* information about the LLVM makefile system. While +loosely patterned after the BSD makefile system, LLVM has taken a departure from +BSD in order to implement additional features needed by LLVM. Although makefile +systems, such as ``automake``, were attempted at one point, it has become clear +that the features needed by LLVM and the ``Makefile`` norm are too great to use +a more limited tool. Consequently, LLVM requires simply GNU Make 3.79, a widely +portable makefile processor. LLVM unabashedly makes heavy use of the features of +GNU Make so the dependency on GNU Make is firm. If you're not familiar with +``make``, it is recommended that you read the `GNU Makefile Manual +<http://www.gnu.org/software/make/manual/make.html>`_. + +While this document is rightly part of the `LLVM Programmer's +Manual <ProgrammersManual.html>`_, it is treated separately here because of the +volume of content and because it is often an early source of bewilderment for +new developers. + +General Concepts +================ + +The LLVM Makefile System is the component of LLVM that is responsible for +building the software, testing it, generating distributions, checking those +distributions, installing and uninstalling, etc. It consists of a several files +throughout the source tree. These files and other general concepts are described +in this section. + +Projects +-------- + +The LLVM Makefile System is quite generous. It not only builds its own software, +but it can build yours too. Built into the system is knowledge of the +``llvm/projects`` directory. Any directory under ``projects`` that has both a +``configure`` script and a ``Makefile`` is assumed to be a project that uses the +LLVM Makefile system. Building software that uses LLVM does not require the +LLVM Makefile System nor even placement in the ``llvm/projects`` +directory. However, doing so will allow your project to get up and running +quickly by utilizing the built-in features that are used to compile LLVM. LLVM +compiles itself using the same features of the makefile system as used for +projects. + +For complete details on setting up your projects configuration, simply mimic the +``llvm/projects/sample`` project. Or for further details, consult the +`Projects <Projects.html>`_ page. + +Variable Values +--------------- + +To use the makefile system, you simply create a file named ``Makefile`` in your +directory and declare values for certain variables. The variables and values +that you select determine what the makefile system will do. These variables +enable rules and processing in the makefile system that automatically Do The +Right Thing™. + +Including Makefiles +------------------- + +Setting variables alone is not enough. You must include into your Makefile +additional files that provide the rules of the LLVM Makefile system. The various +files involved are described in the sections that follow. + +``Makefile`` +^^^^^^^^^^^^ + +Each directory to participate in the build needs to have a file named +``Makefile``. This is the file first read by ``make``. It has three +sections: + +#. Settable Variables --- Required that must be set first. +#. ``include $(LEVEL)/Makefile.common`` --- include the LLVM Makefile system. +#. Override Variables --- Override variables set by the LLVM Makefile system. + +.. _$(LEVEL)/Makefile.common: + +``Makefile.common`` +^^^^^^^^^^^^^^^^^^^ + +Every project must have a ``Makefile.common`` file at its top source +directory. This file serves three purposes: + +#. It includes the project's configuration makefile to obtain values determined + by the ``configure`` script. This is done by including the + `$(LEVEL)/Makefile.config`_ file. + +#. It specifies any other (static) values that are needed throughout the + project. Only values that are used in all or a large proportion of the + project's directories should be placed here. + +#. It includes the standard rules for the LLVM Makefile system, + `$(LLVM_SRC_ROOT)/Makefile.rules`_. This file is the *guts* of the LLVM + ``Makefile`` system. + +.. _$(LEVEL)/Makefile.config: + +``Makefile.config`` +^^^^^^^^^^^^^^^^^^^ + +Every project must have a ``Makefile.config`` at the top of its *build* +directory. This file is **generated** by the ``configure`` script from the +pattern provided by the ``Makefile.config.in`` file located at the top of the +project's *source* directory. The contents of this file depend largely on what +configuration items the project uses, however most projects can get what they +need by just relying on LLVM's configuration found in +``$(LLVM_OBJ_ROOT)/Makefile.config``. + +.. _$(LLVM_SRC_ROOT)/Makefile.rules: + +``Makefile.rules`` +^^^^^^^^^^^^^^^^^^ + +This file, located at ``$(LLVM_SRC_ROOT)/Makefile.rules`` is the heart of the +LLVM Makefile System. It provides all the logic, dependencies, and rules for +building the targets supported by the system. What it does largely depends on +the values of ``make`` `variables`_ that have been set *before* +``Makefile.rules`` is included. + +Comments +^^^^^^^^ + +User ``Makefile``\s need not have comments in them unless the construction is +unusual or it does not strictly follow the rules and patterns of the LLVM +makefile system. Makefile comments are invoked with the pound (``#``) character. +The ``#`` character and any text following it, to the end of the line, are +ignored by ``make``. + +Tutorial +======== + +This section provides some examples of the different kinds of modules you can +build with the LLVM makefile system. In general, each directory you provide will +build a single object although that object may be composed of additionally +compiled components. + +Libraries +--------- + +Only a few variable definitions are needed to build a regular library. +Normally, the makefile system will build all the software into a single +``libname.o`` (pre-linked) object. This means the library is not searchable and +that the distinction between compilation units has been dissolved. Optionally, +you can ask for a shared library (.so) or archive library (.a) built. Archive +libraries are the default. For example: + +.. code-block:: makefile + + LIBRARYNAME = mylib + SHARED_LIBRARY = 1 + ARCHIVE_LIBRARY = 1 + +says to build a library named ``mylib`` with both a shared library +(``mylib.so``) and an archive library (``mylib.a``) version. The contents of all +the libraries produced will be the same, they are just constructed differently. +Note that you normally do not need to specify the sources involved. The LLVM +Makefile system will infer the source files from the contents of the source +directory. + +The ``LOADABLE_MODULE=1`` directive can be used in conjunction with +``SHARED_LIBRARY=1`` to indicate that the resulting shared library should be +openable with the ``dlopen`` function and searchable with the ``dlsym`` function +(or your operating system's equivalents). While this isn't strictly necessary on +Linux and a few other platforms, it is required on systems like HP-UX and +Darwin. You should use ``LOADABLE_MODULE`` for any shared library that you +intend to be loaded into an tool via the ``-load`` option. See the +`WritingAnLLVMPass.html <WritingAnLLVMPass.html#makefile>`_ document for an +example of why you might want to do this. + +Bitcode Modules +^^^^^^^^^^^^^^^ + +In some situations, it is desirable to build a single bitcode module from a +variety of sources, instead of an archive, shared library, or bitcode +library. Bitcode modules can be specified in addition to any of the other types +of libraries by defining the `MODULE_NAME`_ variable. For example: + +.. code-block:: makefile + + LIBRARYNAME = mylib + BYTECODE_LIBRARY = 1 + MODULE_NAME = mymod + +will build a module named ``mymod.bc`` from the sources in the directory. This +module will be an aggregation of all the bitcode modules derived from the +sources. The example will also build a bitcode archive containing a bitcode +module for each compiled source file. The difference is subtle, but important +depending on how the module or library is to be linked. + +Loadable Modules +^^^^^^^^^^^^^^^^ + +In some situations, you need to create a loadable module. Loadable modules can +be loaded into programs like ``opt`` or ``llc`` to specify additional passes to +run or targets to support. Loadable modules are also useful for debugging a +pass or providing a pass with another package if that pass can't be included in +LLVM. + +LLVM provides complete support for building such a module. All you need to do is +use the ``LOADABLE_MODULE`` variable in your ``Makefile``. For example, to build +a loadable module named ``MyMod`` that uses the LLVM libraries ``LLVMSupport.a`` +and ``LLVMSystem.a``, you would specify: + +.. code-block:: makefile + + LIBRARYNAME := MyMod + LOADABLE_MODULE := 1 + LINK_COMPONENTS := support system + +Use of the ``LOADABLE_MODULE`` facility implies several things: + +#. There will be no "``lib``" prefix on the module. This differentiates it from + a standard shared library of the same name. + +#. The `SHARED_LIBRARY`_ variable is turned on. + +#. The `LINK_LIBS_IN_SHARED`_ variable is turned on. + +A loadable module is loaded by LLVM via the facilities of libtool's libltdl +library which is part of ``lib/System`` implementation. + +Tools +----- + +For building executable programs (tools), you must provide the name of the tool +and the names of the libraries you wish to link with the tool. For example: + +.. code-block:: makefile + + TOOLNAME = mytool + USEDLIBS = mylib + LINK_COMPONENTS = support system + +says that we are to build a tool name ``mytool`` and that it requires three +libraries: ``mylib``, ``LLVMSupport.a`` and ``LLVMSystem.a``. + +Note that two different variables are use to indicate which libraries are +linked: ``USEDLIBS`` and ``LLVMLIBS``. This distinction is necessary to support +projects. ``LLVMLIBS`` refers to the LLVM libraries found in the LLVM object +directory. ``USEDLIBS`` refers to the libraries built by your project. In the +case of building LLVM tools, ``USEDLIBS`` and ``LLVMLIBS`` can be used +interchangeably since the "project" is LLVM itself and ``USEDLIBS`` refers to +the same place as ``LLVMLIBS``. + +Also note that there are two different ways of specifying a library: with a +``.a`` suffix and without. Without the suffix, the entry refers to the re-linked +(.o) file which will include *all* symbols of the library. This is +useful, for example, to include all passes from a library of passes. If the +``.a`` suffix is used then the library is linked as a searchable library (with +the ``-l`` option). In this case, only the symbols that are unresolved *at +that point* will be resolved from the library, if they exist. Other +(unreferenced) symbols will not be included when the ``.a`` syntax is used. Note +that in order to use the ``.a`` suffix, the library in question must have been +built with the ``ARCHIVE_LIBRARY`` option set. + +JIT Tools +^^^^^^^^^ + +Many tools will want to use the JIT features of LLVM. To do this, you simply +specify that you want an execution 'engine', and the makefiles will +automatically link in the appropriate JIT for the host or an interpreter if none +is available: + +.. code-block:: makefile + + TOOLNAME = my_jit_tool + USEDLIBS = mylib + LINK_COMPONENTS = engine + +Of course, any additional libraries may be listed as other components. To get a +full understanding of how this changes the linker command, it is recommended +that you: + +.. code-block:: bash + + % cd examples/Fibonacci + % make VERBOSE=1 + +Targets Supported +================= + +This section describes each of the targets that can be built using the LLVM +Makefile system. Any target can be invoked from any directory but not all are +applicable to a given directory (e.g. "check", "dist" and "install" will always +operate as if invoked from the top level directory). + +================= =============== ================== +Target Name Implied Targets Target Description +================= =============== ================== +``all`` \ Compile the software recursively. Default target. +``all-local`` \ Compile the software in the local directory only. +``check`` \ Change to the ``test`` directory in a project and run the test suite there. +``check-local`` \ Run a local test suite. Generally this is only defined in the ``Makefile`` of the project's ``test`` directory. +``clean`` \ Remove built objects recursively. +``clean-local`` \ Remove built objects from the local directory only. +``dist`` ``all`` Prepare a source distribution tarball. +``dist-check`` ``all`` Prepare a source distribution tarball and check that it builds. +``dist-clean`` ``clean`` Clean source distribution tarball temporary files. +``install`` ``all`` Copy built objects to installation directory. +``preconditions`` ``all`` Check to make sure configuration and makefiles are up to date. +``printvars`` ``all`` Prints variables defined by the makefile system (for debugging). +``tags`` \ Make C and C++ tags files for emacs and vi. +``uninstall`` \ Remove built objects from installation directory. +================= =============== ================== + +.. _all: + +``all`` (default) +----------------- + +When you invoke ``make`` with no arguments, you are implicitly instructing it to +seek the ``all`` target (goal). This target is used for building the software +recursively and will do different things in different directories. For example, +in a ``lib`` directory, the ``all`` target will compile source files and +generate libraries. But, in a ``tools`` directory, it will link libraries and +generate executables. + +``all-local`` +------------- + +This target is the same as `all`_ but it operates only on the current directory +instead of recursively. + +``check`` +--------- + +This target can be invoked from anywhere within a project's directories but +always invokes the `check-local`_ target in the project's ``test`` directory, if +it exists and has a ``Makefile``. A warning is produced otherwise. If +`TESTSUITE`_ is defined on the ``make`` command line, it will be passed down to +the invocation of ``make check-local`` in the ``test`` directory. The intended +usage for this is to assist in running specific suites of tests. If +``TESTSUITE`` is not set, the implementation of ``check-local`` should run all +normal tests. It is up to the project to define what different values for +``TESTSUTE`` will do. See the `Testing Guide <TestingGuide.html>`_ for further +details. + +``check-local`` +--------------- + +This target should be implemented by the ``Makefile`` in the project's ``test`` +directory. It is invoked by the ``check`` target elsewhere. Each project is +free to define the actions of ``check-local`` as appropriate for that +project. The LLVM project itself uses dejagnu to run a suite of feature and +regresson tests. Other projects may choose to use dejagnu or any other testing +mechanism. + +``clean`` +--------- + +This target cleans the build directory, recursively removing all things that the +Makefile builds. The cleaning rules have been made guarded so they shouldn't go +awry (via ``rm -f $(UNSET_VARIABLE)/*`` which will attempt to erase the entire +directory structure. + +``clean-local`` +--------------- + +This target does the same thing as ``clean`` but only for the current (local) +directory. + +``dist`` +-------- + +This target builds a distribution tarball. It first builds the entire project +using the ``all`` target and then tars up the necessary files and compresses +it. The generated tarball is sufficient for a casual source distribution, but +probably not for a release (see ``dist-check``). + +``dist-check`` +-------------- + +This target does the same thing as the ``dist`` target but also checks the +distribution tarball. The check is made by unpacking the tarball to a new +directory, configuring it, building it, installing it, and then verifying that +the installation results are correct (by comparing to the original build). This +target can take a long time to run but should be done before a release goes out +to make sure that the distributed tarball can actually be built into a working +release. + +``dist-clean`` +-------------- + +This is a special form of the ``clean`` clean target. It performs a normal +``clean`` but also removes things pertaining to building the distribution. + +``install`` +----------- + +This target finalizes shared objects and executables and copies all libraries, +headers, executables and documentation to the directory given with the +``--prefix`` option to ``configure``. When completed, the prefix directory will +have everything needed to **use** LLVM. + +The LLVM makefiles can generate complete **internal** documentation for all the +classes by using ``doxygen``. By default, this feature is **not** enabled +because it takes a long time and generates a massive amount of data (>100MB). If +you want this feature, you must configure LLVM with the --enable-doxygen switch +and ensure that a modern version of doxygen (1.3.7 or later) is available in +your ``PATH``. You can download doxygen from `here +<http://www.stack.nl/~dimitri/doxygen/download.html#latestsrc>`_. + +``preconditions`` +----------------- + +This utility target checks to see if the ``Makefile`` in the object directory is +older than the ``Makefile`` in the source directory and copies it if so. It also +reruns the ``configure`` script if that needs to be done and rebuilds the +``Makefile.config`` file similarly. Users may overload this target to ensure +that sanity checks are run *before* any building of targets as all the targets +depend on ``preconditions``. + +``printvars`` +------------- + +This utility target just causes the LLVM makefiles to print out some of the +makefile variables so that you can double check how things are set. + +``reconfigure`` +--------------- + +This utility target will force a reconfigure of LLVM or your project. It simply +runs ``$(PROJ_OBJ_ROOT)/config.status --recheck`` to rerun the configuration +tests and rebuild the configured files. This isn't generally useful as the +makefiles will reconfigure themselves whenever its necessary. + +``spotless`` +------------ + +.. warning:: + + Use with caution! + +This utility target, only available when ``$(PROJ_OBJ_ROOT)`` is not the same as +``$(PROJ_SRC_ROOT)``, will completely clean the ``$(PROJ_OBJ_ROOT)`` directory +by removing its content entirely and reconfiguring the directory. This returns +the ``$(PROJ_OBJ_ROOT)`` directory to a completely fresh state. All content in +the directory except configured files and top-level makefiles will be lost. + +``tags`` +-------- + +This target will generate a ``TAGS`` file in the top-level source directory. It +is meant for use with emacs, XEmacs, or ViM. The TAGS file provides an index of +symbol definitions so that the editor can jump you to the definition +quickly. + +``uninstall`` +------------- + +This target is the opposite of the ``install`` target. It removes the header, +library and executable files from the installation directories. Note that the +directories themselves are not removed because it is not guaranteed that LLVM is +the only thing installing there (e.g. ``--prefix=/usr``). + +.. _variables: + +Variables +========= + +Variables are used to tell the LLVM Makefile System what to do and to obtain +information from it. Variables are also used internally by the LLVM Makefile +System. Variable names that contain only the upper case alphabetic letters and +underscore are intended for use by the end user. All other variables are +internal to the LLVM Makefile System and should not be relied upon nor +modified. The sections below describe how to use the LLVM Makefile +variables. + +Control Variables +----------------- + +Variables listed in the table below should be set *before* the inclusion of +`$(LEVEL)/Makefile.common`_. These variables provide input to the LLVM make +system that tell it what to do for the current directory. + +``BUILD_ARCHIVE`` + If set to any value, causes an archive (.a) library to be built. + +``BUILT_SOURCES`` + Specifies a set of source files that are generated from other source + files. These sources will be built before any other target processing to + ensure they are present. + +``BYTECODE_LIBRARY`` + If set to any value, causes a bitcode library (.bc) to be built. + +``CONFIG_FILES`` + Specifies a set of configuration files to be installed. + +``DEBUG_SYMBOLS`` + If set to any value, causes the build to include debugging symbols even in + optimized objects, libraries and executables. This alters the flags + specified to the compilers and linkers. Debugging isn't fun in an optimized + build, but it is possible. + +``DIRS`` + Specifies a set of directories, usually children of the current directory, + that should also be made using the same goal. These directories will be + built serially. + +``DISABLE_AUTO_DEPENDENCIES`` + If set to any value, causes the makefiles to **not** automatically generate + dependencies when running the compiler. Use of this feature is discouraged + and it may be removed at a later date. + +``ENABLE_OPTIMIZED`` + If set to 1, causes the build to generate optimized objects, libraries and + executables. This alters the flags specified to the compilers and + linkers. Generally debugging won't be a fun experience with an optimized + build. + +``ENABLE_PROFILING`` + If set to 1, causes the build to generate both optimized and profiled + objects, libraries and executables. This alters the flags specified to the + compilers and linkers to ensure that profile data can be collected from the + tools built. Use the ``gprof`` tool to analyze the output from the profiled + tools (``gmon.out``). + +``DISABLE_ASSERTIONS`` + If set to 1, causes the build to disable assertions, even if building a + debug or profile build. This will exclude all assertion check code from the + build. LLVM will execute faster, but with little help when things go + wrong. + +``EXPERIMENTAL_DIRS`` + Specify a set of directories that should be built, but if they fail, it + should not cause the build to fail. Note that this should only be used + temporarily while code is being written. + +``EXPORTED_SYMBOL_FILE`` + Specifies the name of a single file that contains a list of the symbols to + be exported by the linker. One symbol per line. + +``EXPORTED_SYMBOL_LIST`` + Specifies a set of symbols to be exported by the linker. + +``EXTRA_DIST`` + Specifies additional files that should be distributed with LLVM. All source + files, all built sources, all Makefiles, and most documentation files will + be automatically distributed. Use this variable to distribute any files that + are not automatically distributed. + +``KEEP_SYMBOLS`` + If set to any value, specifies that when linking executables the makefiles + should retain debug symbols in the executable. Normally, symbols are + stripped from the executable. + +``LEVEL`` (required) + Specify the level of nesting from the top level. This variable must be set + in each makefile as it is used to find the top level and thus the other + makefiles. + +``LIBRARYNAME`` + Specify the name of the library to be built. (Required For Libraries) + +``LINK_COMPONENTS`` + When specified for building a tool, the value of this variable will be + passed to the ``llvm-config`` tool to generate a link line for the + tool. Unlike ``USEDLIBS`` and ``LLVMLIBS``, not all libraries need to be + specified. The ``llvm-config`` tool will figure out the library dependencies + and add any libraries that are needed. The ``USEDLIBS`` variable can still + be used in conjunction with ``LINK_COMPONENTS`` so that additional + project-specific libraries can be linked with the LLVM libraries specified + by ``LINK_COMPONENTS``. + +.. _LINK_LIBS_IN_SHARED: + +``LINK_LIBS_IN_SHARED`` + By default, shared library linking will ignore any libraries specified with + the `LLVMLIBS`_ or `USEDLIBS`_. This prevents shared libs from including + things that will be in the LLVM tool the shared library will be loaded + into. However, sometimes it is useful to link certain libraries into your + shared library and this option enables that feature. + +.. _LLVMLIBS: + +``LLVMLIBS`` + Specifies the set of libraries from the LLVM ``$(ObjDir)`` that will be + linked into the tool or library. + +``LOADABLE_MODULE`` + If set to any value, causes the shared library being built to also be a + loadable module. Loadable modules can be opened with the dlopen() function + and searched with dlsym (or the operating system's equivalent). Note that + setting this variable without also setting ``SHARED_LIBRARY`` will have no + effect. + +.. _MODULE_NAME: + +``MODULE_NAME`` + Specifies the name of a bitcode module to be created. A bitcode module can + be specified in conjunction with other kinds of library builds or by + itself. It constructs from the sources a single linked bitcode file. + +``NO_INSTALL`` + Specifies that the build products of the directory should not be installed + but should be built even if the ``install`` target is given. This is handy + for directories that build libraries or tools that are only used as part of + the build process, such as code generators (e.g. ``tblgen``). + +``OPTIONAL_DIRS`` + Specify a set of directories that may be built, if they exist, but its not + an error for them not to exist. + +``PARALLEL_DIRS`` + Specify a set of directories to build recursively and in parallel if the + ``-j`` option was used with ``make``. + +.. _SHARED_LIBRARY: + +``SHARED_LIBRARY`` + If set to any value, causes a shared library (``.so``) to be built in + addition to any other kinds of libraries. Note that this option will cause + all source files to be built twice: once with options for position + independent code and once without. Use it only where you really need a + shared library. + +``SOURCES`` (optional) + Specifies the list of source files in the current directory to be + built. Source files of any type may be specified (programs, documentation, + config files, etc.). If not specified, the makefile system will infer the + set of source files from the files present in the current directory. + +``SUFFIXES`` + Specifies a set of filename suffixes that occur in suffix match rules. Only + set this if your local ``Makefile`` specifies additional suffix match + rules. + +``TARGET`` + Specifies the name of the LLVM code generation target that the current + directory builds. Setting this variable enables additional rules to build + ``.inc`` files from ``.td`` files. + +.. _TESTSUITE: + +``TESTSUITE`` + Specifies the directory of tests to run in ``llvm/test``. + +``TOOLNAME`` + Specifies the name of the tool that the current directory should build. + +``TOOL_VERBOSE`` + Implies ``VERBOSE`` and also tells each tool invoked to be verbose. This is + handy when you're trying to see the sub-tools invoked by each tool invoked + by the makefile. For example, this will pass ``-v`` to the GCC compilers + which causes it to print out the command lines it uses to invoke sub-tools + (compiler, assembler, linker). + +.. _USEDLIBS: + +``USEDLIBS`` + Specifies the list of project libraries that will be linked into the tool or + library. + +``VERBOSE`` + Tells the Makefile system to produce detailed output of what it is doing + instead of just summary comments. This will generate a LOT of output. + +Override Variables +------------------ + +Override variables can be used to override the default values provided by the +LLVM makefile system. These variables can be set in several ways: + +* In the environment (e.g. setenv, export) --- not recommended. +* On the ``make`` command line --- recommended. +* On the ``configure`` command line. +* In the Makefile (only *after* the inclusion of `$(LEVEL)/Makefile.common`_). + +The override variables are given below: + +``AR`` (defaulted) + Specifies the path to the ``ar`` tool. + +``PROJ_OBJ_DIR`` + The directory into which the products of build rules will be placed. This + might be the same as `PROJ_SRC_DIR`_ but typically is not. + +.. _PROJ_SRC_DIR: + +``PROJ_SRC_DIR`` + The directory which contains the source files to be built. + +``BUILD_EXAMPLES`` + If set to 1, build examples in ``examples`` and (if building Clang) + ``tools/clang/examples`` directories. + +``BZIP2`` (configured) + The path to the ``bzip2`` tool. + +``CC`` (configured) + The path to the 'C' compiler. + +``CFLAGS`` + Additional flags to be passed to the 'C' compiler. + +``CXX`` + Specifies the path to the C++ compiler. + +``CXXFLAGS`` + Additional flags to be passed to the C++ compiler. + +``DATE`` (configured) + Specifies the path to the ``date`` program or any program that can generate + the current date and time on its standard output. + +``DOT`` (configured) + Specifies the path to the ``dot`` tool or ``false`` if there isn't one. + +``ECHO`` (configured) + Specifies the path to the ``echo`` tool for printing output. + +``EXEEXT`` (configured) + Provides the extension to be used on executables built by the makefiles. + The value may be empty on platforms that do not use file extensions for + executables (e.g. Unix). + +``INSTALL`` (configured) + Specifies the path to the ``install`` tool. + +``LDFLAGS`` (configured) + Allows users to specify additional flags to pass to the linker. + +``LIBS`` (configured) + The list of libraries that should be linked with each tool. + +``LIBTOOL`` (configured) + Specifies the path to the ``libtool`` tool. This tool is renamed ``mklib`` + by the ``configure`` script. + +``LLVMAS`` (defaulted) + Specifies the path to the ``llvm-as`` tool. + +``LLVMCC`` + Specifies the path to the LLVM capable compiler. + +``LLVMCXX`` + Specifies the path to the LLVM C++ capable compiler. + +``LLVMGCC`` (defaulted) + Specifies the path to the LLVM version of the GCC 'C' Compiler. + +``LLVMGXX`` (defaulted) + Specifies the path to the LLVM version of the GCC C++ Compiler. + +``LLVMLD`` (defaulted) + Specifies the path to the LLVM bitcode linker tool + +``LLVM_OBJ_ROOT`` (configured) + Specifies the top directory into which the output of the build is placed. + +``LLVM_SRC_ROOT`` (configured) + Specifies the top directory in which the sources are found. + +``LLVM_TARBALL_NAME`` (configured) + Specifies the name of the distribution tarball to create. This is configured + from the name of the project and its version number. + +``MKDIR`` (defaulted) + Specifies the path to the ``mkdir`` tool that creates directories. + +``ONLY_TOOLS`` + If set, specifies the list of tools to build. + +``PLATFORMSTRIPOPTS`` + The options to provide to the linker to specify that a stripped (no symbols) + executable should be built. + +``RANLIB`` (defaulted) + Specifies the path to the ``ranlib`` tool. + +``RM`` (defaulted) + Specifies the path to the ``rm`` tool. + +``SED`` (defaulted) + Specifies the path to the ``sed`` tool. + +``SHLIBEXT`` (configured) + Provides the filename extension to use for shared libraries. + +``TBLGEN`` (defaulted) + Specifies the path to the ``tblgen`` tool. + +``TAR`` (defaulted) + Specifies the path to the ``tar`` tool. + +``ZIP`` (defaulted) + Specifies the path to the ``zip`` tool. + +Readable Variables +------------------ + +Variables listed in the table below can be used by the user's Makefile but +should not be changed. Changing the value will generally cause the build to go +wrong, so don't do it. + +``bindir`` + The directory into which executables will ultimately be installed. This + value is derived from the ``--prefix`` option given to ``configure``. + +``BuildMode`` + The name of the type of build being performed: Debug, Release, or + Profile. + +``bytecode_libdir`` + The directory into which bitcode libraries will ultimately be installed. + This value is derived from the ``--prefix`` option given to ``configure``. + +``ConfigureScriptFLAGS`` + Additional flags given to the ``configure`` script when reconfiguring. + +``DistDir`` + The *current* directory for which a distribution copy is being made. + +.. _Echo: + +``Echo`` + The LLVM Makefile System output command. This provides the ``llvm[n]`` + prefix and starts with ``@`` so the command itself is not printed by + ``make``. + +``EchoCmd`` + Same as `Echo`_ but without the leading ``@``. + +``includedir`` + The directory into which include files will ultimately be installed. This + value is derived from the ``--prefix`` option given to ``configure``. + +``libdir`` + The directory into which native libraries will ultimately be installed. + This value is derived from the ``--prefix`` option given to + ``configure``. + +``LibDir`` + The configuration specific directory into which libraries are placed before + installation. + +``MakefileConfig`` + Full path of the ``Makefile.config`` file. + +``MakefileConfigIn`` + Full path of the ``Makefile.config.in`` file. + +``ObjDir`` + The configuration and directory specific directory where build objects + (compilation results) are placed. + +``SubDirs`` + The complete list of sub-directories of the current directory as + specified by other variables. + +``Sources`` + The complete list of source files. + +``sysconfdir`` + The directory into which configuration files will ultimately be + installed. This value is derived from the ``--prefix`` option given to + ``configure``. + +``ToolDir`` + The configuration specific directory into which executables are placed + before they are installed. + +``TopDistDir`` + The top most directory into which the distribution files are copied. + +``Verb`` + Use this as the first thing on your build script lines to enable or disable + verbose mode. It expands to either an ``@`` (quiet mode) or nothing (verbose + mode). + +Internal Variables +------------------ + +Variables listed below are used by the LLVM Makefile System and considered +internal. You should not use these variables under any circumstances. + +.. code-block:: makefile + + Archive + AR.Flags + BaseNameSources + BCCompile.C + BCCompile.CXX + BCLinkLib + C.Flags + Compile.C + CompileCommonOpts + Compile.CXX + ConfigStatusScript + ConfigureScript + CPP.Flags + CPP.Flags + CXX.Flags + DependFiles + DestArchiveLib + DestBitcodeLib + DestModule + DestSharedLib + DestTool + DistAlways + DistCheckDir + DistCheckTop + DistFiles + DistName + DistOther + DistSources + DistSubDirs + DistTarBZ2 + DistTarGZip + DistZip + ExtraLibs + FakeSources + INCFiles + InternalTargets + LD.Flags + LibName.A + LibName.BC + LibName.LA + LibName.O + LibTool.Flags + Link + LinkModule + LLVMLibDir + LLVMLibsOptions + LLVMLibsPaths + LLVMToolDir + LLVMUsedLibs + LocalTargets + Module + ObjectsBC + ObjectsLO + ObjectsO + ObjMakefiles + ParallelTargets + PreConditions + ProjLibsOptions + ProjLibsPaths + ProjUsedLibs + Ranlib + RecursiveTargets + SrcMakefiles + Strip + StripWarnMsg + TableGen + TDFiles + ToolBuildPath + TopLevelTargets + UserTargets diff --git a/docs/Packaging.html b/docs/Packaging.html deleted file mode 100644 index ac4dcf0..0000000 --- a/docs/Packaging.html +++ /dev/null @@ -1,119 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Advice on Packaging LLVM</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>Advice on Packaging LLVM</h1> -<ol> - <li><a href="#overview">Overview</a></li> - <li><a href="#compilation">Compile Flags</a></li> - <li><a href="#cxx-features">C++ Features</a></li> - <li><a href="#shared-library">Shared Library</a></li> - <li><a href="#deps">Dependencies</a></li> -</ol> - -<!--=========================================================================--> -<h2><a name="overview">Overview</a></h2> -<!--=========================================================================--> -<div> - -<p>LLVM sets certain default configure options to make sure our developers don't -break things for constrained platforms. These settings are not optimal for most -desktop systems, and we hope that packagers (e.g., Redhat, Debian, MacPorts, -etc.) will tweak them. This document lists settings we suggest you tweak. -</p> - -<p>LLVM's API changes with each release, so users are likely to want, for -example, both LLVM-2.6 and LLVM-2.7 installed at the same time to support apps -developed against each. -</p> -</div> - -<!--=========================================================================--> -<h2><a name="compilation">Compile Flags</a></h2> -<!--=========================================================================--> -<div> - -<p>LLVM runs much more quickly when it's optimized and assertions are removed. -However, such a build is currently incompatible with users who build without -defining NDEBUG, and the lack of assertions makes it hard to debug problems in -user code. We recommend allowing users to install both optimized and debug -versions of LLVM in parallel. The following configure flags are relevant: -</p> - -<dl> - <dt><tt>--disable-assertions</tt></dt><dd>Builds LLVM with <tt>NDEBUG</tt> - defined. Changes the LLVM ABI. Also available by setting - <tt>DISABLE_ASSERTIONS=0|1</tt> in <tt>make</tt>'s environment. This defaults - to enabled regardless of the optimization setting, but it slows things - down.</dd> - - <dt><tt>--enable-debug-symbols</tt></dt><dd>Builds LLVM with <tt>-g</tt>. - Also available by setting <tt>DEBUG_SYMBOLS=0|1</tt> in <tt>make</tt>'s - environment. This defaults to disabled when optimizing, so you should turn it - back on to let users debug their programs.</dd> - - <dt><tt>--enable-optimized</tt></dt><dd>(For svn checkouts) Builds LLVM with - <tt>-O2</tt> and, by default, turns off debug symbols. Also available by - setting <tt>ENABLE_OPTIMIZED=0|1</tt> in <tt>make</tt>'s environment. This - defaults to enabled when not in a checkout.</dd> -</dl> -</div> - -<!--=========================================================================--> -<h2><a name="cxx-features">C++ Features</a></h2> -<!--=========================================================================--> -<div> - -<dl> - <dt>RTTI</dt><dd>LLVM disables RTTI by default. Add <tt>REQUIRES_RTTI=1</tt> - to your environment while running <tt>make</tt> to re-enable it. This will - allow users to build with RTTI enabled and still inherit from LLVM - classes.</dd> -</dl> -</div> - -<!--=========================================================================--> -<h2><a name="shared-library">Shared Library</a></h2> -<!--=========================================================================--> -<div> - -<p>Configure with <tt>--enable-shared</tt> to build -<tt>libLLVM-<var>major</var>.<var>minor</var>.(so|dylib)</tt> and link the tools -against it. This saves lots of binary size at the cost of some startup time. -</p> -</div> - -<!--=========================================================================--> -<h2><a name="deps">Dependencies</a></h2> -<!--=========================================================================--> -<div> - -<dl> -<dt><tt>--enable-libffi</tt></dt><dd>Depend on <a -href="http://sources.redhat.com/libffi/">libffi</a> to allow the LLVM -interpreter to call external functions.</dd> -<dt><tt>--with-oprofile</tt></dt><dd>Depend on <a -href="http://oprofile.sourceforge.net/doc/devel/index.html">libopagent</a> -(>=version 0.9.4) to let the LLVM JIT tell oprofile about function addresses and -line numbers.</dd> -</dl> -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ -</address> -</body> -</html> diff --git a/docs/Packaging.rst b/docs/Packaging.rst new file mode 100644 index 0000000..6e74158 --- /dev/null +++ b/docs/Packaging.rst @@ -0,0 +1,75 @@ +.. _packaging: + +======================== +Advice on Packaging LLVM +======================== + +.. contents:: + :local: + +Overview +======== + +LLVM sets certain default configure options to make sure our developers don't +break things for constrained platforms. These settings are not optimal for most +desktop systems, and we hope that packagers (e.g., Redhat, Debian, MacPorts, +etc.) will tweak them. This document lists settings we suggest you tweak. + +LLVM's API changes with each release, so users are likely to want, for example, +both LLVM-2.6 and LLVM-2.7 installed at the same time to support apps developed +against each. + +Compile Flags +============= + +LLVM runs much more quickly when it's optimized and assertions are removed. +However, such a build is currently incompatible with users who build without +defining ``NDEBUG``, and the lack of assertions makes it hard to debug problems +in user code. We recommend allowing users to install both optimized and debug +versions of LLVM in parallel. The following configure flags are relevant: + +``--disable-assertions`` + Builds LLVM with ``NDEBUG`` defined. Changes the LLVM ABI. Also available + by setting ``DISABLE_ASSERTIONS=0|1`` in ``make``'s environment. This + defaults to enabled regardless of the optimization setting, but it slows + things down. + +``--enable-debug-symbols`` + Builds LLVM with ``-g``. Also available by setting ``DEBUG_SYMBOLS=0|1`` in + ``make``'s environment. This defaults to disabled when optimizing, so you + should turn it back on to let users debug their programs. + +``--enable-optimized`` + (For svn checkouts) Builds LLVM with ``-O2`` and, by default, turns off + debug symbols. Also available by setting ``ENABLE_OPTIMIZED=0|1`` in + ``make``'s environment. This defaults to enabled when not in a + checkout. + +C++ Features +============ + +RTTI + LLVM disables RTTI by default. Add ``REQUIRES_RTTI=1`` to your environment + while running ``make`` to re-enable it. This will allow users to build with + RTTI enabled and still inherit from LLVM classes. + +Shared Library +============== + +Configure with ``--enable-shared`` to build +``libLLVM-<major>.<minor>.(so|dylib)`` and link the tools against it. This +saves lots of binary size at the cost of some startup time. + +Dependencies +============ + +``--enable-libffi`` + Depend on `libffi <http://sources.redhat.com/libffi/>`_ to allow the LLVM + interpreter to call external functions. + +``--with-oprofile`` + + Depend on `libopagent + <http://oprofile.sourceforge.net/doc/devel/index.html>`_ (>=version 0.9.4) + to let the LLVM JIT tell oprofile about function addresses and line + numbers. diff --git a/docs/Passes.html b/docs/Passes.html index 37a304d..e8048d5 100644 --- a/docs/Passes.html +++ b/docs/Passes.html @@ -3,7 +3,7 @@ <html> <head> <title>LLVM's Analysis and Transform Passes</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> </head> <body> @@ -100,7 +100,6 @@ perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print " <p>\n" if ! <tr><td><a href="#module-debuginfo">-module-debuginfo</a></td><td>Decodes module-level debug info</td></tr> <tr><td><a href="#no-aa">-no-aa</a></td><td>No Alias Analysis (always returns 'may' alias)</td></tr> <tr><td><a href="#no-profile">-no-profile</a></td><td>No Profile Information</td></tr> -<tr><td><a href="#postdomfrontier">-postdomfrontier</a></td><td>Post-Dominance Frontier Construction</td></tr> <tr><td><a href="#postdomtree">-postdomtree</a></td><td>Post-Dominator Tree Construction</td></tr> <tr><td><a href="#print-alias-sets">-print-alias-sets</a></td><td>Alias Set Printer</td></tr> <tr><td><a href="#print-callgraph">-print-callgraph</a></td><td>Print a call graph</td></tr> @@ -755,7 +754,7 @@ perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print " <p>\n" if ! </h3> <div> <p>Provides other passes access to information on how the size and alignment - required by the the target ABI for various data types.</p> + required by the target ABI for various data types.</p> </div> </div> @@ -1617,7 +1616,7 @@ if (X < 3) {</pre> </h3> <div> <p> - This file demotes all registers to memory references. It is intented to be + This file demotes all registers to memory references. It is intended to be the inverse of <a href="#mem2reg"><tt>-mem2reg</tt></a>. By converting to <tt>load</tt> instructions, the only values live across basic blocks are <tt>alloca</tt> instructions and <tt>load</tt> instructions before @@ -1971,7 +1970,7 @@ if (X < 3) {</pre> <li>Verify that a function's argument list agrees with its declared type.</li> <li>It is illegal to specify a name for a void value.</li> - <li>It is illegal to have a internal global value with no initializer.</li> + <li>It is illegal to have an internal global value with no initializer.</li> <li>It is illegal to have a ret instruction that returns a value that does not agree with the function return value type.</li> <li>Function call argument types match the function prototype.</li> @@ -2060,7 +2059,7 @@ if (X < 3) {</pre> <a href="mailto:rspencer@x10sys.com">Reid Spencer</a><br> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-02-01 04:51:43 +0100 (Wed, 01 Feb 2012) $ + Last modified: $Date: 2012-07-26 00:01:31 +0200 (Thu, 26 Jul 2012) $ </address> </body> diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 625ef9a..5bf499b 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-type" content="text/html;charset=UTF-8"> <title>LLVM Programmer's Manual</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -507,8 +507,9 @@ small and pervasive enough in LLVM that it should always be passed by value.</p> <div> -<p>The <tt>Twine</tt> class is an efficient way for APIs to accept concatenated -strings. For example, a common LLVM paradigm is to name one instruction based on +<p>The <tt><a href="/doxygen/classllvm_1_1Twine.html">Twine</a></tt> class is an +efficient way for APIs to accept concatenated strings. For example, a common +LLVM paradigm is to name one instruction based on the name of another instruction with a suffix, for example:</p> <div class="doc_code"> @@ -517,17 +518,17 @@ the name of another instruction with a suffix, for example:</p> </pre> </div> -<p>The <tt>Twine</tt> class is effectively a -lightweight <a href="http://en.wikipedia.org/wiki/Rope_(computer_science)">rope</a> +<p>The <tt>Twine</tt> class is effectively a lightweight +<a href="http://en.wikipedia.org/wiki/Rope_(computer_science)">rope</a> which points to temporary (stack allocated) objects. Twines can be implicitly constructed as the result of the plus operator applied to strings (i.e., a C -strings, an <tt>std::string</tt>, or a <tt>StringRef</tt>). The twine delays the -actual concatenation of strings until it is actually required, at which point -it can be efficiently rendered directly into a character array. This avoids -unnecessary heap allocation involved in constructing the temporary results of -string concatenation. See -"<tt><a href="/doxygen/classllvm_1_1Twine_8h-source.html">llvm/ADT/Twine.h</a></tt>" -for more information.</p> +strings, an <tt>std::string</tt>, or a <tt>StringRef</tt>). The twine delays +the actual concatenation of strings until it is actually required, at which +point it can be efficiently rendered directly into a character array. This +avoids unnecessary heap allocation involved in constructing the temporary +results of string concatenation. See +"<tt><a href="/doxygen/Twine_8h_source.html">llvm/ADT/Twine.h</a></tt>" +and <a href="#dss_twine">here</a> for more information.</p> <p>As with a <tt>StringRef</tt>, <tt>Twine</tt> objects point to external memory and should almost never be stored or mentioned directly. They are intended @@ -3374,8 +3375,9 @@ provide a name for it (probably based on the name of the translation unit).</p> <hr> <ul> - <li><tt><a href="#Function">Function</a> *getFunction(const std::string - &Name, const <a href="#FunctionType">FunctionType</a> *Ty)</tt> + + <li><tt><a href="#Function">Function</a> *getFunction(StringRef Name) const + </tt> <p>Look up the specified function in the <tt>Module</tt> <a href="#SymbolTable"><tt>SymbolTable</tt></a>. If it does not exist, return @@ -3863,7 +3865,7 @@ is its address (after linking) which is guaranteed to be constant.</p> *Ty, LinkageTypes Linkage, const std::string &N = "", Module* Parent = 0)</tt> <p>Constructor used when you need to create new <tt>Function</tt>s to add - the the program. The constructor must specify the type of the function to + the program. The constructor must specify the type of the function to create and what type of linkage the function should have. The <a href="#FunctionType"><tt>FunctionType</tt></a> argument specifies the formal arguments and return value for the function. The same @@ -4128,7 +4130,7 @@ arguments. An argument has a pointer to the parent Function.</p> <a href="mailto:dhurjati@cs.uiuc.edu">Dinakar Dhurjati</a> and <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-18 22:28:55 +0200 (Wed, 18 Apr 2012) $ + Last modified: $Date: 2012-07-25 15:46:11 +0200 (Wed, 25 Jul 2012) $ </address> </body> diff --git a/docs/Projects.html b/docs/Projects.html deleted file mode 100644 index ebd7203..0000000 --- a/docs/Projects.html +++ /dev/null @@ -1,489 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Creating an LLVM Project</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>Creating an LLVM Project</h1> - -<ol> -<li><a href="#overview">Overview</a></li> -<li><a href="#create">Create a project from the Sample Project</a></li> -<li><a href="#source">Source tree layout</a></li> -<li><a href="#makefiles">Writing LLVM-style Makefiles</a> - <ol> - <li><a href="#reqVars">Required Variables</a></li> - <li><a href="#varsBuildDir">Variables for Building Subdirectories</a></li> - <li><a href="#varsBuildLib">Variables for Building Libraries</a></li> - <li><a href="#varsBuildProg">Variables for Building Programs</a></li> - <li><a href="#miscVars">Miscellaneous Variables</a></li> - </ol></li> -<li><a href="#objcode">Placement of object code</a></li> -<li><a href="#help">Further help</a></li> -</ol> - -<div class="doc_author"> - <p>Written by John Criswell</p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="overview">Overview</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>The LLVM build system is designed to facilitate the building of third party -projects that use LLVM header files, libraries, and tools. In order to use -these facilities, a Makefile from a project must do the following things:</p> - -<ol> - <li>Set <tt>make</tt> variables. There are several variables that a Makefile - needs to set to use the LLVM build system: - <ul> - <li><tt>PROJECT_NAME</tt> - The name by which your project is known.</li> - <li><tt>LLVM_SRC_ROOT</tt> - The root of the LLVM source tree.</li> - <li><tt>LLVM_OBJ_ROOT</tt> - The root of the LLVM object tree.</li> - <li><tt>PROJ_SRC_ROOT</tt> - The root of the project's source tree.</li> - <li><tt>PROJ_OBJ_ROOT</tt> - The root of the project's object tree.</li> - <li><tt>PROJ_INSTALL_ROOT</tt> - The root installation directory.</li> - <li><tt>LEVEL</tt> - The relative path from the current directory to the - project's root ($PROJ_OBJ_ROOT).</li> - </ul></li> - <li>Include <tt>Makefile.config</tt> from <tt>$(LLVM_OBJ_ROOT)</tt>.</li> - <li>Include <tt>Makefile.rules</tt> from <tt>$(LLVM_SRC_ROOT)</tt>.</li> -</ol> - -<p>There are two ways that you can set all of these variables:</p> -<ol> - <li>You can write your own Makefiles which hard-code these values.</li> - <li>You can use the pre-made LLVM sample project. This sample project - includes Makefiles, a configure script that can be used to configure the - location of LLVM, and the ability to support multiple object directories - from a single source directory.</li> -</ol> - -<p>This document assumes that you will base your project on the LLVM sample -project found in <tt>llvm/projects/sample</tt>. If you want to devise your own -build system, studying the sample project and LLVM Makefiles will probably -provide enough information on how to write your own Makefiles.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="create">Create a Project from the Sample Project</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Follow these simple steps to start your project:</p> - -<ol> -<li>Copy the <tt>llvm/projects/sample</tt> directory to any place of your -choosing. You can place it anywhere you like. Rename the directory to match -the name of your project.</li> - -<li> -If you downloaded LLVM using Subversion, remove all the directories named .svn -(and all the files therein) from your project's new source tree. This will -keep Subversion from thinking that your project is inside -<tt>llvm/trunk/projects/sample</tt>.</li> - -<li>Add your source code and Makefiles to your source tree.</li> - -<li>If you want your project to be configured with the <tt>configure</tt> script -then you need to edit <tt>autoconf/configure.ac</tt> as follows: - <ul> - <li><b>AC_INIT</b>. Place the name of your project, its version number and - a contact email address for your project as the arguments to this macro</li> - <li><b>AC_CONFIG_AUX_DIR</b>. If your project isn't in the - <tt>llvm/projects</tt> directory then you might need to adjust this so that - it specifies a relative path to the <tt>llvm/autoconf</tt> directory.</li> - <li><b>LLVM_CONFIG_PROJECT</b>. Just leave this alone.</li> - <li><b>AC_CONFIG_SRCDIR</b>. Specify a path to a file name that identifies - your project; or just leave it at <tt>Makefile.common.in</tt></li> - <li><b>AC_CONFIG_FILES</b>. Do not change.</li> - <li><b>AC_CONFIG_MAKEFILE</b>. Use one of these macros for each Makefile - that your project uses. This macro arranges for your makefiles to be copied - from the source directory, unmodified, to the build directory.</li> - </ul> -</li> - -<li>After updating <tt>autoconf/configure.ac</tt>, regenerate the -configure script with these commands: - -<div class="doc_code"> -<p><tt>% cd autoconf<br> - % ./AutoRegen.sh</tt></p> -</div> - -<p>You must be using Autoconf version 2.59 or later and your aclocal version -should be 1.9 or later.</p></li> - -<li>Run <tt>configure</tt> in the directory in which you want to place -object code. Use the following options to tell your project where it -can find LLVM: - - <dl> - <dt><tt>--with-llvmsrc=<directory></tt></dt> - <dd>Tell your project where the LLVM source tree is located.</dd> - <dt><br><tt>--with-llvmobj=<directory></tt></dt> - <dd>Tell your project where the LLVM object tree is located.</dd> - <dt><br><tt>--prefix=<directory></tt></dt> - <dd>Tell your project where it should get installed.</dd> - </dl> -</ol> - -<p>That's it! Now all you have to do is type <tt>gmake</tt> (or <tt>make</tt> -if your on a GNU/Linux system) in the root of your object directory, and your -project should build.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="source">Source Tree Layout</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>In order to use the LLVM build system, you will want to organize your -source code so that it can benefit from the build system's features. -Mainly, you want your source tree layout to look similar to the LLVM -source tree layout. The best way to do this is to just copy the -project tree from <tt>llvm/projects/sample</tt> and modify it to meet -your needs, but you can certainly add to it if you want.</p> - -<p>Underneath your top level directory, you should have the following -directories:</p> - -<dl> - <dt><b>lib</b> - <dd> - This subdirectory should contain all of your library source - code. For each library that you build, you will have one - directory in <b>lib</b> that will contain that library's source - code. - - <p> - Libraries can be object files, archives, or dynamic libraries. - The <b>lib</b> directory is just a convenient place for libraries - as it places them all in a directory from which they can be linked - later. - - <dt><b>include</b> - <dd> - This subdirectory should contain any header files that are - global to your project. By global, we mean that they are used - by more than one library or executable of your project. - <p> - By placing your header files in <b>include</b>, they will be - found automatically by the LLVM build system. For example, if - you have a file <b>include/jazz/note.h</b>, then your source - files can include it simply with <b>#include "jazz/note.h"</b>. - - <dt><b>tools</b> - <dd> - This subdirectory should contain all of your source - code for executables. For each program that you build, you - will have one directory in <b>tools</b> that will contain that - program's source code. - <p> - - <dt><b>test</b> - <dd> - This subdirectory should contain tests that verify that your code - works correctly. Automated tests are especially useful. - <p> - Currently, the LLVM build system provides basic support for tests. - The LLVM system provides the following: - <ul> - <li> - LLVM provides a tcl procedure that is used by Dejagnu to run - tests. It can be found in <tt>llvm/lib/llvm-dg.exp</tt>. This - test procedure uses RUN lines in the actual test case to determine - how to run the test. See the <a - href="TestingGuide.html">TestingGuide</a> for more details. You - can easily write Makefile support similar to the Makefiles in - <tt>llvm/test</tt> to use Dejagnu to run your project's tests.<br></li> - <li> - LLVM contains an optional package called <tt>llvm-test</tt> - which provides benchmarks and programs that are known to compile with the - LLVM GCC front ends. You can use these - programs to test your code, gather statistics information, and - compare it to the current LLVM performance statistics. - <br>Currently, there is no way to hook your tests directly into the - <tt>llvm/test</tt> testing harness. You will simply - need to find a way to use the source provided within that directory - on your own. - </ul> -</dl> - -<p>Typically, you will want to build your <b>lib</b> directory first followed by -your <b>tools</b> directory.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="makefiles">Writing LLVM Style Makefiles</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The LLVM build system provides a convenient way to build libraries and -executables. Most of your project Makefiles will only need to define a few -variables. Below is a list of the variables one can set and what they can -do:</p> - -<!-- ======================================================================= --> -<h3> - <a name="reqVars">Required Variables</a> -</h3> - -<div> - -<dl> - <dt>LEVEL - <dd> - This variable is the relative path from this Makefile to the - top directory of your project's source code. For example, if - your source code is in <tt>/tmp/src</tt>, then the Makefile in - <tt>/tmp/src/jump/high</tt> would set <tt>LEVEL</tt> to <tt>"../.."</tt>. -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="varsBuildDir">Variables for Building Subdirectories</a> -</h3> - -<div> - -<dl> - <dt>DIRS - <dd> - This is a space separated list of subdirectories that should be - built. They will be built, one at a time, in the order - specified. - <p> - - <dt>PARALLEL_DIRS - <dd> - This is a list of directories that can be built in parallel. - These will be built after the directories in DIRS have been - built. - <p> - - <dt>OPTIONAL_DIRS - <dd> - This is a list of directories that can be built if they exist, - but will not cause an error if they do not exist. They are - built serially in the order in which they are listed. -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="varsBuildLib">Variables for Building Libraries</a> -</h3> - -<div> - -<dl> - <dt>LIBRARYNAME - <dd> - This variable contains the base name of the library that will - be built. For example, to build a library named - <tt>libsample.a</tt>, LIBRARYNAME should be set to - <tt>sample</tt>. - <p> - - <dt>BUILD_ARCHIVE - <dd> - By default, a library is a <tt>.o</tt> file that is linked - directly into a program. To build an archive (also known as - a static library), set the BUILD_ARCHIVE variable. - <p> - - <dt>SHARED_LIBRARY - <dd> - If SHARED_LIBRARY is defined in your Makefile, a shared - (or dynamic) library will be built. -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="varsBuildProg">Variables for Building Programs</a> -</h3> - -<div> - -<dl> - <dt>TOOLNAME - <dd> - This variable contains the name of the program that will - be built. For example, to build an executable named - <tt>sample</tt>, TOOLNAME should be set to <tt>sample</tt>. - <p> - - <dt>USEDLIBS - <dd> - This variable holds a space separated list of libraries that should - be linked into the program. These libraries must be libraries that - come from your <b>lib</b> directory. The libraries must be - specified without their "lib" prefix. For example, to link - libsample.a, you would set USEDLIBS to - <tt>sample.a</tt>. - <p> - Note that this works only for statically linked libraries. - <p> - - <dt>LLVMLIBS - <dd> - This variable holds a space separated list of libraries that should - be linked into the program. These libraries must be LLVM libraries. - The libraries must be specified without their "lib" prefix. For - example, to link with a driver that performs an IR transformation - you might set LLVMLIBS to this minimal set of libraries - <tt>LLVMSupport.a LLVMCore.a LLVMBitReader.a LLVMAsmParser.a LLVMAnalysis.a LLVMTransformUtils.a LLVMScalarOpts.a LLVMTarget.a</tt>. - <p> - Note that this works only for statically linked libraries. LLVM is - split into a large number of static libraries, and the list of libraries you - require may be much longer than the list above. To see a full list - of libraries use: - <tt>llvm-config --libs all</tt>. - Using LINK_COMPONENTS as described below, obviates the need to set LLVMLIBS. - <p> - - <dt>LINK_COMPONENTS - <dd>This variable holds a space separated list of components that - the LLVM Makefiles pass to the <tt>llvm-config</tt> tool to generate - a link line for the program. For example, to link with all LLVM - libraries use - <tt>LINK_COMPONENTS = all</tt>. - <p> - - <dt>LIBS - <dd> - To link dynamic libraries, add <tt>-l<library base name></tt> to - the LIBS variable. The LLVM build system will look in the same places - for dynamic libraries as it does for static libraries. - <p> - For example, to link <tt>libsample.so</tt>, you would have the - following line in your <tt>Makefile</tt>: - <p> - <tt> - LIBS += -lsample - </tt> - <p> - Note that LIBS must occur in the Makefile after the inclusion of Makefile.common. - <p> -</dl> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="miscVars">Miscellaneous Variables</a> -</h3> - -<div> - -<dl> - <dt>ExtraSource - <dd> - This variable contains a space separated list of extra source - files that need to be built. It is useful for including the - output of Lex and Yacc programs. - <p> - - <dt>CFLAGS - <dt>CPPFLAGS - <dd> - This variable can be used to add options to the C and C++ - compiler, respectively. It is typically used to add options - that tell the compiler the location of additional directories - to search for header files. - <p> - It is highly suggested that you append to CFLAGS and CPPFLAGS as - opposed to overwriting them. The master Makefiles may already - have useful options in them that you may not want to overwrite. - <p> -</dl> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="objcode">Placement of Object Code</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The final location of built libraries and executables will depend upon -whether you do a Debug, Release, or Profile build.</p> - -<dl> - <dt>Libraries - <dd> - All libraries (static and dynamic) will be stored in - <tt>PROJ_OBJ_ROOT/<type>/lib</tt>, where type is <tt>Debug</tt>, - <tt>Release</tt>, or <tt>Profile</tt> for a debug, optimized, or - profiled build, respectively.<p> - - <dt>Executables - <dd>All executables will be stored in - <tt>PROJ_OBJ_ROOT/<type>/bin</tt>, where type is <tt>Debug</tt>, - <tt>Release</tt>, or <tt>Profile</tt> for a debug, optimized, or profiled - build, respectively. -</dl> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="help">Further Help</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you have any questions or need any help creating an LLVM project, -the LLVM team would be more than happy to help. You can always post your -questions to the <a -href="http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM Developers -Mailing List</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:criswell@uiuc.edu">John Criswell</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a> - <br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ -</address> - -</body> -</html> diff --git a/docs/Projects.rst b/docs/Projects.rst new file mode 100644 index 0000000..6313288 --- /dev/null +++ b/docs/Projects.rst @@ -0,0 +1,327 @@ +.. _projects: + +======================== +Creating an LLVM Project +======================== + +.. contents:: + :local: + +Overview +======== + +The LLVM build system is designed to facilitate the building of third party +projects that use LLVM header files, libraries, and tools. In order to use +these facilities, a ``Makefile`` from a project must do the following things: + +* Set ``make`` variables. There are several variables that a ``Makefile`` needs + to set to use the LLVM build system: + + * ``PROJECT_NAME`` - The name by which your project is known. + * ``LLVM_SRC_ROOT`` - The root of the LLVM source tree. + * ``LLVM_OBJ_ROOT`` - The root of the LLVM object tree. + * ``PROJ_SRC_ROOT`` - The root of the project's source tree. + * ``PROJ_OBJ_ROOT`` - The root of the project's object tree. + * ``PROJ_INSTALL_ROOT`` - The root installation directory. + * ``LEVEL`` - The relative path from the current directory to the + project's root ``($PROJ_OBJ_ROOT)``. + +* Include ``Makefile.config`` from ``$(LLVM_OBJ_ROOT)``. + +* Include ``Makefile.rules`` from ``$(LLVM_SRC_ROOT)``. + +There are two ways that you can set all of these variables: + +* You can write your own ``Makefiles`` which hard-code these values. + +* You can use the pre-made LLVM sample project. This sample project includes + ``Makefiles``, a configure script that can be used to configure the location + of LLVM, and the ability to support multiple object directories from a single + source directory. + +This document assumes that you will base your project on the LLVM sample project +found in ``llvm/projects/sample``. If you want to devise your own build system, +studying the sample project and LLVM ``Makefiles`` will probably provide enough +information on how to write your own ``Makefiles``. + +Create a Project from the Sample Project +======================================== + +Follow these simple steps to start your project: + +1. Copy the ``llvm/projects/sample`` directory to any place of your choosing. + You can place it anywhere you like. Rename the directory to match the name + of your project. + +2. If you downloaded LLVM using Subversion, remove all the directories named + ``.svn`` (and all the files therein) from your project's new source tree. + This will keep Subversion from thinking that your project is inside + ``llvm/trunk/projects/sample``. + +3. Add your source code and Makefiles to your source tree. + +4. If you want your project to be configured with the ``configure`` script then + you need to edit ``autoconf/configure.ac`` as follows: + + * **AC_INIT** - Place the name of your project, its version number and a + contact email address for your project as the arguments to this macro + + * **AC_CONFIG_AUX_DIR** - If your project isn't in the ``llvm/projects`` + directory then you might need to adjust this so that it specifies a + relative path to the ``llvm/autoconf`` directory. + + * **LLVM_CONFIG_PROJECT** - Just leave this alone. + + * **AC_CONFIG_SRCDIR** - Specify a path to a file name that identifies your + project; or just leave it at ``Makefile.common.in``. + + * **AC_CONFIG_FILES** - Do not change. + + * **AC_CONFIG_MAKEFILE** - Use one of these macros for each Makefile that + your project uses. This macro arranges for your makefiles to be copied from + the source directory, unmodified, to the build directory. + +5. After updating ``autoconf/configure.ac``, regenerate the configure script + with these commands. (You must be using ``Autoconf`` version 2.59 or later + and your ``aclocal`` version should be 1.9 or later.) + + .. code-block:: bash + + % cd autoconf + % ./AutoRegen.sh + +6. Run ``configure`` in the directory in which you want to place object code. + Use the following options to tell your project where it can find LLVM: + + ``--with-llvmsrc=<directory>`` + Tell your project where the LLVM source tree is located. + + ``--with-llvmobj=<directory>`` + Tell your project where the LLVM object tree is located. + + ``--prefix=<directory>`` + Tell your project where it should get installed. + +That's it! Now all you have to do is type ``gmake`` (or ``make`` if you're on a +GNU/Linux system) in the root of your object directory, and your project should +build. + +Source Tree Layout +================== + +In order to use the LLVM build system, you will want to organize your source +code so that it can benefit from the build system's features. Mainly, you want +your source tree layout to look similar to the LLVM source tree layout. The +best way to do this is to just copy the project tree from +``llvm/projects/sample`` and modify it to meet your needs, but you can certainly +add to it if you want. + +Underneath your top level directory, you should have the following directories: + +**lib** + + This subdirectory should contain all of your library source code. For each + library that you build, you will have one directory in **lib** that will + contain that library's source code. + + Libraries can be object files, archives, or dynamic libraries. The **lib** + directory is just a convenient place for libraries as it places them all in + a directory from which they can be linked later. + +**include** + + This subdirectory should contain any header files that are global to your + project. By global, we mean that they are used by more than one library or + executable of your project. + + By placing your header files in **include**, they will be found + automatically by the LLVM build system. For example, if you have a file + **include/jazz/note.h**, then your source files can include it simply with + **#include "jazz/note.h"**. + +**tools** + + This subdirectory should contain all of your source code for executables. + For each program that you build, you will have one directory in **tools** + that will contain that program's source code. + +**test** + + This subdirectory should contain tests that verify that your code works + correctly. Automated tests are especially useful. + + Currently, the LLVM build system provides basic support for tests. The LLVM + system provides the following: + +* LLVM provides a ``tcl`` procedure that is used by ``Dejagnu`` to run tests. + It can be found in ``llvm/lib/llvm-dg.exp``. This test procedure uses ``RUN`` + lines in the actual test case to determine how to run the test. See the + `TestingGuide <TestingGuide.html>`_ for more details. You can easily write + Makefile support similar to the Makefiles in ``llvm/test`` to use ``Dejagnu`` + to run your project's tests. + +* LLVM contains an optional package called ``llvm-test``, which provides + benchmarks and programs that are known to compile with the Clang front + end. You can use these programs to test your code, gather statistical + information, and compare it to the current LLVM performance statistics. + + Currently, there is no way to hook your tests directly into the ``llvm/test`` + testing harness. You will simply need to find a way to use the source + provided within that directory on your own. + +Typically, you will want to build your **lib** directory first followed by your +**tools** directory. + +Writing LLVM Style Makefiles +============================ + +The LLVM build system provides a convenient way to build libraries and +executables. Most of your project Makefiles will only need to define a few +variables. Below is a list of the variables one can set and what they can +do: + +Required Variables +------------------ + +``LEVEL`` + + This variable is the relative path from this ``Makefile`` to the top + directory of your project's source code. For example, if your source code + is in ``/tmp/src``, then the ``Makefile`` in ``/tmp/src/jump/high`` + would set ``LEVEL`` to ``"../.."``. + +Variables for Building Subdirectories +------------------------------------- + +``DIRS`` + + This is a space separated list of subdirectories that should be built. They + will be built, one at a time, in the order specified. + +``PARALLEL_DIRS`` + + This is a list of directories that can be built in parallel. These will be + built after the directories in DIRS have been built. + +``OPTIONAL_DIRS`` + + This is a list of directories that can be built if they exist, but will not + cause an error if they do not exist. They are built serially in the order + in which they are listed. + +Variables for Building Libraries +-------------------------------- + +``LIBRARYNAME`` + + This variable contains the base name of the library that will be built. For + example, to build a library named ``libsample.a``, ``LIBRARYNAME`` should + be set to ``sample``. + +``BUILD_ARCHIVE`` + + By default, a library is a ``.o`` file that is linked directly into a + program. To build an archive (also known as a static library), set the + ``BUILD_ARCHIVE`` variable. + +``SHARED_LIBRARY`` + + If ``SHARED_LIBRARY`` is defined in your Makefile, a shared (or dynamic) + library will be built. + +Variables for Building Programs +------------------------------- + +``TOOLNAME`` + + This variable contains the name of the program that will be built. For + example, to build an executable named ``sample``, ``TOOLNAME`` should be set + to ``sample``. + +``USEDLIBS`` + + This variable holds a space separated list of libraries that should be + linked into the program. These libraries must be libraries that come from + your **lib** directory. The libraries must be specified without their + ``lib`` prefix. For example, to link ``libsample.a``, you would set + ``USEDLIBS`` to ``sample.a``. + + Note that this works only for statically linked libraries. + +``LLVMLIBS`` + + This variable holds a space separated list of libraries that should be + linked into the program. These libraries must be LLVM libraries. The + libraries must be specified without their ``lib`` prefix. For example, to + link with a driver that performs an IR transformation you might set + ``LLVMLIBS`` to this minimal set of libraries ``LLVMSupport.a LLVMCore.a + LLVMBitReader.a LLVMAsmParser.a LLVMAnalysis.a LLVMTransformUtils.a + LLVMScalarOpts.a LLVMTarget.a``. + + Note that this works only for statically linked libraries. LLVM is split + into a large number of static libraries, and the list of libraries you + require may be much longer than the list above. To see a full list of + libraries use: ``llvm-config --libs all``. Using ``LINK_COMPONENTS`` as + described below, obviates the need to set ``LLVMLIBS``. + +``LINK_COMPONENTS`` + + This variable holds a space separated list of components that the LLVM + ``Makefiles`` pass to the ``llvm-config`` tool to generate a link line for + the program. For example, to link with all LLVM libraries use + ``LINK_COMPONENTS = all``. + +``LIBS`` + + To link dynamic libraries, add ``-l<library base name>`` to the ``LIBS`` + variable. The LLVM build system will look in the same places for dynamic + libraries as it does for static libraries. + + For example, to link ``libsample.so``, you would have the following line in + your ``Makefile``: + + .. code-block:: makefile + + LIBS += -lsample + +Note that ``LIBS`` must occur in the Makefile after the inclusion of +``Makefile.common``. + +Miscellaneous Variables +----------------------- + +``CFLAGS`` & ``CPPFLAGS`` + + This variable can be used to add options to the C and C++ compiler, + respectively. It is typically used to add options that tell the compiler + the location of additional directories to search for header files. + + It is highly suggested that you append to ``CFLAGS`` and ``CPPFLAGS`` as + opposed to overwriting them. The master ``Makefiles`` may already have + useful options in them that you may not want to overwrite. + +Placement of Object Code +======================== + +The final location of built libraries and executables will depend upon whether +you do a ``Debug``, ``Release``, or ``Profile`` build. + +Libraries + + All libraries (static and dynamic) will be stored in + ``PROJ_OBJ_ROOT/<type>/lib``, where *type* is ``Debug``, ``Release``, or + ``Profile`` for a debug, optimized, or profiled build, respectively. + +Executables + + All executables will be stored in ``PROJ_OBJ_ROOT/<type>/bin``, where *type* + is ``Debug``, ``Release``, or ``Profile`` for a debug, optimized, or + profiled build, respectively. + +Further Help +============ + +If you have any questions or need any help creating an LLVM project, the LLVM +team would be more than happy to help. You can always post your questions to +the `LLVM Developers Mailing List +<http://lists.cs.uiuc.edu/pipermail/llvmdev/>`_. diff --git a/docs/README.txt b/docs/README.txt new file mode 100644 index 0000000..2fbbf98 --- /dev/null +++ b/docs/README.txt @@ -0,0 +1,12 @@ +LLVM Documentation +================== + +The LLVM documentation is currently written in two formats: + + * Plain HTML documentation. + + * reStructured Text documentation using the Sphinx documentation generator. It + is currently tested with Sphinx 1.1.3. + + For more information, see the "Sphinx Introduction for LLVM Developers" + document. diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html index 71f2cea..85448a5 100644 --- a/docs/ReleaseNotes.html +++ b/docs/ReleaseNotes.html @@ -4,11 +4,11 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <link rel="stylesheet" href="_static/llvm.css" type="text/css"> - <title>LLVM 3.1 Release Notes</title> + <title>LLVM 3.2 Release Notes</title> </head> <body> -<h1>LLVM 3.1 Release Notes</h1> +<h1>LLVM 3.2 Release Notes</h1> <div> <img style="float:right" src="http://llvm.org/img/DragonSmall.png" @@ -18,7 +18,7 @@ <ol> <li><a href="#intro">Introduction</a></li> <li><a href="#subproj">Sub-project Status Update</a></li> - <li><a href="#externalproj">External Projects Using LLVM 3.1</a></li> + <li><a href="#externalproj">External Projects Using LLVM 3.2</a></li> <li><a href="#whatsnew">What's New in LLVM?</a></li> <li><a href="GettingStarted.html">Installation Instructions</a></li> <li><a href="#knownproblems">Known Problems</a></li> @@ -29,6 +29,12 @@ <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p> </div> +<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2 +release.<br> +You may prefer the +<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1 +Release Notes</a>.</h1> + <!-- *********************************************************************** --> <h2> <a name="intro">Introduction</a> @@ -38,11 +44,11 @@ <div> <p>This document contains the release notes for the LLVM Compiler - Infrastructure, release 3.1. Here we describe the status of LLVM, including + Infrastructure, release 3.2. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various - subprojects of LLVM, and some of the current users of the code. - All LLVM releases may be downloaded from - the <a href="http://llvm.org/releases/">LLVM releases web site</a>.</p> + subprojects of LLVM, and some of the current users of the code. All LLVM + releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM + releases web site</a>.</p> <p>For more information about LLVM, including information about the latest release, please check out the <a href="http://llvm.org/">main LLVM web @@ -66,10 +72,10 @@ <div> -<p>The LLVM 3.1 distribution currently consists of code from the core LLVM - repository (which roughly includes the LLVM optimizers, code generators and - supporting tools), and the Clang repository. In addition to this code, the - LLVM Project includes other sub-projects that are in development. Here we +<p>The LLVM 3.2 distribution currently consists of code from the core LLVM + repository, which roughly includes the LLVM optimizers, code generators and + supporting tools, and the Clang repository. In addition to this code, the + LLVM Project includes other sub-projects that are in development. Here we include updates on these subprojects.</p> <!--=========================================================================--> @@ -88,20 +94,13 @@ production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and for Darwin/ARM targets.</p> -<p>In the LLVM 3.1 time-frame, the Clang team has made many improvements. +<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements. Highlights include:</p> <ul> - <li>Greatly expanded <a href="http://clang.llvm.org/cxx_status.html">C++11 - support</a> including lambdas, initializer lists, constexpr, user-defined - literals, and atomics.</li> - <li>A new <a href="http://clang.llvm.org/docs/Tooling.html">tooling</a> - library to ease building of clang-based standalone tools.</li> - <li>Extended support for - <a href="http://clang.llvm.org/docs/ObjectiveCLiterals.html">literals in - Objective C</a>.</li> + <li>...</li> </ul> -<p>For more details about the changes to Clang since the 3.0 release, see the +<p>For more details about the changes to Clang since the 3.1 release, see the <a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release notes.</a></p> @@ -127,23 +126,10 @@ Linux and OpenBSD platforms. It fully supports Ada, C, C++ and Fortran. It has partial support for Go, Java, Obj-C and Obj-C++.</p> -<p>The 3.1 release has the following notable changes:</p> +<p>The 3.2 release has the following notable changes:</p> <ul> - <li>Partial support for gcc-4.7. Ada support is poor, but other languages work - fairly well.</li> - - <li>Support for ARM processors. Some essential gcc headers that are needed to - build DragonEgg for ARM are not installed by gcc. To work around this, - copy the missing headers from the gcc source tree.</li> - - <li>Better optimization for Fortran by exploiting the fact that Fortran scalar - arguments have 'restrict' semantics.</li> - - <li>Better optimization for all languages by passing information about type - aliasing and type ranges to the LLVM optimizers.</li> - - <li>A regression test-suite was added.</li> + <li>...</li> </ul> </div> @@ -160,13 +146,15 @@ target-specific hooks required by code generation and other runtime components. For example, when compiling for a 32-bit target, converting a double to a 64-bit unsigned integer is compiled into a runtime call to the - "__fixunsdfdi" function. The compiler-rt library provides highly optimized - implementations of this and other low-level routines (some are 3x faster than - the equivalent libgcc routines).</p> + <code>__fixunsdfdi</code> function. The compiler-rt library provides highly + optimized implementations of this and other low-level routines (some are 3x + faster than the equivalent libgcc routines).</p> -<p>As of 3.1, compiler-rt includes the helper functions for atomic operations, - allowing atomic operations on arbitrary-sized quantities to work. These - functions follow the specification defined by gcc and are used by clang.</p> +<p>The 3.2 release has the following notable changes:</p> + +<ul> + <li>...</li> +</ul> </div> @@ -183,6 +171,12 @@ expression parsing (particularly for C++) and uses the LLVM JIT for target support.</p> +<p>The 3.2 release has the following notable changes:</p> + +<ul> + <li>...</li> +</ul> + </div> <!--=========================================================================--> @@ -196,15 +190,10 @@ licensed</a> under the MIT and UIUC license, allowing it to be used more permissively.</p> -<p>Within the LLVM 3.1 time-frame there were the following highlights:</p> +<p>Within the LLVM 3.2 time-frame there were the following highlights:</p> <ul> - <li>The <code><atomic></code> header is now passing all tests, when - compiling with clang and linking against the support code from - compiler-rt.</li> - <li>FreeBSD now includes libc++ as part of the base system.</li> - <li>libc++ has been ported to Solaris and, in combination with libcxxrt and - clang, is working with a large body of existing code.</li> + <li>...</li> </ul> </div> @@ -220,8 +209,11 @@ of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time compilation.</p> -<p>In the LLVM 3.1 time-frame, VMKit has had significant improvements on both - runtime and startup performance.</p> +<p>The 3.2 release has the following notable changes:</p> + +<ul> + <li>...</li> +</ul> </div> @@ -239,16 +231,10 @@ Work in the area of automatic SIMD and accelerator code generation was started.</p> -<p>Within the LLVM 3.1 time-frame there were the following highlights:</p> +<p>Within the LLVM 3.2 time-frame there were the following highlights:</p> <ul> - <li>Polly became an official LLVM project</li> - <li>Polly can be loaded directly into clang (enabled by '-O3 -mllvm -polly')</li> - <li>An automatic scheduling optimizer (derived - from <a href="http://pluto-compiler.sourceforge.net/">Pluto</a>) was - integrated. It performs loop transformations to optimize for data-locality - and parallelism. The transformations include, but are not limited to - interchange, fusion, fission, skewing and tiling.</li> + <li>...</li> </ul> </div> @@ -257,15 +243,15 @@ <!-- *********************************************************************** --> <h2> - <a name="externalproj">External Open Source Projects Using LLVM 3.1</a> + <a name="externalproj">External Open Source Projects Using LLVM 3.2</a> </h2> <!-- *********************************************************************** --> <div> <p>An exciting aspect of LLVM is that it is used as an enabling technology for - a lot of other language and tools projects. This section lists some of the - projects that have already been updated to work with LLVM 3.1.</p> + a lot of other language and tools projects. This section lists some of the + projects that have already been updated to work with LLVM 3.2.</p> <h3>Crack</h3> @@ -409,14 +395,14 @@ <!-- *********************************************************************** --> <h2> - <a name="whatsnew">What's New in LLVM 3.1?</a> + <a name="whatsnew">What's New in LLVM 3.2?</a> </h2> <!-- *********************************************************************** --> <div> <p>This release includes a huge number of bug fixes, performance tweaks and - minor improvements. Some of the major improvements and new features are + minor improvements. Some of the major improvements and new features are listed in this section.</p> <!--=========================================================================--> @@ -426,13 +412,13 @@ <div> - <!-- Features that need text if they're finished for 3.1: + <!-- Features that need text if they're finished for 3.2: ARM EHABI combiner-aa? strong phi elim loop dependence analysis CorrelatedValuePropagation - lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.1. + lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.2. Integrated assembler on by default for arm/thumb? --> @@ -443,17 +429,10 @@ llvm/lib/Archive - replace with lib object? --> -<p>LLVM 3.1 includes several major changes and big features:</p> +<p>LLVM 3.2 includes several major changes and big features:</p> <ul> - <li><a href="../tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a>, - a fast memory error detector.</li> - <li><a href="CodeGenerator.html#machineinstrbundle">MachineInstr Bundles</a>, - Support to model instruction bundling / packing.</li> - <li><a href="#armintegratedassembler">ARM Integrated Assembler</a>, - A full featured assembler and direct-to-object support for ARM.</li> - <li><a href="#blockplacement">Basic Block Placement</a> - Probability driven basic block placement.</li> + <li>...</li> </ul> </div> @@ -470,19 +449,9 @@ expose new optimization opportunities:</p> <ul> - <li>A new type representing 16 bit <i>half</i> floating point values has - been added.</li> - <li>IR now supports vectors of pointers, including vector GEPs.</li> - <li>Module flags have been introduced. They convey information about the - module as a whole to LLVM subsystems. This is currently used to encode - Objective C ABI information.</li> - <li>Loads can now have range metadata attached to them to describe the - possible values being loaded.</li> - <li>The <tt>llvm.ctlz</tt> and <tt>llvm.cttz</tt> intrinsics now have an - additional argument which indicates whether the behavior of the intrinsic - is undefined on a zero input. This can be used to generate more efficient - code on platforms that only have instructions which don't return the type - size when counting bits in 0.</li> + <li>Thread local variables may have a specified TLS model. See the + <a href="LangRef.html#globalvars">Language Reference Manual</a>.</li> + <li>...</li> </ul> </div> @@ -494,22 +463,11 @@ <div> -<p>In addition to many minor performance tweaks and bug fixes, this - release includes a few major enhancements and additions to the - optimizers:</p> +<p>In addition to many minor performance tweaks and bug fixes, this release + includes a few major enhancements and additions to the optimizers:</p> <ul> - <li>The loop unroll pass now is able to unroll loops with run-time trip counts. - This feature is turned off by default, and is enabled with the - <code>-unroll-runtime</code> flag.</li> - <li>A new basic-block autovectorization pass is available. Pass - <code>-vectorize</code> to run this pass along with some associated - post-vectorization cleanup passes. For more information, see the EuroLLVM - 2012 slides: <a href="http://llvm.org/devmtg/2012-04-12/Slides/Hal_Finkel.pdf"> - Autovectorization with LLVM</a>.</li> - <li>Inline cost heuristics have been completely overhauled and now closely - model constant propagation through call sites, disregard trivially dead - code costs, and can model C++ STL iterator patterns.</li> + <li>...</li> </ul> </div> @@ -524,14 +482,12 @@ <p>The LLVM Machine Code (aka MC) subsystem was created to solve a number of problems in the realm of assembly, disassembly, object file format handling, and a number of other related areas that CPU instruction-set level tools work - in. For more information, please see - the <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro - to the LLVM MC Project Blog Post</a>.</p> + in. For more information, please see the + <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro + to the LLVM MC Project Blog Post</a>.</p> <ul> - <li>The integrated assembler can optionally emit debug information when - assembling a </tt>.s</tt> file. It can be enabled by passing the - <tt>-g</tt> option to <tt>llvm-mc</tt>.</li> + <li>...</li> </ul> </div> @@ -556,21 +512,7 @@ make it run faster:</p> <ul> - <li>TableGen can now synthesize register classes that are only needed to - represent combinations of constraints from instructions and sub-registers. - The synthetic register classes inherit most of their properties form their - closest user-defined super-class.</li> - <li><code>MachineRegisterInfo</code> now allows the reserved registers to be - frozen when register allocation starts. Target hooks should use the - <code>MRI->canReserveReg(FramePtr)</code> method to avoid accidentally - disabling frame pointer elimination during register allocation.</li> - <li>A new kind of <code>MachineOperand</code> provides a compact - representation of large clobber lists on call instructions. The register - mask operand references a bit mask of preserved registers. Everything else - is clobbered.</li> - <li>The DWARF debug info writer gained support for emitting data for the - <a href="SourceLevelDebugging.html#acceltable">name accelerator tables - DWARF extension</a>. It is used by LLDB to speed up name lookup.</li> + <li>...</li> </ul> <p> We added new TableGen infrastructure to support bundling for @@ -587,11 +529,14 @@ <h4> <a name="blockplacement">Basic Block Placement</a> </h4> + <div> + <p>A probability based block placement and code layout algorithm was added to -LLVM's code generator. This layout pass supports probabilities derived from -static heuristics as well as source code annotations such as -<code>__builtin_expect</code>.</p> + LLVM's code generator. This layout pass supports probabilities derived from + static heuristics as well as source code annotations such as + <code>__builtin_expect</code>.</p> + </div> <!--=========================================================================--> @@ -604,14 +549,7 @@ static heuristics as well as source code annotations such as <p>New features and major changes in the X86 target include:</p> <ul> - <li>Greatly improved support for AVX2.</li> - <li>Lots of bug fixes and improvements for AVX1.</li> - <li>Support for the FMA4 and XOP instruction set extensions.</li> - <li>Call instructions use the new register mask operands for faster compile - times and better support for different calling conventions. The old WINCALL - instructions are no longer needed.</li> - <li>DW2 Exception Handling is enabled on Cygwin and MinGW.</li> - <li>Support for implicit TLS model used with MSVC runtime.</li> + <li>...</li> </ul> </div> @@ -626,65 +564,45 @@ static heuristics as well as source code annotations such as <p>New features of the ARM target include:</p> <ul> - <li>The constant island pass now supports basic block and constant pool entry - alignments greater than 4 bytes.</li> - <li>On Darwin, the ARM target now has a full-featured integrated assembler. - </li> + <li>...</li> </ul> +<!--_________________________________________________________________________--> + <h4> <a name="armintegratedassembler">ARM Integrated Assembler</a> </h4> + <div> + <p>The ARM target now includes a full featured macro assembler, including -direct-to-object module support for clang. The assembler is currently enabled -by default for Darwin only pending testing and any additional necessary -platform specific support for Linux.</p> + direct-to-object module support for clang. The assembler is currently enabled + by default for Darwin only pending testing and any additional necessary + platform specific support for Linux.</p> <p>Full support is included for Thumb1, Thumb2 and ARM modes, along with -subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p> + subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p> <p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual -for details). While there is some, and growing, support for pre-unfied (divided) -syntax, there are still significant gaps in that support.</p> -</div> + for details). While there is some, and growing, support for pre-unfied + (divided) syntax, there are still significant gaps in that support.</p> </div> -<!--=========================================================================--> -<h3> -<a name="MIPS">MIPS Target Improvements</a> -</h3> -<div> -New features and major changes in the MIPS target include:</p> - -<ul> - <li>MIPS32 little-endian direct object code emission is functional.</li> - <li>MIPS64 little-endian code generation is largely functional for N64 ABI in assembly printing mode with the exception of handling of long double (f128) type.</li> - <li>Support for new instructions has been added, which includes swap-bytes - instructions (WSBH and DSBH), floating point multiply-add/subtract and - negative multiply-add/subtract instructions, and floating - point load/store instructions with reg+reg addressing (LWXC1, etc.)</li> - <li>Various fixes to improve performance have been implemented.</li> - <li>Post-RA scheduling is now enabled at -O3.</li> - <li>Support for soft-float code generation has been added.</li> - <li>clang driver's support for MIPS 64-bits targets.</li> - <li>Support for MIPS floating point ABI option in clang driver.</li> -</ul> </div> <!--=========================================================================--> <h3> -<a name="PTX">PTX Target Improvements</a> +<a name="MIPS">MIPS Target Improvements</a> </h3> <div> -<p>An outstanding conditional inversion bug was fixed in this release.</p> +<p>New features and major changes in the MIPS target include:</p> -<p><b>NOTE</b>: LLVM 3.1 marks the last release of the PTX back-end, in its - current form. The back-end is currently being replaced by the NVPTX - back-end, currently in SVN ToT.</p> +<ul> + <li>...</li> +</ul> </div> @@ -696,7 +614,7 @@ New features and major changes in the MIPS target include:</p> <div> <ul> - <li>Support for Qualcomm's Hexagon VLIW processor has been added.</li> + <li>...</li> </ul> </div> @@ -709,25 +627,11 @@ New features and major changes in the MIPS target include:</p> <div> <p>If you're already an LLVM user or developer with out-of-tree changes based on - LLVM 3.1, this section lists some "gotchas" that you may run into upgrading + LLVM 3.2, this section lists some "gotchas" that you may run into upgrading from the previous release.</p> <ul> - <li>LLVM's build system now requires a python 2 interpreter to be present at - build time. A perl interpreter is no longer required.</li> - <li>The C backend has been removed. It had numerous problems, to the point of - not being able to compile any nontrivial program.</li> - <li>The Alpha, Blackfin and SystemZ targets have been removed due to lack of - maintenance.</li> - <li>LLVM 3.1 removes support for reading LLVM 2.9 bitcode files. Going - forward, we aim for all future versions of LLVM to read bitcode files and - <tt>.ll</tt> files produced by LLVM 3.0 and later.</li> - <li>The <tt>unwind</tt> instruction is now gone. With the introduction of the - new exception handling system in LLVM 3.0, the <tt>unwind</tt> instruction - became obsolete.</li> - <li>LLVM 3.0 and earlier automatically added the returns_twice fo functions - like setjmp based on the name. This functionality was removed in 3.1. - This affects Clang users, if -ffreestanding is used.</li> + <li>...</li> </ul> </div> @@ -743,40 +647,7 @@ New features and major changes in the MIPS target include:</p> LLVM API changes are:</p> <ul> - <li>Target specific options have been moved from global variables to members - on the new <code>TargetOptions</code> class, which is local to each - <code>TargetMachine</code>. As a consequence, the associated flags will - no longer be accepted by <tt>clang -mllvm</tt>. This includes: -<ul> -<li><code>llvm::PrintMachineCode</code></li> -<li><code>llvm::NoFramePointerElim</code></li> -<li><code>llvm::NoFramePointerElimNonLeaf</code></li> -<li><code>llvm::DisableFramePointerElim(const MachineFunction &)</code></li> -<li><code>llvm::LessPreciseFPMADOption</code></li> -<li><code>llvm::LessPrecideFPMAD()</code></li> -<li><code>llvm::NoExcessFPPrecision</code></li> -<li><code>llvm::UnsafeFPMath</code></li> -<li><code>llvm::NoInfsFPMath</code></li> -<li><code>llvm::NoNaNsFPMath</code></li> -<li><code>llvm::HonorSignDependentRoundingFPMathOption</code></li> -<li><code>llvm::HonorSignDependentRoundingFPMath()</code></li> -<li><code>llvm::UseSoftFloat</code></li> -<li><code>llvm::FloatABIType</code></li> -<li><code>llvm::NoZerosInBSS</code></li> -<li><code>llvm::JITExceptionHandling</code></li> -<li><code>llvm::JITEmitDebugInfo</code></li> -<li><code>llvm::JITEmitDebugInfoToDisk</code></li> -<li><code>llvm::GuaranteedTailCallOpt</code></li> -<li><code>llvm::StackAlignmentOverride</code></li> -<li><code>llvm::RealignStack</code></li> -<li><code>llvm::DisableJumpTables</code></li> -<li><code>llvm::EnableFastISel</code></li> -<li><code>llvm::getTrapFunctionName()</code></li> -<li><code>llvm::EnableSegmentedStacks</code></li> -</ul></li> - - <li>The <code>MDBuilder</code> class has been added to simplify the creation - of metadata.</li> + <li>...</li> </ul> </div> @@ -791,13 +662,8 @@ New features and major changes in the MIPS target include:</p> <p>In addition, some tools have changed in this release. Some of the changes are:</p> - <ul> - <li><tt>llvm-stress</tt> is a command line tool for generating random - <tt>.ll</tt> files to fuzz different LLVM components. </li> - <li>The <tt>llvm-ld</tt> tool has been removed. The clang driver provides a - more reliable solution for turning a set of bitcode files into a binary. - To merge bitcode files <tt>llvm-link</tt> can be used instead.</li> + <li>...</li> </ul> </div> @@ -811,19 +677,12 @@ New features and major changes in the MIPS target include:</p> <div> <p>Officially supported Python bindings have been added! Feature support is far -from complete. The current bindings support interfaces to:</p> + from complete. The current bindings support interfaces to:</p> + <ul> - <li>Object File Interface</li> - <li>Disassembler</li> + <li>...</li> </ul> -<p>Using the Object File Interface, it is possible to inspect binary object files. -Think of it as a Python version of readelf or llvm-objdump.</p> - -<p>Support for additional features is currently being developed by community -contributors. If you are interested in shaping the direction of the Python -bindings, please express your intent on IRC or the developers list.</p> - </div> </div> @@ -839,11 +698,11 @@ bindings, please express your intent on IRC or the developers list.</p> <p>LLVM is generally a production quality compiler, and is used by a broad range of applications and shipping in many products. That said, not every subsystem is as mature as the aggregate, particularly the more obscure - targets. If you run into a problem, please check the <a - href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if - there isn't already one or ask on the <a - href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev - list</a>.</p> + targets. If you run into a problem, please check + the <a href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if + there isn't already one or ask on + the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev + list</a>.</p> <p>Known problem areas include:</p> @@ -851,7 +710,7 @@ bindings, please express your intent on IRC or the developers list.</p> <li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li> <li>The integrated assembler, disassembler, and JIT is not supported by - several targets. If an integrated assembler is not supported, then a + several targets. If an integrated assembler is not supported, then a system assembler is required. For more details, see the <a href="CodeGenerator.html#targetfeatures">Target Features Matrix</a>. </li> @@ -890,7 +749,7 @@ bindings, please express your intent on IRC or the developers list.</p> src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-05-15 23:58:06 +0200 (Tue, 15 May 2012) $ + Last modified: $Date: 2012-07-13 14:44:23 +0200 (Fri, 13 Jul 2012) $ </address> </body> diff --git a/docs/SegmentedStacks.html b/docs/SegmentedStacks.html deleted file mode 100644 index 16f5507..0000000 --- a/docs/SegmentedStacks.html +++ /dev/null @@ -1,93 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> -<html> - <head> - <title>Segmented Stacks in LLVM</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> - <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> - </head> - - <body> - <h1>Segmented Stacks in LLVM</h1> - <div class="doc_author"> - <p>Written by <a href="mailto:sanjoy@playingwithpointers.com">Sanjoy Das</a></p> - </div> - - <ol> - <li><a href="#intro">Introduction</a></li> - <li><a href="#implementation">Implementation Details</a> - <ol> - <li><a href="#morestack">Allocating Stacklets</a></li> - <li><a href="#alloca">Variable Sized Allocas</a></li> - </ol> - </li> - </ol> - - <h2><a name="intro">Introduction</a></h2> - <div> - <p> - Segmented stack allows stack space to be allocated incrementally than as a monolithic chunk (of some worst case size) at thread initialization. This is done by allocating stack blocks (henceforth called <em>stacklets</em>) and linking them into a doubly linked list. The function prologue is responsible for checking if the current stacklet has enough space for the function to execute; and if not, call into the libgcc runtime to allocate more stack space. When using <tt>llc</tt>, segmented stacks can be enabled by adding <tt>-segmented-stacks</tt> to the command line. - </p> - <p> - The runtime functionality is <a href="http://gcc.gnu.org/wiki/SplitStacks">already there in libgcc</a>. - </p> - </div> - - <h2><a name="implementation">Implementation Details</a></h2> - <div> - <h3><a name="morestack">Allocating Stacklets</a></h3> - <div> - <p> - As mentioned above, the function prologue checks if the current stacklet has enough space. The current approach is to use a slot in the TCB to store the current stack limit (minus the amount of space needed to allocate a new block) - this slot's offset is again dictated by <code>libgcc</code>. The generated assembly looks like this on x86-64: - </p> - <pre> - leaq -8(%rsp), %r10 - cmpq %fs:112, %r10 - jg .LBB0_2 - - # More stack space needs to be allocated - movabsq $8, %r10 # The amount of space needed - movabsq $0, %r11 # The total size of arguments passed on stack - callq __morestack - ret # The reason for this extra return is explained below - .LBB0_2: - # Usual prologue continues here - </pre> - <p> - The size of function arguments on the stack needs to be passed to <code> __morestack</code> (this function is implemented in <code>libgcc</code>) since that number of bytes has to be copied from the previous stacklet to the current one. This is so that SP (and FP) relative addressing of function arguments work as expected. - </p> - <p> - The unusual <code>ret</code> is needed to have the function which made a call to <code>__morestack</code> return correctly. <code>__morestack</code>, instead of returning, calls into <code>.LBB0_2</code>. This is possible since both, the size of the <code>ret</code> instruction and the PC of call to <code>__morestack</code> are known. When the function body returns, control is transferred back to <code>__morestack</code>. <code>__morestack</code> then de-allocates the new stacklet, restores the correct SP value, and does a second return, which returns control to the correct caller. - </p> - </div> - - <h3><a name="alloca">Variable Sized Allocas</a></h3> - <div> - <p> - The section on <a href="#morestack">allocating stacklets</a> automatically assumes that every stack frame will be of fixed size. However, LLVM allows the use of the <code>llvm.alloca</code> intrinsic to allocate dynamically sized blocks of memory on the stack. When faced with such a variable-sized alloca, code is generated to - </p> - <ul> - <li>Check if the current stacklet has enough space. If yes, just bump the SP, like in the normal case.</li> - <li>If not, generate a call to <code>libgcc</code>, which allocates the memory from the heap.</li> - </ul> - <p> - The memory allocated from the heap is linked into a list in the current stacklet, and freed along with the same. This prevents a memory leak. - </p> - </div> - - </div> - - <hr> - <address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"> - <img src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"> - </a> - <a href="http://validator.w3.org/check/referer"> - <img src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> - </a> - <a href="mailto:sanjoy@playingwithpointers.com">Sanjoy Das</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ - </address> - </body> -</html> - diff --git a/docs/SegmentedStacks.rst b/docs/SegmentedStacks.rst new file mode 100644 index 0000000..f97d62a --- /dev/null +++ b/docs/SegmentedStacks.rst @@ -0,0 +1,80 @@ +.. _segmented_stacks: + +======================== +Segmented Stacks in LLVM +======================== + +.. contents:: + :local: + +Introduction +============ + +Segmented stack allows stack space to be allocated incrementally than as a +monolithic chunk (of some worst case size) at thread initialization. This is +done by allocating stack blocks (henceforth called *stacklets*) and linking them +into a doubly linked list. The function prologue is responsible for checking if +the current stacklet has enough space for the function to execute; and if not, +call into the libgcc runtime to allocate more stack space. When using ``llc``, +segmented stacks can be enabled by adding ``-segmented-stacks`` to the command +line. + +The runtime functionality is `already there in libgcc +<http://gcc.gnu.org/wiki/SplitStacks>`_. + +Implementation Details +====================== + +.. _allocating stacklets: + +Allocating Stacklets +-------------------- + +As mentioned above, the function prologue checks if the current stacklet has +enough space. The current approach is to use a slot in the TCB to store the +current stack limit (minus the amount of space needed to allocate a new block) - +this slot's offset is again dictated by ``libgcc``. The generated +assembly looks like this on x86-64: + +.. code-block:: nasm + + leaq -8(%rsp), %r10 + cmpq %fs:112, %r10 + jg .LBB0_2 + + # More stack space needs to be allocated + movabsq $8, %r10 # The amount of space needed + movabsq $0, %r11 # The total size of arguments passed on stack + callq __morestack + ret # The reason for this extra return is explained below + .LBB0_2: + # Usual prologue continues here + +The size of function arguments on the stack needs to be passed to +``__morestack`` (this function is implemented in ``libgcc``) since that number +of bytes has to be copied from the previous stacklet to the current one. This is +so that SP (and FP) relative addressing of function arguments work as expected. + +The unusual ``ret`` is needed to have the function which made a call to +``__morestack`` return correctly. ``__morestack``, instead of returning, calls +into ``.LBB0_2``. This is possible since both, the size of the ``ret`` +instruction and the PC of call to ``__morestack`` are known. When the function +body returns, control is transferred back to ``__morestack``. ``__morestack`` +then de-allocates the new stacklet, restores the correct SP value, and does a +second return, which returns control to the correct caller. + +Variable Sized Allocas +---------------------- + +The section on `allocating stacklets`_ automatically assumes that every stack +frame will be of fixed size. However, LLVM allows the use of the ``llvm.alloca`` +intrinsic to allocate dynamically sized blocks of memory on the stack. When +faced with such a variable-sized alloca, code is generated to: + +* Check if the current stacklet has enough space. If yes, just bump the SP, like + in the normal case. +* If not, generate a call to ``libgcc``, which allocates the memory from the + heap. + +The memory allocated from the heap is linked into a list in the current +stacklet, and freed along with the same. This prevents a memory leak. diff --git a/docs/SourceLevelDebugging.html b/docs/SourceLevelDebugging.html index 259a259..bb72bf3 100644 --- a/docs/SourceLevelDebugging.html +++ b/docs/SourceLevelDebugging.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Source Level Debugging with LLVM</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -77,10 +77,6 @@ </li> </ul> </td> -<td class="right"> -<img src="img/venusflytrap.jpg" alt="A leafy and green bug eater" width="247" -height="369"> -</td> </tr></table> <div class="doc_author"> @@ -2716,7 +2712,7 @@ HashData[hash_data_count] has address attributes: DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges or DW_AT_entry_pc. It also contains DW_TAG_variable DIEs that have a DW_OP_addr in the location (global and static variables). All global and static variables - should be included, including those scoped withing functions and classes. For + should be included, including those scoped within functions and classes. For example using the following code:</p> <div class="doc_code"> <pre> @@ -2855,7 +2851,7 @@ int main () <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-03 02:43:49 +0200 (Tue, 03 Apr 2012) $ + Last modified: $Date: 2012-06-02 12:20:22 +0200 (Sat, 02 Jun 2012) $ </address> </body> diff --git a/docs/SystemLibrary.html b/docs/SystemLibrary.html index 7cafedf..4b09e7c 100644 --- a/docs/SystemLibrary.html +++ b/docs/SystemLibrary.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>System Library</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -310,7 +310,7 @@ <a href="mailto:rspencer@x10sys.com">Reid Spencer</a><br> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-31 12:21:59 +0100 (Mon, 31 Oct 2011) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> </html> diff --git a/docs/TableGenFundamentals.html b/docs/TableGenFundamentals.html deleted file mode 100644 index a211389..0000000 --- a/docs/TableGenFundamentals.html +++ /dev/null @@ -1,973 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>TableGen Fundamentals</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>TableGen Fundamentals</h1> - -<div> -<ul> - <li><a href="#introduction">Introduction</a> - <ol> - <li><a href="#concepts">Basic concepts</a></li> - <li><a href="#example">An example record</a></li> - <li><a href="#running">Running TableGen</a></li> - </ol></li> - <li><a href="#syntax">TableGen syntax</a> - <ol> - <li><a href="#primitives">TableGen primitives</a> - <ol> - <li><a href="#comments">TableGen comments</a></li> - <li><a href="#types">The TableGen type system</a></li> - <li><a href="#values">TableGen values and expressions</a></li> - </ol></li> - <li><a href="#classesdefs">Classes and definitions</a> - <ol> - <li><a href="#valuedef">Value definitions</a></li> - <li><a href="#recordlet">'let' expressions</a></li> - <li><a href="#templateargs">Class template arguments</a></li> - <li><a href="#multiclass">Multiclass definitions and instances</a></li> - </ol></li> - <li><a href="#filescope">File scope entities</a> - <ol> - <li><a href="#include">File inclusion</a></li> - <li><a href="#globallet">'let' expressions</a></li> - <li><a href="#foreach">'foreach' blocks</a></li> - </ol></li> - </ol></li> - <li><a href="#backends">TableGen backends</a> - <ol> - <li><a href="#">todo</a></li> - </ol></li> -</ul> -</div> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="introduction">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TableGen's purpose is to help a human develop and maintain records of -domain-specific information. Because there may be a large number of these -records, it is specifically designed to allow writing flexible descriptions and -for common features of these records to be factored out. This reduces the -amount of duplication in the description, reduces the chance of error, and -makes it easier to structure domain specific information.</p> - -<p>The core part of TableGen <a href="#syntax">parses a file</a>, instantiates -the declarations, and hands the result off to a domain-specific "<a -href="#backends">TableGen backend</a>" for processing. The current major user -of TableGen is the <a href="CodeGenerator.html">LLVM code generator</a>.</p> - -<p>Note that if you work on TableGen much, and use emacs or vim, that you can -find an emacs "TableGen mode" and a vim language file in the -<tt>llvm/utils/emacs</tt> and <tt>llvm/utils/vim</tt> directories of your LLVM -distribution, respectively.</p> - -<!-- ======================================================================= --> -<h3><a name="concepts">Basic concepts</a></h3> - -<div> - -<p>TableGen files consist of two key parts: 'classes' and 'definitions', both -of which are considered 'records'.</p> - -<p><b>TableGen records</b> have a unique name, a list of values, and a list of -superclasses. The list of values is the main data that TableGen builds for each -record; it is this that holds the domain specific information for the -application. The interpretation of this data is left to a specific <a -href="#backends">TableGen backend</a>, but the structure and format rules are -taken care of and are fixed by TableGen.</p> - -<p><b>TableGen definitions</b> are the concrete form of 'records'. These -generally do not have any undefined values, and are marked with the -'<tt>def</tt>' keyword.</p> - -<p><b>TableGen classes</b> are abstract records that are used to build and -describe other records. These 'classes' allow the end-user to build -abstractions for either the domain they are targeting (such as "Register", -"RegisterClass", and "Instruction" in the LLVM code generator) or for the -implementor to help factor out common properties of records (such as "FPInst", -which is used to represent floating point instructions in the X86 backend). -TableGen keeps track of all of the classes that are used to build up a -definition, so the backend can find all definitions of a particular class, such -as "Instruction".</p> - -<p><b>TableGen multiclasses</b> are groups of abstract records that are -instantiated all at once. Each instantiation can result in multiple -TableGen definitions. If a multiclass inherits from another multiclass, -the definitions in the sub-multiclass become part of the current -multiclass, as if they were declared in the current multiclass.</p> - -</div> - -<!-- ======================================================================= --> -<h3><a name="example">An example record</a></h3> - -<div> - -<p>With no other arguments, TableGen parses the specified file and prints out -all of the classes, then all of the definitions. This is a good way to see what -the various definitions expand to fully. Running this on the <tt>X86.td</tt> -file prints this (at the time of this writing):</p> - -<div class="doc_code"> -<pre> -... -<b>def</b> ADD32rr { <i>// Instruction X86Inst I</i> - <b>string</b> Namespace = "X86"; - <b>dag</b> OutOperandList = (outs GR32:$dst); - <b>dag</b> InOperandList = (ins GR32:$src1, GR32:$src2); - <b>string</b> AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; - <b>list</b><dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; - <b>list</b><Register> Uses = []; - <b>list</b><Register> Defs = [EFLAGS]; - <b>list</b><Predicate> Predicates = []; - <b>int</b> CodeSize = 3; - <b>int</b> AddedComplexity = 0; - <b>bit</b> isReturn = 0; - <b>bit</b> isBranch = 0; - <b>bit</b> isIndirectBranch = 0; - <b>bit</b> isBarrier = 0; - <b>bit</b> isCall = 0; - <b>bit</b> canFoldAsLoad = 0; - <b>bit</b> mayLoad = 0; - <b>bit</b> mayStore = 0; - <b>bit</b> isImplicitDef = 0; - <b>bit</b> isConvertibleToThreeAddress = 1; - <b>bit</b> isCommutable = 1; - <b>bit</b> isTerminator = 0; - <b>bit</b> isReMaterializable = 0; - <b>bit</b> isPredicable = 0; - <b>bit</b> hasDelaySlot = 0; - <b>bit</b> usesCustomInserter = 0; - <b>bit</b> hasCtrlDep = 0; - <b>bit</b> isNotDuplicable = 0; - <b>bit</b> hasSideEffects = 0; - <b>bit</b> neverHasSideEffects = 0; - InstrItinClass Itinerary = NoItinerary; - <b>string</b> Constraints = ""; - <b>string</b> DisableEncoding = ""; - <b>bits</b><8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; - Format Form = MRMDestReg; - <b>bits</b><6> FormBits = { 0, 0, 0, 0, 1, 1 }; - ImmType ImmT = NoImm; - <b>bits</b><3> ImmTypeBits = { 0, 0, 0 }; - <b>bit</b> hasOpSizePrefix = 0; - <b>bit</b> hasAdSizePrefix = 0; - <b>bits</b><4> Prefix = { 0, 0, 0, 0 }; - <b>bit</b> hasREX_WPrefix = 0; - FPFormat FPForm = ?; - <b>bits</b><3> FPFormBits = { 0, 0, 0 }; -} -... -</pre> -</div> - -<p>This definition corresponds to a 32-bit register-register add instruction in -the X86. The string after the '<tt>def</tt>' string indicates the name of the -record—"<tt>ADD32rr</tt>" in this case—and the comment at the end of -the line indicates the superclasses of the definition. The body of the record -contains all of the data that TableGen assembled for the record, indicating that -the instruction is part of the "X86" namespace, the pattern indicating how the -the instruction should be emitted into the assembly file, that it is a -two-address instruction, has a particular encoding, etc. The contents and -semantics of the information in the record is specific to the needs of the X86 -backend, and is only shown as an example.</p> - -<p>As you can see, a lot of information is needed for every instruction -supported by the code generator, and specifying it all manually would be -unmaintainable, prone to bugs, and tiring to do in the first place. Because we -are using TableGen, all of the information was derived from the following -definition:</p> - -<div class="doc_code"> -<pre> -let Defs = [EFLAGS], - isCommutable = 1, <i>// X = ADD Y,Z --> X = ADD Z,Y</i> - isConvertibleToThreeAddress = 1 <b>in</b> <i>// Can transform into LEA.</i> -def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), - (ins GR32:$src1, GR32:$src2), - "add{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; -</pre> -</div> - -<p>This definition makes use of the custom class <tt>I</tt> (extended from the -custom class <tt>X86Inst</tt>), which is defined in the X86-specific TableGen -file, to factor out the common features that instructions of its class share. A -key feature of TableGen is that it allows the end-user to define the -abstractions they prefer to use when describing their information.</p> - -<p>Each def record has a special entry called "NAME." This is the -name of the def ("ADD32rr" above). In the general case def names can -be formed from various kinds of string processing expressions and NAME -resolves to the final value obtained after resolving all of those -expressions. The user may refer to NAME anywhere she desires to use -the ultimate name of the def. NAME should not be defined anywhere -else in user code to avoid conflict problems.</p> - -</div> - -<!-- ======================================================================= --> -<h3><a name="running">Running TableGen</a></h3> - -<div> - -<p>TableGen runs just like any other LLVM tool. The first (optional) argument -specifies the file to read. If a filename is not specified, <tt>tblgen</tt> -reads from standard input.</p> - -<p>To be useful, one of the <a href="#backends">TableGen backends</a> must be -used. These backends are selectable on the command line (type '<tt>tblgen --help</tt>' for a list). For example, to get a list of all of the definitions -that subclass a particular type (which can be useful for building up an enum -list of these records), use the <tt>-print-enums</tt> option:</p> - -<div class="doc_code"> -<pre> -$ tblgen X86.td -print-enums -class=Register -AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, -ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, -MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, -R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, -R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, -RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, -XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, -XMM6, XMM7, XMM8, XMM9, - -$ tblgen X86.td -print-enums -class=Instruction -ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, -ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, -ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, -ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, -ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... -</pre> -</div> - -<p>The default backend prints out all of the records, as described <a -href="#example">above</a>.</p> - -<p>If you plan to use TableGen, you will most likely have to <a -href="#backends">write a backend</a> that extracts the information specific to -what you need and formats it in the appropriate way.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="syntax">TableGen syntax</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TableGen doesn't care about the meaning of data (that is up to the backend to -define), but it does care about syntax, and it enforces a simple type system. -This section describes the syntax and the constructs allowed in a TableGen file. -</p> - -<!-- ======================================================================= --> -<h3><a name="primitives">TableGen primitives</a></h3> - -<div> - -<!-- --------------------------------------------------------------------------> -<h4><a name="comments">TableGen comments</a></h4> - -<div> - -<p>TableGen supports BCPL style "<tt>//</tt>" comments, which run to the end of -the line, and it also supports <b>nestable</b> "<tt>/* */</tt>" comments.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="types">The TableGen type system</a> -</h4> - -<div> - -<p>TableGen files are strongly typed, in a simple (but complete) type-system. -These types are used to perform automatic conversions, check for errors, and to -help interface designers constrain the input that they allow. Every <a -href="#valuedef">value definition</a> is required to have an associated type. -</p> - -<p>TableGen supports a mixture of very low-level types (such as <tt>bit</tt>) -and very high-level types (such as <tt>dag</tt>). This flexibility is what -allows it to describe a wide range of information conveniently and compactly. -The TableGen types are:</p> - -<dl> -<dt><tt><b>bit</b></tt></dt> - <dd>A 'bit' is a boolean value that can hold either 0 or 1.</dd> - -<dt><tt><b>int</b></tt></dt> - <dd>The 'int' type represents a simple 32-bit integer value, such as 5.</dd> - -<dt><tt><b>string</b></tt></dt> - <dd>The 'string' type represents an ordered sequence of characters of - arbitrary length.</dd> - -<dt><tt><b>bits</b><n></tt></dt> - <dd>A 'bits' type is an arbitrary, but fixed, size integer that is broken up - into individual bits. This type is useful because it can handle some bits - being defined while others are undefined.</dd> - -<dt><tt><b>list</b><ty></tt></dt> - <dd>This type represents a list whose elements are some other type. The - contained type is arbitrary: it can even be another list type.</dd> - -<dt>Class type</dt> - <dd>Specifying a class name in a type context means that the defined value - must be a subclass of the specified class. This is useful in conjunction with - the <b><tt>list</tt></b> type, for example, to constrain the elements of the - list to a common base class (e.g., a <tt><b>list</b><Register></tt> can - only contain definitions derived from the "<tt>Register</tt>" class).</dd> - -<dt><tt><b>dag</b></tt></dt> - <dd>This type represents a nestable directed graph of elements.</dd> - -<dt><tt><b>code</b></tt></dt> - <dd>This represents a big hunk of text. This is lexically distinct from - string values because it doesn't require escapeing double quotes and other - common characters that occur in code.</dd> -</dl> - -<p>To date, these types have been sufficient for describing things that -TableGen has been used for, but it is straight-forward to extend this list if -needed.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="values">TableGen values and expressions</a> -</h4> - -<div> - -<p>TableGen allows for a pretty reasonable number of different expression forms -when building up values. These forms allow the TableGen file to be written in a -natural syntax and flavor for the application. The current expression forms -supported include:</p> - -<dl> -<dt><tt>?</tt></dt> - <dd>uninitialized field</dd> -<dt><tt>0b1001011</tt></dt> - <dd>binary integer value</dd> -<dt><tt>07654321</tt></dt> - <dd>octal integer value (indicated by a leading 0)</dd> -<dt><tt>7</tt></dt> - <dd>decimal integer value</dd> -<dt><tt>0x7F</tt></dt> - <dd>hexadecimal integer value</dd> -<dt><tt>"foo"</tt></dt> - <dd>string value</dd> -<dt><tt>[{ ... }]</tt></dt> - <dd>code fragment</dd> -<dt><tt>[ X, Y, Z ]<type></tt></dt> - <dd>list value. <type> is the type of the list -element and is usually optional. In rare cases, -TableGen is unable to deduce the element type in -which case the user must specify it explicitly.</dd> -<dt><tt>{ a, b, c }</tt></dt> - <dd>initializer for a "bits<3>" value</dd> -<dt><tt>value</tt></dt> - <dd>value reference</dd> -<dt><tt>value{17}</tt></dt> - <dd>access to one bit of a value</dd> -<dt><tt>value{15-17}</tt></dt> - <dd>access to multiple bits of a value</dd> -<dt><tt>DEF</tt></dt> - <dd>reference to a record definition</dd> -<dt><tt>CLASS<val list></tt></dt> - <dd>reference to a new anonymous definition of CLASS with the specified - template arguments.</dd> -<dt><tt>X.Y</tt></dt> - <dd>reference to the subfield of a value</dd> -<dt><tt>list[4-7,17,2-3]</tt></dt> - <dd>A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from - it. Elements may be included multiple times.</dd> -<dt><tt>foreach <var> = <list> in { <body> }</tt></dt> -<dt><tt>foreach <var> = <list> in <def></tt></dt> - <dd> Replicate <body> or <def>, replacing instances of - <var> with each value in <list>. <var> is scoped at the - level of the <tt>foreach</tt> loop and must not conflict with any other object - introduced in <body> or <def>. Currently only <tt>def</tt>s are - expanded within <body>. - </dd> -<dt><tt>(DEF a, b)</tt></dt> - <dd>a dag value. The first element is required to be a record definition, the - remaining elements in the list may be arbitrary other values, including nested - `<tt>dag</tt>' values.</dd> -<dt><tt>!strconcat(a, b)</tt></dt> - <dd>A string value that is the result of concatenating the 'a' and 'b' - strings.</dd> -<dt><tt>str1#str2</tt></dt> - <dd>"#" (paste) is a shorthand for !strconcat. It may concatenate - things that are not quoted strings, in which case an implicit - !cast<string> is done on the operand of the paste.</dd> -<dt><tt>!cast<type>(a)</tt></dt> - <dd>A symbol of type <em>type</em> obtained by looking up the string 'a' in -the symbol table. If the type of 'a' does not match <em>type</em>, TableGen -aborts with an error. !cast<string> is a special case in that the argument must -be an object defined by a 'def' construct.</dd> -<dt><tt>!subst(a, b, c)</tt></dt> - <dd>If 'a' and 'b' are of string type or are symbol references, substitute -'b' for 'a' in 'c.' This operation is analogous to $(subst) in GNU make.</dd> -<dt><tt>!foreach(a, b, c)</tt></dt> - <dd>For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a -dummy variable that should be declared as a member variable of an instantiated -class. This operation is analogous to $(foreach) in GNU make.</dd> -<dt><tt>!head(a)</tt></dt> - <dd>The first element of list 'a.'</dd> -<dt><tt>!tail(a)</tt></dt> - <dd>The 2nd-N elements of list 'a.'</dd> -<dt><tt>!empty(a)</tt></dt> - <dd>An integer {0,1} indicating whether list 'a' is empty.</dd> -<dt><tt>!if(a,b,c)</tt></dt> - <dd>'b' if the result of 'int' or 'bit' operator 'a' is nonzero, - 'c' otherwise.</dd> -<dt><tt>!eq(a,b)</tt></dt> - <dd>'bit 1' if string a is equal to string b, 0 otherwise. This - only operates on string, int and bit objects. Use !cast<string> to - compare other types of objects.</dd> -</dl> - -<p>Note that all of the values have rules specifying how they convert to values -for different types. These rules allow you to assign a value like "<tt>7</tt>" -to a "<tt>bits<4></tt>" value, for example.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="classesdefs">Classes and definitions</a> -</h3> - -<div> - -<p>As mentioned in the <a href="#concepts">intro</a>, classes and definitions -(collectively known as 'records') in TableGen are the main high-level unit of -information that TableGen collects. Records are defined with a <tt>def</tt> or -<tt>class</tt> keyword, the record name, and an optional list of "<a -href="#templateargs">template arguments</a>". If the record has superclasses, -they are specified as a comma separated list that starts with a colon character -("<tt>:</tt>"). If <a href="#valuedef">value definitions</a> or <a -href="#recordlet">let expressions</a> are needed for the class, they are -enclosed in curly braces ("<tt>{}</tt>"); otherwise, the record ends with a -semicolon.</p> - -<p>Here is a simple TableGen file:</p> - -<div class="doc_code"> -<pre> -<b>class</b> C { <b>bit</b> V = 1; } -<b>def</b> X : C; -<b>def</b> Y : C { - <b>string</b> Greeting = "hello"; -} -</pre> -</div> - -<p>This example defines two definitions, <tt>X</tt> and <tt>Y</tt>, both of -which derive from the <tt>C</tt> class. Because of this, they both get the -<tt>V</tt> bit value. The <tt>Y</tt> definition also gets the Greeting member -as well.</p> - -<p>In general, classes are useful for collecting together the commonality -between a group of records and isolating it in a single place. Also, classes -permit the specification of default values for their subclasses, allowing the -subclasses to override them as they wish.</p> - -<!----------------------------------------------------------------------------> -<h4> - <a name="valuedef">Value definitions</a> -</h4> - -<div> - -<p>Value definitions define named entries in records. A value must be defined -before it can be referred to as the operand for another value definition or -before the value is reset with a <a href="#recordlet">let expression</a>. A -value is defined by specifying a <a href="#types">TableGen type</a> and a name. -If an initial value is available, it may be specified after the type with an -equal sign. Value definitions require terminating semicolons.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="recordlet">'let' expressions</a> -</h4> - -<div> - -<p>A record-level let expression is used to change the value of a value -definition in a record. This is primarily useful when a superclass defines a -value that a derived class or definition wants to override. Let expressions -consist of the '<tt>let</tt>' keyword followed by a value name, an equal sign -("<tt>=</tt>"), and a new value. For example, a new class could be added to the -example above, redefining the <tt>V</tt> field for all of its subclasses:</p> - -<div class="doc_code"> -<pre> -<b>class</b> D : C { let V = 0; } -<b>def</b> Z : D; -</pre> -</div> - -<p>In this case, the <tt>Z</tt> definition will have a zero value for its "V" -value, despite the fact that it derives (indirectly) from the <tt>C</tt> class, -because the <tt>D</tt> class overrode its value.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="templateargs">Class template arguments</a> -</h4> - -<div> - -<p>TableGen permits the definition of parameterized classes as well as normal -concrete classes. Parameterized TableGen classes specify a list of variable -bindings (which may optionally have defaults) that are bound when used. Here is -a simple example:</p> - -<div class="doc_code"> -<pre> -<b>class</b> FPFormat<<b>bits</b><3> val> { - <b>bits</b><3> Value = val; -} -<b>def</b> NotFP : FPFormat<0>; -<b>def</b> ZeroArgFP : FPFormat<1>; -<b>def</b> OneArgFP : FPFormat<2>; -<b>def</b> OneArgFPRW : FPFormat<3>; -<b>def</b> TwoArgFP : FPFormat<4>; -<b>def</b> CompareFP : FPFormat<5>; -<b>def</b> CondMovFP : FPFormat<6>; -<b>def</b> SpecialFP : FPFormat<7>; -</pre> -</div> - -<p>In this case, template arguments are used as a space efficient way to specify -a list of "enumeration values", each with a "<tt>Value</tt>" field set to the -specified integer.</p> - -<p>The more esoteric forms of <a href="#values">TableGen expressions</a> are -useful in conjunction with template arguments. As an example:</p> - -<div class="doc_code"> -<pre> -<b>class</b> ModRefVal<<b>bits</b><2> val> { - <b>bits</b><2> Value = val; -} - -<b>def</b> None : ModRefVal<0>; -<b>def</b> Mod : ModRefVal<1>; -<b>def</b> Ref : ModRefVal<2>; -<b>def</b> ModRef : ModRefVal<3>; - -<b>class</b> Value<ModRefVal MR> { - <i>// Decode some information into a more convenient format, while providing - // a nice interface to the user of the "Value" class.</i> - <b>bit</b> isMod = MR.Value{0}; - <b>bit</b> isRef = MR.Value{1}; - - <i>// other stuff...</i> -} - -<i>// Example uses</i> -<b>def</b> bork : Value<Mod>; -<b>def</b> zork : Value<Ref>; -<b>def</b> hork : Value<ModRef>; -</pre> -</div> - -<p>This is obviously a contrived example, but it shows how template arguments -can be used to decouple the interface provided to the user of the class from the -actual internal data representation expected by the class. In this case, -running <tt>tblgen</tt> on the example prints the following definitions:</p> - -<div class="doc_code"> -<pre> -<b>def</b> bork { <i>// Value</i> - <b>bit</b> isMod = 1; - <b>bit</b> isRef = 0; -} -<b>def</b> hork { <i>// Value</i> - <b>bit</b> isMod = 1; - <b>bit</b> isRef = 1; -} -<b>def</b> zork { <i>// Value</i> - <b>bit</b> isMod = 0; - <b>bit</b> isRef = 1; -} -</pre> -</div> - -<p> This shows that TableGen was able to dig into the argument and extract a -piece of information that was requested by the designer of the "Value" class. -For more realistic examples, please see existing users of TableGen, such as the -X86 backend.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="multiclass">Multiclass definitions and instances</a> -</h4> - -<div> - -<p> -While classes with template arguments are a good way to factor commonality -between two instances of a definition, multiclasses allow a convenient notation -for defining multiple definitions at once (instances of implicitly constructed -classes). For example, consider an 3-address instruction set whose instructions -come in two forms: "<tt>reg = reg op reg</tt>" and "<tt>reg = reg op imm</tt>" -(e.g. SPARC). In this case, you'd like to specify in one place that this -commonality exists, then in a separate place indicate what all the ops are. -</p> - -<p> -Here is an example TableGen fragment that shows this idea: -</p> - -<div class="doc_code"> -<pre> -<b>def</b> ops; -<b>def</b> GPR; -<b>def</b> Imm; -<b>class</b> inst<<b>int</b> opc, <b>string</b> asmstr, <b>dag</b> operandlist>; - -<b>multiclass</b> ri_inst<<b>int</b> opc, <b>string</b> asmstr> { - def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, GPR:$src2)>; - def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, Imm:$src2)>; -} - -<i>// Instantiations of the ri_inst multiclass.</i> -<b>defm</b> ADD : ri_inst<0b111, "add">; -<b>defm</b> SUB : ri_inst<0b101, "sub">; -<b>defm</b> MUL : ri_inst<0b100, "mul">; -... -</pre> -</div> - -<p>The name of the resultant definitions has the multidef fragment names - appended to them, so this defines <tt>ADD_rr</tt>, <tt>ADD_ri</tt>, - <tt>SUB_rr</tt>, etc. A defm may inherit from multiple multiclasses, - instantiating definitions from each multiclass. Using a multiclass - this way is exactly equivalent to instantiating the classes multiple - times yourself, e.g. by writing:</p> - -<div class="doc_code"> -<pre> -<b>def</b> ops; -<b>def</b> GPR; -<b>def</b> Imm; -<b>class</b> inst<<b>int</b> opc, <b>string</b> asmstr, <b>dag</b> operandlist>; - -<b>class</b> rrinst<<b>int</b> opc, <b>string</b> asmstr> - : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, GPR:$src2)>; - -<b>class</b> riinst<<b>int</b> opc, <b>string</b> asmstr> - : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, Imm:$src2)>; - -<i>// Instantiations of the ri_inst multiclass.</i> -<b>def</b> ADD_rr : rrinst<0b111, "add">; -<b>def</b> ADD_ri : riinst<0b111, "add">; -<b>def</b> SUB_rr : rrinst<0b101, "sub">; -<b>def</b> SUB_ri : riinst<0b101, "sub">; -<b>def</b> MUL_rr : rrinst<0b100, "mul">; -<b>def</b> MUL_ri : riinst<0b100, "mul">; -... -</pre> -</div> - -<p> -A defm can also be used inside a multiclass providing several levels of -multiclass instanciations. -</p> - -<div class="doc_code"> -<pre> -<b>class</b> Instruction<bits<4> opc, string Name> { - bits<4> opcode = opc; - string name = Name; -} - -<b>multiclass</b> basic_r<bits<4> opc> { - <b>def</b> rr : Instruction<opc, "rr">; - <b>def</b> rm : Instruction<opc, "rm">; -} - -<b>multiclass</b> basic_s<bits<4> opc> { - <b>defm</b> SS : basic_r<opc>; - <b>defm</b> SD : basic_r<opc>; - <b>def</b> X : Instruction<opc, "x">; -} - -<b>multiclass</b> basic_p<bits<4> opc> { - <b>defm</b> PS : basic_r<opc>; - <b>defm</b> PD : basic_r<opc>; - <b>def</b> Y : Instruction<opc, "y">; -} - -<b>defm</b> ADD : basic_s<0xf>, basic_p<0xf>; -... - -<i>// Results</i> -<b>def</b> ADDPDrm { ... -<b>def</b> ADDPDrr { ... -<b>def</b> ADDPSrm { ... -<b>def</b> ADDPSrr { ... -<b>def</b> ADDSDrm { ... -<b>def</b> ADDSDrr { ... -<b>def</b> ADDY { ... -<b>def</b> ADDX { ... -</pre> -</div> - -<p> -defm declarations can inherit from classes too, the -rule to follow is that the class list must start after the -last multiclass, and there must be at least one multiclass -before them. -</p> - -<div class="doc_code"> -<pre> -<b>class</b> XD { bits<4> Prefix = 11; } -<b>class</b> XS { bits<4> Prefix = 12; } - -<b>class</b> I<bits<4> op> { - bits<4> opcode = op; -} - -<b>multiclass</b> R { - <b>def</b> rr : I<4>; - <b>def</b> rm : I<2>; -} - -<b>multiclass</b> Y { - <b>defm</b> SS : R, XD; - <b>defm</b> SD : R, XS; -} - -<b>defm</b> Instr : Y; - -<i>// Results</i> -<b>def</b> InstrSDrm { - bits<4> opcode = { 0, 0, 1, 0 }; - bits<4> Prefix = { 1, 1, 0, 0 }; -} -... -<b>def</b> InstrSSrr { - bits<4> opcode = { 0, 1, 0, 0 }; - bits<4> Prefix = { 1, 0, 1, 1 }; -} -</pre> -</div> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="filescope">File scope entities</a> -</h3> - -<div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="include">File inclusion</a> -</h4> - -<div> -<p>TableGen supports the '<tt>include</tt>' token, which textually substitutes -the specified file in place of the include directive. The filename should be -specified as a double quoted string immediately after the '<tt>include</tt>' -keyword. Example:</p> - -<div class="doc_code"> -<pre> -<b>include</b> "foo.td" -</pre> -</div> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="globallet">'let' expressions</a> -</h4> - -<div> - -<p>"Let" expressions at file scope are similar to <a href="#recordlet">"let" -expressions within a record</a>, except they can specify a value binding for -multiple records at a time, and may be useful in certain other cases. -File-scope let expressions are really just another way that TableGen allows the -end-user to factor out commonality from the records.</p> - -<p>File-scope "let" expressions take a comma-separated list of bindings to -apply, and one or more records to bind the values in. Here are some -examples:</p> - -<div class="doc_code"> -<pre> -<b>let</b> isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 <b>in</b> - <b>def</b> RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; - -<b>let</b> isCall = 1 <b>in</b> - <i>// All calls clobber the non-callee saved registers...</i> - <b>let</b> Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, - MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, - XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] <b>in</b> { - <b>def</b> CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), - "call\t${dst:call}", []>; - <b>def</b> CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), - "call\t{*}$dst", [(X86call GR32:$dst)]>; - <b>def</b> CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), - "call\t{*}$dst", []>; - } -</pre> -</div> - -<p>File-scope "let" expressions are often useful when a couple of definitions -need to be added to several records, and the records do not otherwise need to be -opened, as in the case with the <tt>CALL*</tt> instructions above.</p> - -<p>It's also possible to use "let" expressions inside multiclasses, providing -more ways to factor out commonality from the records, specially if using -several levels of multiclass instanciations. This also avoids the need of using -"let" expressions within subsequent records inside a multiclass.</p> - -<pre class="doc_code"> -<b>multiclass </b>basic_r<bits<4> opc> { - <b>let </b>Predicates = [HasSSE2] in { - <b>def </b>rr : Instruction<opc, "rr">; - <b>def </b>rm : Instruction<opc, "rm">; - } - <b>let </b>Predicates = [HasSSE3] in - <b>def </b>rx : Instruction<opc, "rx">; -} - -<b>multiclass </b>basic_ss<bits<4> opc> { - <b>let </b>IsDouble = 0 in - <b>defm </b>SS : basic_r<opc>; - - <b>let </b>IsDouble = 1 in - <b>defm </b>SD : basic_r<opc>; -} - -<b>defm </b>ADD : basic_ss<0xf>; -</pre> -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="foreach">Looping</a> -</h4> - -<div> -<p>TableGen supports the '<tt>foreach</tt>' block, which textually replicates -the loop body, substituting iterator values for iterator references in the -body. Example:</p> - -<div class="doc_code"> -<pre> -<b>foreach</b> i = [0, 1, 2, 3] in { - <b>def</b> R#i : Register<...>; - <b>def</b> F#i : Register<...>; -} -</pre> -</div> - -<p>This will create objects <tt>R0</tt>, <tt>R1</tt>, <tt>R2</tt> and -<tt>R3</tt>. <tt>foreach</tt> blocks may be nested. If there is only -one item in the body the braces may be elided:</p> - -<div class="doc_code"> -<pre> -<b>foreach</b> i = [0, 1, 2, 3] in - <b>def</b> R#i : Register<...>; - -</pre> -</div> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="codegen">Code Generator backend info</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>Expressions used by code generator to describe instructions and isel -patterns:</p> - -<dl> -<dt><tt>(implicit a)</tt></dt> - <dd>an implicitly defined physical register. This tells the dag instruction - selection emitter the input pattern's extra definitions matches implicit - physical register definitions.</dd> -</dl> -</div> - -<!-- *********************************************************************** --> -<h2><a name="backends">TableGen backends</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TODO: How they work, how to write one. This section should not contain -details about any particular backend, except maybe -print-enums as an example. -This should highlight the APIs in <tt>TableGen/Record.h</tt>.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-03-27 13:25:16 +0200 (Tue, 27 Mar 2012) $ -</address> - -</body> -</html> diff --git a/docs/TableGenFundamentals.rst b/docs/TableGenFundamentals.rst new file mode 100644 index 0000000..bfb2618 --- /dev/null +++ b/docs/TableGenFundamentals.rst @@ -0,0 +1,799 @@ +.. _tablegen: + +===================== +TableGen Fundamentals +===================== + +.. contents:: + :local: + +Introduction +============ + +TableGen's purpose is to help a human develop and maintain records of +domain-specific information. Because there may be a large number of these +records, it is specifically designed to allow writing flexible descriptions and +for common features of these records to be factored out. This reduces the +amount of duplication in the description, reduces the chance of error, and makes +it easier to structure domain specific information. + +The core part of TableGen `parses a file`_, instantiates the declarations, and +hands the result off to a domain-specific `TableGen backend`_ for processing. +The current major user of TableGen is the `LLVM code +generator <CodeGenerator.html>`_. + +Note that if you work on TableGen much, and use emacs or vim, that you can find +an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and +``llvm/utils/vim`` directories of your LLVM distribution, respectively. + +.. _intro: + +Basic concepts +-------------- + +TableGen files consist of two key parts: 'classes' and 'definitions', both of +which are considered 'records'. + +**TableGen records** have a unique name, a list of values, and a list of +superclasses. The list of values is the main data that TableGen builds for each +record; it is this that holds the domain specific information for the +application. The interpretation of this data is left to a specific `TableGen +backend`_, but the structure and format rules are taken care of and are fixed by +TableGen. + +**TableGen definitions** are the concrete form of 'records'. These generally do +not have any undefined values, and are marked with the '``def``' keyword. + +**TableGen classes** are abstract records that are used to build and describe +other records. These 'classes' allow the end-user to build abstractions for +either the domain they are targeting (such as "Register", "RegisterClass", and +"Instruction" in the LLVM code generator) or for the implementor to help factor +out common properties of records (such as "FPInst", which is used to represent +floating point instructions in the X86 backend). TableGen keeps track of all of +the classes that are used to build up a definition, so the backend can find all +definitions of a particular class, such as "Instruction". + +**TableGen multiclasses** are groups of abstract records that are instantiated +all at once. Each instantiation can result in multiple TableGen definitions. +If a multiclass inherits from another multiclass, the definitions in the +sub-multiclass become part of the current multiclass, as if they were declared +in the current multiclass. + +.. _described above: + +An example record +----------------- + +With no other arguments, TableGen parses the specified file and prints out all +of the classes, then all of the definitions. This is a good way to see what the +various definitions expand to fully. Running this on the ``X86.td`` file prints +this (at the time of this writing): + +.. code-block:: llvm + + ... + def ADD32rr { // Instruction X86Inst I + string Namespace = "X86"; + dag OutOperandList = (outs GR32:$dst); + dag InOperandList = (ins GR32:$src1, GR32:$src2); + string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; + list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; + list<Register> Uses = []; + list<Register> Defs = [EFLAGS]; + list<Predicate> Predicates = []; + int CodeSize = 3; + int AddedComplexity = 0; + bit isReturn = 0; + bit isBranch = 0; + bit isIndirectBranch = 0; + bit isBarrier = 0; + bit isCall = 0; + bit canFoldAsLoad = 0; + bit mayLoad = 0; + bit mayStore = 0; + bit isImplicitDef = 0; + bit isConvertibleToThreeAddress = 1; + bit isCommutable = 1; + bit isTerminator = 0; + bit isReMaterializable = 0; + bit isPredicable = 0; + bit hasDelaySlot = 0; + bit usesCustomInserter = 0; + bit hasCtrlDep = 0; + bit isNotDuplicable = 0; + bit hasSideEffects = 0; + bit neverHasSideEffects = 0; + InstrItinClass Itinerary = NoItinerary; + string Constraints = ""; + string DisableEncoding = ""; + bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; + Format Form = MRMDestReg; + bits<6> FormBits = { 0, 0, 0, 0, 1, 1 }; + ImmType ImmT = NoImm; + bits<3> ImmTypeBits = { 0, 0, 0 }; + bit hasOpSizePrefix = 0; + bit hasAdSizePrefix = 0; + bits<4> Prefix = { 0, 0, 0, 0 }; + bit hasREX_WPrefix = 0; + FPFormat FPForm = ?; + bits<3> FPFormBits = { 0, 0, 0 }; + } + ... + +This definition corresponds to a 32-bit register-register add instruction in the +X86. The string after the '``def``' string indicates the name of the +record---"``ADD32rr``" in this case---and the comment at the end of the line +indicates the superclasses of the definition. The body of the record contains +all of the data that TableGen assembled for the record, indicating that the +instruction is part of the "X86" namespace, the pattern indicating how the the +instruction should be emitted into the assembly file, that it is a two-address +instruction, has a particular encoding, etc. The contents and semantics of the +information in the record is specific to the needs of the X86 backend, and is +only shown as an example. + +As you can see, a lot of information is needed for every instruction supported +by the code generator, and specifying it all manually would be unmaintainable, +prone to bugs, and tiring to do in the first place. Because we are using +TableGen, all of the information was derived from the following definition: + +.. code-block:: llvm + + let Defs = [EFLAGS], + isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y + isConvertibleToThreeAddress = 1 in // Can transform into LEA. + def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), + (ins GR32:$src1, GR32:$src2), + "add{l}\t{$src2, $dst|$dst, $src2}", + [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; + +This definition makes use of the custom class ``I`` (extended from the custom +class ``X86Inst``), which is defined in the X86-specific TableGen file, to +factor out the common features that instructions of its class share. A key +feature of TableGen is that it allows the end-user to define the abstractions +they prefer to use when describing their information. + +Each def record has a special entry called "``NAME``." This is the name of the +def ("``ADD32rr``" above). In the general case def names can be formed from +various kinds of string processing expressions and ``NAME`` resolves to the +final value obtained after resolving all of those expressions. The user may +refer to ``NAME`` anywhere she desires to use the ultimate name of the def. +``NAME`` should not be defined anywhere else in user code to avoid conflict +problems. + +Running TableGen +---------------- + +TableGen runs just like any other LLVM tool. The first (optional) argument +specifies the file to read. If a filename is not specified, ``llvm-tblgen`` +reads from standard input. + +To be useful, one of the `TableGen backends`_ must be used. These backends are +selectable on the command line (type '``llvm-tblgen -help``' for a list). For +example, to get a list of all of the definitions that subclass a particular type +(which can be useful for building up an enum list of these records), use the +``-print-enums`` option: + +.. code-block:: bash + + $ llvm-tblgen X86.td -print-enums -class=Register + AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, + ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, + R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, + R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, + RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, + XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, + XMM6, XMM7, XMM8, XMM9, + + $ llvm-tblgen X86.td -print-enums -class=Instruction + ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, + ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, + ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, + ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, + ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... + +The default backend prints out all of the records, as `described above`_. + +If you plan to use TableGen, you will most likely have to `write a backend`_ +that extracts the information specific to what you need and formats it in the +appropriate way. + +.. _parses a file: + +TableGen syntax +=============== + +TableGen doesn't care about the meaning of data (that is up to the backend to +define), but it does care about syntax, and it enforces a simple type system. +This section describes the syntax and the constructs allowed in a TableGen file. + +TableGen primitives +------------------- + +TableGen comments +^^^^^^^^^^^^^^^^^ + +TableGen supports BCPL style "``//``" comments, which run to the end of the +line, and it also supports **nestable** "``/* */``" comments. + +.. _TableGen type: + +The TableGen type system +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen files are strongly typed, in a simple (but complete) type-system. +These types are used to perform automatic conversions, check for errors, and to +help interface designers constrain the input that they allow. Every `value +definition`_ is required to have an associated type. + +TableGen supports a mixture of very low-level types (such as ``bit``) and very +high-level types (such as ``dag``). This flexibility is what allows it to +describe a wide range of information conveniently and compactly. The TableGen +types are: + +``bit`` + A 'bit' is a boolean value that can hold either 0 or 1. + +``int`` + The 'int' type represents a simple 32-bit integer value, such as 5. + +``string`` + The 'string' type represents an ordered sequence of characters of arbitrary + length. + +``bits<n>`` + A 'bits' type is an arbitrary, but fixed, size integer that is broken up + into individual bits. This type is useful because it can handle some bits + being defined while others are undefined. + +``list<ty>`` + This type represents a list whose elements are some other type. The + contained type is arbitrary: it can even be another list type. + +Class type + Specifying a class name in a type context means that the defined value must + be a subclass of the specified class. This is useful in conjunction with + the ``list`` type, for example, to constrain the elements of the list to a + common base class (e.g., a ``list<Register>`` can only contain definitions + derived from the "``Register``" class). + +``dag`` + This type represents a nestable directed graph of elements. + +``code`` + This represents a big hunk of text. This is lexically distinct from string + values because it doesn't require escaping double quotes and other common + characters that occur in code. + +To date, these types have been sufficient for describing things that TableGen +has been used for, but it is straight-forward to extend this list if needed. + +.. _TableGen expressions: + +TableGen values and expressions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen allows for a pretty reasonable number of different expression forms +when building up values. These forms allow the TableGen file to be written in a +natural syntax and flavor for the application. The current expression forms +supported include: + +``?`` + uninitialized field + +``0b1001011`` + binary integer value + +``07654321`` + octal integer value (indicated by a leading 0) + +``7`` + decimal integer value + +``0x7F`` + hexadecimal integer value + +``"foo"`` + string value + +``[{ ... }]`` + code fragment + +``[ X, Y, Z ]<type>`` + list value. <type> is the type of the list element and is usually optional. + In rare cases, TableGen is unable to deduce the element type in which case + the user must specify it explicitly. + +``{ a, b, c }`` + initializer for a "bits<3>" value + +``value`` + value reference + +``value{17}`` + access to one bit of a value + +``value{15-17}`` + access to multiple bits of a value + +``DEF`` + reference to a record definition + +``CLASS<val list>`` + reference to a new anonymous definition of CLASS with the specified template + arguments. + +``X.Y`` + reference to the subfield of a value + +``list[4-7,17,2-3]`` + A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it. + Elements may be included multiple times. + +``foreach <var> = [ <list> ] in { <body> }`` + +``foreach <var> = [ <list> ] in <def>`` + Replicate <body> or <def>, replacing instances of <var> with each value + in <list>. <var> is scoped at the level of the ``foreach`` loop and must + not conflict with any other object introduced in <body> or <def>. Currently + only ``def``\s are expanded within <body>. + +``foreach <var> = 0-15 in ...`` + +``foreach <var> = {0-15,32-47} in ...`` + Loop over ranges of integers. The braces are required for multiple ranges. + +``(DEF a, b)`` + a dag value. The first element is required to be a record definition, the + remaining elements in the list may be arbitrary other values, including + nested ```dag``' values. + +``!strconcat(a, b)`` + A string value that is the result of concatenating the 'a' and 'b' strings. + +``str1#str2`` + "#" (paste) is a shorthand for !strconcat. It may concatenate things that + are not quoted strings, in which case an implicit !cast<string> is done on + the operand of the paste. + +``!cast<type>(a)`` + A symbol of type *type* obtained by looking up the string 'a' in the symbol + table. If the type of 'a' does not match *type*, TableGen aborts with an + error. !cast<string> is a special case in that the argument must be an + object defined by a 'def' construct. + +``!subst(a, b, c)`` + If 'a' and 'b' are of string type or are symbol references, substitute 'b' + for 'a' in 'c.' This operation is analogous to $(subst) in GNU make. + +``!foreach(a, b, c)`` + For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a dummy + variable that should be declared as a member variable of an instantiated + class. This operation is analogous to $(foreach) in GNU make. + +``!head(a)`` + The first element of list 'a.' + +``!tail(a)`` + The 2nd-N elements of list 'a.' + +``!empty(a)`` + An integer {0,1} indicating whether list 'a' is empty. + +``!if(a,b,c)`` + 'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise. + +``!eq(a,b)`` + 'bit 1' if string a is equal to string b, 0 otherwise. This only operates + on string, int and bit objects. Use !cast<string> to compare other types of + objects. + +Note that all of the values have rules specifying how they convert to values +for different types. These rules allow you to assign a value like "``7``" +to a "``bits<4>``" value, for example. + +Classes and definitions +----------------------- + +As mentioned in the `intro`_, classes and definitions (collectively known as +'records') in TableGen are the main high-level unit of information that TableGen +collects. Records are defined with a ``def`` or ``class`` keyword, the record +name, and an optional list of "`template arguments`_". If the record has +superclasses, they are specified as a comma separated list that starts with a +colon character ("``:``"). If `value definitions`_ or `let expressions`_ are +needed for the class, they are enclosed in curly braces ("``{}``"); otherwise, +the record ends with a semicolon. + +Here is a simple TableGen file: + +.. code-block:: llvm + + class C { bit V = 1; } + def X : C; + def Y : C { + string Greeting = "hello"; + } + +This example defines two definitions, ``X`` and ``Y``, both of which derive from +the ``C`` class. Because of this, they both get the ``V`` bit value. The ``Y`` +definition also gets the Greeting member as well. + +In general, classes are useful for collecting together the commonality between a +group of records and isolating it in a single place. Also, classes permit the +specification of default values for their subclasses, allowing the subclasses to +override them as they wish. + +.. _value definition: +.. _value definitions: + +Value definitions +^^^^^^^^^^^^^^^^^ + +Value definitions define named entries in records. A value must be defined +before it can be referred to as the operand for another value definition or +before the value is reset with a `let expression`_. A value is defined by +specifying a `TableGen type`_ and a name. If an initial value is available, it +may be specified after the type with an equal sign. Value definitions require +terminating semicolons. + +.. _let expression: +.. _let expressions: +.. _"let" expressions within a record: + +'let' expressions +^^^^^^^^^^^^^^^^^ + +A record-level let expression is used to change the value of a value definition +in a record. This is primarily useful when a superclass defines a value that a +derived class or definition wants to override. Let expressions consist of the +'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new +value. For example, a new class could be added to the example above, redefining +the ``V`` field for all of its subclasses: + +.. code-block:: llvm + + class D : C { let V = 0; } + def Z : D; + +In this case, the ``Z`` definition will have a zero value for its ``V`` value, +despite the fact that it derives (indirectly) from the ``C`` class, because the +``D`` class overrode its value. + +.. _template arguments: + +Class template arguments +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen permits the definition of parameterized classes as well as normal +concrete classes. Parameterized TableGen classes specify a list of variable +bindings (which may optionally have defaults) that are bound when used. Here is +a simple example: + +.. code-block:: llvm + + class FPFormat<bits<3> val> { + bits<3> Value = val; + } + def NotFP : FPFormat<0>; + def ZeroArgFP : FPFormat<1>; + def OneArgFP : FPFormat<2>; + def OneArgFPRW : FPFormat<3>; + def TwoArgFP : FPFormat<4>; + def CompareFP : FPFormat<5>; + def CondMovFP : FPFormat<6>; + def SpecialFP : FPFormat<7>; + +In this case, template arguments are used as a space efficient way to specify a +list of "enumeration values", each with a "``Value``" field set to the specified +integer. + +The more esoteric forms of `TableGen expressions`_ are useful in conjunction +with template arguments. As an example: + +.. code-block:: llvm + + class ModRefVal<bits<2> val> { + bits<2> Value = val; + } + + def None : ModRefVal<0>; + def Mod : ModRefVal<1>; + def Ref : ModRefVal<2>; + def ModRef : ModRefVal<3>; + + class Value<ModRefVal MR> { + // Decode some information into a more convenient format, while providing + // a nice interface to the user of the "Value" class. + bit isMod = MR.Value{0}; + bit isRef = MR.Value{1}; + + // other stuff... + } + + // Example uses + def bork : Value<Mod>; + def zork : Value<Ref>; + def hork : Value<ModRef>; + +This is obviously a contrived example, but it shows how template arguments can +be used to decouple the interface provided to the user of the class from the +actual internal data representation expected by the class. In this case, +running ``llvm-tblgen`` on the example prints the following definitions: + +.. code-block:: llvm + + def bork { // Value + bit isMod = 1; + bit isRef = 0; + } + def hork { // Value + bit isMod = 1; + bit isRef = 1; + } + def zork { // Value + bit isMod = 0; + bit isRef = 1; + } + +This shows that TableGen was able to dig into the argument and extract a piece +of information that was requested by the designer of the "Value" class. For +more realistic examples, please see existing users of TableGen, such as the X86 +backend. + +Multiclass definitions and instances +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While classes with template arguments are a good way to factor commonality +between two instances of a definition, multiclasses allow a convenient notation +for defining multiple definitions at once (instances of implicitly constructed +classes). For example, consider an 3-address instruction set whose instructions +come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``" +(e.g. SPARC). In this case, you'd like to specify in one place that this +commonality exists, then in a separate place indicate what all the ops are. + +Here is an example TableGen fragment that shows this idea: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + multiclass ri_inst<int opc, string asmstr> { + def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + } + + // Instantiations of the ri_inst multiclass. + defm ADD : ri_inst<0b111, "add">; + defm SUB : ri_inst<0b101, "sub">; + defm MUL : ri_inst<0b100, "mul">; + ... + +The name of the resultant definitions has the multidef fragment names appended +to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc. A defm may +inherit from multiple multiclasses, instantiating definitions from each +multiclass. Using a multiclass this way is exactly equivalent to instantiating +the classes multiple times yourself, e.g. by writing: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + class rrinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + + class riinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + + // Instantiations of the ri_inst multiclass. + def ADD_rr : rrinst<0b111, "add">; + def ADD_ri : riinst<0b111, "add">; + def SUB_rr : rrinst<0b101, "sub">; + def SUB_ri : riinst<0b101, "sub">; + def MUL_rr : rrinst<0b100, "mul">; + def MUL_ri : riinst<0b100, "mul">; + ... + +A ``defm`` can also be used inside a multiclass providing several levels of +multiclass instanciations. + +.. code-block:: llvm + + class Instruction<bits<4> opc, string Name> { + bits<4> opcode = opc; + string name = Name; + } + + multiclass basic_r<bits<4> opc> { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + + multiclass basic_s<bits<4> opc> { + defm SS : basic_r<opc>; + defm SD : basic_r<opc>; + def X : Instruction<opc, "x">; + } + + multiclass basic_p<bits<4> opc> { + defm PS : basic_r<opc>; + defm PD : basic_r<opc>; + def Y : Instruction<opc, "y">; + } + + defm ADD : basic_s<0xf>, basic_p<0xf>; + ... + + // Results + def ADDPDrm { ... + def ADDPDrr { ... + def ADDPSrm { ... + def ADDPSrr { ... + def ADDSDrm { ... + def ADDSDrr { ... + def ADDY { ... + def ADDX { ... + +``defm`` declarations can inherit from classes too, the rule to follow is that +the class list must start after the last multiclass, and there must be at least +one multiclass before them. + +.. code-block:: llvm + + class XD { bits<4> Prefix = 11; } + class XS { bits<4> Prefix = 12; } + + class I<bits<4> op> { + bits<4> opcode = op; + } + + multiclass R { + def rr : I<4>; + def rm : I<2>; + } + + multiclass Y { + defm SS : R, XD; + defm SD : R, XS; + } + + defm Instr : Y; + + // Results + def InstrSDrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + ... + def InstrSSrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + +File scope entities +------------------- + +File inclusion +^^^^^^^^^^^^^^ + +TableGen supports the '``include``' token, which textually substitutes the +specified file in place of the include directive. The filename should be +specified as a double quoted string immediately after the '``include``' keyword. +Example: + +.. code-block:: llvm + + include "foo.td" + +'let' expressions +^^^^^^^^^^^^^^^^^ + +"Let" expressions at file scope are similar to `"let" expressions within a +record`_, except they can specify a value binding for multiple records at a +time, and may be useful in certain other cases. File-scope let expressions are +really just another way that TableGen allows the end-user to factor out +commonality from the records. + +File-scope "let" expressions take a comma-separated list of bindings to apply, +and one or more records to bind the values in. Here are some examples: + +.. code-block:: llvm + + let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in + def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; + + let isCall = 1 in + // All calls clobber the non-callee saved registers... + let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, + XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { + def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), + "call\t${dst:call}", []>; + def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), + "call\t{*}$dst", [(X86call GR32:$dst)]>; + def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), + "call\t{*}$dst", []>; + } + +File-scope "let" expressions are often useful when a couple of definitions need +to be added to several records, and the records do not otherwise need to be +opened, as in the case with the ``CALL*`` instructions above. + +It's also possible to use "let" expressions inside multiclasses, providing more +ways to factor out commonality from the records, specially if using several +levels of multiclass instanciations. This also avoids the need of using "let" +expressions within subsequent records inside a multiclass. + +.. code-block:: llvm + + multiclass basic_r<bits<4> opc> { + let Predicates = [HasSSE2] in { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + let Predicates = [HasSSE3] in + def rx : Instruction<opc, "rx">; + } + + multiclass basic_ss<bits<4> opc> { + let IsDouble = 0 in + defm SS : basic_r<opc>; + + let IsDouble = 1 in + defm SD : basic_r<opc>; + } + + defm ADD : basic_ss<0xf>; + +Looping +^^^^^^^ + +TableGen supports the '``foreach``' block, which textually replicates the loop +body, substituting iterator values for iterator references in the body. +Example: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in { + def R#i : Register<...>; + def F#i : Register<...>; + } + +This will create objects ``R0``, ``R1``, ``R2`` and ``R3``. ``foreach`` blocks +may be nested. If there is only one item in the body the braces may be +elided: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in + def R#i : Register<...>; + +Code Generator backend info +=========================== + +Expressions used by code generator to describe instructions and isel patterns: + +``(implicit a)`` + an implicitly defined physical register. This tells the dag instruction + selection emitter the input pattern's extra definitions matches implicit + physical register definitions. + +.. _TableGen backend: +.. _TableGen backends: +.. _write a backend: + +TableGen backends +================= + +TODO: How they work, how to write one. This section should not contain details +about any particular backend, except maybe ``-print-enums`` as an example. This +should highlight the APIs in ``TableGen/Record.h``. diff --git a/docs/TestSuiteMakefileGuide.html b/docs/TestSuiteMakefileGuide.html index 876fe42..1b24250 100644 --- a/docs/TestSuiteMakefileGuide.html +++ b/docs/TestSuiteMakefileGuide.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>LLVM test-suite Makefile Guide</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -238,12 +238,12 @@ LLVM.</p> simple one is simply running <tt>gmake</tt> with no arguments. This will compile and run all programs in the tree using a number of different methods and compare results. Any failures are reported in the output, but are likely - drowned in the other output. Passes are not reported explicitely.</p> + drowned in the other output. Passes are not reported explicitly.</p> <p>Somewhat better is running <tt>gmake TEST=sometest test</tt>, which runs the specified test and usually adds per-program summaries to the output (depending on which sometest you use). For example, the <tt>nightly</tt> test - explicitely outputs TEST-PASS or TEST-FAIL for every test after each program. + explicitly outputs TEST-PASS or TEST-FAIL for every test after each program. Though these lines are still drowned in the output, it's easy to grep the output logs in the Output directories.</p> diff --git a/docs/TestingGuide.html b/docs/TestingGuide.html index 33ce793..804e929 100644 --- a/docs/TestingGuide.html +++ b/docs/TestingGuide.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>LLVM Testing Infrastructure Guide</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -626,6 +626,8 @@ define i8 @coerce_offset0(i32 %V, i32* %P) { <div> +<!-- {% raw %} --> + <p>The CHECK: and CHECK-NOT: directives both take a pattern to match. For most uses of FileCheck, fixed string matching is perfectly sufficient. For some things, a more flexible form of matching is desired. To support this, FileCheck @@ -650,6 +652,8 @@ braces like you would in C. In the rare case that you want to match double braces explicitly from the input, you can use something ugly like <b>{{[{][{]}}</b> as your pattern.</p> +<!-- {% endraw %} --> + </div> <!-- _______________________________________________________________________ --> @@ -659,6 +663,9 @@ braces explicitly from the input, you can use something ugly like <div> + +<!-- {% raw %} --> + <p>It is often useful to match a pattern and then verify that it occurs again later in the file. For codegen tests, this can be useful to allow any register, but verify that that register is used consistently later. To do this, FileCheck @@ -690,6 +697,8 @@ that FileCheck is not actually line-oriented when it matches, this allows you to define two separate CHECK lines that match on the same line. </p> +<!-- {% endraw %} --> + </div> </div> @@ -900,7 +909,7 @@ the <a href="TestSuiteMakefileGuide.html">Test Suite Makefile Guide.</a></p> John T. Criswell, Daniel Dunbar, Reid Spencer, and Tanya Lattner<br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-18 10:02:25 +0200 (Wed, 18 Apr 2012) $ + Last modified: $Date: 2012-05-08 20:26:07 +0200 (Tue, 08 May 2012) $ </address> </body> </html> diff --git a/docs/WritingAnLLVMBackend.html b/docs/WritingAnLLVMBackend.html index 85548ea..11517c2 100644 --- a/docs/WritingAnLLVMBackend.html +++ b/docs/WritingAnLLVMBackend.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Writing an LLVM Compiler Backend</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -2526,7 +2526,7 @@ with assembler. <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a> <br> - Last modified: $Date: 2012-03-01 16:14:19 +0100 (Thu, 01 Mar 2012) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> diff --git a/docs/WritingAnLLVMPass.html b/docs/WritingAnLLVMPass.html index 5dc67ae..149b103 100644 --- a/docs/WritingAnLLVMPass.html +++ b/docs/WritingAnLLVMPass.html @@ -4,7 +4,7 @@ <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Writing an LLVM Pass</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> + <link rel="stylesheet" href="_static/llvm.css" type="text/css"> </head> <body> @@ -1947,7 +1947,7 @@ Despite that, we have kept the LLVM passes SMP ready, and you should too.</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-04-08 13:52:52 +0200 (Sun, 08 Apr 2012) $ + Last modified: $Date: 2012-04-19 22:20:34 +0200 (Thu, 19 Apr 2012) $ </address> </body> diff --git a/docs/img/lines.gif b/docs/_static/lines.gif Binary files differindex 88f491e..88f491e 100644 --- a/docs/img/lines.gif +++ b/docs/_static/lines.gif diff --git a/docs/llvm.css b/docs/_static/llvm.css index e3e6351..d7b5dae 100644 --- a/docs/llvm.css +++ b/docs/_static/llvm.css @@ -16,7 +16,7 @@ table { text-align: center; border: 2px solid black; margin-right: 1em; margin-bottom: 1em; } tr, td { border: 2px solid gray; padding: 4pt 4pt 2pt 2pt; } th { border: 2px solid gray; font-weight: bold; font-size: 105%; - background: url("img/lines.gif"); + background: url("lines.gif"); font-family: "Georgia,Palatino,Times,Roman,SanSerif"; text-align: center; vertical-align: middle; } /* @@ -24,7 +24,7 @@ th { border: 2px solid gray; font-weight: bold; font-size: 105%; */ /* Common for title and header */ .doc_title, .doc_section, .doc_subsection, h1, h2, h3 { - color: black; background: url("img/lines.gif"); + color: black; background: url("lines.gif"); font-family: "Georgia,Palatino,Times,Roman,SanSerif"; font-weight: bold; border-width: 1px; border-style: solid none solid none; diff --git a/docs/_templates/indexsidebar.html b/docs/_templates/indexsidebar.html new file mode 100644 index 0000000..4161742 --- /dev/null +++ b/docs/_templates/indexsidebar.html @@ -0,0 +1,7 @@ +{# This template defines sidebar which can be used to provide common links on + all documentation pages. #} + +<h3>Bugs</h3> + +<p>LLVM bugs should be reported to + <a href="http://llvm.org/bugs">Bugzilla</a>.</p> diff --git a/docs/_templates/layout.html b/docs/_templates/layout.html new file mode 100644 index 0000000..de5db5c --- /dev/null +++ b/docs/_templates/layout.html @@ -0,0 +1,13 @@ +{% extends "!layout.html" %} + +{% block extrahead %} +<style type="text/css"> + table.right { float: right; margin-left: 20px; } + table.right td { border: 1px solid #ccc; } +</style> +{% endblock %} + +{% block rootrellink %} + <li><a href="http://llvm.org/">LLVM Home</a> | </li> + <li><a href="{{ pathto('index') }}">Documentation</a>»</li> +{% endblock %} diff --git a/docs/conf.py b/docs/conf.py new file mode 100644 index 0000000..de0585d --- /dev/null +++ b/docs/conf.py @@ -0,0 +1,263 @@ +# -*- coding: utf-8 -*- +# +# LLVM documentation build configuration file. +# +# This file is execfile()d with the current directory set to its containing dir. +# +# Note that not all possible configuration values are present in this +# autogenerated file. +# +# All configuration values have a default; values that are commented out +# serve to show the default. + +import sys, os + +# If extensions (or modules to document with autodoc) are in another directory, +# add these directories to sys.path here. If the directory is relative to the +# documentation root, use os.path.abspath to make it absolute, like shown here. +#sys.path.insert(0, os.path.abspath('.')) + +# -- General configuration ----------------------------------------------------- + +# If your documentation needs a minimal Sphinx version, state it here. +#needs_sphinx = '1.0' + +# Add any Sphinx extension module names here, as strings. They can be extensions +# coming with Sphinx (named 'sphinx.ext.*') or your custom ones. +extensions = ['sphinx.ext.intersphinx', 'sphinx.ext.todo'] + +# Add any paths that contain templates here, relative to this directory. +templates_path = ['_templates'] + +# The suffix of source filenames. +source_suffix = '.rst' + +# The encoding of source files. +#source_encoding = 'utf-8-sig' + +# The master toctree document. +master_doc = 'index' + +# General information about the project. +project = u'LLVM' +copyright = u'2012, LLVM Project' + +# The version info for the project you're documenting, acts as replacement for +# |version| and |release|, also used in various other places throughout the +# built documents. +# +# The short X.Y version. +version = '3.2' +# The full version, including alpha/beta/rc tags. +release = '3.2' + +# The language for content autogenerated by Sphinx. Refer to documentation +# for a list of supported languages. +#language = None + +# There are two options for replacing |today|: either, you set today to some +# non-false value, then it is used: +#today = '' +# Else, today_fmt is used as the format for a strftime call. +today_fmt = '%Y-%m-%d' + +# List of patterns, relative to source directory, that match files and +# directories to ignore when looking for source files. +exclude_patterns = ['_build'] + +# The reST default role (used for this markup: `text`) to use for all documents. +#default_role = None + +# If true, '()' will be appended to :func: etc. cross-reference text. +#add_function_parentheses = True + +# If true, the current module name will be prepended to all description +# unit titles (such as .. function::). +#add_module_names = True + +# If true, sectionauthor and moduleauthor directives will be shown in the +# output. They are ignored by default. +show_authors = True + +# The name of the Pygments (syntax highlighting) style to use. +pygments_style = 'friendly' + +# A list of ignored prefixes for module index sorting. +#modindex_common_prefix = [] + + +# -- Options for HTML output --------------------------------------------------- + +# The theme to use for HTML and HTML Help pages. See the documentation for +# a list of builtin themes. +html_theme = 'llvm-theme' + +# Theme options are theme-specific and customize the look and feel of a theme +# further. For a list of options available for each theme, see the +# documentation. +#html_theme_options = {} + +# Add any paths that contain custom themes here, relative to this directory. +html_theme_path = ["."] + +# The name for this set of Sphinx documents. If None, it defaults to +# "<project> v<release> documentation". +#html_title = None + +# A shorter title for the navigation bar. Default is the same as html_title. +#html_short_title = None + +# The name of an image file (relative to this directory) to place at the top +# of the sidebar. +#html_logo = None + +# The name of an image file (within the static path) to use as favicon of the +# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 +# pixels large. +#html_favicon = None + +# Add any paths that contain custom static files (such as style sheets) here, +# relative to this directory. They are copied after the builtin static files, +# so a file named "default.css" will overwrite the builtin "default.css". +html_static_path = ['_static'] + +# If not '', a 'Last updated on:' timestamp is inserted at every page bottom, +# using the given strftime format. +html_last_updated_fmt = '%Y-%m-%d' + +# If true, SmartyPants will be used to convert quotes and dashes to +# typographically correct entities. +#html_use_smartypants = True + +# Custom sidebar templates, maps document names to template names. +html_sidebars = {'index': 'indexsidebar.html'} + +# Additional templates that should be rendered to pages, maps page names to +# template names. +# +# We load all the old-school HTML documentation pages into Sphinx here. +basedir = os.path.dirname(__file__) +html_additional_pages = {} +for directory in ('', 'tutorial'): + for file in os.listdir(os.path.join(basedir, directory)): + if not file.endswith('.html'): + continue + + subpath = os.path.join(directory, file) + name,_ = os.path.splitext(subpath) + html_additional_pages[name] = subpath + +# If false, no module index is generated. +#html_domain_indices = True + +# If false, no index is generated. +#html_use_index = True + +# If true, the index is split into individual pages for each letter. +#html_split_index = False + +# If true, links to the reST sources are added to the pages. +html_show_sourcelink = True + +# If true, "Created using Sphinx" is shown in the HTML footer. Default is True. +#html_show_sphinx = True + +# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. +#html_show_copyright = True + +# If true, an OpenSearch description file will be output, and all pages will +# contain a <link> tag referring to it. The value of this option must be the +# base URL from which the finished HTML is served. +#html_use_opensearch = '' + +# This is the file name suffix for HTML files (e.g. ".xhtml"). +#html_file_suffix = None + +# Output file base name for HTML help builder. +htmlhelp_basename = 'LLVMdoc' + + +# -- Options for LaTeX output -------------------------------------------------- + +latex_elements = { +# The paper size ('letterpaper' or 'a4paper'). +#'papersize': 'letterpaper', + +# The font size ('10pt', '11pt' or '12pt'). +#'pointsize': '10pt', + +# Additional stuff for the LaTeX preamble. +#'preamble': '', +} + +# Grouping the document tree into LaTeX files. List of tuples +# (source start file, target name, title, author, documentclass [howto/manual]). +latex_documents = [ + ('index', 'LLVM.tex', u'LLVM Documentation', + u'LLVM project', 'manual'), +] + +# The name of an image file (relative to this directory) to place at the top of +# the title page. +#latex_logo = None + +# For "manual" documents, if this is true, then toplevel headings are parts, +# not chapters. +#latex_use_parts = False + +# If true, show page references after internal links. +#latex_show_pagerefs = False + +# If true, show URL addresses after external links. +#latex_show_urls = False + +# Documents to append as an appendix to all manuals. +#latex_appendices = [] + +# If false, no module index is generated. +#latex_domain_indices = True + + +# -- Options for manual page output -------------------------------------------- + +# One entry per manual page. List of tuples +# (source start file, name, description, authors, manual section). +man_pages = [] + +# Automatically derive the list of man pages from the contents of the command +# guide subdirectory. +man_page_authors = "Maintained by The LLVM Team (http://llvm.org/)." +command_guide_subpath = 'CommandGuide' +command_guide_path = os.path.join(basedir, command_guide_subpath) +for name in os.listdir(command_guide_path): + # Ignore non-ReST files and the index page. + if not name.endswith('.rst') or name in ('index.rst',): + continue + + # Otherwise, automatically extract the description. + file_subpath = os.path.join(command_guide_subpath, name) + with open(os.path.join(command_guide_path, name)) as f: + it = iter(f) + title = it.next()[:-1] + header = it.next()[:-1] + + if len(header) != len(title): + print >>sys.stderr, ( + "error: invalid header in %r (does not match title)" % ( + file_subpath,)) + if ' - ' not in title: + print >>sys.stderr, ( + ("error: invalid title in %r " + "(expected '<name> - <description>')") % ( + file_subpath,)) + + # Split the name out of the title. + name,description = title.split(' - ', 1) + man_pages.append((file_subpath.replace('.rst',''), name, + description, man_page_authors, 1)) + +# If true, show URL addresses after external links. +#man_show_urls = False + +# FIXME: Define intersphinx configration. +intersphinx_mapping = {} diff --git a/docs/design_and_overview.rst b/docs/design_and_overview.rst new file mode 100644 index 0000000..ea68415 --- /dev/null +++ b/docs/design_and_overview.rst @@ -0,0 +1,36 @@ +.. _design_and_overview: + +LLVM Design & Overview +====================== + +.. toctree:: + :hidden: + + GetElementPtr + +* `LLVM Language Reference Manual <LangRef.html>`_ + + Defines the LLVM intermediate representation. + +* `Introduction to the LLVM Compiler <http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.html>`_ + + Presentation providing a users introduction to LLVM. + +* `Intro to LLVM <http://www.aosabook.org/en/llvm.html>`_ + + Book chapter providing a compiler hacker's introduction to LLVM. + +* `LLVM: A Compilation Framework forLifelong Program Analysis & Transformation + <http://llvm.org/pubs/2004-01-30-CGO-LLVM.html>`_ + + Design overview. + +* `LLVM: An Infrastructure for Multi-Stage Optimization + <http://llvm.org/pubs/2002-12-LattnerMSThesis.html>`_ + + More details (quite old now). + +* :ref:`gep` + + Answers to some very frequent questions about LLVM's most frequently + misunderstood instruction. diff --git a/docs/development_process.rst b/docs/development_process.rst new file mode 100644 index 0000000..4fc20b3 --- /dev/null +++ b/docs/development_process.rst @@ -0,0 +1,30 @@ +.. _development_process: + +Development Process Documentation +================================= + +.. toctree:: + :hidden: + + MakefileGuide + Projects + +* :ref:`projects` + + How-to guide and templates for new projects that *use* the LLVM + infrastructure. The templates (directory organization, Makefiles, and test + tree) allow the project code to be located outside (or inside) the ``llvm/`` + tree, while using LLVM header files and libraries. + +* `LLVMBuild Documentation <LLVMBuild.html>`_ + + Describes the LLVMBuild organization and files used by LLVM to specify + component descriptions. + +* :ref:`makefile_guide` + + Describes how the LLVM makefiles work and how to use them. + +* `How To Release LLVM To The Public <HowToReleaseLLVM.html>`_ + + This is a guide to preparing LLVM releases. Most developers can ignore it. diff --git a/docs/doxygen.css b/docs/doxygen.css index 80c6cad..83951f6 100644 --- a/docs/doxygen.css +++ b/docs/doxygen.css @@ -327,7 +327,7 @@ HR { height: 1px; } .title { font-size: 25pt; - color: black; background: url("../img/lines.gif"); + color: black; font-weight: bold; border-width: 1px; border-style: solid none solid none; diff --git a/docs/img/Debugging.gif b/docs/img/Debugging.gif Binary files differdeleted file mode 100644 index 662d35a..0000000 --- a/docs/img/Debugging.gif +++ /dev/null diff --git a/docs/img/libdeps.gif b/docs/img/libdeps.gif Binary files differdeleted file mode 100644 index c5c0ed4..0000000 --- a/docs/img/libdeps.gif +++ /dev/null diff --git a/docs/img/objdeps.gif b/docs/img/objdeps.gif Binary files differdeleted file mode 100644 index 57c3e2e..0000000 --- a/docs/img/objdeps.gif +++ /dev/null diff --git a/docs/img/venusflytrap.jpg b/docs/img/venusflytrap.jpg Binary files differdeleted file mode 100644 index 59340ef..0000000 --- a/docs/img/venusflytrap.jpg +++ /dev/null diff --git a/docs/index.html b/docs/index.html deleted file mode 100644 index edd476d..0000000 --- a/docs/index.html +++ /dev/null @@ -1,286 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Documentation for the LLVM System at SVN head</title> - <link rel="stylesheet" href="llvm.css" type="text/css"> -</head> -<body> - -<h1>Documentation for the LLVM System at SVN head</h1> - -<p class="doc_warning">If you are using a released version of LLVM, -see <a href="http://llvm.org/releases/">the download page</a> to find -your documentation.</p> - -<table class="layout" width="95%"><tr class="layout"><td class="left"> -<ul> - <li><a href="#llvmdesign">LLVM Design</a></li> - <li><a href="/pubs/">LLVM Publications</a></li> - <li><a href="#userguide">LLVM User Guides</a></li> - <li><a href="#llvmprog">LLVM Programming Documentation</a></li> - <li><a href="#subsystems">LLVM Subsystem Documentation</a></li> - <li><a href="#develprocess">LLVM Development Process Documentation</a></li> - <li><a href="#maillist">LLVM Mailing Lists</a></li> -</ul> -</td><td class="right"> - <form action="http://www.google.com/search" method=get> - <p> - <input type="hidden" name="sitesearch" value="llvm.org/docs"> - <input type=text name=q size=25><br> - <input type=submit value="Search the LLVM Docs" name="submit"> - </p> - </form> -</td></tr></table> - -<div class="doc_author"> - <p>Written by <a href="http://llvm.org/">The LLVM Team</a></p> -</div> - -<!--=======================================================================--> -<h2><a name="llvmdesign">LLVM Design & Overview</a></h2> -<!--=======================================================================--> - -<ul> -<li><a href="LangRef.html">LLVM Language Reference Manual</a> - Defines the LLVM -intermediate representation.</li> -<li><a href="http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.html">Introduction to the LLVM Compiler </a> - Presentation providing a users introduction to LLVM.</li> -<li><a href="http://www.aosabook.org/en/llvm.html">Intro to LLVM</a> - book chapter providing a compiler hacker's introduction to LLVM.</li> -<li><a href="http://llvm.org/pubs/2004-01-30-CGO-LLVM.html">LLVM: A Compilation Framework for -Lifelong Program Analysis & Transformation</a> - Design overview.</li> -<li><a href="http://llvm.org/pubs/2002-12-LattnerMSThesis.html">LLVM: An Infrastructure for -Multi-Stage Optimization</a> - More details (quite old now).</li> -<li><a href="GetElementPtr.html">GetElementPtr FAQ</a> - Answers to some very -frequent questions about LLVM's most frequently misunderstood instruction.</li> -</ul> - -<!--=======================================================================--> -<h2><a name="userguide">LLVM User Guides</a></h2> -<!--=======================================================================--> - -<ul> -<li><a href="GettingStarted.html">The LLVM Getting Started Guide</a> - -Discusses how to get up and running quickly with the LLVM infrastructure. -Everything from unpacking and compilation of the distribution to execution of -some tools.</li> - -<li><a href="CMake.html">LLVM CMake guide</a> - An addendum to the main Getting -Started guide for those using the <a href="http://www.cmake.org/">CMake build -system</a>. -</li> - -<li><a href="GettingStartedVS.html">Getting Started with the LLVM System using -Microsoft Visual Studio</a> - An addendum to the main Getting Started guide for -those using Visual Studio on Windows.</li> - -<li><a href="tutorial/">LLVM Tutorial</a> - A walk through the process of using -LLVM for a custom language, and the facilities LLVM offers in tutorial form.</li> -<li><a href="DeveloperPolicy.html">Developer Policy</a> - The LLVM project's -policy towards developers and their contributions.</li> - -<li><a href="CommandGuide/index.html">LLVM Command Guide</a> - A reference -manual for the LLVM command line utilities ("man" pages for LLVM tools).</li> - -<li><a href="Passes.html">LLVM's Analysis and Transform Passes</a> - A list of -optimizations and analyses implemented in LLVM.</li> - -<li><a href="FAQ.html">Frequently Asked Questions</a> - A list of common -questions and problems and their solutions.</li> - -<li><a href="ReleaseNotes.html">Release notes for the current release</a> -- This describes new features, known bugs, and other limitations.</li> - -<li><a href="HowToSubmitABug.html">How to Submit A Bug Report</a> - -Instructions for properly submitting information about any bugs you run into in -the LLVM system.</li> - -<li><a href="TestingGuide.html">LLVM Testing Infrastructure Guide</a> - A reference -manual for using the LLVM testing infrastructure.</li> - -<li><a href="http://clang.llvm.org/get_started.html">How to build the C, C++, ObjC, -and ObjC++ front end</a> - Instructions for building the clang front-end from -source.</li> - -<li><a href="Packaging.html">Packaging guide</a> - Advice on packaging -LLVM into a distribution.</li> - -<li><a href="Lexicon.html">The LLVM Lexicon</a> - Definition of acronyms, terms -and concepts used in LLVM.</li> - -<li><a name="irc">You can probably find help on the unofficial LLVM IRC -channel</a>. We often are on irc.oftc.net in the #llvm channel. If you are -using the mozilla browser, and have chatzilla installed, you can <a -href="irc://irc.oftc.net/llvm">join #llvm on irc.oftc.net</a> directly.</li> - -<li><a href="HowToAddABuilder.html">How To Add Your Build Configuration -To LLVM Buildbot Infrastructure</a> - Instructions for adding new builder to -LLVM buildbot master.</li> - -</ul> - - -<!--=======================================================================--> -<h2><a name="llvmprog">LLVM Programming Documentation</a></h2> -<!--=======================================================================--> - -<ul> -<li><a href="LangRef.html">LLVM Language Reference Manual</a> - Defines the LLVM -intermediate representation and the assembly form of the different nodes.</li> - -<li><a href="ProgrammersManual.html">The LLVM Programmers Manual</a> - -Introduction to the general layout of the LLVM sourcebase, important classes -and APIs, and some tips & tricks.</li> - -<li><a href="CommandLine.html">CommandLine library Reference Manual</a> - -Provides information on using the command line parsing library.</li> - -<li><a href="CodingStandards.html">LLVM Coding standards</a> - -Details the LLVM coding standards and provides useful information on writing -efficient C++ code.</li> - -<li><a href="ExtendingLLVM.html">Extending LLVM</a> - Look here to see how -to add instructions and intrinsics to LLVM.</li> - -<li><a href="http://llvm.org/doxygen/">Doxygen generated -documentation</a> (<a -href="http://llvm.org/doxygen/inherits.html">classes</a>) - -(<a href="http://llvm.org/doxygen/doxygen.tar.gz">tarball</a>) -</li> - -<li><a href="http://llvm.org/viewvc/">ViewVC Repository Browser</a></li> - -</ul> - -<!--=======================================================================--> -<h2><a name="subsystems">LLVM Subsystem Documentation</a></h2> -<!--=======================================================================--> - -<ul> - -<li><a href="WritingAnLLVMPass.html">Writing an LLVM Pass</a> - Information -on how to write LLVM transformations and analyses.</li> - -<li><a href="WritingAnLLVMBackend.html">Writing an LLVM Backend</a> - Information -on how to write LLVM backends for machine targets.</li> - -<li><a href="CodeGenerator.html">The LLVM Target-Independent Code -Generator</a> - The design and implementation of the LLVM code generator. -Useful if you are working on retargetting LLVM to a new architecture, designing -a new codegen pass, or enhancing existing components.</li> - -<li><a href="TableGenFundamentals.html">TableGen Fundamentals</a> - -Describes the TableGen tool, which is used heavily by the LLVM code -generator.</li> - -<li><a href="AliasAnalysis.html">Alias Analysis in LLVM</a> - Information -on how to write a new alias analysis implementation or how to use existing -analyses.</li> - -<li><a href="GarbageCollection.html">Accurate Garbage Collection with -LLVM</a> - The interfaces source-language compilers should use for compiling -GC'd programs.</li> - -<li><a href="SourceLevelDebugging.html">Source Level Debugging with -LLVM</a> - This document describes the design and philosophy behind the LLVM -source-level debugger.</li> - -<li><a href="ExceptionHandling.html">Zero Cost Exception handling in LLVM</a> -- This document describes the design and implementation of exception handling -in LLVM.</li> - -<li><a href="Bugpoint.html">Bugpoint</a> - automatic bug finder and test-case -reducer description and usage information.</li> - -<li><a href="BitCodeFormat.html">LLVM Bitcode File Format</a> - This describes -the file format and encoding used for LLVM "bc" files.</li> - -<li><a href="SystemLibrary.html">System Library</a> - This document describes -the LLVM System Library (<tt>lib/System</tt>) and how to keep LLVM source code -portable</li> - -<li><a href="LinkTimeOptimization.html">Link Time Optimization</a> - This -document describes the interface between LLVM intermodular optimizer and -the linker and its design</li> - -<li><a href="GoldPlugin.html">The LLVM gold plugin</a> - How to build your -programs with link-time optimization on Linux.</li> - -<li><a href="DebuggingJITedCode.html">The GDB JIT interface</a> - How to debug -JITed code with GDB.</li> - -<li><a href="BranchWeightMetadata.html">Branch Weight Metadata</a> - Provides -information about Branch Prediction Information.</li> - -</ul> - -<!--=======================================================================--> -<h2><a name="develprocess">LLVM Development Process Documentation</a></h2> -<!--=======================================================================--> - -<ul> - -<li><a href="Projects.html">LLVM Project Guide</a> - How-to guide and -templates for new projects that <em>use</em> the LLVM infrastructure. The -templates (directory organization, Makefiles, and test tree) allow the project -code to be located outside (or inside) the <tt>llvm/</tt> tree, while using LLVM -header files and libraries.</li> - -<li><a href="LLVMBuild.html">LLVMBuild Documentation</a> - Describes the -LLVMBuild organization and files used by LLVM to specify component -descriptions.</li> - -<li><a href="MakefileGuide.html">LLVM Makefile Guide</a> - Describes how the -LLVM makefiles work and how to use them.</li> - -<li><a href="HowToReleaseLLVM.html">How To Release LLVM To The Public</a> - This -is a guide to preparing LLVM releases. Most developers can ignore it.</li> - -</ul> - -<!--=======================================================================--> -<h2><a name="maillist">LLVM Mailing Lists</a></h2> -<!--=======================================================================--> - -<ul> -<li>The <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce"> -LLVM Announcements List</a>: This is a low volume list that provides important -announcements regarding LLVM. It gets email about once a month.</li> - -<li>The <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">Developer's -List</a>: This list is for people who want to be included in technical -discussions of LLVM. People post to this list when they have questions about -writing code for or using the LLVM tools. It is relatively low volume.</li> - -<li>The <a href="http://lists.cs.uiuc.edu/pipermail/llvmbugs/">Bugs & -Patches Archive</a>: This list gets emailed every time a bug is opened and -closed, and when people submit patches to be included in LLVM. It is higher -volume than the LLVMdev list.</li> - -<li>The <a href="http://lists.cs.uiuc.edu/pipermail/llvm-commits/">Commits -Archive</a>: This list contains all commit messages that are made when LLVM -developers commit code changes to the repository. It is useful for those who -want to stay on the bleeding edge of LLVM development. This list is very high -volume.</li> - -<li>The <a href="http://lists.cs.uiuc.edu/pipermail/llvm-testresults/"> -Test Results Archive</a>: A message is automatically sent to this list by every -active nightly tester when it completes. As such, this list gets email several -times each day, making it a high volume list.</li> - -</ul> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-02-26 23:26:37 +0100 (Sun, 26 Feb 2012) $ -</address> -</body></html> diff --git a/docs/index.rst b/docs/index.rst new file mode 100644 index 0000000..53d3e7c --- /dev/null +++ b/docs/index.rst @@ -0,0 +1,70 @@ +.. _contents: + +Overview +======== + +.. warning:: + + If you are using a released version of LLVM, see `the download page + <http://llvm.org/releases/>`_ to find your documentation. + +The LLVM compiler infrastructure supports a wide range of projects, from +industrial strength compilers to specialized JIT applications to small +research projects. + +Similarly, documentation is broken down into several high-level groupings +targeted at different audiences: + + * **Design & Overview** + + Several introductory papers and presentations are available at + :ref:`design_and_overview`. + + * **Publications** + + The list of `publications <http://llvm.org/pubs>`_ based on LLVM. + + * **User Guides** + + Those new to the LLVM system should first vist the :ref:`userguides`. + + NOTE: If you are a user who is only interested in using LLVM-based + compilers, you should look into `Clang <http://clang.llvm.org>`_ or + `DragonEgg <http://dragonegg.llvm.org>`_ instead. The documentation here is + intended for users who have a need to work with the intermediate LLVM + representation. + + * **API Clients** + + Developers of applications which use LLVM as a library should visit the + :ref:`programming`. + + * **Subsystems** + + API clients and LLVM developers may be interested in the + :ref:`subsystems` documentation. + + * **Development Process** + + Additional documentation on the LLVM project can be found at + :ref:`development_process`. + + * **Mailing Lists** + + For more information, consider consulting the LLVM :ref:`mailing_lists`. + +.. toctree:: + :maxdepth: 2 + + design_and_overview + userguides + programming + subsystems + development_process + mailing_lists + +Indices and tables +================== + +* :ref:`genindex` +* :ref:`search` diff --git a/docs/llvm-theme/layout.html b/docs/llvm-theme/layout.html new file mode 100644 index 0000000..746c2f5 --- /dev/null +++ b/docs/llvm-theme/layout.html @@ -0,0 +1,23 @@ +{# + sphinxdoc/layout.html + ~~~~~~~~~~~~~~~~~~~~~ + + Sphinx layout template for the sphinxdoc theme. + + :copyright: Copyright 2007-2010 by the Sphinx team, see AUTHORS. + :license: BSD, see LICENSE for details. +#} +{% extends "basic/layout.html" %} + +{% block relbar1 %} +<div class="logo"> + <a href="{{ pathto('index') }}"> + <img src="{{pathto("_static/logo.png", 1) }}" + alt="LLVM Logo" width="250" height="88"/></a> +</div> +{{ super() }} +{% endblock %} + +{# put the sidebar before the body #} +{% block sidebar1 %}{{ sidebar() }}{% endblock %} +{% block sidebar2 %}{% endblock %} diff --git a/docs/llvm-theme/static/contents.png b/docs/llvm-theme/static/contents.png Binary files differnew file mode 100644 index 0000000..7fb8215 --- /dev/null +++ b/docs/llvm-theme/static/contents.png diff --git a/docs/llvm-theme/static/llvm-theme.css b/docs/llvm-theme/static/llvm-theme.css new file mode 100644 index 0000000..f684d00 --- /dev/null +++ b/docs/llvm-theme/static/llvm-theme.css @@ -0,0 +1,374 @@ +/* + * sphinxdoc.css_t + * ~~~~~~~~~~~~~~~ + * + * Sphinx stylesheet -- sphinxdoc theme. Originally created by + * Armin Ronacher for Werkzeug. + * + * :copyright: Copyright 2007-2010 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ + +@import url("basic.css"); + +/* -- page layout ----------------------------------------------------------- */ + +body { + font-family: 'Lucida Grande', 'Lucida Sans Unicode', 'Geneva', + 'Verdana', sans-serif; + font-size: 14px; + letter-spacing: -0.01em; + line-height: 150%; + text-align: center; + background-color: #BFD1D4; + color: black; + padding: 0; + border: 1px solid #aaa; + + margin: 0px 80px 0px 80px; + min-width: 740px; +} + +div.logo { + background-color: white; + text-align: left; + padding: 10px 10px 15px 15px; +} + +div.document { + background-color: white; + text-align: left; + background-image: url(contents.png); + background-repeat: repeat-x; +} + +div.bodywrapper { + margin: 0 240px 0 0; + border-right: 1px solid #ccc; +} + +div.body { + margin: 0; + padding: 0.5em 20px 20px 20px; +} + +div.related { + font-size: 1em; +} + +div.related ul { + background-image: url(navigation.png); + height: 2em; + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; +} + +div.related ul li { + margin: 0; + padding: 0; + height: 2em; + float: left; +} + +div.related ul li.right { + float: right; + margin-right: 5px; +} + +div.related ul li a { + margin: 0; + padding: 0 5px 0 5px; + line-height: 1.75em; + color: #EE9816; +} + +div.related ul li a:hover { + color: #3CA8E7; +} + +div.sphinxsidebarwrapper { + padding: 0; +} + +div.sphinxsidebar { + margin: 0; + padding: 0.5em 15px 15px 0; + width: 210px; + float: right; + font-size: 1em; + text-align: left; +} + +div.sphinxsidebar h3, div.sphinxsidebar h4 { + margin: 1em 0 0.5em 0; + font-size: 1em; + padding: 0.1em 0 0.1em 0.5em; + color: white; + border: 1px solid #86989B; + background-color: #AFC1C4; +} + +div.sphinxsidebar h3 a { + color: white; +} + +div.sphinxsidebar ul { + padding-left: 1.5em; + margin-top: 7px; + padding: 0; + line-height: 130%; +} + +div.sphinxsidebar ul ul { + margin-left: 20px; +} + +div.footer { + background-color: #E3EFF1; + color: #86989B; + padding: 3px 8px 3px 0; + clear: both; + font-size: 0.8em; + text-align: right; +} + +div.footer a { + color: #86989B; + text-decoration: underline; +} + +/* -- body styles ----------------------------------------------------------- */ + +p { + margin: 0.8em 0 0.5em 0; +} + +a { + color: #CA7900; + text-decoration: none; +} + +a:hover { + color: #2491CF; +} + +div.body p a{ + text-decoration: underline; +} + +h1 { + margin: 0; + padding: 0.7em 0 0.3em 0; + font-size: 1.5em; + color: #11557C; +} + +h2 { + margin: 1.3em 0 0.2em 0; + font-size: 1.35em; + padding: 0; +} + +h3 { + margin: 1em 0 -0.3em 0; + font-size: 1.2em; +} + +h3 a:hover { + text-decoration: underline; +} + +div.body h1 a, div.body h2 a, div.body h3 a, div.body h4 a, div.body h5 a, div.body h6 a { + color: black!important; +} + +div.body h1, +div.body h2, +div.body h3, +div.body h4, +div.body h5, +div.body h6 { + background-color: #f2f2f2; + font-weight: normal; + color: #20435c; + border-bottom: 1px solid #ccc; + margin: 20px -20px 10px -20px; + padding: 3px 0 3px 10px; +} + +div.body h1 { margin-top: 0; font-size: 200%; } +div.body h2 { font-size: 160%; } +div.body h3 { font-size: 140%; } +div.body h4 { font-size: 120%; } +div.body h5 { font-size: 110%; } +div.body h6 { font-size: 100%; } + +h1 a.anchor, h2 a.anchor, h3 a.anchor, h4 a.anchor, h5 a.anchor, h6 a.anchor { + display: none; + margin: 0 0 0 0.3em; + padding: 0 0.2em 0 0.2em; + color: #aaa!important; +} + +h1:hover a.anchor, h2:hover a.anchor, h3:hover a.anchor, h4:hover a.anchor, +h5:hover a.anchor, h6:hover a.anchor { + display: inline; +} + +h1 a.anchor:hover, h2 a.anchor:hover, h3 a.anchor:hover, h4 a.anchor:hover, +h5 a.anchor:hover, h6 a.anchor:hover { + color: #777; + background-color: #eee; +} + +a.headerlink { + color: #c60f0f!important; + font-size: 1em; + margin-left: 6px; + padding: 0 4px 0 4px; + text-decoration: none!important; +} + +a.headerlink:hover { + background-color: #ccc; + color: white!important; +} + +cite, code, tt { + font-family: 'Consolas', 'Deja Vu Sans Mono', + 'Bitstream Vera Sans Mono', monospace; + font-size: 0.95em; + letter-spacing: 0.01em; +} + +:not(a.reference) > tt { + background-color: #f2f2f2; + border-bottom: 1px solid #ddd; + color: #333; +} + +tt.descname, tt.descclassname, tt.xref { + border: 0; +} + +hr { + border: 1px solid #abc; + margin: 2em; +} + +p a tt { + border: 0; + color: #CA7900; +} + +p a tt:hover { + color: #2491CF; +} + +a tt { + border: none; +} + +pre { + font-family: 'Consolas', 'Deja Vu Sans Mono', + 'Bitstream Vera Sans Mono', monospace; + font-size: 0.95em; + letter-spacing: 0.015em; + line-height: 120%; + padding: 0.5em; + border: 1px solid #ccc; + background-color: #f8f8f8; +} + +pre a { + color: inherit; + text-decoration: underline; +} + +td.linenos pre { + padding: 0.5em 0; +} + +div.quotebar { + background-color: #f8f8f8; + max-width: 250px; + float: right; + padding: 2px 7px; + border: 1px solid #ccc; +} + +div.topic { + background-color: #f8f8f8; +} + +table { + border-collapse: collapse; + margin: 0 -0.5em 0 -0.5em; +} + +table td, table th { + padding: 0.2em 0.5em 0.2em 0.5em; +} + +div.admonition, div.warning { + font-size: 0.9em; + margin: 1em 0 1em 0; + border: 1px solid #86989B; + background-color: #f7f7f7; + padding: 0; +} + +div.admonition p, div.warning p { + margin: 0.5em 1em 0.5em 1em; + padding: 0; +} + +div.admonition pre, div.warning pre { + margin: 0.4em 1em 0.4em 1em; +} + +div.admonition p.admonition-title, +div.warning p.admonition-title { + margin: 0; + padding: 0.1em 0 0.1em 0.5em; + color: white; + border-bottom: 1px solid #86989B; + font-weight: bold; + background-color: #AFC1C4; +} + +div.warning { + border: 1px solid #940000; +} + +div.warning p.admonition-title { + background-color: #CF0000; + border-bottom-color: #940000; +} + +div.admonition ul, div.admonition ol, +div.warning ul, div.warning ol { + margin: 0.1em 0.5em 0.5em 3em; + padding: 0; +} + +div.versioninfo { + margin: 1em 0 0 0; + border: 1px solid #ccc; + background-color: #DDEAF0; + padding: 8px; + line-height: 1.3em; + font-size: 0.9em; +} + +.viewcode-back { + font-family: 'Lucida Grande', 'Lucida Sans Unicode', 'Geneva', + 'Verdana', sans-serif; +} + +div.viewcode-block:target { + background-color: #f4debf; + border-top: 1px solid #ac9; + border-bottom: 1px solid #ac9; +} diff --git a/docs/llvm-theme/static/logo.png b/docs/llvm-theme/static/logo.png Binary files differnew file mode 100644 index 0000000..18d424c --- /dev/null +++ b/docs/llvm-theme/static/logo.png diff --git a/docs/llvm-theme/static/navigation.png b/docs/llvm-theme/static/navigation.png Binary files differnew file mode 100644 index 0000000..1081dc1 --- /dev/null +++ b/docs/llvm-theme/static/navigation.png diff --git a/docs/llvm-theme/theme.conf b/docs/llvm-theme/theme.conf new file mode 100644 index 0000000..573fd78 --- /dev/null +++ b/docs/llvm-theme/theme.conf @@ -0,0 +1,4 @@ +[theme] +inherit = basic +stylesheet = llvm-theme.css +pygments_style = friendly diff --git a/docs/mailing_lists.rst b/docs/mailing_lists.rst new file mode 100644 index 0000000..106f1da --- /dev/null +++ b/docs/mailing_lists.rst @@ -0,0 +1,35 @@ +.. _mailing_lists: + +Mailing Lists +============= + + * `LLVM Announcements List + <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce>`_ + + This is a low volume list that provides important announcements regarding + LLVM. It gets email about once a month. + + * `Developer's List <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ + + This list is for people who want to be included in technical discussions of + LLVM. People post to this list when they have questions about writing code + for or using the LLVM tools. It is relatively low volume. + + * `Bugs & Patches Archive <http://lists.cs.uiuc.edu/pipermail/llvmbugs/>`_ + + This list gets emailed every time a bug is opened and closed, and when people + submit patches to be included in LLVM. It is higher volume than the LLVMdev + list. + + * `Commits Archive <http://lists.cs.uiuc.edu/pipermail/llvm-commits/>`_ + + This list contains all commit messages that are made when LLVM developers + commit code changes to the repository. It is useful for those who want to + stay on the bleeding edge of LLVM development. This list is very high volume. + + * `Test Results Archive + <http://lists.cs.uiuc.edu/pipermail/llvm-testresults/>`_ + + A message is automatically sent to this list by every active nightly tester + when it completes. As such, this list gets email several times each day, + making it a high volume list. diff --git a/docs/make.bat b/docs/make.bat new file mode 100644 index 0000000..8dfec03 --- /dev/null +++ b/docs/make.bat @@ -0,0 +1,190 @@ +@ECHO OFF + +REM Command file for Sphinx documentation + +if "%SPHINXBUILD%" == "" ( + set SPHINXBUILD=sphinx-build +) +set BUILDDIR=_build +set ALLSPHINXOPTS=-d %BUILDDIR%/doctrees %SPHINXOPTS% . +set I18NSPHINXOPTS=%SPHINXOPTS% . +if NOT "%PAPER%" == "" ( + set ALLSPHINXOPTS=-D latex_paper_size=%PAPER% %ALLSPHINXOPTS% + set I18NSPHINXOPTS=-D latex_paper_size=%PAPER% %I18NSPHINXOPTS% +) + +if "%1" == "" goto help + +if "%1" == "help" ( + :help + echo.Please use `make ^<target^>` where ^<target^> is one of + echo. html to make standalone HTML files + echo. dirhtml to make HTML files named index.html in directories + echo. singlehtml to make a single large HTML file + echo. pickle to make pickle files + echo. json to make JSON files + echo. htmlhelp to make HTML files and a HTML help project + echo. qthelp to make HTML files and a qthelp project + echo. devhelp to make HTML files and a Devhelp project + echo. epub to make an epub + echo. latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter + echo. text to make text files + echo. man to make manual pages + echo. texinfo to make Texinfo files + echo. gettext to make PO message catalogs + echo. changes to make an overview over all changed/added/deprecated items + echo. linkcheck to check all external links for integrity + echo. doctest to run all doctests embedded in the documentation if enabled + goto end +) + +if "%1" == "clean" ( + for /d %%i in (%BUILDDIR%\*) do rmdir /q /s %%i + del /q /s %BUILDDIR%\* + goto end +) + +if "%1" == "html" ( + %SPHINXBUILD% -b html %ALLSPHINXOPTS% %BUILDDIR%/html + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The HTML pages are in %BUILDDIR%/html. + goto end +) + +if "%1" == "dirhtml" ( + %SPHINXBUILD% -b dirhtml %ALLSPHINXOPTS% %BUILDDIR%/dirhtml + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The HTML pages are in %BUILDDIR%/dirhtml. + goto end +) + +if "%1" == "singlehtml" ( + %SPHINXBUILD% -b singlehtml %ALLSPHINXOPTS% %BUILDDIR%/singlehtml + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The HTML pages are in %BUILDDIR%/singlehtml. + goto end +) + +if "%1" == "pickle" ( + %SPHINXBUILD% -b pickle %ALLSPHINXOPTS% %BUILDDIR%/pickle + if errorlevel 1 exit /b 1 + echo. + echo.Build finished; now you can process the pickle files. + goto end +) + +if "%1" == "json" ( + %SPHINXBUILD% -b json %ALLSPHINXOPTS% %BUILDDIR%/json + if errorlevel 1 exit /b 1 + echo. + echo.Build finished; now you can process the JSON files. + goto end +) + +if "%1" == "htmlhelp" ( + %SPHINXBUILD% -b htmlhelp %ALLSPHINXOPTS% %BUILDDIR%/htmlhelp + if errorlevel 1 exit /b 1 + echo. + echo.Build finished; now you can run HTML Help Workshop with the ^ +.hhp project file in %BUILDDIR%/htmlhelp. + goto end +) + +if "%1" == "qthelp" ( + %SPHINXBUILD% -b qthelp %ALLSPHINXOPTS% %BUILDDIR%/qthelp + if errorlevel 1 exit /b 1 + echo. + echo.Build finished; now you can run "qcollectiongenerator" with the ^ +.qhcp project file in %BUILDDIR%/qthelp, like this: + echo.^> qcollectiongenerator %BUILDDIR%\qthelp\llvm.qhcp + echo.To view the help file: + echo.^> assistant -collectionFile %BUILDDIR%\qthelp\llvm.ghc + goto end +) + +if "%1" == "devhelp" ( + %SPHINXBUILD% -b devhelp %ALLSPHINXOPTS% %BUILDDIR%/devhelp + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. + goto end +) + +if "%1" == "epub" ( + %SPHINXBUILD% -b epub %ALLSPHINXOPTS% %BUILDDIR%/epub + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The epub file is in %BUILDDIR%/epub. + goto end +) + +if "%1" == "latex" ( + %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex + if errorlevel 1 exit /b 1 + echo. + echo.Build finished; the LaTeX files are in %BUILDDIR%/latex. + goto end +) + +if "%1" == "text" ( + %SPHINXBUILD% -b text %ALLSPHINXOPTS% %BUILDDIR%/text + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The text files are in %BUILDDIR%/text. + goto end +) + +if "%1" == "man" ( + %SPHINXBUILD% -b man %ALLSPHINXOPTS% %BUILDDIR%/man + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The manual pages are in %BUILDDIR%/man. + goto end +) + +if "%1" == "texinfo" ( + %SPHINXBUILD% -b texinfo %ALLSPHINXOPTS% %BUILDDIR%/texinfo + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The Texinfo files are in %BUILDDIR%/texinfo. + goto end +) + +if "%1" == "gettext" ( + %SPHINXBUILD% -b gettext %I18NSPHINXOPTS% %BUILDDIR%/locale + if errorlevel 1 exit /b 1 + echo. + echo.Build finished. The message catalogs are in %BUILDDIR%/locale. + goto end +) + +if "%1" == "changes" ( + %SPHINXBUILD% -b changes %ALLSPHINXOPTS% %BUILDDIR%/changes + if errorlevel 1 exit /b 1 + echo. + echo.The overview file is in %BUILDDIR%/changes. + goto end +) + +if "%1" == "linkcheck" ( + %SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck + if errorlevel 1 exit /b 1 + echo. + echo.Link check complete; look for any errors in the above output ^ +or in %BUILDDIR%/linkcheck/output.txt. + goto end +) + +if "%1" == "doctest" ( + %SPHINXBUILD% -b doctest %ALLSPHINXOPTS% %BUILDDIR%/doctest + if errorlevel 1 exit /b 1 + echo. + echo.Testing of doctests in the sources finished, look at the ^ +results in %BUILDDIR%/doctest/output.txt. + goto end +) + +:end diff --git a/docs/programming.rst b/docs/programming.rst new file mode 100644 index 0000000..27e4301 --- /dev/null +++ b/docs/programming.rst @@ -0,0 +1,40 @@ +.. _programming: + +Programming Documentation +========================= + +.. toctree:: + :hidden: + + CodingStandards + CommandLine + +* `LLVM Language Reference Manual <LangRef.html>`_ + + Defines the LLVM intermediate representation and the assembly form of the + different nodes. + +* `The LLVM Programmers Manual <ProgrammersManual.html>`_ + + Introduction to the general layout of the LLVM sourcebase, important classes + and APIs, and some tips & tricks. + +* :ref:`commandline` + + Provides information on using the command line parsing library. + +* :ref:`coding_standards` + + Details the LLVM coding standards and provides useful information on writing + efficient C++ code. + +* `Extending LLVM <ExtendingLLVM.html>`_ + + Look here to see how to add instructions and intrinsics to LLVM. + +* `Doxygen generated documentation <http://llvm.org/doxygen/>`_ + + (`classes <http://llvm.org/doxygen/inherits.html>`_) + (`tarball <http://llvm.org/doxygen/doxygen.tar.gz>`_) + +* `ViewVC Repository Browser <http://llvm.org/viewvc/>`_ diff --git a/docs/subsystems.rst b/docs/subsystems.rst new file mode 100644 index 0000000..be33295 --- /dev/null +++ b/docs/subsystems.rst @@ -0,0 +1,91 @@ +.. _subsystems: + +Subsystem Documentation +======================= + +.. toctree:: + :hidden: + + AliasAnalysis + BitCodeFormat + BranchWeightMetadata + Bugpoint + CodeGenerator + ExceptionHandling + LinkTimeOptimization + SegmentedStacks + TableGenFundamentals + +* `Writing an LLVM Pass <WritingAnLLVMPass.html>`_ + + Information on how to write LLVM transformations and analyses. + +* `Writing an LLVM Backend <WritingAnLLVMBackend.html>`_ + + Information on how to write LLVM backends for machine targets. + +* :ref:`code_generator` + + The design and implementation of the LLVM code generator. Useful if you are + working on retargetting LLVM to a new architecture, designing a new codegen + pass, or enhancing existing components. + +* :ref:`tablegen` + + Describes the TableGen tool, which is used heavily by the LLVM code + generator. + +* :ref:`alias_analysis` + + Information on how to write a new alias analysis implementation or how to + use existing analyses. + +* `Accurate Garbage Collection with LLVM <GarbageCollection.html>`_ + + The interfaces source-language compilers should use for compiling GC'd + programs. + +* `Source Level Debugging with LLVM <SourceLevelDebugging.html>`_ + + This document describes the design and philosophy behind the LLVM + source-level debugger. + +* :ref:`exception_handling` + + This document describes the design and implementation of exception handling + in LLVM. + +* :ref:`bugpoint` + + Automatic bug finder and test-case reducer description and usage + information. + +* :ref:`bitcode_format` + + This describes the file format and encoding used for LLVM "bc" files. + +* `System Library <SystemLibrary.html>`_ + + This document describes the LLVM System Library (<tt>lib/System</tt>) and + how to keep LLVM source code portable + +* :ref:`lto` + + This document describes the interface between LLVM intermodular optimizer + and the linker and its design + +* `The LLVM gold plugin <GoldPlugin.html>`_ + + How to build your programs with link-time optimization on Linux. + +* `The GDB JIT interface <DebuggingJITedCode.html>`_ + + How to debug JITed code with GDB. + +* :ref:`branch_weight` + + Provides information about Branch Prediction Information. + +* :ref:`segmented_stacks` + + This document describes segmented stacks and how they are used in LLVM. diff --git a/docs/tutorial/LangImpl1.html b/docs/tutorial/LangImpl1.html index 22a2b12..717454f 100644 --- a/docs/tutorial/LangImpl1.html +++ b/docs/tutorial/LangImpl1.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Tutorial Introduction and the Lexer</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -342,7 +342,7 @@ so that you can use the lexer and parser together. <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl2.html b/docs/tutorial/LangImpl2.html index e4707b3..694f734 100644 --- a/docs/tutorial/LangImpl2.html +++ b/docs/tutorial/LangImpl2.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Implementing a Parser and AST</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1225,7 +1225,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl3.html b/docs/tutorial/LangImpl3.html index 9647b43..1390153 100644 --- a/docs/tutorial/LangImpl3.html +++ b/docs/tutorial/LangImpl3.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Implementing code generation to LLVM IR</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -685,10 +685,10 @@ clang++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy // See example below. #include "llvm/DerivedTypes.h" +#include "llvm/IRBuilder.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/Analysis/Verifier.h" -#include "llvm/Support/IRBuilder.h" #include <cstdio> #include <string> #include <map> @@ -1262,7 +1262,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-06-29 14:38:19 +0200 (Fri, 29 Jun 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl4.html b/docs/tutorial/LangImpl4.html index 06a8a13..3f8d4a4 100644 --- a/docs/tutorial/LangImpl4.html +++ b/docs/tutorial/LangImpl4.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Adding JIT and Optimizer Support</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -253,10 +253,9 @@ add instruction from every execution of this function.</p> <p>LLVM provides a wide variety of optimizations that can be used in certain circumstances. Some <a href="../Passes.html">documentation about the various passes</a> is available, but it isn't very complete. Another good source of -ideas can come from looking at the passes that <tt>llvm-gcc</tt> or -<tt>llvm-ld</tt> run to get started. The "<tt>opt</tt>" tool allows you to -experiment with passes from the command line, so you can see if they do -anything.</p> +ideas can come from looking at the passes that <tt>Clang</tt> runs to get +started. The "<tt>opt</tt>" tool allows you to experiment with passes from the +command line, so you can see if they do anything.</p> <p>Now that we have reasonable code coming out of our front-end, lets talk about executing it!</p> @@ -518,6 +517,7 @@ at runtime.</p> #include "llvm/DerivedTypes.h" #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/JIT.h" +#include "llvm/IRBuilder.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/PassManager.h" @@ -525,7 +525,6 @@ at runtime.</p> #include "llvm/Analysis/Passes.h" #include "llvm/Target/TargetData.h" #include "llvm/Transforms/Scalar.h" -#include "llvm/Support/IRBuilder.h" #include "llvm/Support/TargetSelect.h" #include <cstdio> #include <string> @@ -1147,7 +1146,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-06-29 14:38:19 +0200 (Fri, 29 Jun 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl5.html b/docs/tutorial/LangImpl5.html index 0164ca3..a7a3737 100644 --- a/docs/tutorial/LangImpl5.html +++ b/docs/tutorial/LangImpl5.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Extending the Language: Control Flow</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -895,6 +895,7 @@ clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 #include "llvm/DerivedTypes.h" #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/JIT.h" +#include "llvm/IRBuilder.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/PassManager.h" @@ -902,7 +903,6 @@ clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 #include "llvm/Analysis/Passes.h" #include "llvm/Target/TargetData.h" #include "llvm/Transforms/Scalar.h" -#include "llvm/Support/IRBuilder.h" #include "llvm/Support/TargetSelect.h" #include <cstdio> #include <string> @@ -1766,7 +1766,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-06-29 14:38:19 +0200 (Fri, 29 Jun 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl6.html b/docs/tutorial/LangImpl6.html index 4fcf109..1128893 100644 --- a/docs/tutorial/LangImpl6.html +++ b/docs/tutorial/LangImpl6.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Extending the Language: User-defined Operators</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -636,7 +636,7 @@ def mandelhelp(xmin xmax xstep ymin ymax ystep) : putchard(10) ) -# mandel - This is a convenient helper function for ploting the mandelbrot set +# mandel - This is a convenient helper function for plotting the mandelbrot set # from the specified position with the specified Magnification. def mandel(realstart imagstart realmag imagmag) mandelhelp(realstart, realstart+realmag*78, realmag, @@ -834,6 +834,7 @@ library, although doing that will cause problems on Windows.</p> #include "llvm/DerivedTypes.h" #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/JIT.h" +#include "llvm/IRBuilder.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/PassManager.h" @@ -841,7 +842,6 @@ library, although doing that will cause problems on Windows.</p> #include "llvm/Analysis/Passes.h" #include "llvm/Target/TargetData.h" #include "llvm/Transforms/Scalar.h" -#include "llvm/Support/IRBuilder.h" #include "llvm/Support/TargetSelect.h" #include <cstdio> #include <string> @@ -1823,7 +1823,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-07-31 09:05:57 +0200 (Tue, 31 Jul 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl7.html b/docs/tutorial/LangImpl7.html index ebf6514..f1fe404 100644 --- a/docs/tutorial/LangImpl7.html +++ b/docs/tutorial/LangImpl7.html @@ -7,7 +7,7 @@ construction</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1002,6 +1002,7 @@ clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 #include "llvm/DerivedTypes.h" #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/JIT.h" +#include "llvm/IRBuilder.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/PassManager.h" @@ -1009,7 +1010,6 @@ clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 #include "llvm/Analysis/Passes.h" #include "llvm/Target/TargetData.h" #include "llvm/Transforms/Scalar.h" -#include "llvm/Support/IRBuilder.h" #include "llvm/Support/TargetSelect.h" #include <cstdio> #include <string> @@ -2158,7 +2158,7 @@ int main() { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-10-16 10:06:54 +0200 (Sun, 16 Oct 2011) $ + Last modified: $Date: 2012-06-29 14:38:19 +0200 (Fri, 29 Jun 2012) $ </address> </body> </html> diff --git a/docs/tutorial/LangImpl8.html b/docs/tutorial/LangImpl8.html index cc55d40..50fcd8c 100644 --- a/docs/tutorial/LangImpl8.html +++ b/docs/tutorial/LangImpl8.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Conclusion and other useful LLVM tidbits</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -47,7 +47,7 @@ <div> -<p>Welcome to the the final chapter of the "<a href="index.html">Implementing a +<p>Welcome to the final chapter of the "<a href="index.html">Implementing a language with LLVM</a>" tutorial. In the course of this tutorial, we have grown our little Kaleidoscope language from being a useless toy, to being a semi-interesting (but probably still useless) toy. :)</p> @@ -353,7 +353,7 @@ Passing Style</a> and the use of tail calls (which LLVM also supports).</p> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-07-23 10:51:15 +0200 (Mon, 23 Jul 2012) $ </address> </body> </html> diff --git a/docs/tutorial/Makefile b/docs/tutorial/Makefile deleted file mode 100644 index fdf1bb6..0000000 --- a/docs/tutorial/Makefile +++ /dev/null @@ -1,30 +0,0 @@ -##===- docs/tutorial/Makefile ------------------------------*- Makefile -*-===## -# -# The LLVM Compiler Infrastructure -# -# This file is distributed under the University of Illinois Open Source -# License. See LICENSE.TXT for details. -# -##===----------------------------------------------------------------------===## - -LEVEL := ../.. -include $(LEVEL)/Makefile.common - -HTML := $(wildcard $(PROJ_SRC_DIR)/*.html) -PNG := $(wildcard $(PROJ_SRC_DIR)/*.png) -EXTRA_DIST := $(HTML) index.html -HTML_DIR := $(DESTDIR)$(PROJ_docsdir)/html/tutorial - -install-local:: $(HTML) - $(Echo) Installing HTML Tutorial Documentation - $(Verb) $(MKDIR) $(HTML_DIR) - $(Verb) $(DataInstall) $(HTML) $(HTML_DIR) - $(Verb) $(DataInstall) $(PNG) $(HTML_DIR) - $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/index.html $(HTML_DIR) - -uninstall-local:: - $(Echo) Uninstalling Tutorial Documentation - $(Verb) $(RM) -rf $(HTML_DIR) - -printvars:: - $(Echo) "HTML : " '$(HTML)' diff --git a/docs/tutorial/OCamlLangImpl1.html b/docs/tutorial/OCamlLangImpl1.html index 7cae68c..86a395a 100644 --- a/docs/tutorial/OCamlLangImpl1.html +++ b/docs/tutorial/OCamlLangImpl1.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -359,7 +359,7 @@ include a driver so that you can use the lexer and parser together. <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl2.html b/docs/tutorial/OCamlLangImpl2.html index e1bb871..9bb4c40 100644 --- a/docs/tutorial/OCamlLangImpl2.html +++ b/docs/tutorial/OCamlLangImpl2.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1037,7 +1037,7 @@ main () <a href="mailto:sabre@nondot.org">Chris Lattner</a> <a href="mailto:erickt@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl3.html b/docs/tutorial/OCamlLangImpl3.html index c240bb9..e6105e8 100644 --- a/docs/tutorial/OCamlLangImpl3.html +++ b/docs/tutorial/OCamlLangImpl3.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1087,7 +1087,7 @@ main () <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-07-15 22:03:30 +0200 (Fri, 15 Jul 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl4.html b/docs/tutorial/OCamlLangImpl4.html index db164d5..e3e2469 100644 --- a/docs/tutorial/OCamlLangImpl4.html +++ b/docs/tutorial/OCamlLangImpl4.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -270,10 +270,9 @@ add instruction from every execution of this function.</p> <p>LLVM provides a wide variety of optimizations that can be used in certain circumstances. Some <a href="../Passes.html">documentation about the various passes</a> is available, but it isn't very complete. Another good source of -ideas can come from looking at the passes that <tt>llvm-gcc</tt> or -<tt>llvm-ld</tt> run to get started. The "<tt>opt</tt>" tool allows you to -experiment with passes from the command line, so you can see if they do -anything.</p> +ideas can come from looking at the passes that <tt>Clang</tt> runs to get +started. The "<tt>opt</tt>" tool allows you to experiment with passes from the +command line, so you can see if they do anything.</p> <p>Now that we have reasonable code coming out of our front-end, lets talk about executing it!</p> @@ -1021,7 +1020,7 @@ extern double putchard(double X) { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl5.html b/docs/tutorial/OCamlLangImpl5.html index ca79691..994957e 100644 --- a/docs/tutorial/OCamlLangImpl5.html +++ b/docs/tutorial/OCamlLangImpl5.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1554,7 +1554,7 @@ operators</a> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl6.html b/docs/tutorial/OCamlLangImpl6.html index bde429b..cef3884 100644 --- a/docs/tutorial/OCamlLangImpl6.html +++ b/docs/tutorial/OCamlLangImpl6.html @@ -7,7 +7,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -611,7 +611,7 @@ def mandelhelp(xmin xmax xstep ymin ymax ystep) : putchard(10) ) -# mandel - This is a convenient helper function for ploting the mandelbrot set +# mandel - This is a convenient helper function for plotting the mandelbrot set # from the specified position with the specified Magnification. def mandel(realstart imagstart realmag imagmag) mandelhelp(realstart, realstart+realmag*78, realmag, @@ -1568,7 +1568,7 @@ SSA construction</a> <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-07-31 09:05:57 +0200 (Tue, 31 Jul 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl7.html b/docs/tutorial/OCamlLangImpl7.html index a48e679..abe8913 100644 --- a/docs/tutorial/OCamlLangImpl7.html +++ b/docs/tutorial/OCamlLangImpl7.html @@ -8,7 +8,7 @@ <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> <meta name="author" content="Erick Tryzelaar"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -1898,7 +1898,7 @@ extern double printd(double X) { <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> - Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ + Last modified: $Date: 2012-05-03 00:46:36 +0200 (Thu, 03 May 2012) $ </address> </body> </html> diff --git a/docs/tutorial/OCamlLangImpl8.html b/docs/tutorial/OCamlLangImpl8.html index eed8c03..7c1a500 100644 --- a/docs/tutorial/OCamlLangImpl8.html +++ b/docs/tutorial/OCamlLangImpl8.html @@ -6,7 +6,7 @@ <title>Kaleidoscope: Conclusion and other useful LLVM tidbits</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> @@ -47,7 +47,7 @@ <div> -<p>Welcome to the the final chapter of the "<a href="index.html">Implementing a +<p>Welcome to the final chapter of the "<a href="index.html">Implementing a language with LLVM</a>" tutorial. In the course of this tutorial, we have grown our little Kaleidoscope language from being a useless toy, to being a semi-interesting (but probably still useless) toy. :)</p> diff --git a/docs/tutorial/index.html b/docs/tutorial/index.html index 0a8cae2..2c11a9a 100644 --- a/docs/tutorial/index.html +++ b/docs/tutorial/index.html @@ -7,7 +7,7 @@ <meta name="author" content="Owen Anderson"> <meta name="description" content="LLVM Tutorial: Table of Contents."> - <link rel="stylesheet" href="../llvm.css" type="text/css"> + <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> </head> <body> diff --git a/docs/userguides.rst b/docs/userguides.rst new file mode 100644 index 0000000..26a5a8c --- /dev/null +++ b/docs/userguides.rst @@ -0,0 +1,89 @@ +.. _userguides: + +User Guides +=========== + +.. toctree:: + :hidden: + + CMake + CommandGuide/index + DeveloperPolicy + GettingStartedVS + FAQ + Lexicon + Packaging + +* `The LLVM Getting Started Guide <GettingStarted.html>`_ + + Discusses how to get up and running quickly with the LLVM infrastructure. + Everything from unpacking and compilation of the distribution to execution + of some tools. + +* :ref:`building-with-cmake` + + An addendum to the main Getting Started guide for those using the `CMake + build system <http://www.cmake.org>`_. + +* `Getting Started with the LLVM System using Microsoft Visual Studio + <GettingStartedVS.html>`_ + + An addendum to the main Getting Started guide for those using Visual Studio + on Windows. + +* `LLVM Tutorial <tutorial/>`_ + + A walk through the process of using LLVM for a custom language, and the + facilities LLVM offers in tutorial form. + +* :ref:`developer_policy` + + The LLVM project's policy towards developers and their contributions. + +* :ref:`LLVM Command Guide <commands>` + + A reference manual for the LLVM command line utilities ("man" pages for LLVM + tools). + +* `LLVM's Analysis and Transform Passes <Passes.html>`_ + + A list of optimizations and analyses implemented in LLVM. + +* :ref:`faq` + + A list of common questions and problems and their solutions. + +* `Release notes for the current release <ReleaseNotes.html>`_ + + This describes new features, known bugs, and other limitations. + +* `How to Submit A Bug Report <HowToSubmitABug.html>`_ + + Instructions for properly submitting information about any bugs you run into + in the LLVM system. + +* `LLVM Testing Infrastructure Guide <TestingGuide.html>`_ + + A reference manual for using the LLVM testing infrastructure. + +* `How to build the C, C++, ObjC, and ObjC++ front end <http://clang.llvm.org/get_started.html>`_ + + Instructions for building the clang front-end from source. + +* :ref:`packaging` + + Advice on packaging LLVM into a distribution. + +* :ref:`lexicon` + + Definition of acronyms, terms and concepts used in LLVM. + +* `How To Add Your Build Configuration To LLVM Buildbot Infrastructure <HowToAddABuilder.html>`_ + + Instructions for adding new builder to LLVM buildbot master. + +* **IRC** -- You can probably find help on the unofficial LLVM IRC. + + We often are on irc.oftc.net in the #llvm channel. If you are using the + mozilla browser, and have chatzilla installed, you can `join #llvm on + irc.oftc.net <irc://irc.oftc.net/llvm>`_. diff --git a/docs/yaml2obj.rst b/docs/yaml2obj.rst new file mode 100644 index 0000000..cb59162 --- /dev/null +++ b/docs/yaml2obj.rst @@ -0,0 +1,222 @@ +.. _yaml2obj: + +yaml2obj +======== + +yaml2obj takes a YAML description of an object file and converts it to a binary +file. + + $ yaml2py input-file + +.. program:: yaml2py + +Outputs the binary to stdout. + +COFF Syntax +----------- + +Here's a sample COFF file. + +.. code-block:: yaml + + header: + Machine: IMAGE_FILE_MACHINE_I386 # (0x14C) + + sections: + - Name: .text + Characteristics: [ IMAGE_SCN_CNT_CODE + , IMAGE_SCN_ALIGN_16BYTES + , IMAGE_SCN_MEM_EXECUTE + , IMAGE_SCN_MEM_READ + ] # 0x60500020 + SectionData: + "\x83\xEC\x0C\xC7\x44\x24\x08\x00\x00\x00\x00\xC7\x04\x24\x00\x00\x00\x00\xE8\x00\x00\x00\x00\xE8\x00\x00\x00\x00\x8B\x44\x24\x08\x83\xC4\x0C\xC3" # |....D$.......$...............D$.....| + + symbols: + - Name: .text + Value: 0 + SectionNumber: 1 + SimpleType: IMAGE_SYM_TYPE_NULL # (0) + ComplexType: IMAGE_SYM_DTYPE_NULL # (0) + StorageClass: IMAGE_SYM_CLASS_STATIC # (3) + NumberOfAuxSymbols: 1 + AuxillaryData: + "\x24\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00" # |$.................| + + - Name: _main + Value: 0 + SectionNumber: 1 + SimpleType: IMAGE_SYM_TYPE_NULL # (0) + ComplexType: IMAGE_SYM_DTYPE_NULL # (0) + StorageClass: IMAGE_SYM_CLASS_EXTERNAL # (2) + +Here's a simplified Kwalify_ schema with an extension to allow alternate types. + +.. _Kwalify: http://www.kuwata-lab.com/kwalify/ruby/users-guide.html + +.. code-block:: yaml + + type: map + mapping: + header: + type: map + mapping: + Machine: [ {type: str, enum: + [ IMAGE_FILE_MACHINE_UNKNOWN + , IMAGE_FILE_MACHINE_AM33 + , IMAGE_FILE_MACHINE_AMD64 + , IMAGE_FILE_MACHINE_ARM + , IMAGE_FILE_MACHINE_ARMV7 + , IMAGE_FILE_MACHINE_EBC + , IMAGE_FILE_MACHINE_I386 + , IMAGE_FILE_MACHINE_IA64 + , IMAGE_FILE_MACHINE_M32R + , IMAGE_FILE_MACHINE_MIPS16 + , IMAGE_FILE_MACHINE_MIPSFPU + , IMAGE_FILE_MACHINE_MIPSFPU16 + , IMAGE_FILE_MACHINE_POWERPC + , IMAGE_FILE_MACHINE_POWERPCFP + , IMAGE_FILE_MACHINE_R4000 + , IMAGE_FILE_MACHINE_SH3 + , IMAGE_FILE_MACHINE_SH3DSP + , IMAGE_FILE_MACHINE_SH4 + , IMAGE_FILE_MACHINE_SH5 + , IMAGE_FILE_MACHINE_THUMB + , IMAGE_FILE_MACHINE_WCEMIPSV2 + ]} + , {type: int} + ] + Characteristics: + - type: seq + sequence: + - type: str + enum: [ IMAGE_FILE_RELOCS_STRIPPED + , IMAGE_FILE_EXECUTABLE_IMAGE + , IMAGE_FILE_LINE_NUMS_STRIPPED + , IMAGE_FILE_LOCAL_SYMS_STRIPPED + , IMAGE_FILE_AGGRESSIVE_WS_TRIM + , IMAGE_FILE_LARGE_ADDRESS_AWARE + , IMAGE_FILE_BYTES_REVERSED_LO + , IMAGE_FILE_32BIT_MACHINE + , IMAGE_FILE_DEBUG_STRIPPED + , IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP + , IMAGE_FILE_NET_RUN_FROM_SWAP + , IMAGE_FILE_SYSTEM + , IMAGE_FILE_DLL + , IMAGE_FILE_UP_SYSTEM_ONLY + , IMAGE_FILE_BYTES_REVERSED_HI + ] + - type: int + sections: + type: seq + sequence: + - type: map + mapping: + Name: {type: str} + Characteristics: + - type: seq + sequence: + - type: str + enum: [ IMAGE_SCN_TYPE_NO_PAD + , IMAGE_SCN_CNT_CODE + , IMAGE_SCN_CNT_INITIALIZED_DATA + , IMAGE_SCN_CNT_UNINITIALIZED_DATA + , IMAGE_SCN_LNK_OTHER + , IMAGE_SCN_LNK_INFO + , IMAGE_SCN_LNK_REMOVE + , IMAGE_SCN_LNK_COMDAT + , IMAGE_SCN_GPREL + , IMAGE_SCN_MEM_PURGEABLE + , IMAGE_SCN_MEM_16BIT + , IMAGE_SCN_MEM_LOCKED + , IMAGE_SCN_MEM_PRELOAD + , IMAGE_SCN_ALIGN_1BYTES + , IMAGE_SCN_ALIGN_2BYTES + , IMAGE_SCN_ALIGN_4BYTES + , IMAGE_SCN_ALIGN_8BYTES + , IMAGE_SCN_ALIGN_16BYTES + , IMAGE_SCN_ALIGN_32BYTES + , IMAGE_SCN_ALIGN_64BYTES + , IMAGE_SCN_ALIGN_128BYTES + , IMAGE_SCN_ALIGN_256BYTES + , IMAGE_SCN_ALIGN_512BYTES + , IMAGE_SCN_ALIGN_1024BYTES + , IMAGE_SCN_ALIGN_2048BYTES + , IMAGE_SCN_ALIGN_4096BYTES + , IMAGE_SCN_ALIGN_8192BYTES + , IMAGE_SCN_LNK_NRELOC_OVFL + , IMAGE_SCN_MEM_DISCARDABLE + , IMAGE_SCN_MEM_NOT_CACHED + , IMAGE_SCN_MEM_NOT_PAGED + , IMAGE_SCN_MEM_SHARED + , IMAGE_SCN_MEM_EXECUTE + , IMAGE_SCN_MEM_READ + , IMAGE_SCN_MEM_WRITE + ] + - type: int + SectionData: {type: str} + symbols: + type: seq + sequence: + - type: map + mapping: + Name: {type: str} + Value: {type: int} + SectionNumber: {type: int} + SimpleType: [ {type: str, enum: [ IMAGE_SYM_TYPE_NULL + , IMAGE_SYM_TYPE_VOID + , IMAGE_SYM_TYPE_CHAR + , IMAGE_SYM_TYPE_SHORT + , IMAGE_SYM_TYPE_INT + , IMAGE_SYM_TYPE_LONG + , IMAGE_SYM_TYPE_FLOAT + , IMAGE_SYM_TYPE_DOUBLE + , IMAGE_SYM_TYPE_STRUCT + , IMAGE_SYM_TYPE_UNION + , IMAGE_SYM_TYPE_ENUM + , IMAGE_SYM_TYPE_MOE + , IMAGE_SYM_TYPE_BYTE + , IMAGE_SYM_TYPE_WORD + , IMAGE_SYM_TYPE_UINT + , IMAGE_SYM_TYPE_DWORD + ]} + , {type: int} + ] + ComplexType: [ {type: str, enum: [ IMAGE_SYM_DTYPE_NULL + , IMAGE_SYM_DTYPE_POINTER + , IMAGE_SYM_DTYPE_FUNCTION + , IMAGE_SYM_DTYPE_ARRAY + ]} + , {type: int} + ] + StorageClass: [ {type: str, enum: + [ IMAGE_SYM_CLASS_END_OF_FUNCTION + , IMAGE_SYM_CLASS_NULL + , IMAGE_SYM_CLASS_AUTOMATIC + , IMAGE_SYM_CLASS_EXTERNAL + , IMAGE_SYM_CLASS_STATIC + , IMAGE_SYM_CLASS_REGISTER + , IMAGE_SYM_CLASS_EXTERNAL_DEF + , IMAGE_SYM_CLASS_LABEL + , IMAGE_SYM_CLASS_UNDEFINED_LABEL + , IMAGE_SYM_CLASS_MEMBER_OF_STRUCT + , IMAGE_SYM_CLASS_ARGUMENT + , IMAGE_SYM_CLASS_STRUCT_TAG + , IMAGE_SYM_CLASS_MEMBER_OF_UNION + , IMAGE_SYM_CLASS_UNION_TAG + , IMAGE_SYM_CLASS_TYPE_DEFINITION + , IMAGE_SYM_CLASS_UNDEFINED_STATIC + , IMAGE_SYM_CLASS_ENUM_TAG + , IMAGE_SYM_CLASS_MEMBER_OF_ENUM + , IMAGE_SYM_CLASS_REGISTER_PARAM + , IMAGE_SYM_CLASS_BIT_FIELD + , IMAGE_SYM_CLASS_BLOCK + , IMAGE_SYM_CLASS_FUNCTION + , IMAGE_SYM_CLASS_END_OF_STRUCT + , IMAGE_SYM_CLASS_FILE + , IMAGE_SYM_CLASS_SECTION + , IMAGE_SYM_CLASS_WEAK_EXTERNAL + , IMAGE_SYM_CLASS_CLR_TOKEN + ]} + , {type: int} + ] |