diff options
Diffstat (limited to 'docs/GetElementPtr.html')
-rw-r--r-- | docs/GetElementPtr.html | 232 |
1 files changed, 123 insertions, 109 deletions
diff --git a/docs/GetElementPtr.html b/docs/GetElementPtr.html index 5410137..4c347a6 100644 --- a/docs/GetElementPtr.html +++ b/docs/GetElementPtr.html @@ -11,9 +11,9 @@ </head> <body> -<div class="doc_title"> +<h1> The Often Misunderstood GEP Instruction -</div> +</h1> <ol> <li><a href="#intro">Introduction</a></li> @@ -58,10 +58,10 @@ <!-- *********************************************************************** --> -<div class="doc_section"><a name="intro"><b>Introduction</b></a></div> +<h2><a name="intro">Introduction</a></h2> <!-- *********************************************************************** --> -<div class="doc_text"> +<div> <p>This document seeks to dispel the mystery and confusion surrounding LLVM's <a href="LangRef.html#i_getelementptr">GetElementPtr</a> (GEP) instruction. Questions about the wily GEP instruction are @@ -72,21 +72,20 @@ </div> <!-- *********************************************************************** --> -<div class="doc_section"><a name="addresses"><b>Address Computation</b></a></div> +<h2><a name="addresses">Address Computation</a></h2> <!-- *********************************************************************** --> -<div class="doc_text"> +<div> <p>When people are first confronted with the GEP instruction, they tend to relate it to known concepts from other programming paradigms, most notably C array indexing and field selection. GEP closely resembles C array indexing and field selection, however it's is a little different and this leads to the following questions.</p> -</div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="firstptr"><b>What is the first index of the GEP instruction?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="firstptr">What is the first index of the GEP instruction?</a> +</h3> +<div> <p>Quick answer: The index stepping through the first operand.</p> <p>The confusion with the first index usually arises from thinking about the GetElementPtr instruction as if it was a C index operator. They aren't the @@ -205,11 +204,11 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="extra_index"><b>Why is the extra 0 index required?</b></a> -</div> +<h3> + <a name="extra_index">Why is the extra 0 index required?</a> +</h3> <!-- *********************************************************************** --> -<div class="doc_text"> +<div> <p>Quick answer: there are no superfluous indices.</p> <p>This question arises most often when the GEP instruction is applied to a global variable which is always a pointer type. For example, consider @@ -247,10 +246,10 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="deref"><b>What is dereferenced by GEP?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="deref">What is dereferenced by GEP?</a> +</h3> +<div> <p>Quick answer: nothing.</p> <p>The GetElementPtr instruction dereferences nothing. That is, it doesn't access memory in any way. That's what the Load and Store instructions are for. @@ -302,10 +301,10 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="lead0"><b>Why don't GEP x,0,0,1 and GEP x,1 alias?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="lead0">Why don't GEP x,0,0,1 and GEP x,1 alias?</a> +</h3> +<div> <p>Quick Answer: They compute different address locations.</p> <p>If you look at the first indices in these GEP instructions you find that they are different (0 and 1), therefore the address @@ -331,10 +330,10 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="trail0"><b>Why do GEP x,1,0,0 and GEP x,1 alias?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="trail0">Why do GEP x,1,0,0 and GEP x,1 alias?</a> +</h3> +<div> <p>Quick Answer: They compute the same address location.</p> <p>These two GEP instructions will compute the same address because indexing through the 0th element does not change the address. However, it does change @@ -355,10 +354,10 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="vectors"><b>Can GEP index into vector elements?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="vectors">Can GEP index into vector elements?</a> +</h3> +<div> <p>This hasn't always been forcefully disallowed, though it's not recommended. It leads to awkward special cases in the optimizers, and fundamental inconsistency in the IR. In the future, it will probably be outright @@ -368,10 +367,10 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="addrspace"><b>What effect do address spaces have on GEPs?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="addrspace">What effect do address spaces have on GEPs?</a> +</h3> +<div> <p>None, except that the address space qualifier on the first operand pointer type always matches the address space qualifier on the result type.</p> @@ -379,11 +378,12 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="int"><b>How is GEP different from ptrtoint, arithmetic, - and inttoptr?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="int"> + How is GEP different from ptrtoint, arithmetic, and inttoptr? + </a> +</h3> +<div> <p>It's very similar; there are only subtle differences.</p> <p>With ptrtoint, you have to pick an integer type. One approach is to pick i64; @@ -409,11 +409,13 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="be"><b>I'm writing a backend for a target which needs custom - lowering for GEP. How do I do this?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="be"> + I'm writing a backend for a target which needs custom lowering for GEP. + How do I do this? + </a> +</h3> +<div> <p>You don't. The integer computation implied by a GEP is target-independent. Typically what you'll need to do is make your backend pattern-match expressions trees involving ADD, MUL, etc., which are what GEP is lowered @@ -431,10 +433,10 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="vla"><b>How does VLA addressing work with GEPs?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="vla">How does VLA addressing work with GEPs?</a> +</h3> +<div> <p>GEPs don't natively support VLAs. LLVM's type system is entirely static, and GEP address computations are guided by an LLVM type.</p> @@ -450,16 +452,18 @@ idx3 = (char*) &MyVar + 8 VLA and non-VLA indexing in the same manner.</p> </div> +</div> + <!-- *********************************************************************** --> -<div class="doc_section"><a name="rules"><b>Rules</b></a></div> +<h2><a name="rules">Rules</a></h2> <!-- *********************************************************************** --> - +<div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="bounds"><b>What happens if an array index is out of bounds?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="bounds">What happens if an array index is out of bounds?</a> +</h3> +<div> <p>There are two senses in which an array index can be out of bounds.</p> <p>First, there's the array type which comes from the (static) type of @@ -498,20 +502,20 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="negative"><b>Can array indices be negative?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="negative">Can array indices be negative?</a> +</h3> +<div> <p>Yes. This is basically a special case of array indices being out of bounds.</p> </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="compare"><b>Can I compare two values computed with GEPs?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="compare">Can I compare two values computed with GEPs?</a> +</h3> +<div> <p>Yes. If both addresses are within the same allocated object, or one-past-the-end, you'll get the comparison result you expect. If either is outside of it, integer arithmetic wrapping may occur, so the @@ -520,11 +524,13 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="types"><b>Can I do GEP with a different pointer type than the type of - the underlying object?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="types"> + Can I do GEP with a different pointer type than the type of + the underlying object? + </a> +</h3> +<div> <p>Yes. There are no restrictions on bitcasting a pointer value to an arbitrary pointer type. The types in a GEP serve only to define the parameters for the underlying integer computation. They need not correspond with the actual @@ -538,11 +544,12 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="null"><b>Can I cast an object's address to integer and add it - to null?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="null"> + Can I cast an object's address to integer and add it to null? + </a> +</h3> +<div> <p>You can compute an address that way, but if you use GEP to do the add, you can't use that pointer to actually access the object, unless the object is managed outside of LLVM.</p> @@ -562,11 +569,13 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="ptrdiff"><b>Can I compute the distance between two objects, and add - that value to one address to compute the other address?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="ptrdiff"> + Can I compute the distance between two objects, and add + that value to one address to compute the other address? + </a> +</h3> +<div> <p>As with arithmetic on null, You can use GEP to compute an address that way, but you can't use that pointer to actually access the object if you do, unless the object is managed outside of LLVM.</p> @@ -577,10 +586,10 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="tbaa"><b>Can I do type-based alias analysis on LLVM IR?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="tbaa">Can I do type-based alias analysis on LLVM IR?</a> +</h3> +<div> <p>You can't do type-based alias analysis using LLVM's built-in type system, because LLVM has no restrictions on mixing types in addressing, loads or stores.</p> @@ -594,10 +603,10 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="overflow"><b>What happens if a GEP computation overflows?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="overflow">What happens if a GEP computation overflows?</a> +</h3> +<div> <p>If the GEP lacks the <tt>inbounds</tt> keyword, the value is the result from evaluating the implied two's complement integer computation. However, since there's no guarantee of where an object will be allocated in the @@ -624,11 +633,12 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="check"><b>How can I tell if my front-end is following the - rules?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="check"> + How can I tell if my front-end is following the rules? + </a> +</h3> +<div> <p>There is currently no checker for the getelementptr rules. Currently, the only way to do this is to manually check each place in your front-end where GetElementPtr operators are created.</p> @@ -641,16 +651,18 @@ idx3 = (char*) &MyVar + 8 </div> +</div> + <!-- *********************************************************************** --> -<div class="doc_section"><a name="rationale"><b>Rationale</b></a></div> +<h2><a name="rationale">Rationale</a></h2> <!-- *********************************************************************** --> - +<div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="goals"><b>Why is GEP designed this way?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="goals">Why is GEP designed this way?</a> +</h3> +<div> <p>The design of GEP has the following goals, in rough unofficial order of priority:</p> <ul> @@ -669,10 +681,10 @@ idx3 = (char*) &MyVar + 8 </div> <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="i32"><b>Why do struct member indices always use i32?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="i32">Why do struct member indices always use i32?</a> +</h3> +<div> <p>The specific type i32 is probably just a historical artifact, however it's wide enough for all practical purposes, so there's been no need to change it. It doesn't necessarily imply i32 address arithmetic; it's just an identifier @@ -684,10 +696,10 @@ idx3 = (char*) &MyVar + 8 <!-- *********************************************************************** --> -<div class="doc_subsection"> - <a name="uglygep"><b>What's an uglygep?</b></a> -</div> -<div class="doc_text"> +<h3> + <a name="uglygep">What's an uglygep?</a> +</h3> +<div> <p>Some LLVM optimizers operate on GEPs by internally lowering them into more primitive integer expressions, which allows them to be combined with other integer expressions and/or split into multiple separate @@ -704,11 +716,13 @@ idx3 = (char*) &MyVar + 8 </div> +</div> + <!-- *********************************************************************** --> -<div class="doc_section"><a name="summary"><b>Summary</b></a></div> +<h2><a name="summary">Summary</a></h2> <!-- *********************************************************************** --> -<div class="doc_text"> +<div> <p>In summary, here's some things to always remember about the GetElementPtr instruction:</p> <ol> @@ -732,8 +746,8 @@ idx3 = (char*) &MyVar + 8 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> <a href="http://validator.w3.org/check/referer"><img src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br/> - Last modified: $Date: 2011-02-11 22:50:52 +0100 (Fri, 11 Feb 2011) $ + <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br/> + Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ </address> </body> </html> |