From 2b066988909948dc3d53d01760bc2d71d32f3feb Mon Sep 17 00:00:00 2001 From: dim Date: Mon, 2 May 2011 19:34:44 +0000 Subject: Vendor import of llvm trunk r130700: http://llvm.org/svn/llvm-project/llvm/trunk@130700 --- docs/BitCodeFormat.html | 401 +++++++++++++++++++++++------------------------- 1 file changed, 192 insertions(+), 209 deletions(-) (limited to 'docs/BitCodeFormat.html') diff --git a/docs/BitCodeFormat.html b/docs/BitCodeFormat.html index 8d3d382..9a042a0 100644 --- a/docs/BitCodeFormat.html +++ b/docs/BitCodeFormat.html @@ -7,7 +7,7 @@ -
LLVM Bitcode File Format
+

LLVM Bitcode File Format

  1. Abstract
  2. Overview
  3. @@ -47,10 +47,10 @@ - +

    Abstract

    -
    +

    This document describes the LLVM bitstream file format and the encoding of the LLVM IR into it.

    @@ -58,10 +58,10 @@ the LLVM IR into it.

    - +

    Overview

    -
    +

    What is commonly known as the LLVM bitcode file format (also, sometimes @@ -88,10 +88,10 @@ wrapper format, then describes the record structure used by LLVM IR files.

    - +

    Bitstream Format

    -
    +

    The bitstream format is literally a stream of bits, with a very simple @@ -114,13 +114,12 @@ href="CommandGuide/html/llvm-bcanalyzer.html">llvm-bcanalyzer tool can be used to dump and inspect arbitrary bitstreams, which is very useful for understanding the encoding.

    -
    - - +

    + Magic Numbers +

    -
    +

    The first two bytes of a bitcode file are 'BC' (0x42, 0x43). The second two bytes are an application-specific magic number. Generic @@ -130,10 +129,11 @@ bitcode, while application-specific programs will want to look at all four.

    - +

    + Primitives +

    -
    +

    A bitstream literally consists of a stream of bits, which are read in order @@ -144,13 +144,12 @@ Width Integers or as Variable Width Integers.

    -
    - - +

    + Fixed Width Integers +

    -
    +

    Fixed-width integer values have their low bits emitted directly to the file. For example, a 3-bit integer value encodes 1 as 001. Fixed width integers @@ -161,10 +160,11 @@ Integers.

    - +

    + Variable Width Integers +

    -
    +

    Variable-width integer (VBR) values encode values of arbitrary size, optimizing for the case where the values are small. Given a 4-bit VBR field, @@ -182,9 +182,9 @@ value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value

    - +

    6-bit characters

    -
    +

    6-bit characters encode common characters into a fixed 6-bit field. They represent the following characters with the following 6-bit values:

    @@ -206,9 +206,9 @@ characters not in the set.

    - +

    Word Alignment

    -
    +

    Occasionally, it is useful to emit zero bits until the bitstream is a multiple of 32 bits. This ensures that the bit position in the stream can be @@ -216,12 +216,14 @@ represented as a multiple of 32-bit words.

    +
    - +

    + Abbreviation IDs +

    -
    +

    A bitstream is a sequential series of Blocks and @@ -253,10 +255,11 @@ an abbreviated record encoding.

    - +

    + Blocks +

    -
    +

    Blocks in a bitstream denote nested regions of the stream, and are identified by @@ -297,13 +300,10 @@ its own set of abbreviations, and its own abbrev id width. When a sub-block is popped, the saved values are restored.

    -
    - - +

    ENTER_SUBBLOCK Encoding

    -
    +

    [ENTER_SUBBLOCK, blockidvbr8, newabbrevlenvbr4, <align32bits>, blocklen32]

    @@ -322,10 +322,9 @@ reader to skip over the entire block in one jump.
    - +

    END_BLOCK Encoding

    -
    +

    [END_BLOCK, <align32bits>]

    @@ -337,13 +336,14 @@ an even multiple of 32-bits.
    - +
    - +

    + Data Records +

    -
    +

    Data records consist of a record code and a number of (up to) 64-bit integer values. The interpretation of the code and values is @@ -355,13 +355,10 @@ which encodes the target triple of a module. The code is ASCII codes for the characters in the string.

    -
    - - +

    UNABBREV_RECORD Encoding

    -
    +

    [UNABBREV_RECORD, codevbr6, numopsvbr6, op0vbr6, op1vbr6, ...]

    @@ -385,10 +382,9 @@ bits. This is not an efficient encoding, but it is fully general.
    - +

    Abbreviated Record Encoding

    -
    +

    [<abbrevid>, fields...]

    @@ -409,11 +405,14 @@ operand value).

    - - -
    + +

    + Abbreviations +

    + +

    Abbreviations are an important form of compression for bitstreams. The idea is to specify a dense encoding for a class of records once, then use that encoding @@ -431,13 +430,11 @@ As a concrete example, LLVM IR files usually emit an abbreviation for binary operators. If a specific LLVM module contained no or few binary operators, the abbreviation does not need to be emitted.

    -
    - +

    DEFINE_ABBREV Encoding

    -
    +

    [DEFINE_ABBREV, numabbrevopsvbr5, abbrevop0, abbrevop1, ...]

    @@ -552,11 +549,14 @@ used for any other string value.
    - - -
    + +

    + Standard Blocks +

    + +

    In addition to the basic block structure and record encodings, the bitstream @@ -565,13 +565,10 @@ stream is to be decoded or other metadata. In the future, new standard blocks may be added. Block IDs 0-7 are reserved for standard blocks.

    -
    - - +

    #0 - BLOCKINFO Block

    -
    +

    The BLOCKINFO block allows the description of metadata for other @@ -620,11 +617,15 @@ from the corresponding blocks. It is not safe to skip them.

    +
    + +
    + - +

    Bitcode Wrapper Format

    -
    +

    Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper @@ -652,10 +653,10 @@ value that can be used to encode the CPU of the target.

    - +

    LLVM IR Encoding

    -
    +

    LLVM IR is encoded into a bitstream by defining blocks and records. It uses @@ -666,16 +667,17 @@ that the writer uses, as these are fully self-described in the file, and the reader is not allowed to build in any knowledge of this.

    -
    - - +

    + Basics +

    + +
    - +

    LLVM IR Magic Number

    -
    +

    The magic number for LLVM IR files is: @@ -695,9 +697,9 @@ When combined with the bitcode magic number and viewed as bytes, this is

    - +

    Signed VBRs

    -
    +

    Variable Width Integer encoding is an efficient way to @@ -728,9 +730,9 @@ within CONSTANTS_BLOCK blocks. -

    +

    LLVM IR Blocks

    -
    +

    LLVM IR is defined with the following blocks: @@ -758,11 +760,14 @@ LLVM IR is defined with the following blocks:

    - - -
    + +

    + MODULE_BLOCK Contents +

    + +

    The MODULE_BLOCK block (id 8) is the top-level block for LLVM bitcode files, and each bitcode file must contain exactly one. In @@ -782,13 +787,10 @@ following sub-blocks:

  4. METADATA_BLOCK
  5. -
    - - +

    MODULE_CODE_VERSION Record

    -
    +

    [VERSION, version#]

    @@ -798,10 +800,9 @@ time.

    - +

    MODULE_CODE_TRIPLE Record

    -
    +

    [TRIPLE, ...string...]

    The TRIPLE record (code 2) contains a variable number of @@ -810,10 +811,9 @@ specification string.

    - +

    MODULE_CODE_DATALAYOUT Record

    -
    +

    [DATALAYOUT, ...string...]

    The DATALAYOUT record (code 3) contains a variable number of @@ -822,10 +822,9 @@ specification string.

    - +

    MODULE_CODE_ASM Record

    -
    +

    [ASM, ...string...]

    The ASM record (code 4) contains a variable number of @@ -834,10 +833,9 @@ individual assembly blocks separated by newline (ASCII 10) characters.

    - +

    MODULE_CODE_SECTIONNAME Record

    -
    +

    [SECTIONNAME, ...string...]

    The SECTIONNAME record (code 5) contains a variable number @@ -850,10 +848,9 @@ referenced by the 1-based index in the section fields of

    - +

    MODULE_CODE_DEPLIB Record

    -
    +

    [DEPLIB, ...string...]

    The DEPLIB record (code 6) contains a variable number of @@ -864,10 +861,9 @@ library name referenced.

    - +

    MODULE_CODE_GLOBALVAR Record

    -
    +

    [GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal]

    The GLOBALVAR record (code 7) marks the declaration or @@ -923,16 +919,15 @@ encoding of the visibility of this variable: is thread_local

  6. unnamed_addr: If present and non-zero, indicates that the variable -has unnamed_addr
  7. +has unnamed_addr
    - +

    MODULE_CODE_FUNCTION Record

    -
    +

    [FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]

    @@ -980,16 +975,15 @@ index in the table of MODULE_CODE_GCNAME entries.
  8. unnamed_addr: If present and non-zero, indicates that the function -has unnamed_addr
  9. +has unnamed_addr
    - +

    MODULE_CODE_ALIAS Record

    -
    +

    [ALIAS, alias type, aliasee val#, linkage, visibility]

    @@ -1011,10 +1005,9 @@ for this alias
    - +

    MODULE_CODE_PURGEVALS Record

    -
    +

    [PURGEVALS, numvals]

    The PURGEVALS record (code 10) resets the module-level @@ -1025,10 +1018,9 @@ new value indices will start from the given numvals value.

    - +

    MODULE_CODE_GCNAME Record

    -
    +

    [GCNAME, ...string...]

    The GCNAME record (code 11) contains a variable number of @@ -1039,11 +1031,14 @@ the module. These records can be referenced by 1-based index in the gc fields of FUNCTION records.

    - - -
    + +

    + PARAMATTR_BLOCK Contents +

    + +

    The PARAMATTR_BLOCK block (id 9) contains a table of entries describing the attributes of function parameters. These @@ -1057,14 +1052,10 @@ href="#FUNC_CODE_INST_CALL">INST_CALL records.

    that each is unique (i.e., no two indicies represent equivalent attribute lists).

    -
    - - - +

    PARAMATTR_CODE_ENTRY Record

    -
    +

    [ENTRY, paramidx0, attr0, paramidx1, attr1...]

    @@ -1105,11 +1096,14 @@ the logarithm base 2 of the requested alignment, plus 1
    - - -
    + +

    + TYPE_BLOCK Contents +

    + +

    The TYPE_BLOCK block (id 10) contains records which constitute a table of type operator entries used to represent types @@ -1124,13 +1118,10 @@ type operator records. each entry is unique (i.e., no two indicies represent structurally equivalent types).

    -
    - - +

    TYPE_CODE_NUMENTRY Record

    -
    +

    [NUMENTRY, numentries]

    @@ -1142,10 +1133,9 @@ in the block.
    - +

    TYPE_CODE_VOID Record

    -
    +

    [VOID]

    @@ -1155,10 +1145,9 @@ type table.
    - +

    TYPE_CODE_FLOAT Record

    -
    +

    [FLOAT]

    @@ -1168,10 +1157,9 @@ floating point) type to the type table.
    - +

    TYPE_CODE_DOUBLE Record

    -
    +

    [DOUBLE]

    @@ -1181,10 +1169,9 @@ floating point) type to the type table.
    - +

    TYPE_CODE_LABEL Record

    -
    +

    [LABEL]

    @@ -1194,10 +1181,9 @@ the type table.
    - +

    TYPE_CODE_OPAQUE Record

    -
    +

    [OPAQUE]

    @@ -1208,10 +1194,9 @@ unified.
    - +

    TYPE_CODE_INTEGER Record

    -
    +

    [INTEGER, width]

    @@ -1222,10 +1207,9 @@ integer type.
    - +

    TYPE_CODE_POINTER Record

    -
    +

    [POINTER, pointee type, address space]

    @@ -1243,10 +1227,9 @@ default address space is zero.
    - +

    TYPE_CODE_FUNCTION Record

    -
    +

    [FUNCTION, vararg, ignored, retty, ...paramty... ]

    @@ -1268,10 +1251,9 @@ parameter types of the function
    - +

    TYPE_CODE_STRUCT Record

    -
    +

    [STRUCT, ispacked, ...eltty...]

    @@ -1287,10 +1269,9 @@ types of the structure
    - +

    TYPE_CODE_ARRAY Record

    -
    +

    [ARRAY, numelts, eltty]

    @@ -1305,10 +1286,9 @@ table. The operand fields are

    - +

    TYPE_CODE_VECTOR Record

    -
    +

    [VECTOR, numelts, eltty]

    @@ -1323,10 +1303,9 @@ table. The operand fields are

    - +

    TYPE_CODE_X86_FP80 Record

    -
    +

    [X86_FP80]

    @@ -1336,10 +1315,9 @@ floating point) type to the type table.
    - +

    TYPE_CODE_FP128 Record

    -
    +

    [FP128]

    @@ -1349,10 +1327,9 @@ floating point) type to the type table.
    - +

    TYPE_CODE_PPC_FP128 Record

    -
    +

    [PPC_FP128]

    @@ -1362,10 +1339,9 @@ floating point) type to the type table.
    - +

    TYPE_CODE_METADATA Record

    -
    +

    [METADATA]

    @@ -1374,11 +1350,14 @@ type to the type table.

    - - -
    + +

    + CONSTANTS_BLOCK Contents +

    + +

    The CONSTANTS_BLOCK block (id 11) ...

    @@ -1387,10 +1366,11 @@ type to the type table. - +

    + FUNCTION_BLOCK Contents +

    -
    +

    The FUNCTION_BLOCK block (id 12) ...

    @@ -1409,23 +1389,21 @@ type to the type table. - +

    + TYPE_SYMTAB_BLOCK Contents +

    -
    +

    The TYPE_SYMTAB_BLOCK block (id 13) contains entries which map between module-level named types and their corresponding type indices.

    -
    - - +

    TST_CODE_ENTRY Record

    -
    +

    [ENTRY, typeid, ...string...]

    @@ -1436,12 +1414,14 @@ name. Each entry corresponds to a single named type.

    +
    - +

    + VALUE_SYMTAB_BLOCK Contents +

    -
    +

    The VALUE_SYMTAB_BLOCK block (id 14) ...

    @@ -1450,10 +1430,11 @@ name. Each entry corresponds to a single named type. - +

    + METADATA_BLOCK Contents +

    -
    +

    The METADATA_BLOCK block (id 15) ...

    @@ -1462,16 +1443,18 @@ name. Each entry corresponds to a single named type. - +

    + METADATA_ATTACHMENT Contents +

    -
    +

    The METADATA_ATTACHMENT block (id 16) ...

    +

    @@ -1480,8 +1463,8 @@ name. Each entry corresponds to a single named type. Valid HTML 4.01 Chris Lattner
    -The LLVM Compiler Infrastructure
    -Last modified: $Date: 2011-01-08 17:42:36 +0100 (Sat, 08 Jan 2011) $ +The LLVM Compiler Infrastructure
    +Last modified: $Date: 2011-04-23 02:30:22 +0200 (Sat, 23 Apr 2011) $ -- cgit v1.1