diff options
Diffstat (limited to 'docs/ReleaseNotes.rst')
-rw-r--r-- | docs/ReleaseNotes.rst | 410 |
1 files changed, 385 insertions, 25 deletions
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst index c0d2ea1..fd149c9 100644 --- a/docs/ReleaseNotes.rst +++ b/docs/ReleaseNotes.rst @@ -5,12 +5,6 @@ LLVM 3.7 Release Notes .. contents:: :local: -.. warning:: - These are in-progress notes for the upcoming LLVM 3.7 release. You may - prefer the `LLVM 3.6 Release Notes <http://llvm.org/releases/3.6.0/docs - /ReleaseNotes.html>`_. - - Introduction ============ @@ -23,7 +17,7 @@ from the `LLVM releases web site <http://llvm.org/releases/>`_. For more information about LLVM, including information about the latest release, please check out the `main LLVM web site <http://llvm.org/>`_. If you have questions or comments, the `LLVM Developer's Mailing List -<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ is a good place to send +<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send them. Note that if you are reading this file from a Subversion checkout or the main @@ -48,46 +42,346 @@ Non-comprehensive list of changes in this release collection of tips for frontend authors on how to generate IR which LLVM is able to effectively optimize. -* The DataLayout is no longer optional. All the IR level optimizations expects +* The ``DataLayout`` is no longer optional. All the IR level optimizations expects it to be present and the API has been changed to use a reference instead of a pointer to make it explicit. The Module owns the datalayout and it has to match the one attached to the TargetMachine for generating code. -* ... next change ... + In 3.6, a pass was inserted in the pipeline to make the ``DataLayout`` accessible: + ``MyPassManager->add(new DataLayoutPass(MyTargetMachine->getDataLayout()));`` + In 3.7, you don't need a pass, you set the ``DataLayout`` on the ``Module``: + ``MyModule->setDataLayout(MyTargetMachine->createDataLayout());`` -.. NOTE - If you would like to document a larger change, then you can add a - subsection about it right here. You can copy the following boilerplate - and un-indent it (the indentation causes it to be inside this comment). + The LLVM C API ``LLVMGetTargetMachineData`` is deprecated to reflect the fact + that it won't be available anymore from ``TargetMachine`` in 3.8. - Special New Feature - ------------------- +* Comdats are now orthogonal to the linkage. LLVM will not create + comdats for weak linkage globals and the frontends are responsible + for explicitly adding them. - Makes programs 10x faster by doing Special New Thing. +* On ELF we now support multiple sections with the same name and + comdat. This allows for smaller object files since multiple + sections can have a simple name (`.text`, `.rodata`, etc). -Changes to the ARM Backend --------------------------- +* LLVM now lazily loads metadata in some cases. Creating archives + with IR files with debug info is now 25X faster. + +* llvm-ar can create archives in the BSD format used by OS X. + +* LLVM received a backend for the extended Berkely Packet Filter + instruction set that can be dynamically loaded into the Linux kernel via the + `bpf(2) <http://man7.org/linux/man-pages/man2/bpf.2.html>`_ syscall. + + Support for BPF has been present in the kernel for some time, but starting + from 3.18 has been extended with such features as: 64-bit registers, 8 + additional registers registers, conditional backwards jumps, call + instruction, shift instructions, map (hash table, array, etc.), 1-8 byte + load/store from stack, and more. - During this release ... + Up until now, users of BPF had to write bytecode by hand, or use + custom generators. This release adds a proper LLVM backend target for the BPF + bytecode architecture. + The BPF target is now available by default, and options exist in both Clang + (-target bpf) or llc (-march=bpf) to pick eBPF as a backend. + +* Switch-case lowering was rewritten to avoid generating unbalanced search trees + (`PR22262 <http://llvm.org/pr22262>`_) and to exploit profile information + when available. Some lowering strategies are now disabled when optimizations + are turned off, to save compile time. + +* The debug info IR class hierarchy now inherits from ``Metadata`` and has its + own bitcode records and assembly syntax + (`documented in LangRef <LangRef.html#specialized-metadata-nodes>`_). The debug + info verifier has been merged with the main verifier. + +* LLVM IR and APIs are in a period of transition to aid in the removal of + pointer types (the end goal being that pointers are typeless/opaque - void*, + if you will). Some APIs and IR constructs have been modified to take + explicit types that are currently checked to match the target type of their + pre-existing pointer type operands. Further changes are still needed, but the + more you can avoid using ``PointerType::getPointeeType``, the easier the + migration will be. + +* Argument-less ``TargetMachine::getSubtarget`` and + ``TargetMachine::getSubtargetImpl`` have been removed from the tree. Updating + out of tree ports is as simple as implementing a non-virtual version in the + target, but implementing full ``Function`` based ``TargetSubtargetInfo`` + support is recommended. + +* This is expected to be the last major release of LLVM that supports being + run on Windows XP and Windows Vista. For the next major release the minimum + Windows version requirement will be Windows 7. Changes to the MIPS Target -------------------------- - During this release ... +During this release the MIPS target has: + +* Added support for MIPS32R3, MIPS32R5, MIPS32R3, MIPS32R5, and microMIPS32. + +* Added support for dynamic stack realignment. This is of particular importance + to MSA on 32-bit subtargets since vectors always exceed the stack alignment on + the O32 ABI. + +* Added support for compiler-rt including: + + * Support for the Address, and Undefined Behaviour Sanitizers for all MIPS + subtargets. + + * Support for the Data Flow, and Memory Sanitizer for 64-bit subtargets. + + * Support for the Profiler for all MIPS subtargets. + +* Added support for libcxx, and libcxxabi. + +* Improved inline assembly support such that memory constraints may now make use + of the appropriate address offsets available to the instructions. Also, added + support for the ``ZC`` constraint. + +* Added support for 128-bit integers on 64-bit subtargets and 16-bit floating + point conversions on all subtargets. + +* Added support for read-only ``.eh_frame`` sections by storing type information + indirectly. + +* Added support for MCJIT on all 64-bit subtargets as well as MIPS32R6. + +* Added support for fast instruction selection on MIPS32 and MIPS32R2 with PIC. + +* Various bug fixes. Including the following notable fixes: + * Fixed 'jumpy' debug line info around calls where calculation of the address + of the function would inappropriately change the line number. + + * Fixed missing ``__mips_isa_rev`` macro on the MIPS32R6 and MIPS32R6 + subtargets. + + * Fixed representation of NaN when targeting systems using traditional + encodings. Traditionally, MIPS has used NaN encodings that were compatible + with IEEE754-1985 but would later be found incompatible with IEEE754-2008. + + * Fixed multiple segfaults and assertions in the disassembler when + disassembling instructions that have memory operands. + + * Fixed multiple cases of suboptimal code generation involving $zero. + + * Fixed code generation of 128-bit shifts on 64-bit subtargets. + + * Prevented the delay slot filler from filling call delay slots with + instructions that modify or use $ra. + + * Fixed some remaining N32/N64 calling convention bugs when using small + structures on big-endian subtargets. + + * Fixed missing sign-extensions that are required by the N32/N64 calling + convention when generating calls to library functions with 32-bit + parameters. + + * Corrected the ``int64_t`` typedef to be ``long`` for N64. + + * ``-mno-odd-spreg`` is now honoured for vector insertion/extraction + operations when using -mmsa. + + * Fixed vector insertion and extraction for MSA on 64-bit subtargets. + + * Corrected the representation of member function pointers. This makes them + usable on microMIPS subtargets. Changes to the PowerPC Target ----------------------------- - During this release ... +There are numerous improvements to the PowerPC target in this release: + +* LLVM now supports the ISA 2.07B (POWER8) instruction set, including + direct moves between general registers and vector registers, and + built-in support for hardware transactional memory (HTM). Some missing + instructions from ISA 2.06 (POWER7) were also added. + +* Code generation for the local-dynamic and global-dynamic thread-local + storage models has been improved. + +* Loops may be restructured to leverage pre-increment loads and stores. + +* QPX - The vector instruction set used by the IBM Blue Gene/Q supercomputers + is now supported. + +* Loads from the TOC area are now correctly treated as invariant. + +* PowerPC now has support for i128 and v1i128 types. The types differ + in how they are passed in registers for the ELFv2 ABI. + +* Disassembly will now print shorter mnemonic aliases when available. + +* Optional register name prefixes for VSX and QPX registers are now + supported in the assembly parser. + +* The back end now contains a pass to remove unnecessary vector swaps + from POWER8 little-endian code generation. Additional improvements + are planned for release 3.8. + +* The undefined-behavior sanitizer (UBSan) is now supported for PowerPC. + +* Many new vector programming APIs have been added to altivec.h. + Additional ones are planned for release 3.8. + +* PowerPC now supports __builtin_call_with_static_chain. + +* PowerPC now supports the revised -mrecip option that permits finer + control over reciprocal estimates. +* Many bugs have been identified and fixed. -Changes to the OCaml bindings +Changes to the SystemZ Target ----------------------------- - During this release ... +* LLVM no longer attempts to automatically detect the current host CPU when + invoked natively. +* Support for all thread-local storage models. (Previous releases would support + only the local-exec TLS model.) + +* The POPCNT instruction is now used on z196 and above. + +* The RISBGN instruction is now used on zEC12 and above. + +* Support for the transactional-execution facility on zEC12 and above. + +* Support for the z13 processor and its vector facility. + + +Changes to the JIT APIs +----------------------- + +* Added a new C++ JIT API called On Request Compilation, or ORC. + + ORC is a new JIT API inspired by MCJIT but designed to be more testable, and + easier to extend with new features. A key new feature already in tree is lazy, + function-at-a-time compilation for X86. Also included is a reimplementation of + MCJIT's API and behavior (OrcMCJITReplacement). MCJIT itself remains in tree, + and continues to be the default JIT ExecutionEngine, though new users are + encouraged to try ORC out for their projects. (A good place to start is the + new ORC tutorials under llvm/examples/kaleidoscope/orc). + +Sub-project Status Update +========================= + +In addition to the core LLVM 3.7 distribution of production-quality compiler +infrastructure, the LLVM project includes sub-projects that use the LLVM core +and share the same distribution license. This section provides updates on these +sub-projects. + +Polly - The Polyhedral Loop Optimizer in LLVM +--------------------------------------------- + +`Polly <http://polly.llvm.org>`_ is a polyhedral loop optimization +infrastructure that provides data-locality optimizations to LLVM-based +compilers. When compiled as part of clang or loaded as a module into clang, +it can perform loop optimizations such as tiling, loop fusion or outer-loop +vectorization. As a generic loop optimization infrastructure it allows +developers to get a per-loop-iteration model of a loop nest on which detailed +analysis and transformations can be performed. + +Changes since the last release: + +* isl imported into Polly distribution + + `isl <http://repo.or.cz/w/isl.git>`_, the math library Polly uses, has been + imported into the source code repository of Polly and is now distributed as part + of Polly. As this was the last external library dependency of Polly, Polly can + now be compiled right after checking out the Polly source code without the need + for any additional libraries to be pre-installed. + +* Small integer optimization of isl + + The MIT licensed imath backend using in `isl <http://repo.or.cz/w/isl.git>`_ for + arbitrary width integer computations has been optimized to use native integer + operations for the common case where the operands of a computation fit into 32 + bit and to only fall back to large arbitrary precision integers for the + remaining cases. This optimization has greatly improved the compile-time + performance of Polly, both due to faster native operations also due to a + reduction in malloc traffic and pointer indirections. As a result, computations + that use arbitrary precision integers heavily have been speed up by almost 6x. + As a result, the compile-time of Polly on the Polybench test kernels in the LNT + suite has been reduced by 20% on average with compile time reductions between + 9-43%. + +* Schedule Trees + + Polly now uses internally so-called > Schedule Trees < to model the loop + structure it optimizes. Schedule trees are an easy to understand tree structure + that describes a loop nest using integer constraint sets to keep track of + execution constraints. It allows the developer to use per-tree-node operations + to modify the loop tree. Programatic analysis that work on the schedule tree + (e.g., as dependence analysis) also show a visible speedup as they can exploit + the tree structure of the schedule and need to fall back to ILP based + optimization problems less often. Section 6 of `Polyhedral AST generation is + more than scanning polyhedra + <http://www.grosser.es/#pub-polyhedral-AST-generation>`_ gives a detailed + explanation of this schedule trees. + +* Scalar and PHI node modeling - Polly as an analysis + + Polly now requires almost no preprocessing to analyse LLVM-IR, which makes it + easier to use Polly as a pure analysis pass e.g. to provide more precise + dependence information to non-polyhedral transformation passes. Originally, + Polly required the input LLVM-IR to be preprocessed such that all scalar and + PHI-node dependences are translated to in-memory operations. Since this release, + Polly has full support for scalar and PHI node dependences and requires no + scalar-to-memory translation for such kind of dependences. + +* Modeling of modulo and non-affine conditions + + Polly can now supports modulo operations such as A[t%2][i][j] as they appear + often in stencil computations and also allows data-dependent conditional + branches as they result e.g. from ternary conditions ala A[i] > 255 ? 255 : + A[i]. + +* Delinearization + + Polly now support the analysis of manually linearized multi-dimensional arrays + as they result form macros such as + "#define 2DARRAY(A,i,j) (A.data[(i) * A.size + (j)]". Similar constructs appear + in old C code written before C99, C++ code such as boost::ublas, LLVM exported + from Julia, Matlab generated code and many others. Our work titled + `Optimistic Delinearization of Parametrically Sized Arrays + <http://www.grosser.es/#pub-optimistic-delinerization>`_ gives details. + +* Compile time improvements + + Pratik Bahtu worked on compile-time performance tuning of Polly. His work + together with the support for schedule trees and the small integer optimization + in isl notably reduced the compile time. + +* Increased compute timeouts + + As Polly's compile time has been notabily improved, we were able to increase + the compile time saveguards in Polly. As a result, the default configuration + of Polly can now analyze larger loop nests without running into compile time + restrictions. + +* Export Debug Locations via JSCoP file + + Polly's JSCoP import/export format gained support for debug locations that show + to the user the source code location of detected scops. + +* Improved windows support + + The compilation of Polly on windows using cmake has been improved and several + visual studio build issues have been addressed. + +* Many bug fixes + +libunwind +--------- + +The unwind implementation which use to reside in `libc++abi` has been moved into +a separate repository. This implementation can still be used for `libc++abi` by +specifying `-DLIBCXXABI_USE_LLVM_UNWINDER=YES` and +`-DLIBCXXABI_LIBUNWIND_PATH=<path to libunwind source>` when configuring +`libc++abi`, which defaults to `true` when building on ARM. + +The new repository can also be built standalone if just `libunwind` is desired. External Open Source Projects Using LLVM 3.7 ============================================ @@ -96,7 +390,74 @@ An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. This section lists some of the projects that have already been updated to work with LLVM 3.7. -* A project + +LDC - the LLVM-based D compiler +------------------------------- + +`D <http://dlang.org>`_ is a language with C-like syntax and static typing. It +pragmatically combines efficiency, control, and modeling power, with safety and +programmer productivity. D supports powerful concepts like Compile-Time Function +Execution (CTFE) and Template Meta-Programming, provides an innovative approach +to concurrency and offers many classical paradigms. + +`LDC <http://wiki.dlang.org/LDC>`_ uses the frontend from the reference compiler +combined with LLVM as backend to produce efficient native code. LDC targets +x86/x86_64 systems like Linux, OS X, FreeBSD and Windows and also Linux on +PowerPC (32/64 bit). Ports to other architectures like ARM, AArch64 and MIPS64 +are underway. + +Portable Computing Language (pocl) +---------------------------------- + +In addition to producing an easily portable open source OpenCL +implementation, another major goal of `pocl <http://portablecl.org/>`_ +is improving performance portability of OpenCL programs with +compiler optimizations, reducing the need for target-dependent manual +optimizations. An important part of pocl is a set of LLVM passes used to +statically parallelize multiple work-items with the kernel compiler, even in +the presence of work-group barriers. + + +TTA-based Co-design Environment (TCE) +------------------------------------- + +`TCE <http://tce.cs.tut.fi/>`_ is a toolset for designing customized +exposed datapath processors based on the Transport triggered +architecture (TTA). + +The toolset provides a complete co-design flow from C/C++ +programs down to synthesizable VHDL/Verilog and parallel program binaries. +Processor customization points include the register files, function units, +supported operations, and the interconnection network. + +TCE uses Clang and LLVM for C/C++/OpenCL C language support, target independent +optimizations and also for parts of code generation. It generates +new LLVM-based code generators "on the fly" for the designed processors and +loads them in to the compiler backend as runtime libraries to avoid +per-target recompilation of larger parts of the compiler chain. + +BPF Compiler Collection (BCC) +----------------------------- +`BCC <https://github.com/iovisor/bcc>`_ is a Python + C framework for tracing and +networking that is using Clang rewriter + 2nd pass of Clang + BPF backend to +generate eBPF and push it into the kernel. + +LLVMSharp & ClangSharp +---------------------- + +`LLVMSharp <http://www.llvmsharp.org>`_ and +`ClangSharp <http://www.clangsharp.org>`_ are type-safe C# bindings for +Microsoft.NET and Mono that Platform Invoke into the native libraries. +ClangSharp is self-hosted and is used to generated LLVMSharp using the +LLVM-C API. + +`LLVMSharp Kaleidoscope Tutorials <http://www.llvmsharp.org/Kaleidoscope/>`_ +are instructive examples of writing a compiler in C#, with certain improvements +like using the visitor pattern to generate LLVM IR. + +`ClangSharp PInvoke Generator <http://www.clangsharp.org/PInvoke/>`_ is the +self-hosting mechanism for LLVM/ClangSharp and is demonstrative of using +LibClang to generate Platform Invoke (PInvoke) signatures for C APIs. Additional Information @@ -111,4 +472,3 @@ going into the ``llvm/docs/`` directory in the LLVM tree. If you have any questions or comments about LLVM, please feel free to contact us via the `mailing lists <http://llvm.org/docs/#maillist>`_. - |