summaryrefslogtreecommitdiffstats
path: root/www/performance-2008-10-31.html
diff options
context:
space:
mode:
Diffstat (limited to 'www/performance-2008-10-31.html')
-rw-r--r--www/performance-2008-10-31.html134
1 files changed, 134 insertions, 0 deletions
diff --git a/www/performance-2008-10-31.html b/www/performance-2008-10-31.html
new file mode 100644
index 0000000..5246ac3
--- /dev/null
+++ b/www/performance-2008-10-31.html
@@ -0,0 +1,134 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
+ "http://www.w3.org/TR/html4/strict.dtd">
+<html>
+<head>
+ <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+ <title>Clang - Performance</title>
+ <link type="text/css" rel="stylesheet" href="menu.css" />
+ <link type="text/css" rel="stylesheet" href="content.css" />
+ <style type="text/css">
+</style>
+</head>
+<body>
+
+<!--#include virtual="menu.html.incl"-->
+
+<div id="content">
+
+<!--*************************************************************************-->
+<h1>Clang - Performance</h1>
+<!--*************************************************************************-->
+
+<p>This page tracks the compile time performance of Clang on two
+interesting benchmarks:
+<ul>
+ <li><i>Sketch</i>: The Objective-C example application shipped on
+ Mac OS X as part of Xcode. <i>Sketch</i> is indicative of a
+ "typical" Objective-C app. The source itself has a relatively
+ small amount of code (~7,500 lines of source code), but it relies
+ on the extensive Cocoa APIs to build its functionality. Like many
+ Objective-C applications, it includes
+ <tt>Cocoa/Cocoa.h</tt> in all of its source files, which represents a
+ significant stress test of the front-end's performance on lexing,
+ preprocessing, parsing, and syntax analysis.</li>
+ <li><i>176.gcc</i>: This is the gcc-2.7.2.2 code base as present in
+ SPECINT 2000. In contrast to Sketch, <i>176.gcc</i> consists of a
+ large amount of C source code (~220,000 lines) with few system
+ dependencies. This stresses the back-end's performance on generating
+ assembly code and debug information.</li>
+</ul>
+</p>
+
+<!--*************************************************************************-->
+<h2><a name="enduser">Experiments</a></h2>
+<!--*************************************************************************-->
+
+<p>Measurements are done by serially processing each file in the
+respective benchmark, using Clang, gcc, and llvm-gcc as compilers. In
+order to track the performance of various subsystems the timings have
+been broken down into separate stages where possible:
+
+<ul>
+ <li><tt>-Eonly</tt>: This option runs the preprocessor but does not
+ perform any output. For gcc and llvm-gcc, the -MM option is used
+ as a rough equivalent to this step.</li>
+ <li><tt>-parse-noop</tt>: This option runs the parser on the input,
+ but without semantic analysis or any output. gcc and llvm-gcc have
+ no equivalent for this option.</li>
+ <li><tt>-fsyntax-only</tt>: This option runs the parser with semantic
+ analysis.</li>
+ <li><tt>-emit-llvm -O0</tt>: For Clang and llvm-gcc, this option
+ converts to the LLVM intermediate representation but doesn't
+ generate native code.</li>
+ <li><tt>-S -O0</tt>: Perform actual code generation to produce a
+ native assembler file.</li>
+ <li><tt>-S -O0 -g</tt>: This adds emission of debug information to
+ the assembly output.</li>
+</ul>
+</p>
+
+<p>This set of stages is chosen to be approximately additive, that is
+each subsequent stage simply adds some additional processing. The
+timings measure the delta of the given stage from the previous
+one. For example, the timings for <tt>-fsyntax-only</tt> below show
+the difference of running with <tt>-fsyntax-only</tt> versus running
+with <tt>-parse-noop</tt> (for clang) or <tt>-MM</tt> with gcc and
+llvm-gcc. This amounts to a fairly accurate measure of only the time
+to perform semantic analysis (and parsing, in the case of gcc and llvm-gcc).</p>
+
+<p>These timings are chosen to break down the compilation process for
+clang as much as possible. The graphs below show these numbers
+combined so that it is easy to see how the time for a particular task
+is divided among various components. For example, <tt>-S -O0</tt>
+includes the time of <tt>-fsyntax-only</tt> and <tt>-emit-llvm -O0</tt>.</p>
+
+<p>Note that we already know that the LLVM optimizers are substantially (30-40%)
+faster than the GCC optimizers at a given -O level, so we only focus on -O0
+compile time here.</p>
+
+<!--*************************************************************************-->
+<h2><a name="enduser">Timing Results</a></h2>
+<!--*************************************************************************-->
+
+<!--=======================================================================-->
+<h3><a name="2008-10-31">2008-10-31</a></h3>
+<!--=======================================================================-->
+
+<center><h4>Sketch</h4></center>
+<img class="img_slide"
+ src="timing-data/2008-10-31/sketch.png" alt="Sketch Timings"/>
+
+<p>This shows Clang's substantial performance improvements in
+preprocessing and semantic analysis; over 90% faster on
+-fsyntax-only. As expected, time spent in code generation for this
+benchmark is relatively small. One caveat, Clang's debug information
+generation for Objective-C is very incomplete; this means the <tt>-S
+-O0 -g</tt> numbers are unfair since Clang is generating substantially
+less output.</p>
+
+<p>This chart also shows the effect of using precompiled headers (PCH)
+on compiler time. gcc and llvm-gcc see a large performance improvement
+with PCH; about 4x in wall time. Unfortunately, Clang does not yet
+have an implementation of PCH-style optimizations, but we are actively
+working to address this.</p>
+
+<center><h4>176.gcc</h4></center>
+<img class="img_slide"
+ src="timing-data/2008-10-31/176.gcc.png" alt="176.gcc Timings"/>
+
+<p>Unlike the <i>Sketch</i> timings, compilation of <i>176.gcc</i>
+involves a large amount of code generation. The time spent in Clang's
+LLVM IR generation and code generation is on par with gcc's code
+generation time but the improved parsing & semantic analysis
+performance means Clang still comes in at ~29% faster versus gcc
+on <tt>-S -O0 -g</tt> and ~20% faster versus llvm-gcc.</p>
+
+<p>These numbers indicate that Clang still has room for improvement in
+several areas, notably our LLVM IR generation is significantly slower
+than that of llvm-gcc, and both Clang and llvm-gcc incur a
+significantly higher cost for adding debugging information compared to
+gcc.</p>
+
+</div>
+</body>
+</html>
OpenPOWER on IntegriCloud