From f27e5a09a0d815b8a4814152954ff87dadfdefc0 Mon Sep 17 00:00:00 2001 From: ed Date: Tue, 2 Jun 2009 17:58:47 +0000 Subject: Import Clang, at r72732. --- www/diagnostics.html | 279 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 279 insertions(+) create mode 100644 www/diagnostics.html (limited to 'www/diagnostics.html') diff --git a/www/diagnostics.html b/www/diagnostics.html new file mode 100644 index 0000000..5be4db2 --- /dev/null +++ b/www/diagnostics.html @@ -0,0 +1,279 @@ + + + + + Clang - Expressive Diagnostics + + + + + + + + +
+ + + +

Expressive Diagnostics

+ + +

In addition to being fast and functional, we aim to make Clang extremely user +friendly. As far as a command-line compiler goes, this basically boils down to +making the diagnostics (error and warning messages) generated by the compiler +be as useful as possible. There are several ways that we do this. This section +talks about the experience provided by the command line compiler, contrasting +Clang output to GCC 4.2's output in several examples. + +

+ +

Column Numbers and Caret Diagnostics

+ +

First, all diagnostics produced by clang include full column number +information, and use this to print "caret diagnostics". This is a feature +provided by many commercial compilers, but is generally missing from open source +compilers. This is nice because it makes it very easy to understand exactly +what is wrong in a particular piece of code, an example is:

+ +
+  $ gcc-4.2 -fsyntax-only -Wformat format-strings.c
+  format-strings.c:91: warning: too few arguments for format
+  $ clang -fsyntax-only format-strings.c
+  format-strings.c:91:13: warning: '.*' specified field precision is missing a matching 'int' argument
+    printf("%.*d");
+              ^
+
+ +

The caret (the blue "^" character) exactly shows where the problem is, even +inside of the string. This makes it really easy to jump to the problem and +helps when multiple instances of the same character occur on a line. We'll +revisit this more in following examples.

+ +

Range Highlighting for Related Text

+ +

Clang captures and accurately tracks range information for expressions, +statements, and other constructs in your program and uses this to make +diagnostics highlight related information. For example, here's a somewhat +nonsensical example to illustrate this:

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
+  $ clang -fsyntax-only t.c
+  t.c:7:39: error: invalid operands to binary expression ('int' and 'struct A')
+    return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);
+                         ~~~~~~~~~~~~~~ ^ ~~~~~
+
+ +

Here you can see that you don't even need to see the original source code to +understand what is wrong based on the Clang error: Because clang prints a +caret, you know exactly which plus it is complaining about. The range +information highlights the left and right side of the plus which makes it +immediately obvious what the compiler is talking about, which is very useful for +cases involving precedence issues and many other cases.

+ +

Precision in Wording

+ +

A detail is that we have tried really hard to make the diagnostics that come +out of clang contain exactly the pertinent information about what is wrong and +why. In the example above, we tell you what the inferred types are for +the left and right hand sides, and we don't repeat what is obvious from the +caret (that this is a "binary +"). Many other examples abound, here is a simple +one:

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  t.c:5: error: invalid type argument of 'unary *'
+  $ clang -fsyntax-only t.c
+  t.c:5:11: error: indirection requires pointer operand ('int' invalid)
+    int y = *SomeA.X;
+            ^~~~~~~~
+
+ +

In this example, not only do we tell you that there is a problem with the * +and point to it, we say exactly why and tell you what the type is (in case it is +a complicated subexpression, such as a call to an overloaded function). This +sort of attention to detail makes it much easier to understand and fix problems +quickly.

+ +

No Pretty Printing of Expressions in Diagnostics

+ +

Since Clang has range highlighting, it never needs to pretty print your code +back out to you. This is particularly bad in G++ (which often emits errors +containing lowered vtable references), but even GCC can produce +inscrutible error messages in some cases when it tries to do this. In this +example P and Q have type "int*":

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object  is not a function
+  $ clang -fsyntax-only t.c
+  t.c:12:8: error: called object type 'int' is not a function or function pointer
+    (P-Q)();
+    ~~~~~^
+
+ + +

Typedef Preservation and Selective Unwrapping

+ +

Many programmers use high-level user defined types, typedefs, and other +syntactic sugar to refer to types in their program. This is useful because they +can abbreviate otherwise very long types and it is useful to preserve the +typename in diagnostics. However, sometimes very simple typedefs can wrap +trivial types and it is important to strip off the typedef to understand what +is going on. Clang aims to handle both cases well.

+ +

For example, here is an example that shows where it is important to preserve +a typedef in C:

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
+  $ clang -fsyntax-only t.c
+  t.c:15:11: error: can't convert between vector values of different size ('__m128' and 'int const *')
+    myvec[1]/P;
+    ~~~~~~~~^~
+
+ +

Here the type printed by GCC isn't even valid, but if the error were about a +very long and complicated type (as often happens in C++) the error message would +be ugly just because it was long and hard to read. Here's an example where it +is useful for the compiler to expose underlying details of a typedef:

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  t.c:13: error: request for member 'x' in something not a structure or union
+  $ clang -fsyntax-only t.c
+  t.c:13:9: error: member reference base type 'pid_t' (aka 'int') is not a structure or union
+    myvar = myvar.x;
+            ~~~~~ ^
+
+ +

If the user was somehow confused about how the system "pid_t" typedef is +defined, Clang helpfully displays it with "aka".

+ +

In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as: + +

+
+namespace services {
+  struct WebService {  };
+}
+namespace myapp {
+  namespace servers {
+    struct Server {  };
+  }
+}
+
+using namespace myapp;
+void addHTTPService(servers::Server const &server, ::services::WebService const *http) {
+  server += http;
+}
+
+
+ +

and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"): + +

+  $ g++-4.2 -fsyntax-only t.cpp
+  t.cpp:9: error: no match for 'operator+=' in 'server += http'
+  $ clang -fsyntax-only t.cpp
+  t.cpp:9:10: error: invalid operands to binary expression ('servers::Server const' and '::services::WebService const *')
+    server += http;
+    ~~~~~~ ^  ~~~~
+
+ +

Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like std::vector<Real>) was spelled within the source code. For example:

+ +
+  $ g++-4.2 -fsyntax-only t.cpp
+  t.cpp:12: error: no match for 'operator=' in 'str = vec'
+  $ clang -fsyntax-only t.cpp
+  t.cpp:12:7: error: incompatible type assigning 'vector<Real>', expected 'std::string' (aka 'class std::basic_string<char>')
+    str = vec;
+        ^ ~~~
+
+ +

Fix-it Hints

+ +

"Fix-it" hints provide advice for fixing small, localized problems +in source code. When Clang produces a diagnostic about a particular +problem that it can work around (e.g., non-standard or redundant +syntax, missing keywords, common mistakes, etc.), it may also provide +specific guidance in the form of a code transformation to correct the +problem. For example, here Clang warns about the use of a GCC +extension that has been considered obsolete since 1993:

+ +
+  $ clang t.c
+  t.c:5:28: warning: use of GNU old-style field designator extension
+  struct point origin = { x: 0.0, y: 0.0 };
+                          ~~ ^
+                          .x = 
+  t.c:5:36: warning: use of GNU old-style field designator extension
+  struct point origin = { x: 0.0, y: 0.0 };
+                                  ~~ ^
+                                  .y = 
+
+ +

The underlined code should be removed, then replaced with the code below the caret line (".x =" or ".y =", respectively). "Fix-it" hints are most useful for working around common user errors and misconceptions. For example, C++ users commonly forget the syntax for explicit specialization of class templates, as in the following error:

+ +
+  $ clang t.cpp
+  t.cpp:9:3: error: template specialization requires 'template<>'
+    struct iterator_traits<file_iterator> {
+    ^
+    template<> 
+
+ +

Again, after describing the problem, Clang provides the fix--add template<>--as part of the diagnostic.

+ +

Automatic Macro Expansion

+ +

Many errors happen in macros that are sometimes deeply nested. With +traditional compilers, you need to dig deep into the definition of the macro to +understand how you got into trouble. Here's a simple example that shows how +Clang helps you out:

+ +
+  $ gcc-4.2 -fsyntax-only t.c
+  t.c: In function 'test':
+  t.c:80: error: invalid operands to binary < (have 'struct mystruct' and 'float')
+  $ clang -fsyntax-only t.c
+  t.c:80:3: error: invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
+    X = MYMAX(P, F);
+        ^~~~~~~~~~~
+  t.c:76:94: note: instantiated from:
+  #define MYMAX(A,B)    __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a < __b ? __b : __a; })
+                                                                                           ~~~ ^ ~~~
+
+ +

This shows how clang automatically prints instantiation information and +nested range information for diagnostics as they are instantiated through macros +and also shows how some of the other pieces work in a bigger example. Here's +another real world warning that occurs in the "window" Unix package (which +implements the "wwopen" class of APIs):

+ +
+  $ clang -fsyntax-only t.c
+  t.c:22:2: warning: type specifier missing, defaults to 'int'
+          ILPAD();
+          ^
+  t.c:17:17: note: instantiated from:
+  #define ILPAD() PAD((NROW - tt.tt_row) * 10)    /* 1 ms per char */
+                  ^
+  t.c:14:2: note: instantiated from:
+          register i; \
+          ^
+
+ +

In practice, we've found that this is actually more useful in multiply nested +macros that in simple ones.

+ +
+ + -- cgit v1.1